0% found this document useful (0 votes)
34 views28 pages

LBSIM Business Analytics Slides - Day 7

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views28 pages

LBSIM Business Analytics Slides - Day 7

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Advanced Data Science Using

SAS
Day 7– Sun, 1 Dec 2024
Lal Bahadur Shastri Institute of
Management
G Krishnamurthy
Programing Constructs In SAS
Topics
• Recap
– Conditional Processing
– Iterative Processing: Looping in SAS
• Working With SAS Dates
• Creating Data Sets
• Creating Formats and Labels
Recap
Performing Conditional
Processing
Logical Comparison Operators
Comparison Mnemonic Symbol
Equal To EQ =
Not Equal To NE ^=, or ~=
Less Than LT <
Less Than or Equal To LE <=
Greater Than GT >
Greater Than or Equal To GE >=
Equal To Any Value In A List IN
Performing Iterative Processing
Using DO-END statements
• Consider a program that converts temperatures
from degrees Celsius to degrees Fahrenheit in
the form of a table
• data Convert_Temp;
• do Temp_C = 0 to 100;
• Temp_F = 1.8 * Temp_C + 32;
• output;
• end;
• run;
• title “Listing of Data Set Convert_Temp”;
• proc print data=Convert_Temp noobs;
• run;
Exiting Loops
• DO WHILE • DO UNTIL
– One of two statements – DO UNTIL exits the loop
that does a logical test to when the UNTIL
decide when to leave a condition becomes true
DO loop – UNTIL condition
– do while <expression>; evaluated at the bottom
Statements of the loop
end;
– Here, the expression is
evaluated at the top of
the loop
Working With SAS Dates
Introduction
• SAS reads and writes dates in most formats,
including ddmmyyyy, or mmddyyyy, or
dd/mmm/yyyy.
• All dates are converted to a value starting from
January 1, 1960.
– January 1,1960 is 0
– January 2,1960 is 1
• No dates before Jan 1,1582 can be computed,
as the Gregorian calendar was started that year.
INFORMATS
• INFORMAT is used to read data into SAS variables
• There are three categories
– Character $INFORMATw.
– Numeric INFORMATw.d
– Date/ Time INFORMATw.
• There are other INFORMAT categories used to read
Hebrew and Asian languages
• The decimal point is mandatory to differentiate SAS
variables from SAS INFORMAT
• Usually, ‘w’ in the case of DATE/ TIME is set to 10
Program to read dates
• data Read_Dates;
• infile “~/gkrishnamurthy/Date_Data.txt” pad;
• input @1 Date1 mmddyy10.
• @12 Date2 date9.
• @22 Date3 ddmmmyy10.;
• run;

• title “Listing of Data Set dates”;


• proc print data=Read_Dates noobs;
• run;

• Date_Data.txt
– 05/23/2015 23May2015 23/05/2015
– 10/21/1950 21Oct1950 21/10/1950
– 5/7/2013 7Jul2013 7/5/2013
Output
• Output would be
– Listing of Data Set Dates

Date1 Date2 Date3


20231 20231 20231
-3359 -3359 -3359
19485 19548 19485
Formatted Program to read dates
• data Read_Dates;
• infile “~/gkrishnamurthy/Date_Data.txt” pad;
• input @1 Date1 mmddyy10.
• @12 Date2 date9.
• @22 Date3 ddmmmyy10.;
• format Date1 mmddyy10. Date2 Date3 date9.;
• run;

• title “Listing of Data Set dates”;


• proc print data=Read_Dates noobs;
• run;

• Date_Data.txt
– 05/23/2015 23May2015 23/05/2015
– 10/21/1950 21Oct1950 21/10/1950
– 5/7/2013 7Jul2013 7/5/2013
Formatted Output
• Output would be
– Listing of Data Set Dates

Date1 Date2 Date3


05/23/2015 23MAY2015 23MAY2015
10/21/1950 21OCT1950 21OCT1950
05/07/2013 07JUL2013 07MAY2013
Extracting Dates / Creating a Bar Chart
• data Extract;
title "Listing of the First Twenty
• informat Date mmddyy10.;
• input Date@@; Observations from Extract";
• Day_of_Week = weekday(Date); proc print data=Extract(obs=20);
• Day_of_Month = day(Date);
• Year = year(Date); run;
• format Date mmddyy10.;

• datalines; title "Frequencies for Day of the
• 1/5/2000 2/8/2000 4/23/2000 4/12/2000
8/21/2000 8/22/2000 8/23/2000
Week";
• 12/12/2000 12/15/2000 12/18/2000 proc sgplot data=Extract;
• 2/22/2001 2/1/2001 4/18/2001
4/18/2001 4/18/2001 9/17/2001 vbar Day_of_Week;
12/25/2001
• 12/22/2001 3/3/2001 3/6/2001 3/7/2001
run;
• ;
Formatting Dates
• proc format;

title "Listing of the First
value DOW 1='Sun' 2='Mon' 3='Tue' 4='Wed'
5='Thu' 6='Fri' 7='Sat'; Twenty Observations from
• run; EXTRACT";
• data Extract; proc print
• informat Date mmddyy10.; data=Extract(obs=20);
• input Date @@;
• Day_of_Week = weekday(Date); run;
• Day_of_Month = day(Date);
• Year = year(Date);
• format Date mmddyyd10. Day_of_Week DOW.; title "Frequencies for Day of
• the Week";
• datalines;
• 1/5/2000 2/8/2000 4/23/2000 4/12/2000 proc sgplot data=Extract;
8/21/2000 8/21/2000 8/22/2000
• 12/12/2000 12/15/2000 12/18/2000
vbar Day_of_Week;
• 2/22/2001 2/1/2001 4/18/2001 4/18/2001 run;
4/18/2001 9/17/2001 12/25/2001
• 12/22/2001 3/3/2001 3/6/2001 3/7/2001
Getting Days Between Two Dates
• Computing the years between two dates
• data _null_;
• DOB=input(‘<Your DOB>,yymmdd9.);
• age=yrdif(DOB, today(), 'AGE’);
• age_last=int(yrdif(DOB,today(),'AGE'));
• put age= 'years';
• put age_last='years';
• run;
Using Date Constants
• data _null_;
• title "Checking for Out-of-Range Dates";
• input @1 Date mmddyy10.;
• file print;
• if Date lt '01Jan2020'd and not missing(Date) or
• Date gt '31Dec2021'd then put "Date " Date "is out of range";
• format Date mmddyy10.;

• datalines;
• 10/13/2020
• 5/1/2012
• 1/1/2015
• 6/5/2020
• 1/1/2000
• ;
Creating Formats And Labels
Introduction
• Most databases store information in a
database using codes, rather than actual
values
• However, we would prefer to get the output in
the form of the actual labels
• This is done by using SAS formats
Using Formats
• We can create a Taxes data set using the datalines
command
• These can be formatted using your own formats, as
demonstrated
– Value statements are used to define formats
– Format names can be 32 characters in length
– Only difference from other SAS names is that they cannot end
in a digit
– Formats applied to character variables must begin with a $ sign
– NOCUM is used to omit cumulative frequencies from the
results
SAS Built-In Formats
Format Display Explanation
8.3 1234.567 Total width = 8, including
the decimal point
10.4 b1234.5670 Leading space represented
by b, followed by the
number and a 0 at the end
8.2 b1234.57 Leading space represented
by b
4. 1235. Width shorter than value
10.1 bbbb1234.6 Four leading spaces
Making Formats Permanent
• All formats written are temporary, that is they
exist only for the session duration
• However, if there are formats that are being
used frequently, then these can be made
permanent without having to rerun PROC
FORMAT
• Usually, these are stored in your home
directory
Filtering SAS Data Sets
Programs
• Use the data set SASHELP.retail.
• A. Get all observations in the data set that deal
with sale values, date, year, month, and day for
the year 1981
• B. Get all observations in the data set that deal
with sale values, date, year, month, and day for
the years 1980, 1981, 1983 and 1985, where
sale values are in excess of $250.
• Use the SAS SET command for this exercise
Thank You

You might also like