2 - Getting Your Data Into SAS The Basics
2 - Getting Your Data Into SAS The Basics
The Basics
Math 3210
Dr. Zeng
Department of Mathematics
California State University, Bakersfield
Outline
• Getting data into SAS
-Entering data directly into SAS
-Creating SAS data sets from external data files
Method TWO: Creating SAS data sets from raw data files (external data).
This method allows you to import your data externally from other software’s
data files into SAS such as excel or text files. In this chapter, you will learn
about
Importing excel data to SAS Enterprise Guide
Importing text data to the SAS Studio
Importing excel data to the SAS Studio
The INFILE statement
The IMPORT procedure
Entering data directly into SAS (internal data)
4. Type the following SAS code in the editor window and submit
data sampledata1;
infile '/home/bzeng/my_content/sampledata1.txt';
input name $ gender $ score;
run;
proc print data=sampledata1;
run;
Remark:
4. In this example:
• When you use column input in the INPUT statement, list the
variable names and specify column positions that identify the
location of the corresponding data field.
• With column input, the INPUT statement takes the following
form. After the INPUT keyword, list the first variable’s name. If
the variable is character, leave a space; then place a $. After
the dollar sign, or variable name if it is numeric, leave a space;
then list the column or range of columns for that variable.
Repeat for all variables.
• If you read the data from an external data file, use the INFILE
statement before the INPUT statement and remove the
DATALINES statement.
Advantages of Column Input over List Input
a) 123
b) 1234
c) 123456
d) 1.25
e) 12.2567
Example 1: date and comma
data total_sales;
input Date mmddyy10. +2 Amount comma5.;
datalines;
09/05/2013 1,382
10/19/2013 1,235
11/30/2013 2,391
;
run;
proc print data=total_sales;
title Reading Raw Data not in Standard Format: date and
comma;
run;
Note:
• The MMDDYY10. informat for the variable date tells
SAS to interpret the raw data as a month, day, and year,
ignoring the slashes. 10 is the length.
• Notice that these dates are printed as the number of
days since January 1, 1960. In later chapters, we will
talk about how to format these values into readable
dates.
• The comma5. informat for the variable amount tells
SAS to interpret the raw data as a number, ignoring the
comma. Note that the length of the variable is 5.
• The +2 is a pointer control that tells SAS where to look
for the next item.
Example 2: standard character and numeric data
data example2;
input name $10. Age 3. Height 5.1 BirthDate MMDDYY8.;
datalines;
Jane Matt 35 175.6 03-21-82
Jose Lee 32 172.8 06-15-85
;
run;
proc print data =example2;
title ‘Example 2-Reading Raw Data not in Standard
Format: standard character and numeric data’;
run;
Note:
• Name $10. tells SAS to read the first variable
“name” from columns 1 through 10.
• Then the starting point for the second variable
is column 11, and SAS reads values for “age” in
columns 11 through 13.
• The third variable “height” are in columns 14
through 18. It contains a decimal place.
• The last variable “BirthDate” starts in column
19 and is in a date form.
Example 3: standard and non-standard data
data contest;
input name $16. Age 3. +1 Type $1. +1 Date MMDDYY10. (Score1 Score2
Score3 Score4 Score5) (4.1);
datalines;
Alicia Grossman 13 c 10-28-1999 7.8 6.5 7.2 8.0 7.9
Matthew Lee 9 D 10-30-1999 6.5 5.9 6.8 6.0 8.1
Elizabeth Garcia 10 C 10-29-1999 8.9 7.9 8.5 9.0 8.8
Lori Newcombe 6 D 10-30-1999 6.7 5.6 4.9 5.2 6.1
Jose Martinez 7 d 10-31-1999 8.9 9.510.0 9.7 9.0
Brian Williams 11 C 10-29-1999 7.8 8.4 8.5 7.9 8.0
;
proc print data=contest;
title Example3;
run;
Note:
• The variable “name” is a standard character. $16. means it
is in column 1 through column 16.
• The variable “age” is also standard numeric data. 3. means
it is three columns wide, and has no decimal places.
• The +1 skips over one column.
• $1. means the variable “type” is a standard character which
is one column wide.
• MMDDYY10. reads date in the form 10-31-2016 or 10/31-
2016, each 10 columns wide.
• Score 1-score5 require the same informat 4.1.
• By putting the variables and the informat in separate sets
of parentheses, you only need to list the informat once.
Working with SAS Dates: The FORMAT Statement
If you print a SAS date value, SAS will by default print the actual
value-the number of days since January 1, 1960. In most cases,
this is not very meaningful. In fact, SAS has a variety of formats
for printing dates in different forms. Here is a list of selected SAS
date formats.
Example: The FORMAT Statement
The FORMAT statement example below tells SAS to print the variable date using
the MMDDYY8. format.
Next example tells SAS to print the variable date using the worddate18. ;
DATA nationalparks;
INFILE '/home/bzeng/my_content/NatPark.dat';
INPUT ParkName $ 1-22 State $ Year @40 Acreage COMMA9.;
RUN;
Practice 5: Mix Inputs
Create the data set club1 by following the instructions below:
• Use list input for variables IdNumber, StartWeight, and
EndWeight
• Use formmated input for variable Name
• Use column input for variable Team