Proc SQL
Proc SQL
Proc SQL can be used either to create new variables or to merge datasets.
Note: One limitation of PROC SQL is that it can merge only up to 16 tables
(datasets) at a time. The normal limit in SAS is 100. Also note that if you
use PROC SQL to merge two or more datasets, you need not sort any
dataset.
Exercise
Ex 1: Given two datasets A and B in the folder, use PROC SQL to merge
these two datasets to create a new dataset, D by a common I variable.
SQL Examples
Removing duplicates
Proc sql;
create table new as
select distinct jobcode from trg.flttattnd;
Quit;
Proc sql;
create table new as
select salary, salary*0.05 as tax, hiredate format=date7. from trg.fltattnd
where salary > 10000
order by salary, hiredate descending;
quit;
(any of the data step functions can be used to create new variables except
sound, dif and lag.)
SQL Examples
Proc sql;
create table new as
select count(*) as n, round(mean(salary),0.01) format=6.2 as salarymean
from trg.fltattnd;
quit;
Proc sql;
create table new as
select jobcode, count(jobcode) as n, hiredate format=date7., salary,
max(salary) as salarymax, round(salary/(calculated salarymax)*100,0.01)
format 6.2 as salpct
from trg.fltattnd
group by jobcode;
quit;
SQL Examples
Proc sql;
create table new as
select jobcode, count(jobcode) as n, hiredate format=date7., salary,
max(salary) as salarymax
from trg.fltattnd
group by jobcode
having salary=calculated salarymax
order by calculated salarymax;
quit;
Proc sql;
create table new as
select lastname, case when salary >30000 then high sal when salary < 20000
thenlow salary else errorend as saltype length=8 from trg.fltattnd order
by salary;
quit;
Note: The case expression can be used to create a new variable that is a
re-categorization of the values of another variable.
select name
from student
where course like %EE%
Example. Run the following code on the dataset forsql and note the
results
proc sql;
create table matrix as select * from
(select ans as ans0001 from trg.forsql where var='0001'),
(select ans as ans0006 from trg.forsql where var='0006'),
(select ans as ans0003 from trg.forsql where var='0003')
order by ans0001, ans0006, ans0003;
quit;
Proc Sql;
select distinct T.branch-name
from branch as T, branch as S
where T.assets > S.assets and
S.branch-city = Brooklyn