0% found this document useful (0 votes)

3 views

2015 Regression Using Stata and SAS

The document provides an overview of regression analysis, explaining its purpose, popularity, and the assumptions that underpin it, such as linearity and mean independence. It outlines both primitive and improved methods for conducting regression analysis using SAS and Stata, including commands for executing the analysis and checking for violations of assumptions. Additionally, it discusses the consequences of violating these assumptions and offers solutions to address them.

Uploaded by

abdi1211001

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

2015 Regression Using Stata and SAS

Uploaded by

abdi1211001

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Regression Analysis Using SAS

and Stata

Hsueh-Sheng Wu
CFDR Workshop Series
October 19, 2015

1
Outline
• What is regression analysis?
• Why is regression analysis popular?
• A primitive way of conducting regression analysis
• A better way of conducting regression analysis:
Corrections for violations in regression assumptions for
– Linearity
– Mean independence
– Homoscedasticity
– Uncorrelated disturbances
– Normal disturbance

• Conclusions

2
What Is Regression?
Regression is used to study the relation between a single
dependent variable and one or more independent
variables. In regression, the dependent variable y is a
linear function of the x’s, plus a random disturbance ε.

y = a + b1x1 + b2x2 + ε

y is the dependent variable

a is the intercept
x1 and x2 are independent variables
b1and b2 are regression coefficients
ε represents the combined effects of all the causes of y that are
not included in the equation, but can influence the relations
between x’s and y

3
Five Assumptions of Regression
1. Linearity
– y is a linear function of the x’s
2. Mean independence
– the mean of the disturbance term is always 0 and
does not depend on the value of x’s
3. Homoscedasticity
– The variance of ε does not depend on the x’s
4. Uncorrelated disturbances
– The value of ε for any individual in the sample is not
correlated with the value of ε for any other individuals
5. Normal disturbance
– ε has a normal distribution
4
What Is Regression Analysis Popular?
• Statistical convenience. All statistic software provide
regression analysis.
• Intuitive logic. Regression analysis fits our thinking style,
that is, once we observed a phenomenon (i.e.,
dependent variable), what may contribute to this
phenomenon.
• Various types of regression models
– Based on the number of independent variables
• Simple regression
• Multiple Regression
– Based on the type of the dependent variable
• Ordinary least square regression
• Logistic regression
• Ordered logistic regression
• Multinomial logistic regression
• Poisson regression
– Based on the number of dependent variables
• Structural Equation Modeling 5
• Hierarchical Linear Regression
A Primitive Way of Conducting Regression Analysis
• Decide a research question
e.g., Whether the price of the car is determined by the weight,
length, and the repair records of cars

• Decide dependent variable and independent variables

Dependent variable: the price of the car
Independent variables: the weight, length, and repair records

• Find a data set

Data set: the information on prices, weights, lengths, and repair
records of 74 cars

• Decide the regression model

Ordinary Least Square (OLS) model is used because price is a
continuous variable

• Run the regression analysis

• Interpret the results

6
Stata and SAS Commands for Regression Analysis
SAS commands:
proc reg data = auto;
MODEL price = weight length rep78;
run;
The REG Procedure
Model: MODEL1
Dependent Variable: price Price

Number of Observations Read 74

Number of Observations Used 69
Number of Observations with Missing Values 5

Analysis of Variance

Sum of Mean
Source DF Squares Square F Value Pr > F

Model 3 246375736 82125245 16.16 <.0001

Error 65 330421222 5083403
Corrected Total 68 576796959

Root MSE 2254.64042 R-Square 0.4271

Dependent Mean 6146.04348 Adj R-Sq 0.4007
Coeff Var 36.68442

Parameter Estimates

Parameter Standard
Variable Label DF Estimate Error t Value Pr > |t|

Intercept Intercept 1 6850.95187 4312.73825 1.59 0.1170

weight Weight (lbs.) 1 5.25210 1.10343 4.76 <.0001
length Length (in.) 1 -103.60163 37.78457 -2.74 0.0079
rep78 Repair Record 1978 1 844.94616 302.03629 2.80 0.0068

7
Stata commands:
webuse auto.dta, clear
reg price weight length rep78

Stata Output:

8
A Better Way of Conducting Regression Analysis
• Decide a research question

• Decide dependent variable and independent variables

• Find a data set

• Decide the regression model

• Run the regression analysis

• Check the violations of the regression assumptions

• Fix the violations and then run the analysis again

• Interpret the results 9

Linearity Assumption
What does it mean?
• The dependent variable y is a linear function of the x’s
• Possible causes of violating this assumption:
– Inaccurate specification of the regression models
– Influential observations

What are the consequences?

• Biased estimates of intercept and regression coefficients
• Inaccurate prediction of y

10
Linearity Assumption (Cont.)
How to detect the inaccurate specification of the
models?
•Plot y against x
•Plot residuals against x
•Plot residuals against yhat

SAS commands:
proc reg data=auto;
model price = length;
plot price*length;
plot rstudent.*length;
plot rstudent.*p. / noline;
run;
11
Linearity Assumption (Cont.)

12
Linearity Assumption (Cont.)
Stata commands:
webuse auto.dta, clear
reg price length
predict r, rstudent
predict yhat, xb
scatter price length
scatter r length
scatter r yhat

13
Linearity Assumption (Cont.)
4
3
Studentized residuals
0 1 -1 2

140 160 180 200 220 240

Length (in.)

14
Linearity Assumption (Cont.)
Check for influential observations:
• Outliers:
If observations have standardized residuals that exceed =2 or -2, they may
indeed outliers.

• Observations with high leverage:

If observation has leverage that is large than (2k+2)/n, where k is the
number of predictors and n is the number of observations, these
observations are said to have high leverage

• Observations with high impact on the regression coefficients:

Influential observations can be determined by either Cook’s D statistics,
DFITS, or DFBETA statistics.
–If observations have the value of Cook’s D statistics larger than 4/n,
–If the DFITS statistics whose absolute values are larger than 2*sqrt(k/n),
–If the DFBETA statistics whose absolute value greater than 2/sqrt(n), they are
influential observations.

15
Linearity Assumption (Cont.)
SAS commands:
proc reg data = in.auto;
model price = weight length rep78;
Output out=in.outlier(keep = make price weight length rep78 r lever cooked dffit)
rstudent = r h=lever cookd = cooked dffits = dffit;
run;
quit;

Proc print data = in.outlier;

Var make r;
Where abs(r)>2 & r ~=. ;
run;

Proc print data = in.outlier;

Var make lever;
Where lever > (2*3+2)/69 & lever ~=.;
run; 16
Linearity Assumption (Cont.)
proc reg data = in.auto;
model price = weight length rep78 / influence;
ods output OutputStatistics=in.dfbetas;
id make;
run;
quit;

proc print data=in.dfbetas;

var make DFFITS;
Where abs(DFFITs) > (2*sqrt(3/69)) & DFFITS ~=. ;
Run;

proc print data=in.dfbetas;

var make DFB_Intercept DFB_weight DFB_length DFB_rep78 ;
Where abs(DFB_weight) > (2/sqrt(69)) & DFB_weight ~=. ;
Run; 17
Linearity Assumption (Cont.)

Obs make DFFITS

2 Linc. Mark V 0.4797

4 Cad. Eldorado 0.8512
5 Linc. Versailles 0.5270
15 AMC Pacer -1.0048
18 Volvo 260 0.5247
39 Cad. Seville 1.0777
49 Audi Fox 0.6182
66 Plym. Arrow -1.0159

Obs make Intercept weight length rep78

2 Linc. Mark V -0.0130 0.2530 -0.1184 0.1010

4 Cad. Eldorado 0.4435 0.4704 -0.4156 -0.4082
5 Linc. Versailles 0.2646 0.4147 -0.3478 0.0204
15 AMC Pacer -0.8790 -0.9209 0.9525 0.0170
39 Cad. Seville 0.6489 0.9956 -0.8688 0.1391
49 Audi Fox -0.2089 -0.5201 0.4191 -0.2670
51 VW Dasher -0.1254 -0.2461 0.1961 0.0210
66 Plym. Arrow -0.9049 -0.9223 0.9670 0.0298

18
Linearity Assumption (Cont.)
Stata commands:

reg price weight length rep78

predict r, rstudent
predict lever, leverage
predict cooked, cooksd
predict dfit, dfits
list make r if abs(r) > 2 & r ~=.
list make lever if lever > (2*3+2)/69 & lever ~=.
list make cooked if cooked >4/69 & cooked ~=.
list make dfit if abs(dfit)>2*sqrt(3/69) & dfit ~=.

dfbeta
list make _dfbeta_1 _dfbeta_2 _dfbeta_3 if abs(_dfbeta_1) > (2/sqrt(69)) &
_dfbeta_1 ~=.

19
Linearity Assumption (Cont.)
. list make dfit if abs(dfit)>2*sqrt(3/69) & dfit ~=.

make dfit

2. AMC Pacer -1.004767

12. Cad. Eldorado .8511783
13. Cad. Seville 1.077664
27. Linc. Mark V .4797307
28. Linc. Versailles .5269713

42. Plym. Arrow -1.015867

54. Audi Fox .6182262
74. Volvo 260 .5247175

. list make _dfbeta_1 _dfbeta_2 _dfbeta_3 if abs(_dfbeta_1) > (2/sqrt(69)) & _dfbeta_1 ~=.

make _dfbeta_1 _dfbeta_2 _dfbeta_3

2. AMC Pacer -.9209325 .9525123 .0170096

12. Cad. Eldorado .47041 -.4156323 -.4082073
13. Cad. Seville .9955547 -.8688278 .1390504
27. Linc. Mark V .2530411 -.118375 .1010498
28. Linc. Versailles .4147299 -.3477834 .0203597

42. Plym. Arrow -.9222513 .9670225 .0297615

54. Audi Fox -.5201173 .4191374 -.2670405
70. VW Dasher -.2461434 .1960774 .0209733

20
.
Linearity Assumption (Cont.)
Solutions:
• Re-specify the model by mathematically transforming x’s. e.g., for a
curvilinear relation, you can square the x’s.
– log transformation
– Exponentiation transformation is the use of the inverse of a logarithm, as in x’ = εx
– polynomial transformation is the use of powers of the variable, as in x’ = x2, x’ = x3, x’ =
SQRT(x). We use this approach often in multiple regression.
– rescale the x variable into a dummy (dichotomous) variable
• Restrict the range of x
• Identify the influential cases and examine whether they should be
included in the sample

21
Mean Independence
What does it mean?
• The mean of the disturbance term is always 0 and does not depend
on the value of x’s.
• Possible causes of violating this assumption:
– omitted x variables: if any of the omitted variables is associated
with the x’s.
– reverse causation: if y influence x’s, then ε is associated with the
x’s.
– measurement error in the x: x includes not only x but also
something else. This something else will get into ε.

What are the consequences?

• Biased estimates of intercept and regression coefficients
• Inaccurate prediction of Y

22
Mean Independence (Con.)
How to detect the violation?
Link test: if the current model is a good model, no
additional predictors have significant
associations with the dependent variable.

23
Mean Independence (Cont.)
SAS commands for Link test:
proc reg data=auto;
model price = length;
output out=auto2 (keep= price length yhat) predicted=yhat;
run;
quit;
data auto3;
set auto2;
yhat2= yhat**2;
run;
proc reg data=auto3;
model price = yhat yhat2;
run;

Stata commands:
webuse auto.dta, clear
reg price length
predict yhat, xb
gen yhat2 = yhat*yhat
reg price yhat yhat2

24
Mean Independence (Cont.)
Solutions:
• Use of past literatures to justify your model
• Use experimental design to collect your data, which not only support
the mean independence assumption, but also avoid reverse
causation
• If you use survey design and have measures of relevant variables
that have not been included in the model, you can include these
variables in the model to reduce the possibility of violating this
assumption
• Use simultaneous equations to model reciprocal relations between
x’s and y
• Choose measures with high reliability or include measurement
models in regression analysis

25
Homoscedasticity
What does it mean?
• Homoscedasticity means that the variance of ε is the same across
all levels of x’s.
• Possible causes of violating this assumption.
– Improvement in data collection techniques: During the course of
data collection, the interviewers are getting better and less likely
to commit error in collecting data.
– Learning: Respondents are less likely to have errors in
answering the same questions when being interviewed in the
follow-up survey than in the baseline survey.
– Outliers
What are the consequences?
• Inefficiency: observations with larger disturbance variance contain
less information than observations with smaller disturbance
variance. but OLS weights them equally.
• Bias in standard errors can leads to incorrect conclusions.
26
Homoscedasticity (Cont.)
How to detect the violation?
• Plot residuals against X
• Plot residuals against Yhat
• White test
• Cameron & Trivedi's decomposition of IM-test
• Breusch-Pagan / Cook-Weisberg

SAS commands:
proc reg data=auto;
model price = length weight rep78/ spec;
run; quit;

Stata commands:
webuse auto.dta, clear
reg price length weight rep78
estat imtest, white 27
estat hettest
Homoscedasticity (Cont.)
Solutions:
• Re-specify the model or transform the dependent variable
• Use robust standard errors
• Use weighted least squares only if you know what weights to use

28
Uncorrelated Disturbances
What it means?
• The disturbance variables for any two individuals must be
uncorrelated.
• Possible causes of violating this assumption
– Sample design: simple random sampling is not likely to cause this
problem, but a cluster sampling is.
– The selection of unit of analysis, e.g., the couple
– The use of panel data

What are the consequences?

• Inefficient estimates
• Downward bias in estimated standard errors, which means that
there will be a tendency to conclude that relations exist when they
really don’t.
29
Uncorrelated Disturbances (Cont.)
How to detect the violation?
• Calculate the residuals for all respondents and then examine
correlations between the residuals of suspected groups of
respondents
• Intra-class correlation

Solutions:
• Include the cluster variables into the models as a control
• Use the cluster option in the regression analysis
• Use regression that can control for correlations among observations,
for example, Hierarchical Linear Model

30
Uncorrelated Disturbances (Cont.)
Solutions:
• Including the correlations among respondents into the regression
models

SAS commands:
proc genmod data=auto;
class foreign;
Model price = price weight rep78;
repeated subject=foreign / type=ind ;
run;

Stata commands:
reg price length rep78 weight, cluster(foreign)

31
Normality
What does it mean:
• The disturbance term ε need to be normally distributed, but x’s and y
do not.
– Positive Skewness
– Negative Skewness
– Positive Kurtosis
– Negative Kurtosis
• Possible causes of violation of this assumption
– The true distribution of the variable, e.g., some variables follow a binomial or
poisson distribution.
– Measurement artifacts
– Inadequate sample

What are the consequences?

• When the sample is extremely small (e.g., below 100), the violation
of this assumption leads to inaccurate estimates of confidence
intervals and p-values. As the sample gets larger, the central limit
theorem suggested that we can get pretty accurate confidence
intervals and p-values. 32
Normality (Cont.)
How to detect the violation:
• Graphic methods: Stem-and-leaf plot, (skeletal) box plot,
dot plot, histogram
• Shapiro-Wilk W test for normality

33
Normality (Cont.)
SAS Commands:
proc reg data=auto;
model price= length weight rep78;
output out=auto2 (keep= price length weight rep78 res yhat)
residual=res predicted=yhat;
run;

proc univariate data=auto2 normal;

var res;
qqplot res / normal(mu=est sigma=est);
run;

Stata commands:
reg price length rep78 weight
swilk r
34
Normality (Cont.)

Solutions:
• Using larger samples
• Using conservative p-values (e.g., using 0.01 rather than
0.05)

35
Conclusions
. Regression analysis is the most commonly used technique
in social sciences
• To accurately use regression analysis, you need to check
for possible violations of the regression analysis
• Other useful resources for learning conducting regression
– http://www.ats.ucla.edu/stat/sas/webbooks/reg/default.htm
– http://www.ats.ucla.edu/stat/stata/webbooks/reg/
– http://www.indiana.edu/~statmath/stat/all/panel/
– http://dss.princeton.edu/online_help/analysis/regression_intro.htm

• If you have any questions about running regression

analysis, CFDR provides programming support. Please
feel free to contact Hsueh-Sheng Wu @ 372-3119 or
[email protected]

Economics Department Model Exam Questions With Answers
No ratings yet
Economics Department Model Exam Questions With Answers
20 pages
Proc Robust Reg
No ratings yet
Proc Robust Reg
56 pages
Lab8 Hetero GLS and WLS
No ratings yet
Lab8 Hetero GLS and WLS
5 pages
Section 2
No ratings yet
Section 2
22 pages
EXCEL Output: Regression Statistics
No ratings yet
EXCEL Output: Regression Statistics
21 pages
TUTORIAL 7: Multiple Linear Regression I. Multiple Regression
No ratings yet
TUTORIAL 7: Multiple Linear Regression I. Multiple Regression
6 pages
Chapter 4 Demand Estimation
No ratings yet
Chapter 4 Demand Estimation
9 pages
Econometrics With Stata PDF
No ratings yet
Econometrics With Stata PDF
58 pages
SPSSTutorial Math Cracker
No ratings yet
SPSSTutorial Math Cracker
43 pages
Business Statistics, 5 Ed.: by Ken Black
No ratings yet
Business Statistics, 5 Ed.: by Ken Black
34 pages
DISC 212 Session 13
No ratings yet
DISC 212 Session 13
29 pages
Regression Analysis
No ratings yet
Regression Analysis
65 pages
Unit 5
No ratings yet
Unit 5
104 pages
lecture 9-10
No ratings yet
lecture 9-10
28 pages
Multiple Linear Regression in Excel
No ratings yet
Multiple Linear Regression in Excel
19 pages
Multiple Linear Regression Analysis Usin
No ratings yet
Multiple Linear Regression Analysis Usin
19 pages
9 W9INSE6220 Fall 2023
No ratings yet
9 W9INSE6220 Fall 2023
42 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Lecture Plan 12 - 16!1!1
No ratings yet
Lecture Plan 12 - 16!1!1
7 pages
Regrion
No ratings yet
Regrion
19 pages
20BCE1205 Lab3
No ratings yet
20BCE1205 Lab3
9 pages
Lecture 12 (2)
No ratings yet
Lecture 12 (2)
5 pages
regression-analysis-notes
No ratings yet
regression-analysis-notes
6 pages
Fikret Isik - Lecture Notes For Statistics Session - IUFRO Genetics of Host-Parasite Interactions in Forestry - 2011
No ratings yet
Fikret Isik - Lecture Notes For Statistics Session - IUFRO Genetics of Host-Parasite Interactions in Forestry - 2011
47 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
Chap 6 MultipleLinearRegression Adjusted
No ratings yet
Chap 6 MultipleLinearRegression Adjusted
30 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
Topic09. Multiple Regression
No ratings yet
Topic09. Multiple Regression
36 pages
Stat 2509 Exam Review Problems 1 Awoods
No ratings yet
Stat 2509 Exam Review Problems 1 Awoods
23 pages
Regression
No ratings yet
Regression
34 pages
Unit 4
No ratings yet
Unit 4
7 pages
What Is Multiple Linear Regression
No ratings yet
What Is Multiple Linear Regression
23 pages
Introduction To Management Science: Post Mid Sessions 2 & 3 November 4 and 6 2019
No ratings yet
Introduction To Management Science: Post Mid Sessions 2 & 3 November 4 and 6 2019
26 pages
Specification Test: Vid Adrison
No ratings yet
Specification Test: Vid Adrison
18 pages
Statistical Data Analysis Assignment
No ratings yet
Statistical Data Analysis Assignment
17 pages
Problem Set3
No ratings yet
Problem Set3
4 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
Regression Test Lesson Notes (Optional Download)
No ratings yet
Regression Test Lesson Notes (Optional Download)
5 pages
Regression Notes
No ratings yet
Regression Notes
6 pages
Data Analysis
100% (1)
Data Analysis
28 pages
Chapter 6
No ratings yet
Chapter 6
58 pages
Business Analytics
No ratings yet
Business Analytics
19 pages
LGT2425 Lecture 3 Part II (Notes)
No ratings yet
LGT2425 Lecture 3 Part II (Notes)
55 pages
Arun_27072021_Predictive_Modeling.pdf
No ratings yet
Arun_27072021_Predictive_Modeling.pdf
33 pages
Lab 4
No ratings yet
Lab 4
7 pages
Lecture 11
No ratings yet
Lecture 11
62 pages
Assignment_Solution_1
No ratings yet
Assignment_Solution_1
11 pages
Analisis Jalur
No ratings yet
Analisis Jalur
30 pages
Deep Learning
No ratings yet
Deep Learning
7 pages
Regression Practice: KWH HDD Ref
No ratings yet
Regression Practice: KWH HDD Ref
8 pages
Chapter 3 MLR
No ratings yet
Chapter 3 MLR
40 pages
Lecture-3---Linear-Regression-imran-20022025-092939am
No ratings yet
Lecture-3---Linear-Regression-imran-20022025-092939am
46 pages
Linear Regression Analysis Theory and Computing 1st Edition Xin Yan download pdf
100% (1)
Linear Regression Analysis Theory and Computing 1st Edition Xin Yan download pdf
51 pages
Chapter 06-Regression Analysis
No ratings yet
Chapter 06-Regression Analysis
41 pages
MKTG 4110 Class 6
No ratings yet
MKTG 4110 Class 6
10 pages
Multiple Regression Slides Mod-Ed
No ratings yet
Multiple Regression Slides Mod-Ed
32 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
From Everand
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
César Pérez López
No ratings yet
The Calculus Lifesaver: All the Tools You Need to Excel at Calculus
From Everand
The Calculus Lifesaver: All the Tools You Need to Excel at Calculus
Adrian Banner
4/5 (6)
Loop-shaping Robust Control
From Everand
Loop-shaping Robust Control
Philippe Feyel
No ratings yet
18- Introduction and levels of measurements(2017-18)
No ratings yet
18- Introduction and levels of measurements(2017-18)
41 pages
MultidimensionalScalingMethodand
No ratings yet
MultidimensionalScalingMethodand
15 pages
19760019843
No ratings yet
19760019843
288 pages
330.Lect11
No ratings yet
330.Lect11
35 pages
Lee
No ratings yet
Lee
24 pages
TaggartKoskelaRooke2014_SupplyChain-ReductionRework
No ratings yet
TaggartKoskelaRooke2014_SupplyChain-ReductionRework
44 pages
eai.18-7-2019.2288566
No ratings yet
eai.18-7-2019.2288566
7 pages
Action_Research_Initiatives
No ratings yet
Action_Research_Initiatives
82 pages
stress_testing_insurance_companies_methodology
No ratings yet
stress_testing_insurance_companies_methodology
6 pages
655ecacb7dd82
No ratings yet
655ecacb7dd82
10 pages
G01SASIntroduction(Part1) (1)
No ratings yet
G01SASIntroduction(Part1) (1)
4 pages
Simple-Linear-Regression-Model-3 24
No ratings yet
Simple-Linear-Regression-Model-3 24
87 pages
(Ebook PDF) CFA Program Curriculum 2019 Level II Volumes 1-6 Box Set All Chapters Instant Download
100% (5)
(Ebook PDF) CFA Program Curriculum 2019 Level II Volumes 1-6 Box Set All Chapters Instant Download
41 pages
Syllabus BS 5th Semester Economics
No ratings yet
Syllabus BS 5th Semester Economics
64 pages
1 PB
No ratings yet
1 PB
14 pages
Online Reviews and Rating Shape Purchasing Decisions in Indonesian E-Commerce
No ratings yet
Online Reviews and Rating Shape Purchasing Decisions in Indonesian E-Commerce
10 pages
Introduction to Econometrics Christ University Question Bank
No ratings yet
Introduction to Econometrics Christ University Question Bank
11 pages
Business Analytics: Data Science for Business Problems Walter R. Paczkowski - Read the ebook now with the complete version and no limits
100% (1)
Business Analytics: Data Science for Business Problems Walter R. Paczkowski - Read the ebook now with the complete version and no limits
53 pages
G. S. Maddala - Introduction To Econometrics-Macmillan Pub. Co. - Maxwell Macmillan Canada - Maxwell Macmillan International (1992)
No ratings yet
G. S. Maddala - Introduction To Econometrics-Macmillan Pub. Co. - Maxwell Macmillan Canada - Maxwell Macmillan International (1992)
637 pages
ANOVA
No ratings yet
ANOVA
4 pages
Immediate download (Ebook) Adventures in Financial Data Science: The Empirical Properties of Financial and Economic Data, 2nd Edition by Graham L. Giller ISBN 9789811250644, 9811250642 ebooks 2024
100% (2)
Immediate download (Ebook) Adventures in Financial Data Science: The Empirical Properties of Financial and Economic Data, 2nd Edition by Graham L. Giller ISBN 9789811250644, 9811250642 ebooks 2024
71 pages
Publikasi1 94015 9865
No ratings yet
Publikasi1 94015 9865
16 pages
Group Statistics: Dimension1
No ratings yet
Group Statistics: Dimension1
8 pages
Applied Statistics For The Social and Health Sciences (PDFDrive)
No ratings yet
Applied Statistics For The Social and Health Sciences (PDFDrive)
1,017 pages
Fernandes Et Al., 2010
No ratings yet
Fernandes Et Al., 2010
15 pages
CSIT Module IV Notes
No ratings yet
CSIT Module IV Notes
19 pages
CFA Level II: Quantitative Methods
No ratings yet
CFA Level II: Quantitative Methods
20 pages
The Effect of Organizational Climate and Competence On The Performance of Pt. Saharjo Enam Sembilan
No ratings yet
The Effect of Organizational Climate and Competence On The Performance of Pt. Saharjo Enam Sembilan
16 pages
MMM - Multiple Regression
No ratings yet
MMM - Multiple Regression
68 pages
M.A. Economics
No ratings yet
M.A. Economics
23 pages
Module 18
No ratings yet
Module 18
24 pages
W8 Module 7 Methodology PDF
No ratings yet
W8 Module 7 Methodology PDF
11 pages
Do Risk Management Disclosure Affect Firm Value Through Profitability?
No ratings yet
Do Risk Management Disclosure Affect Firm Value Through Profitability?
20 pages
3 Regression Diagnostics
100% (1)
3 Regression Diagnostics
53 pages
Leadership, Management and Team Competencies of Filipino Nursing Student Manager-Leaders: Implications on Nursing Education
No ratings yet
Leadership, Management and Team Competencies of Filipino Nursing Student Manager-Leaders: Implications on Nursing Education
8 pages
5 - Regression
No ratings yet
5 - Regression
63 pages
Download Full Modern statistics for the social and behavioral sciences : a practical introduction Second Edition. Edition Wilcox PDF All Chapters
100% (2)
Download Full Modern statistics for the social and behavioral sciences : a practical introduction Second Edition. Edition Wilcox PDF All Chapters
55 pages
(eBook PDF) Discovering Statistics Using IBM SPSS Statistics 4th 2024 scribd download
100% (2)
(eBook PDF) Discovering Statistics Using IBM SPSS Statistics 4th 2024 scribd download
41 pages
The Effectiveness of Chess On Problem-Solving
No ratings yet
The Effectiveness of Chess On Problem-Solving
10 pages
2024_Data Analytics Book
No ratings yet
2024_Data Analytics Book
193 pages

2015 Regression Using Stata and SAS

Uploaded by

2015 Regression Using Stata and SAS

Uploaded by

Regression Analysis Using SAS

y is the dependent variable

• Decide dependent variable and independent variables

• Find a data set

• Decide the regression model

• Run the regression analysis

• Interpret the results

Number of Observations Read 74

Model 3 246375736 82125245 16.16 <.0001

Root MSE 2254.64042 R-Square 0.4271

Intercept Intercept 1 6850.95187 4312.73825 1.59 0.1170

• Decide dependent variable and independent variables

• Find a data set

• Decide the regression model

• Run the regression analysis

• Check the violations of the regression assumptions

• Fix the violations and then run the analysis again

• Interpret the results 9

What are the consequences?

140 160 180 200 220 240

• Observations with high leverage:

• Observations with high impact on the regression coefficients:

Proc print data = in.outlier;

Proc print data = in.outlier;

proc print data=in.dfbetas;

proc print data=in.dfbetas;

Obs make DFFITS

2 Linc. Mark V 0.4797

Obs make Intercept weight length rep78

2 Linc. Mark V -0.0130 0.2530 -0.1184 0.1010

reg price weight length rep78

2. AMC Pacer -1.004767

42. Plym. Arrow -1.015867

make _dfbeta_1 _dfbeta_2 _dfbeta_3

2. AMC Pacer -.9209325 .9525123 .0170096

42. Plym. Arrow -.9222513 .9670225 .0297615

What are the consequences?

What are the consequences?

What are the consequences?

proc univariate data=auto2 normal;

• If you have any questions about running regression

You might also like