Running Regression model and interpreting the results:
title 'Simple Linear Regression';
data Class;
input Name $ Height Weight Age @@;
datalines;
Alfred 69.0 112.5 14 Alice 56.5 84.0
98.0 13
Carol
62.8 102.5 14 Henry 63.5 102.5
83.0 12
Jane
59.8 84.5 12 Janet 62.5 112.5
84.0 13
John
59.0 99.5 12 Joyce 51.3 50.5
90.0 14
Louise 56.3 77.0 12 Mary
66.5 112.0
150.0 16
Robert 64.8 128.0 12 Ronald 67.0 133.0
85.0 11
William 66.5 112.0 15
;
13
Barbara 65.3
14
James
15
Jeffrey 62.5
11
Judy
64.3
15
Philip
72.0
15
Thomas
57.5
57.3
ods graphics on;
proc reg;
model Weight = Height;
run;
ods graphics off;
The statistic for the overall model is highly significant (=57.076, <0.0001),
indicating that the model explains a significant portion of the variation in the
data.
degrees of freedom can be used in checking accuracy of the data and model.
The model degrees of freedom are one less than the number of parameters
to be estimated. This model estimates two parameters, and ; thus, the
degrees of freedom should be . The corrected total degrees of freedom are
always one less than the total number of observations in the data set, in this
case .
Several simple statistics follow the ANOVA table. The Root MSE is an estimate
of the standard deviation of the error term. The coefficient of variation, or
Coeff Var, is a unitless expression of the variation in the data. The R-square
and Adj R-square are two statistics used in assessing the fit of the model;
values close to 1 indicate a better fit. The R-square of 0.77 indicates that
Height accounts for 77% of the variation in Weight.
The "Parameter Estimates" table in Figure 73.2 contains the estimates of and
. The table also contains the statistics and the corresponding -values for
testing whether each parameter is significantly different from zero. The
-values (, and , ) indicate that the intercept and Height parameter estimates,
respectively, are highly significant.
A trend in the residuals would indicate nonconstant variance in the data. The
plot of residuals by predicted values in the upper-left corner of the
diagnostics panel in Figure 73.4 might indicate a slight trend in the residuals;
they appear to increase slightly as the predicted values increase. A fanshaped trend might indicate the need for a variance-stabilizing
transformation. A curved trend (such as a semicircle) might indicate the need
for a quadratic term in the model. Since these residuals have no apparent
trend, the analysis is considered to be acceptable.