0% found this document useful (0 votes)
24 views

QA Notes

Uploaded by

slaydes13
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

QA Notes

Uploaded by

slaydes13
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

9.

2 Part I|-Statistical Methods


INTRODUCTION the doge
and regression coefficients discussed earlier measure
The correlation another. While it is useful to know how one
of the effect of one variable on
know how one phenomenon
phenomenon is #affected by
in
also important to
influenced by another, it is relationship tends to be complex rather than simple. Oneta
other variables.
is related Its nature,
to a great number of others, many of which may be e interrelated among themselves
type of soil, temperature, amount of
rainf. li
yield of rice is affected by the
For example, physical, chemical or economic, they are
affected
Whether phenomena are biological, statistician's task to determine the effertot
multiplicity of causal factors. It is a part of the
acting separately or simultaneously or one cause when the ef
cause or two or more causes
help of multiple and partial correlation analvsie
of others is estimated. This is done with the
and partial correlation analysis is that whereas in
The basic distinction between multipleof relationship between the variable Yand all the variables
the former, we measure the degree between Y
X., X, taken together, in the latter we measure the degree of relationship
removed.
X,, X, with the effect of all other variables
and one of the variables X, X, X¡..., X,
PARTIAL CORRELATION one
correlation between a dependent variable and
It is often important to measure the involved are kept constant ie., when
variables
particular independent variable when all other
effects of all other variables are removed (often indicated by the phrase other things being
the of partial correlation. For example, if we
equal"). This can be obtained by calculating coefficient
of rainfall and temperature and if we limit
our
have three variablesyield of wheat, amount average daily temperature existed or t
analysis of yield and rainfall to periods when a certainthat changes in temperature are allowed
we treat the problem mathematically in such a wayThus, partial correlation analysis measures
for, the problem becomes one of partial correlation.
one independent variable in such a way thst
the strength of the relationship between Yand are taken into account. A partial correlation
variations in the other independent variables
in that all other factors are "hed
coefficient is analogous to a partial regression coefficient of all other variabies eve
constant ". Simple correlation, on the other hand, ignores the effectindependent variable, on to one
though these variables might be quite closely related to the
another

Partial Correlation Coefficient


relationship between the depend
Partial correlation coefficient provides a measure of the the variables eliminated.
variable and other variables with the effect of the most of
If we denote by r12a the coefficient of partial correlation between X,
and X, keeping
constant, we find that
T123=
Similarly,
a-a-a)
T32
V1-nya -r)
where riug is the coefficient of partialcorrelation between X, and X, keeping 4, eo

where r is the coeffieient of partial correlation between X, and X, keeping A, eon


three variables X,, X, and Xg, there will Partial and Multiple
Thus,for
studyingthe relationship between two
be three Corelation 9.3
lcorrelation
et partial cnefficient helps Us to
coefficients
variables when the third is heldpartial correlation
of
sAY, X, and X, X,
merely due to the answer questions such as constant.
and over and above thefact that this. Is
both are affected by X, the correlation
riationbetween X, or is there a no
Thus,in determining a partial
intiuenceofX from each correlation
association due to the
of the two coefficient between X and common
X, weinfiuence
of
the
betweenthe "unexplained"
Ne
residuals variables
that
soas to attempt to
ascertain whether relationship
sts
kshould be noted
that the value of a remain.
Orresponding coefficient, partial partial correlation coefficient is
t
determination, always interpreted via
i.e., by squaring the
eient. Thus, if X, X, and X, represent sales, partial correlation
spetively and we
get o a =0.912, it advertisement expenditure
means that more than 91 per cent of and price
(sales) that is not associated with Xg (price), is the variation in
agenditure). associated with X, (advertisement
Partial and Multiple Comelation 97
MLTIPLECORRELATION
bprblemsof multiple correlation, we are dealing with situations that
exanple, we may consider the involve three or more
ariables. For association
amount of rainfall and the average daily between the yield of wheat per acre
both the
of the value of one of these variables basedtemperature.
We are trying to make
on the values of all the
aishles whose value we are trying t
estimate is called the dependent variable others. The
aahles on which our estimates are based are known as and the
bhimself chooses which variable is to be dependent andindependent which
variables. The
dependent. at Tt is merely a question of problem being variables are to be
u nrobable weight of men, we make weight the studied. If we are trying to determine
dependent
endent variables. It, on the other hand, we are interested in variable and height, age, etc.,
be height the dependent variable and weight, age, etc., the estimating height, we will
ems of multiple correlation, we always have independent variables. Thus, in
three or more variables (one dependent and
eoahers independent). In order that we may distinguish them easily we follow the c
custom of
wOresenting them by the letter X with subscript. The dependent variable is
and the others by Xz Xg, etc. Thus, in the height, age and always denoted by
weight problem,
nestimate men's weight (that is, if weight be a dependent variable), we might if we are trying
denote:
X ’ weight in lbs.
X ’ height in inches.
-ge in years.
Coeficient of Multiple Correlation
ecoeficient of multiple linear correlation is
represented by R, and it is common to add
sbscripts
tiple linear
inear the variables involved. Thus, R4 Would represent the coefficient of
comalotion
btwoon on the one
X
Renpt of the dependent variable is always to the hand and X,, X and X, on the other. The
left of the point.
coetncient of multiple correlation can be expressed in terns of r12 1s and r28 as follows:

Ri23 =
1
Rg13 = rtrg-2rghsa
1
Ran=+-2hein'a
1
Ashould
be
noted I that R-23 is the same as R32
coef
betlerficiiesnttheoflinear
multiple correlationbetween
harar relat&ionship. If therelationship
such as R,os lies between 0 and 1. The closer it is to 1,
the variables. The closer it is to 0, the worse is the
coefficient of multiple correlationis 1, the correlationiscalled perfect.
poBsiblecorrelation
that non-linear coefficient of 0indicates no linear relationship between the variables,
turelatienonposiainetcoefive inficisignents randangerange
SWays
a
fromrelationship
+ 1-0to 0to -1-0, the coefficients of multiple correlatio
may exist. It should be noted that whereas the
from+ 1-0 to 0. Since some of the individual variables may
n are
coneulc regression
Unless otherw Gquation is used, the coefficlent of muliple correlatlon is called the coefficiont of linear multiple
Wise specified whonever we reler lo multiple correlation, we shal inply linear multiple correlation.
9.8 Part I|-Statistical Methods

dependent variables and other negatively correlated.


be positively correlated with the
distinguishing between a positive and
purpose would be served in negative v
determinatio
By squaring Ry, we obtain the coefficient of multiple followa: value of R.
An alternate formula for obtaining the value of R23 is as

Or

_2
Similarly,
1

and
1
or

To determine a multiple coeffcient with three independent variables, the following formn
will be used:
RËg4 = 1 - ( 1 - ) - l - )
llustration 6: The following zero-order correlation coefficients are given
fy2 = 0-98, r, = 0-44 and a =0-54.
Calculate multiple correlation coefficient treating first variable as dependent and second and hrd Deh
(M.Com, varabies
Uh
independent.
Solution: We have to calculate the multiple correlation coefficient treating flirst variable as dependent and secoi
third variables as independent, i.e., we have to find R

R123
1
Substituting the given values, we have

(0-98) +(0.44)-2(0-98)X(0-44)X0.54)
1-(0-54)
9604 +0-1936 0-46570-986
0.7084

Advantages of Multiple Cofrelation Analysis


The coefficient of multiple correlation serves the following purposes: a
1. It serves as a variable taken
varisbles
measure of the degree of association between one independent
dependent variable and a group of other variables taken as the egressionan
madeb
2. It also serves as a caleulated plane of
mneasure of goodness of fit of the accuracy of estimates
consequently as a measure of the general degree of
reference to equation for the plane of regression.
"p2 Explained variation
Total variation
Partial and Mutiple Coekation 9.9
fMultiple Correlation Analysis
intationsof
correlation analysis is based on the
LMultiple assumption
the variables is linear. In other words, the rate of change thatin the
relationship
one variable in terms of
between
assumed to be constant for all
another is valaes. In practice, most
linear but follow some other pattern. This
limits somewhat the use of relationships
multiple
are not
analysis. The linear regression n
coefficients are not accurately descriptive of correlation
data curvilinear
2Asecond important limitation is the assumption that effects of independent variables on
Aenendent variables are separate, dhstinct and additive. When the effects of
additive, a given change in one has the same effect on the dependent variables
variable
egardless of the sizes of the other two independent variables.
.Eoer multiple correlation involves a great deal of work relative to the results frequently
tained. When the results are obtained, only a few students, well-trained in the
ae able to interpret them. The misuse of correlation method
deubt on the results has probability led to1more
method than is justified. However, this lack of understanding and resulting
misuse are due to the complexity of the method.
MULTIPLE REGRESSION ANALYSIS
Mutiple Regression and Correlation Analysis
Multiple regression analysis represents a logical extension of two-variable regression analysis.
Itstead of a single independent variable, two or more independent variables are used to estimate
te values of a dependent variable. However, the fundamental concept in the analysis remains
the same. The following are the three main objectives of multiple regression and
znalvsis: correlation
L. To derive an equation which provides estimates of the dependent variable from values of
the two or more independent
variables.
4 10 obtain a measure of the error involved in using this regression equation as a
basis tor
estination.
en a measure of the proportion of variance in the dependent variable accounted for
Or l69 explained by 170 the independent
variables.
Bethod af least
Purpose is accomplished by deriving an appropriate regression equation by the
squares. The second purpose is achieved through the calculation of a standard
i estimate. The third purpose is accomplished by computing the multiple coeficient of

tetMulermtiinplateion.regression equation: The multiple regression equation describes the average


teationsgression
endent
A hipvariablequation
between thesevariables and this relationship is used to predict or control the
e. is an equation for estimating a dependent variable, say, X, from the
Pendent variablesthisX,is3sometimes
Tunetional notation, and is written
called abriregression
cfily as X, equation
=F(X,, X ..) read as ",
of X, on g
is a
g . In
ion
ln of X., X, and so
ease of three on.
variables, the regression equation of X, on X, and X, has the form
X.MiThete1p8econstant
he
ndent yariably2ses.
computed or estimated value of the dependent variable and
X N, are the

endent variable when all the


is the ntercept made by the regression plane. It gives the value of the
independent variables assumea value equal to
zero. bs and
COnstruction of a regression equation. to threevaabe
llustration: Find the multiple linear regression equation of X, on 2X
given beldw: and X, from the data
relaling

X, 15
4 6 13
7
16 4
3
12 8
4
30 10 N
24 20 14
Com,Dettak
(AM
Solution: The rogression equation of X, on X, and X, is
X, = 0, + b, X + bs2 Xg normaleJuatas
The value of the Constants a,, by and b,2 are obtained by solving the lollowing three
X, Na, z+ bz) X, + bi2 X,
X,X, a, X,+ bg3 X?+ ba X,X,
1X,X, a, X,+ b XX + bzX
Partial and Mutile
CALCULATING THE REQUIRED VALUES Corkstion 9.13
X
30
Xx,
15 60 120
450
x x
12 24 72 225
144 900
6 288 38
20 56 144 576
140 160
14 49 64
6 54 126 400
84 81
4 10 52 130
36 196
13 40 169 16
3 4 45 100
15 60 12 225 9 1
SX=54 EX, = 48 X, = 102 EX,X, EX,X, XX, x?=576 X?
-339 =720 = 1,034 494 2.188
ottng the values in the normal equations:
6 a23+ 48 b,23 +102 b2 54
48 a,at 494 b23 + 834 b2 = 339 ...(0

102 az3+ 1034 b23+ 2188 b,a2 = 720 .. ()

tong Eq. () by 8, we get


48 a23+ 384 b123+816 b,2 432
Saratng Eqn. (in) from (iv), we get

pyng Eqn. () by 17, we get


110 b2+218 b132 93 dj23|62%33
102 a,z3+816 b,23 + 1734 b2
Soractng Egn. () from Eqn. (vi), we get
918 bi23-o-R6D .(v)

bi32-o-0225
218 b23 + 454 bg2 =-198 ..(vi)
kapyng Eq. (v) by 109, we obtain
11990 b,23+ 24970 b92 10137
ugyeg Egn. (vi) by 55, we get
11990 b23 + 23762 b,2 =-10890
ng Egn. (vii) from Egn.
(ix), we get
1208 b2 -753
-753 -0-623
b32
bsArg the value of
b,n2 in Eqn. (), we get
110 b,s
1200
218 (-0623) -93
110 b,2s 135 814-93
42 814e0 389
b2 110

Stnsuting the values of b


and b2 in Eqn. (), we get
ba,t 48(0-389) + 102(-o 623) 54
6 a, 54 63 646- 18 672 98 874
16479
Thus. the required regression equation is:
X, 16 479 0 389 X,-0823X,
fijent of Multiple Determination
owficient of multiple determination is analogous to the coefficient of determination in
scatter
O-variable case. As explained earlier, the fit of a straight line to the two-variable
EIS Teasured by the simple coefficient of determination which was defined as the ratio of
eexplained sum of squares to the total sum of squares. In the same fashion, we can define
icient of multiple determination which is denoted by P. Symbolically:
SSR SSE
R= =1 SST
SST
is easier to interpret since R is a
iar to the case of andr in two-variable analysis, R
Pentage figure whereas R is not.
aa ratio of explained variation to the total variation in X,, R can be interpreted as the
variation in the dependent variable that is associated with or explained
Othe total think of R as a measure of closeness of the it
egression of X, on X, and X. We may also
R* to 1, the smaller is
&regression plane to the actual points. The closer the value of
the better is the fit.
ne points about the regression plane and is called the coeficient of
multiple
Ihe square root of determination
the coefficient of multiple in practice.
ielation, denoted as R. This measure is seldom used

You might also like