Factor Analysis - Spss
Factor Analysis - Spss
Take a look at the initial communalities (for each variable, this is the R2 for
predicting that variable from an optimally weighted linear combination of the remaining
variables). Recall that they were all 1s for the principal components analysis we did
earlier, but now each is less than 1. If we sum these communalities we get 5.675. We
started with 7 units of standardized variance and we have now reduced that to 5.675
units of standardized variance (by eliminating unique variance).
Communalities
Initial Extraction
COST .738 .745
SIZE .912 .914
ALCOHOL .866 .866
REPUTAT .499 .385
COLOR .922 .892
AROMA .857 .896
TASTE .881 .902
Extraction Method: Principal Axis Factoring.
For an iterated principal axis solution SPSS first estimates communalities, with
R2 s, and then conducts the analysis. It then takes the communalities from that first
analysis and inserts them into the main diagonal of the correlation matrix in place of the
R2 s, and does the analysis again. The variables SSLs from this second solution are
then inserted into the main diagonal replacing the communalities from the previous
iteration, etc. etc., until the change from one iteration to the next iteration is trivial.
Look at the communalities after this iterative process and for a two-factor
solution. They now sum to 5.60. That is, 5.6/7 = 80% of the variance is common
variance and 20% is unique. Here you can see how we have packaged that common
variance into two factors, both before and after a varimax rotation:
Factor
1 2
TASTE .950 -2.17E-02
AROMA .946 2.106E-02
COLOR .942 6.771E-02
SIZE 7.337E-02 .953
ALCOHOL 2.974E-02 .930
COST -4.64E-02 .862
REPUTAT -.431 -.447
Extraction Method: Principal Axis Factoring.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 3 iterations.
These loadings are very similar to those we obtained previously with a principal
components analysis.
Reproduced Correlations
Factor Factor
1 2 1 2
TASTE .955 -7.14E-02 TASTE .947 .030
AROMA .949 -2.83E-02 AROMA .946 .072
COLOR .943 1.877E-02 COLOR .945 .118
SIZE 2.200E-02 .953 SIZE .123 .956
ALCOHOL -2.05E-02 .932 ALCOHOL .078 .930
COST -9.33E-02 .868 COST -.002 .858
REPUTAT -.408 -.426 REPUTAT -.453 -.469
Extraction Method: Principal Axis Factoring. Extraction Method: Principal Axis Factoring.
Rotation Method: Promax with Kaiser Normalization. Rotation Method: Promax with Kaiser Normalization.
a. Rotation converged in 3 iterations.
Factor 1 2
1 1.000 .106
2 .106 1.000
Extraction Method: Principal Axis Factoring.
Rotation Method: Promax with Kaiser Normalization.
Notice that this solution is not much different from the previously obtained
varimax solution, so little was gained by allowing the factors to be correlated.
other procedures. In the Factor Analysis window, click Scores and select Save As
Variables, Regression, Display Factor Score Coefficient Matrix.
Factor
1 2
COST .026 .157
SIZE -.066 .610
ALCOHOL .036 .251
REPUTAT .011 -.042
COLOR .225 -.201
AROMA .398 .026
TASTE .409 .110
Extraction Method: Principal Axis Factoring.
Rotation Method: Varimax with Kaiser Normalization.
Factor Scores Method: Regression.
Look back at your data sheet. You will find that two columns have been added to
the right, one for scores on Factor 1 and another for scores on Factor 2.
SPSS also gives you a Factor Score Covariance Matrix. On the main diagonal of
this matrix are, for each factor, the R2 between the factor and the observed variables.
This is treated as an indictor of the internal consistency of the solution. Values below
.70 are considered undesirable.
Factor 1 2
1 .966 .003
2 .003 .953
Extraction Method: Principal Axis Factoring.
Rotation Method: Varimax with Kaiser Normalization.
Factor Scores Method: Regression.
7
These squared multiple correlation coefficients are equal to the variance of the
factor scores:
Descriptive Statistics
N Mean Variance
FAC1_1 220 .0000000 .966
FAC2_1 220 .0000000 .953
The input data included two variables (SES and Group) not included in the factor
analysis. Just for fun, try conducting a multiple regression predicting subjects SES
from their factor scores and also try using Students t to compare the two groups means
on the factor scores. Do note that the scores for factor 1 are not correlated with those
for factor 2. Accordingly, in the multiple regression the squared semipartial correlation
coefficients are identical to squared zero-order correlation coefficients and the
R 2 = rY21 + rY22 .
Model Summary
ANOVAb
Sum of
Model Squares df Mean Square F Sig.
1 Regression 1320.821 2 660.410 4453.479 .000a
Residual 32.179 217 .148
Total 1353.000 219
a. Predictors: (Constant), FAC2_1, FAC1_1
b. Dependent Variable: SES
Coefficientsa
Standardized
Coefficients Correlations
Model Beta t Sig. Zero-order Part
1 (Constant) 134.810 .000
FAC1_1 .681 65.027 .000 .679 .681
FAC2_1 -.718 -68.581 .000 -.716 -.718
a. Dependent Variable: SES
8
Group Statistics
Std. Error
GROUP N Mean Std. Deviation Mean
FAC1_1 1 121 -.4198775 .97383364 .08853033
2 99 .5131836 .71714232 .07207552
FAC2_1 1 121 .5620465 .88340921 .08030993
2 99 -.6869457 .55529938 .05580969
poorly under conditions of non-simple structure and variable loadings, which are typical
of the conditions most often found in actual practice. Grice and Harris developed an
alternative unit-weighting scheme which produced factor scores that compared
favorably with exact factor scores -- they based the weightings on the factor score
coefficients rather than on the loadings.
Grice's article extended the discussion to the case of oblique factor analysis,
where one could entertain several different sorts of unit-weighting schemes -- for
example, basing them on the pattern matrix (loadings, standardized regression
coefficients for predicting), the structure matrix (correlations of variables with factors), or
the factor score coefficients. Grice defined a variable as salient on a factor if it had a
weighting coefficient whose absolute value was at least 1/3 as large as that of the
variable with the largest absolute weighting coefficient on that factor. Salient items'
weights were replaced with 1 or 1, and nonsalient variables' weights with 0. The
results of his Monte Carlo study indicated that factor scores using this unit-weighting
scheme based on scoring coefficients performed better than those using various other
unit-weighting schemes and at least as good as exact factor scores (by most criteria
and under most conditions). He did note, however, that exact factor scores may be
preferred under certain circumstances -- for example, when using factor scores on the
same sample as that from which they were derived, especially when sample size is
relatively small. If we followed Grices advice we would drop Reputation from both
subscales and Cost from the second subscale.
Cronbachs Alpha
If you have developed subscales such as the Aesthetic Quality and Cheap Drunk
subscales above, you should report an estimate of the reliability of each subscale. Test-retest
reliability can be employed if you administer the scale to the same persons twice, but usually
you will only want to administer it once to each person. Cronbachs alpha is an easy and
generally acceptable estimate of reliability.
Suppose that we are going to compute AQ (Aesthetic Quality) as color + taste + aroma
reputat and CD as cost + size + alcohol reputat. How reliable would such subscales be?
We conduct an item analysis to evaluate the reliability (and internal consistency) of the each
subscale.
Before conducting the item analysis, we shall need to multiply the Reputation variable
by minus 1, since it is negatively weighted in the AQ and CD subscale scores. Transform
Compute NegRep = 1 reputat.
10
Analyze, Scale, Reliability Analysis. Scoot color, aroma, taste, and NegRep into the
items box.
Reliability Statistics
Cronbach's
Alpha N of Items
.886 4
Item-Total Statistics
Notice that NegRep is not as well correlated with the corrected total scores as are the
other items and that dropping it from the scale would increase the value of alpha considerably.
That might be enough to justify dropping the reputation variable from this subscale.
If you conduct an item analysis on the CD items you will find alpha = .878 and increases
to .941 if Reputation is dropped from the scale.
Each loading is classified as Positively Salient (Catell used a criterion of > .10,
Ill use a higher cut, > .30), Negatively Salient (< -.30) or neither (HyperPlane). One
then constructs a third order square [PS, HP, NS] matrix comparing Group 1 with Group
2. Ill abbreviate the contents of this table using these cell indices:
Group 1
PS HP NS
PS 11 12 13
Group2 HP 21 22 23
NS 31 32 33
The loadings for X1 - F1 are PS for both groups, so it is counted in cell 11. Ditto
for X2 - F1. X3 - F1 is HP in both groups, so it is counted in cell 22. Ditto for X4 - F1.
X5 - F1 is NS in Group 1 but PS in Group 2, so it is counted in cell 13.
Thus, the table for comparing Factor 1 in Group 1 with Factor 1 in Group 2 with
frequency counts inserted in the cells looks like this:
Group 1
PS HP NS
PS 2 0 1
Group2 HP 0 2 0
NS 0 0 0
The 1 in the upper right corner reflects the difference in the two patterns with
respect to X5. Counts in the main diagonal, especially in the upper left and the lower
right, indicate similarity of structure; counts off the main diagonal, especially in the upper
right or lower left, indicate dissimilarity.
Catells s is computed from these counts this way:
11 + 33 13 31
s= (The numbers here are cell indices.)
11 + 33 + 13 + 31 + .5(12 + 21 + 23 + 32)
For our data,
2 + 0 1 0
s= = .33
2 + 0 + 1 + 0 + .5(0 + 0 + 0 + 0)
Catell et al. (Educ. & Psych. Measurement, 1969, 29, 781-792) provide tables to
convert s to an approximate significance level, P, for testing the null hypothesis that the
two factors being compared (one from population 1, one from population 2) are not
related to one another. [I have these tables, in Tabachnick & Fidell, 1989, pages 717 &
718, if you need them.] These tables require you to compute the percentage of
hyperplane counts (60, 70, 80, or 90) and to have at least 10 variables (the table has
rows for 10, 20, 30, 40, 50, 60, 80, & 100 variables). We have only 5 variables, and a
hyperplane percentage of only 40%, so we cant use the table. If we had 10 variables
and a hyperplane percentage of 60%, P = .138 for s = .26 and P = .02 for s = .51.
Under those conditions our s of .33 would have a P of about .10, not low enough to
reject the null hypothesis (if alpha = .05) and conclude that the two factors are related
13
(similar). In other words, we would be left with the null hypothesis that Factor 1 is not
the same in population 1 as population 2.
It is not always easy to decide which pairs of factors to compare. One does not
always compare Factor 1 in Group 1 with Factor 1 in Group 2, and 2 in 1 with 2 in 2, etc.
Factor 1 in Group 1 may look more like Factor 2 in Group 2 than it does like Factor 1 in
Group 2, so one would compare 1 in 1 with 2 in 2. Remember that factors are ordered
from highest to lowest SSL, and sampling error alone may cause inversions in the
orders of factors with similar SSLs. For our hypothetical data, comparing 1 in 1 with 1
in 2 makes sense, since F1 has high loadings on X1 and X2 in both groups. But what
factor in Group 2 would we choose to compare with F2 in Group 1 - the structures are so
different that a simple eyeball test tells us that there is no factor in Group 2 similar to F2
in Group 1.
One may also use a simple Pearson r to compare two factors. Just correlate
the loadings on the factor in Group 1 with the loadings on the maybe similar factor in
Group 2. For 1 in 1 compared with 1 in 2, the 1(2) data are .90(.45), .63(.65), .15(.27),
-.09(-.15), & -.74(.95). The r is -0.19, indicating little similarity between the two factors.
The Pearson r can detect not only differences in two factors patterns of loadings,
but also differences in the relative magnitudes of those loadings. One should beware
that with factors having a large number of small loadings, those small loadings could
cause the r to be large (if they are similar between factors) even if the factors had
dissimilar loadings on the more important variables.
Cross-Correlated Factor Scores. Compute factor scoring coefficients for Group
1 and, separately, for Group 2. Then for each case compute the factor score using the
scoring coefficients from the group in which is it located and also compute it using the
scoring coefficients from the other group. Correlate these two sets of factor scores
Same Group and Other Group. A high correlation between these two sets of factor
scores should indicate similarity of the two factors between groups. Of course, this
method and the other two could be used with random halves of one sample to assess
the stability of the solution or with different random samples from the same population at
different times to get something like a test-retest measure of stability across samples
and times.
RMS, root mean square. For each variable square the difference between the
loading in the one group and that in the other group. Find the mean of these differences
and then the square root of that mean. If there is a perfect match between the two
groups loadings, RMS = 0. The maximum value of RMS (2) would result when all of
the loadings are one or minus one, with those in the one group opposite in sign of those
in the other group.
CC, coefficient of congruence. Multiply each loading in the one group by the
corresponding loading in the other group. Sum these products and then divide by the
square root of (the sum of squared loadings for the one group times the sum of squared
loading for the other group).
See Factorial Invariance of the Occupational Work Ethic Inventory -- An example
of the used of multiple techniques to compare factor structures.
14
overdetermination [each factor having at least three or four high loadings and
simple structure (few, nonoverlapping factors)] each increase your chances of
faithfully reproducing the population factor pattern.
Strengths in one area can compensate for weaknesses in another area.
When communalities are high (> .6), you should be in good shape even with N
well below 100.
With communalities moderate (about .5) and the factors well-determined, you
should have 100 to 200 subjects.
With communalities low (< .5) but high overdetermination of factors (not many
factors, each with 6 or 7 high loadings), you probably need well over 100
subjects.
With low communalities and only 3 or 4 high loadings on each, you probably
need over 300 subjects.
With low communalities and poorly determined factors, you will need well over
500 subjects.
Of course, when planning your research you do not know for sure how good the
communalities will be nor how well determined your factors will be, so maybe the best
simple advise, for an a priori rule of thumb, is "the more subjects, the better."
MacCallum's advise to researchers is to try to keep the number of variables and factors
small and select variables (write items) to assure moderate to high communalities.
Closing Comments
Please note that this has been an introductory lesson that has not addressed
many of the less common techniques available. For example, I have not discussed
Alpha Extraction, which extracts factors with the goal of maximizing alpha (reliability)
coefficients of the Extracted Factors, or Maximum-Likelihood Extraction, or several
other extraction methods.
I should remind you of the necessity of investigating (maybe even deleting)
outlying observations. Subjects factor scores may be inspected to find observations
that are outliers with respect to the solution [very large absolute value of a factor score].