Let's Interact! Modeling Interaction Effects in Linear and Generalized Linear Models Using SAS
Let's Interact! Modeling Interaction Effects in Linear and Generalized Linear Models Using SAS
2
Topics
3
Two examples
4
Theory
6
Theorizing and specifying interaction effects
Moderation Mediation
X2
X1 X2 Y
X1 Y
7
Theorizing and specifying interaction effects
β0 intercept term
8
Theorizing and specifying interaction effects
9
Case #1
Predicting Investment Advisor Productivity
Predicting Investment Advisor Productivity
11
Predicting Investment Advisor Productivity
12
Data Preparation
Data Preparation
14
Mean Centering
15
Mean Centering
16
Product Terms
17
Let’s Interact!
Specifying the Model with PROC REG
Model Specification
19
Results: Main Effects Model
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
20
Results: Main Effects Model
Parameter Estimates
Parameter Standard
Variable Estimate Error t Value Pr > |t|
21
Results: Interactive Model
The REG Procedure
Model: INTERACTION
Dependent Variable: TOTAL_REV_11
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
22
Results: Interactive Model
Parameter Estimates
Parameter Standard
Variable Estimate Error t Value Pr > |t|
23
Results: Interactive Model
Test INT_EFFECT Results for Dependent
Variable TOTAL_REV_11
Mean
Source DF Square F Value Pr > F
Numerator 1 431040 9.87 0.0017
Denominator 998 43664
24
Results: Summary Dataset
ACCTS_PER_
ACCTS_ MEDIAN_ HH_MED_HH_
_MODEL_ _TYPE_ _DEPVAR_ Intercept PER_HH HH_ASSETS ASSETS _RSQ_
MAINEFFECTS PARMS TOTAL_REV_11 751.270 53.813 -0.114 . 0.812
INTERACTION PARMS TOTAL_REV_11 739.794 46.247 -0.305 0.562 0.814
25
A Plot is Worth a Thousand Words
(or Coefficients)
Graphical Depictions of Interaction Effects
• Two strategies:
- Effect plots (effect displays) depict the strength and
direction of the relationship between the focal
independent variable and dependent variable at
different levels of the moderator variable.
- Coefficient plots display the coefficient (and
confidence interval) for the focal independent variable
with the scores for the moderator variable centered at
different values. This serves to highlight the regions of
significance of the focal independent variable.
27
Effect Plot
28
Effect Plot
29
Effect Plot
DATA _NULL_;
SET means;
IF VarName="ACCTS_PER_HH" AND LocMeasure="Mean"
THEN CALL SYMPUT('AVG_ACCTS_PER_HH', LocValue);
IF VarName="MEDIAN_HH_ASSETS" AND LocMeasure="Mean"
THEN CALL SYMPUT('AVG_MEDIAN_HH_ASSETS', LocValue);
RUN;
30
Effect Plot
DATA plot_1 (DROP=i j _MODEL_);
SET parmest (WHERE=(_MODEL_="INTERACTION") KEEP=_MODEL_ Intercept
MEDIAN_HH_ASSETS ACCTS_PER_HH ACCTS_PER_HH_MED_HH_ASSETS
RENAME=(MEDIAN_HH_ASSETS=b_MED_ASSETS ACCTS_PER_HH=b_ACCTS
ACCTS_PER_HH_MED_HH_ASSETS=b_MED_ASSETS_ACCTS));
DO i=100 TO 600;
DO j=1.5 TO 4 BY 0.5;
MEDIAN_HH_ASSETS=i;
MEDIAN_HH_ASSETS_CTR=i - INPUT(&AVG_MEDIAN_HH_ASSETS, BEST12.);
ACCTS_PER_HH=j;
ACCTS_PER_HH_CTR=j - INPUT(&AVG_ACCTS_PER_HH, BEST12.);
PRED=Intercept + /* Intercept */
(b_MED_ASSETS * MEDIAN_HH_ASSETS_CTR) + /* Median HH Assets */
(b_ACCTS * ACCTS_PER_HH_CTR) + /* Accounts per household */
(b_MED_ASSETS_ACCTS * (MEDIAN_HH_ASSETS_CTR * ACCTS_PER_HH_CTR))
/* Interaction */
;
OUTPUT;
END;
END;
RUN;
31
Effect Plot
DATA plot_2;
MERGE plot_1 (WHERE=(ACCTS_PER_HH=1.5) RENAME=(PRED=PRED_1_5))
plot_1 (WHERE=(ACCTS_PER_HH=2.0) RENAME=(PRED=PRED_2_0))
plot_1 (WHERE=(ACCTS_PER_HH=2.5) RENAME=(PRED=PRED_2_5))
plot_1 (WHERE=(ACCTS_PER_HH=3.0) RENAME=(PRED=PRED_3_0))
plot_1 (WHERE=(ACCTS_PER_HH=3.5) RENAME=(PRED=PRED_3_5))
plot_1 (WHERE=(ACCTS_PER_HH=4.0) RENAME=(PRED=PRED_4_0));
BY MEDIAN_HH_ASSETS;
RUN;
32
Effect Plot
33
Effect Plot
34
Effect Plot
35
Coefficient Plot
36
Coefficient Plot
37
Coefficient Plot
38
Coefficient Plot
39
Coefficient Plot
40
Coefficient Plot
41
Coefficient Plot
%INTPROBE(DataIn=data_3, DataOut=data_4);
42
Coefficient Plot
Lower 95% Upper 95%
ACCTS_PER_ Parameter CL CL
Variable HH_CENTER Estimate Parameter Parameter Pr > |t|
43
Coefficient Plot
44
Coefficient Plot
45
Case #2
Canadian Attitudes Toward
Canada–US Relations
Research Questions
47
Data: Canadian Election Studies (1997–2011)
“Do you think Canada’s ties with the United States should be much
closer, somewhat closer, about the same as now, somewhat more
distant, or much more distant?”
60 55
52 54
50 46
42
38
40 Much/Somewhat Closer
34
30 38
% 30 26 27 25 About the Same as Now
20 24
21 21 Much/Somewhat More Distant
19
17 16
10
0
1997 2000 2004 2006 2008 2011
48
Let’s Interact Some More!
Specifying the Model with PROC LOGISTIC
Model Specification
PROC LOGISTIC DATA=data_7;
MODEL CANADA_TIES_US=
POST_PARTY_CONS POST_PARTY_NDP POST_PARTY_BQ
POST_PARTY_OTHER POST_NO_PARTY LEFT_RIGHT
LN_DISTANCE_USA
/LINK=CLOGIT RSQUARE;
WEIGHT WEIGHT;
RUN;
Intercept
Intercept and
Criterion Only Covariates
53
Results: Main Effects Model
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter Estimate Error Chi-Square Pr > ChiSq
54
Results: Interactive Model
The LOGISTIC Procedure
Intercept
Intercept and
Criterion Only Covariates
55
Results: Interactive Model
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter Estimate Error Chi-Square Pr > ChiSq
56
Results: Interactive Model
Linear Hypotheses Testing Results
Wald
Label Chi-Square DF Pr > ChiSq
57
Results: Summary Dataset
POST_ POST_
PARTY_ PARTY_ POST_ LN_DISTANCE_
CONS NDP PARTY_BQ USA
58
Effect Plots
59
Effect Plot
DATA plot_1 (KEEP=DISTANCE_CAN_US_BORDER LN_DISTANCE_USA
LN_DISTANCE_USA_CTR CP_: );
SET parmest (KEEP=Intercept_: POST_PARTY_: LN_DIST:
RENAME=(POST_PARTY_CONS=b_CONS POST_PARTY_NDP=b_NDP
POST_PARTY_BQ=b_BQ LN_DISTANCE_USA=b_LN_DISTANCE_USA
LN_DIST_USA_CONS=b_LN_DIST_USA_CONS
LN_DIST_USA_NDP=b_LN_DIST_USA_NDP
LN_DIST_USA_BQ=b_LN_DIST_USA_BQ));
DO i=0.1, 0.5, 1 TO 2500;
DISTANCE_CAN_US_BORDER=i;
LN_DISTANCE_USA=(LOG(DISTANCE_CAN_US_BORDER));
LN_DISTANCE_USA_CTR=(LOG(DISTANCE_CAN_US_BORDER))
-4.6644800622;
60
Effect Plot
REG_EQN_1_LIB=Intercept_1 +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR);
CP_1_LIB=CDF('LOGISTIC',REG_EQN_1_LIB);
REG_EQN_1_CONS=Intercept_1 + (b_CONS*1) +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR) +
(b_LN_DIST_USA_CONS*(1*LN_DISTANCE_USA_CTR));
CP_1_CONS=CDF('LOGISTIC',REG_EQN_1_CONS);
REG_EQN_1_NDP=Intercept_1 + (b_NDP*1) +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR) +
(b_LN_DIST_USA_NDP*(1*LN_DISTANCE_USA_CTR));
CP_1_NDP=CDF('LOGISTIC',REG_EQN_1_NDP);
REG_EQN_1_BQ=Intercept_1 + (b_BQ*1) +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR) +
(b_LN_DIST_USA_BQ*(1*LN_DISTANCE_USA_CTR));
CP_1_BQ=CDF('LOGISTIC',REG_EQN_1_BQ);
61
Effect Plot
REG_EQN_2_LIB=Intercept_2 +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR);
CP_2_LIB=CDF('LOGISTIC',REG_EQN_2_LIB);
REG_EQN_2_CONS=Intercept_2 + (b_CONS*1) +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR) +
(b_LN_DIST_USA_CONS*(1*LN_DISTANCE_USA_CTR));
CP_2_CONS=CDF('LOGISTIC',REG_EQN_2_CONS);
REG_EQN_2_NDP=Intercept_2 + (b_NDP*1) +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR) +
(b_LN_DIST_USA_NDP*(1*LN_DISTANCE_USA_CTR));
CP_2_NDP=CDF('LOGISTIC',REG_EQN_2_NDP);
REG_EQN_2_BQ=Intercept_2 + (b_BQ*1) +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR) +
(b_LN_DIST_USA_BQ*(1*LN_DISTANCE_USA_CTR));
CP_2_BQ=CDF('LOGISTIC',REG_EQN_2_BQ);
62
Effect Plot
REG_EQN_3_LIB=Intercept_3 +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR);
CP_3_LIB=CDF('LOGISTIC',REG_EQN_3_LIB);
REG_EQN_3_CONS=Intercept_3 + (b_CONS*1) +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR) +
(b_LN_DIST_USA_CONS*(1*LN_DISTANCE_USA_CTR));
CP_3_CONS=CDF('LOGISTIC',REG_EQN_3_CONS);
REG_EQN_3_NDP=Intercept_3 + (b_NDP*1) +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR) +
(b_LN_DIST_USA_NDP*(1*LN_DISTANCE_USA_CTR));
CP_3_NDP=CDF('LOGISTIC',REG_EQN_3_NDP);
REG_EQN_3_BQ=Intercept_3 + (b_BQ*1) +
(b_LN_DISTANCE_USA*LN_DISTANCE_USA_CTR) +
(b_LN_DIST_USA_BQ*(1*LN_DISTANCE_USA_CTR));
CP_3_BQ=CDF('LOGISTIC',REG_EQN_3_BQ);
OUTPUT;
END;
RUN;
63
Effect Plots
64
Coefficient Plots
65
Recap
66
References
Aiken, L.S. and S.G. West (1991) Multiple Regression: Testing and Interpreting
Interactions. Thousand Oaks, CA: Sage.
Allison, P.D. (1977) “Testing for Interaction in Multiple Regression.” American
Journal of Sociology 83(1): 144–153.
Baron, R.M. and D.A. Kenny (1986) “The Moderator-Mediator Variable
Distinction in Social Psychological Research.” Journal of Personality and
Social Psychology 51(6): 1173–1182.
Braumoeller, B.F. (2004) “Hypothesis Testing and Multiplicative Interaction
Terms.” International Organization 58(4): 807–820.
Edwards, J.R. (2008) “Seven Deadly Myths of Testing Moderation in
Organizational Research.” Statistical and Methodological Myths and Urban
Legends. Eds. C.E. Lance and R.J. Vandenberg New York: Routledge.
Fox, J. (2008) Applied Regression Analysis and Generalized Linear Models. 2nd
ed. Thousand Oaks, CA: Sage.
67
References
68
Thank you!
Timothy B. Gravelle
Principal Scientist & Director, Insights Lab
[email protected]