0% found this document useful (0 votes)

510 views8 pages

Path Analysis Introduction and Example

This document introduces path analysis and provides an example using data on academic professionals. It discusses: 1. The assumptions of path analysis models including linearity, no feedback loops, uncorrelated errors, and causal closure. 2. How to specify a path model using simultaneous equations or a path diagram. It presents a simple path model relating variables like time, publications, citations, and salary. 3. How to estimate path coefficients using the regression approach and compute expected correlations between variables based on the model. 4. It applies this to example data on academics, presenting a path diagram and system of equations relating variables like sex, time, publications, citations, and salary.

Uploaded by

Guerrero JM

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

510 views8 pages

Path Analysis Introduction and Example

Uploaded by

Guerrero JM

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Path Analysis Introduction and Example

Joel S Steele, PhD

Winter 2017

Path Analysis

Model specification

There two main ways of communicating the system of equations that represents a theoretical model. Either
with a set of simultaneous equations, or with a path diagram. Below we explore both and provide an example.

Path Model Assumptions

For this example we will be accepting a number of assumptions.

1. All causal relations are linear and additive
2. All models are recursive
• results in uncorrelated error terms
• no two-way causal relations
• no feedback loops
3. Error terms are uncorrelated with other independent variables
4. There is a weak causal ordering
5. Causal closure, meaning all of the relevant causal variables are included in the model
If these assumptions are met, then we can use least squares regression for our estimation. In what follows we
will be fitting our model to the standardized data.

A simple example

Figure 1: Regression path diagram

Based on Figure 1 we have a simple multiple regression, this is not any more difficult than what we have seen
previously. Everything that we know from multiple regression should replicate in this situation. However,
there is another aspect to this illustration that is important, namely that the goal of path modeling, and the
multivariate extensions such as SEM and latent variable modeling, is to reproduce the variance-covariance
matrix of the variables included. In this example we will be using z-scores, so we will be interested in
reproducing the correlation matrix among the variables z1 , z2 , and z3 .

1
Simultaneous equation modeling approach

The equation that represents the path model above in Figure 1 can be expressed as,

z1i = β12 z2i + β13 z3i + β1a uai . (1)

In our following steps we will work to compute the correlations among each of the variables, based on
the model. That is, we will compute the correlations using Equation 1 above to see how each relations is
decomposed based on our theoretical arrangement.

Correlation r12

In order to compute the model-based expected correlation between z1 and z2 we will multiply both sides of
the equation by z2 and simplify.

1
n Σz1i z2i = n1 Σβ12 z2i z2i + n1 Σβ13 z3i z2i + n1 Σβ1a uai z2i
1
n Σz1i z2i = β12 n1 Σz2i z2i + β13 n1 Σz3i z2i + β1a n1 Σuai z2i
r12 = β12 (1) + β13 r23 + β1a ra2

It is important to note that, by assumption errors are uncorrelated with all other predictors, thus ra2 = 0.
Making this substitution we obtain,

r12 = β12 + β13 r23 (2)

represents our model based estimation of the correlation between z1 and z2 .

Correlation r13

In order to compute the model-based expected correlation between z1 and z3 we will multiply both sides of
the equation by z3 and simplify.

1
n Σz1i z3i = n1 Σβ12 z2i z3i + n1 Σβ13 z3i z3i + n1 Σβ1a uai z3i
r13 = β12 r23 + β13 (1) + 0 (3)
r13 = β12 r23 + β13

Parameter estimation of β12 and β13

Now that we have the model implied correlations for both r12 and r13 , we can focus on the estimation of the
parameters β12 and β13 . Starting from the model implied relations among the variables, the estimation of
these parameters can be expressed using our earlier solutions in equations 2 and 3.
To begin, we will focus on the estimation of β12 . Our first step is to solve for the parameter β13 from equation
3. We do this in order to get an equation that expresses β13 in terms of β12 , we will need this to solve for β12 .

r13 = β12 r23 + β13

.
β13 = r13 − β12 r23

Substituting this expression into equation 2 we obtain,

2
r12 = β12 + (r13 − β12 r23 )r23
2
r12 = β12 + r13 r23 − β12 r23
2 ,
r12 − r13 r23 = β12 − β12 r23
2
r12 − r13 r23 = β12 (1 − r23 )
r12 − r13 r23
β12 = 2 (4)
1 − r23

A similar process can be performed for the estimation of β13 .

Standard Error of Estimation

Finally, we will solve for the model based correlation of z1i with itself. We multiply through our structural
equation by z1i ,

1
n Σz1i z1i = n1 Σβ12 z2i z1i + n1 Σβ13 z3i z1i + n1 Σβ1a uai z1i
1 = β12 r12 + β13 r13 + β1a r1a
β1a r1a = 1 − (β12 r12 + β13 r13 )

Recall that the multiple R2 for a model is equal to Σkp=1 βyp ryp , where k is the number of predictors for the
variable y. In our above equation this translates to R2 = β12 r12 + β13 r13 , thus we can express the above
equation as,

β1a r1a = 1 − R2 . (5)

You may also notice that since uai is uncorrelated with any other predictor, the correlation r1a = β1a . This
results in our final expression of the equation 5,

2
β1a =√1 − R2
(6)
β1a = 1 − R2 .

This last expression is our standard error of the estimate from the model.

3
Data Example

Motivation

The difference from what we have seen before is that now we are considering multiple equations with multiple
outcomes possible. Note that each equation is still for a single outcome, but we can consider the entire system
of equations. This allows us to not only see the influence of other inputs on relations among predictors and
outcomes, as with Moderation, in this framework we are interested in the possible mechanisms of causation.
These causal relations can be either direct or indirect meaning that they can operate through other variables.
These data represent a subset of 62 academic professionals who were measured on a number of variables
including:
• sex : Biological sex of respondent (male=1)
• time : Time, in years, since earning their PhD
• pub : Number of publications
• cit : Number of citations
• salary : Annual salary in dollars

Table 1: Descriptive statistics

mean sd min max range se

time 6.790 4.278 1 21 20 0.543
pub 18.177 14.004 1 69 68 1.779
sex 0.565 0.500 0 1 1 0.063
cit 40.226 17.172 1 90 89 2.181
salary 54815.758 9706.023 37939 83503 45564 1232.666

Below we present a path diagram in Figure 2, as well as the mathematical specification of the system of
equations in Equation 7.

Zero-order correlations

It is always informative to look at the raw associations among the variables before any modeling is proposed.
Below is the correlation table for these data.

Table 2: correlation raw data

time pub sex cit salary

time 1.000 0.651 0.210 0.373 0.608
pub 0.651 1.000 0.159 0.333 0.506
sex 0.210 0.159 1.000 0.149 0.201
cit 0.373 0.333 0.149 1.000 0.550
salary 0.608 0.506 0.201 0.550 1.000

The entire system can be expressed as,

time ∼ sex
pub ∼ sex + time
(7)
cit ∼ sex + time + pub
salary ∼ sex + time + pub + cit

4
Figure 2: Path diagram

Model fit using linear multiple regression

Next we explore what the estimates will be for each of our linear equations using the multiple regression
estimation framework.

time ∼ sex

Estimate Std. Error t value Pr(>|t|)

sex 0.21 0.125 1.674 0.099

pub ∼ sex + time

Estimate Std. Error t value Pr(>|t|)

sex 0.023 0.1 0.234 0.816
time 0.646 0.1 6.442 0.000

cit ∼ sex + time + pub

Estimate Std. Error t value Pr(>|t|)

sex 0.071 0.122 0.578 0.566
time 0.257 0.159 1.620 0.110
pub 0.155 0.157 0.983 0.330

salary ∼ sex + time + pub + cit

Estimate Std. Error t value Pr(>|t|)

sex 0.047 0.095 0.498 0.621
time 0.378 0.126 3.002 0.004
pub 0.134 0.123 1.089 0.281
cit 0.357 0.101 3.542 0.001
5
Structural Equation Modeling of the System

Next we will use the R package lavaan to fit the above model to the our data.
suppressMessages(library(lavaan))
fig12.2.1_mod = '
time ~ sex
pub ~ sex + time
cit ~ sex + time + pub
salary ~ sex + time + pub + cit'
fit = sem(fig12.2.1_mod, data=dat)
summary(fit,fit.measures=T)

lavaan (0.5-22) converged normally after 135 iterations

Number of observations 62

Estimator ML
Minimum Function Test Statistic 0.000
Degrees of freedom 0

Model test baseline model:

Minimum Function Test Statistic 91.009

Degrees of freedom 10
P-value 0.000

User model versus baseline model:

Comparative Fit Index (CFI) 1.000

Tucker-Lewis Index (TLI) 1.000

Loglikelihood and Information Criteria:

Loglikelihood user model (H0) -1348.081

Loglikelihood unrestricted model (H1) -1348.081

Number of free parameters 14

Akaike (AIC) 2724.162
Bayesian (BIC) 2753.942
Sample-size adjusted Bayesian (BIC) 2709.893

Root Mean Square Error of Approximation:

RMSEA 0.000
90 Percent Confidence Interval 0.000 0.000
P-value RMSEA <= 0.05 NA

Standardized Root Mean Square Residual:

SRMR 0.000

Parameter Estimates:

Information Expected

6
Standard Errors Standard

Regressions:
Estimate Std.Err z-value P(>|z|)
time ~
sex 1.794 1.063 1.688 0.091
pub ~
sex 0.657 2.762 0.238 0.812
time 2.114 0.323 6.548 0.000
cit ~
sex 2.426 4.096 0.592 0.554
time 1.034 0.622 1.661 0.097
pub 0.190 0.188 1.008 0.314
salary ~
sex 917.767 1783.362 0.515 0.607
time 857.006 276.091 3.104 0.002
pub 92.746 82.391 1.126 0.260
cit 201.931 55.141 3.662 0.000

Variances:
Estimate Std.Err z-value P(>|z|)
.time 17.214 3.092 5.568 0.000
.pub 111.191 19.971 5.568 0.000
.cit 244.239 43.867 5.568 0.000
.salary 46042901.212 8269549.178 5.568 0.000

7
Estimation comparisons

Below we present tables of estimates from both the SEM as well as the multiple equations using linear
regression.

Table 7: Standardized estimates from SEM

lhs rhs std.all z pvalue

time sex 0.210 1.688 0.091
pub sex 0.023 0.238 0.812
pub time 0.646 6.548 0.000
cit sex 0.071 0.592 0.554
cit time 0.257 1.661 0.097
cit pub 0.155 1.008 0.314
salary sex 0.047 0.515 0.607
salary time 0.378 3.104 0.002
salary pub 0.134 1.126 0.260
salary cit 0.357 3.662 0.000

Table 8: Estimates from linear regression models

Estimate Std. Error t value Pr(>|t|)

time ~ sex 0.210 0.125 1.674 0.099
pub ~ sex 0.023 0.100 0.234 0.816
pub ~ time 0.646 0.100 6.442 0.000
cit ~ sex 0.071 0.122 0.578 0.566
cit ~ time 0.257 0.159 1.620 0.110
cit ~ pub 0.155 0.157 0.983 0.330
salary ~ sex 0.047 0.095 0.498 0.621
salary ~ time 0.378 0.126 3.002 0.004
salary ~ pub 0.134 0.123 1.089 0.281
salary ~ cit 0.357 0.101 3.542 0.001

Table 9: correlation raw data

time pub sex cit salary

time 1.000 0.651 0.210 0.373 0.608
pub 0.651 1.000 0.159 0.333 0.506
sex 0.210 0.159 1.000 0.149 0.201
cit 0.373 0.333 0.149 1.000 0.550
salary 0.608 0.506 0.201 0.550 1.000

Chi-Square Test Guide for Students
100% (1)
Chi-Square Test Guide for Students
8 pages
Unbiased Estimators and Sufficiency in Statistics
No ratings yet
Unbiased Estimators and Sufficiency in Statistics
2 pages
Advanced Recurrence Relations
No ratings yet
Advanced Recurrence Relations
100 pages
Lecture 9 F Test Practice Questions
No ratings yet
Lecture 9 F Test Practice Questions
2 pages
10.4 Applications of Numerical Methods Applications of Gaussian Elimination With Pivoting
No ratings yet
10.4 Applications of Numerical Methods Applications of Gaussian Elimination With Pivoting
11 pages
Stat 138 Course Syllabus
No ratings yet
Stat 138 Course Syllabus
4 pages
Reviewer in Inquiries Investigation and Immersion Non Stem
No ratings yet
Reviewer in Inquiries Investigation and Immersion Non Stem
3 pages
Chapter-9-Simple Linear Regression & Correlation
No ratings yet
Chapter-9-Simple Linear Regression & Correlation
9 pages
Probability Concepts for Managers
100% (1)
Probability Concepts for Managers
45 pages
QT Chapter 4
No ratings yet
QT Chapter 4
6 pages
Stat 132 Syllabus
No ratings yet
Stat 132 Syllabus
3 pages
Chi-Square Test of Independence Guide
100% (1)
Chi-Square Test of Independence Guide
3 pages
Group Theory Lecture Notes Overview
No ratings yet
Group Theory Lecture Notes Overview
75 pages
Statistical Methods For The Social Sciences Academia
No ratings yet
Statistical Methods For The Social Sciences Academia
105 pages
Independent vs. Dependent Variables - Definition & Examples
100% (1)
Independent vs. Dependent Variables - Definition & Examples
12 pages
The KENDALL Coefficient of Concordance
No ratings yet
The KENDALL Coefficient of Concordance
3 pages
Linear Regression & Correlation Guide
No ratings yet
Linear Regression & Correlation Guide
3 pages
Advanced Educational Statistics Course
No ratings yet
Advanced Educational Statistics Course
37 pages
Statistical Methods Overview
No ratings yet
Statistical Methods Overview
115 pages
Performance Evaluation by Fuzzy Inference PDF
No ratings yet
Performance Evaluation by Fuzzy Inference PDF
7 pages
Regression Analysis Linear and Multiple Regression
No ratings yet
Regression Analysis Linear and Multiple Regression
6 pages
The Idea of Integral
100% (1)
The Idea of Integral
11 pages
MTH 106 INTRODUCTORY TO DESCRIPTIVE STATISTICS
100% (1)
MTH 106 INTRODUCTORY TO DESCRIPTIVE STATISTICS
134 pages
Mmw-Linear Regression and Correlation - 082330
No ratings yet
Mmw-Linear Regression and Correlation - 082330
23 pages
Chapter 1 - Introduction To Statistics
No ratings yet
Chapter 1 - Introduction To Statistics
91 pages
Week 2 Lesson 1 2 - Quantitative Research, Characteristics and Importance
No ratings yet
Week 2 Lesson 1 2 - Quantitative Research, Characteristics and Importance
13 pages
Precalculus Lesson Plan: Induction & Series
No ratings yet
Precalculus Lesson Plan: Induction & Series
7 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
Lesson 4 Integration of Trigonometric Functions
No ratings yet
Lesson 4 Integration of Trigonometric Functions
8 pages
Calculus I
No ratings yet
Calculus I
197 pages
Data Analysis Using Statistics
No ratings yet
Data Analysis Using Statistics
75 pages
Agra University Journal Scie
No ratings yet
Agra University Journal Scie
69 pages
Chapter 9
No ratings yet
Chapter 9
23 pages
Normal Curve Powerpoint
No ratings yet
Normal Curve Powerpoint
18 pages
Understanding Mean, Median, and Mode
No ratings yet
Understanding Mean, Median, and Mode
30 pages
Resampling Techniques for Engineers
No ratings yet
Resampling Techniques for Engineers
15 pages
Handout Numerical Analysis
No ratings yet
Handout Numerical Analysis
3 pages
Stats for Master's Students
No ratings yet
Stats for Master's Students
587 pages
ECON Education and Human Capital Compilation
No ratings yet
ECON Education and Human Capital Compilation
51 pages
Math 26 Midyear 2016 Course Guide
100% (1)
Math 26 Midyear 2016 Course Guide
4 pages
JB Ies 110 PDF
100% (1)
JB Ies 110 PDF
459 pages
1.2. Mathematical Language and Symbols
No ratings yet
1.2. Mathematical Language and Symbols
44 pages
GEC 105: Speech Communication Overview
No ratings yet
GEC 105: Speech Communication Overview
7 pages
Course Outline Title Probability and Statistics Code MT-205 Credit Hours
No ratings yet
Course Outline Title Probability and Statistics Code MT-205 Credit Hours
7 pages
Module 11 Unit 3 Multiple Linear Regression
No ratings yet
Module 11 Unit 3 Multiple Linear Regression
8 pages
Ridge Regression Overview and Applications
No ratings yet
Ridge Regression Overview and Applications
149 pages
Vector Analysis
No ratings yet
Vector Analysis
18 pages
The Analysis of Economic Data Using Multivariate and Time Series Technique
No ratings yet
The Analysis of Economic Data Using Multivariate and Time Series Technique
5 pages
Math for Everyday Life
No ratings yet
Math for Everyday Life
6 pages
Calculus 3: Multiple Integration Guide
No ratings yet
Calculus 3: Multiple Integration Guide
31 pages
Markov Chain
No ratings yet
Markov Chain
8 pages
CMSC 56 Course Outline
No ratings yet
CMSC 56 Course Outline
17 pages
Engineering Course Overview
No ratings yet
Engineering Course Overview
15 pages
Research Methods for Grad Students
No ratings yet
Research Methods for Grad Students
43 pages
Statistical Modelling and Regression Techniques
No ratings yet
Statistical Modelling and Regression Techniques
63 pages
Generalized Estimating Equations (Gees)
No ratings yet
Generalized Estimating Equations (Gees)
40 pages
Theme 3 Multivariante Regression Model
No ratings yet
Theme 3 Multivariante Regression Model
8 pages
Path Analysis for Data Scientists
No ratings yet
Path Analysis for Data Scientists
30 pages
AAQ IIJCBSacceptedmanuscriptSeptember2018 PDF
No ratings yet
AAQ IIJCBSacceptedmanuscriptSeptember2018 PDF
31 pages
Bond Et Al 2011
No ratings yet
Bond Et Al 2011
13 pages
Loneliness and Drug Use in Young Adults: International Journal of Adolescence and Youth
No ratings yet
Loneliness and Drug Use in Young Adults: International Journal of Adolescence and Youth
19 pages
Loneliness Correlates in Advanced Alcohol Abusers: Socialfactors Needs
No ratings yet
Loneliness Correlates in Advanced Alcohol Abusers: Socialfactors Needs
9 pages
Sentido de Vida
No ratings yet
Sentido de Vida
8 pages
Loneliness and Alcohol Abuse: A Review of Evidences of An Interplay
No ratings yet
Loneliness and Alcohol Abuse: A Review of Evidences of An Interplay
10 pages
James Gaskin Stats Tools Package Guide
100% (1)
James Gaskin Stats Tools Package Guide
44 pages
Linear Regression and Feature Selection
100% (1)
Linear Regression and Feature Selection
3 pages
en - Statitsique Descriptive Univariée Anglais
No ratings yet
en - Statitsique Descriptive Univariée Anglais
7 pages
Deep Neural Network Module 4 Regularization
No ratings yet
Deep Neural Network Module 4 Regularization
53 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
12 pages
Time Series
100% (4)
Time Series
19 pages
Normal Distribution Practice
No ratings yet
Normal Distribution Practice
2 pages
Tranad: Deep Transformer Networks For Anomaly Detection in Multivariate Time Series Data
No ratings yet
Tranad: Deep Transformer Networks For Anomaly Detection in Multivariate Time Series Data
15 pages
Fuzzy Regression Models
No ratings yet
Fuzzy Regression Models
18 pages
Crime Data Analysis in Toronto - Group 4
No ratings yet
Crime Data Analysis in Toronto - Group 4
22 pages
Model-Based Urban Road Network Performance Measurement Using Travel Time Reliability A Case Study of Addis Ababa City, Ethiopia
No ratings yet
Model-Based Urban Road Network Performance Measurement Using Travel Time Reliability A Case Study of Addis Ababa City, Ethiopia
7 pages
Lesson 11:: Expected Value of Random Variables
100% (1)
Lesson 11:: Expected Value of Random Variables
20 pages
Filtering and Identification: Lecture 3: Stochastic Least Squares Square Root Estimation
No ratings yet
Filtering and Identification: Lecture 3: Stochastic Least Squares Square Root Estimation
45 pages
Chapter 2
No ratings yet
Chapter 2
41 pages
Ma3251 - SNM Unit - 1 Testing of Hypothesis Au Questions-1
No ratings yet
Ma3251 - SNM Unit - 1 Testing of Hypothesis Au Questions-1
31 pages
Korelasi Keterampilan dan Kreativitas Wirausaha
No ratings yet
Korelasi Keterampilan dan Kreativitas Wirausaha
12 pages
Final Exam
50% (2)
Final Exam
4 pages
Multiple Discriminant Analysis and Logistic Regression
No ratings yet
Multiple Discriminant Analysis and Logistic Regression
56 pages
Weibull vs Exponential Distributions
No ratings yet
Weibull vs Exponential Distributions
24 pages
Sheet 4 Distribution Function
No ratings yet
Sheet 4 Distribution Function
4 pages
Chapter 01
No ratings yet
Chapter 01
16 pages
Computational: Erwin L. Medina
No ratings yet
Computational: Erwin L. Medina
29 pages
FINAL Exam - Stat and Prob 11
No ratings yet
FINAL Exam - Stat and Prob 11
4 pages
Excel NormS Functions Spreadsheet
No ratings yet
Excel NormS Functions Spreadsheet
16 pages
Standard Deviation Notes and Values
No ratings yet
Standard Deviation Notes and Values
1 page
Flexible Regression and Smoothing Using GAMLSS in R 1st Edition Mikis D. Stasinopoulos - Download The Ebook Today and Own The Complete Version
100% (8)
Flexible Regression and Smoothing Using GAMLSS in R 1st Edition Mikis D. Stasinopoulos - Download The Ebook Today and Own The Complete Version
57 pages
A Meta-Analysis On The Effects of Reading Media On Reading Comprehension
No ratings yet
A Meta-Analysis On The Effects of Reading Media On Reading Comprehension
39 pages
Flash Sale & Cashback Impact on Impulse Buying
No ratings yet
Flash Sale & Cashback Impact on Impulse Buying
15 pages
11th Stat Model Paper
No ratings yet
11th Stat Model Paper
5 pages
IIT-JEE Statistics Assignment
No ratings yet
IIT-JEE Statistics Assignment
14 pages

Path Analysis Introduction and Example

Uploaded by

Path Analysis Introduction and Example

Uploaded by

Path Analysis Introduction and Example

Joel S Steele, PhD

Path Model Assumptions

For this example we will be accepting a number of assumptions.

Figure 1: Regression path diagram

z1i = β12 z2i + β13 z3i + β1a uai . (1)

r12 = β12 + β13 r23 (2)

represents our model based estimation of the correlation between z1 and z2 .

Parameter estimation of β12 and β13

r13 = β12 r23 + β13

Substituting this expression into equation 2 we obtain,

A similar process can be performed for the estimation of β13 .

Standard Error of Estimation

β1a r1a = 1 − R2 . (5)

Table 1: Descriptive statistics

mean sd min max range se

Table 2: correlation raw data

time pub sex cit salary

The entire system can be expressed as,

Model fit using linear multiple regression

Estimate Std. Error t value Pr(>|t|)

pub ∼ sex + time

Estimate Std. Error t value Pr(>|t|)

cit ∼ sex + time + pub

Estimate Std. Error t value Pr(>|t|)

salary ∼ sex + time + pub + cit

Estimate Std. Error t value Pr(>|t|)

lavaan (0.5-22) converged normally after 135 iterations

Model test baseline model:

Minimum Function Test Statistic 91.009

User model versus baseline model:

Comparative Fit Index (CFI) 1.000

Loglikelihood and Information Criteria:

Loglikelihood user model (H0) -1348.081

Number of free parameters 14

Root Mean Square Error of Approximation:

Standardized Root Mean Square Residual:

Table 7: Standardized estimates from SEM

lhs rhs std.all z pvalue

Table 8: Estimates from linear regression models

Estimate Std. Error t value Pr(>|t|)

Table 9: correlation raw data

time pub sex cit salary

You might also like