0% found this document useful (0 votes)

20 views

Assignment On Data Analysis Using Stat and EViews.

Uploaded by

Sura Gloss Girma

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Assignment On Data Analysis Using Stat and EViews.

Uploaded by

Sura Gloss Girma

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 33

Oromia State University, School of Graduate Studies, Msc In

Development Economics (Burayu Center)

OROMIA STATE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

DEPARTMENT OF DEVELOPMENT ECONOMICS

MSc Development Economics 2nd Year

Advanced Research Methods and Software Application (DECON532)
Group Assignment on Data Analysis Using Stata and EViews.
Name Id.No

1. Sura Girma Umeta……………………………. SGS/DE/W/15/0819

2. Milkiyas Lamessa Wegi.………………………. SGS/DE/W/15/0816
3. Beekbar Debeli Merga….………………………SGS/DE/W/15/0809
4. Miresa Bane Tefera….….………………………PG/DE/W/14/08022

Zemed D.M. (Ph.D.)

Assistant Professor of Economics
December, 2023
Burayyu, Ethiopia
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

Part I: Cross-Sectional Data Analysis using Stata

1. Descriptive Analysis
 Measures of Central Tendency:

Age: Mean age is 24.6 with a standard deviation of 2.84.

Minimum age is 19, and the maximum age is 30.
Sales: Mean sales amount is 43.55 with a standard deviation of 19.97.
Minimum sales amount is 11, and the maximum is 77.

 Measures of Dispersion:

Age: The variance in age is 8.04.

Skewness is negative (-0.0648), indicating a slight left skew.
Kurtosis is 2.64, suggesting some degree of peakedness.
Sales: The variance in sales is 398.9974.
Skewness is positive (0.3328), indicating a slight right skew.
Kurtosis is 1.9513, suggesting some degree of peakedness.

 Measures of Distribution:

Age: The data are relatively symmetric, with a slight left skewness.
The distribution is more peaked than a normal distribution.
Sales: The sales data are slightly positively skewed.
The distribution is moderately peaked.

 Frequency Distribution and Percentiles:

Sales: Percentiles range from 11 to 77.

The median (50th percentile) sales value is 38. 95% of sales fall below 76.
Age: Percentiles range from 19 to 30.
The median (50th percentile) age is 24. 95% of ages fall below 29.

 Comparative Descriptive Analysis:

Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

By Gender: Gender 1 (presumably male) has a mean age of 24.13 and mean sales of 45.
Gender 2 (presumably female) has a mean age of 24.92 and mean sales of 42.58.
By Region: Region 1 has the highest mean sales (58.25) and a mean age of 24.5.
Region 2 has a mean sale of 37.33 and mean age of 22.5.
Region 3 has a mean sale of 30.17 and mean age of 26.83.

 Frequency Distribution by Categories:

Sales: Sales range from 11 to 77. The most common sales values are in the mid-range
(between 40 and 50).
Age: Ages range from 19 to 30. The most common age values are centered around 24.
Gender: 40% are in Gender 1, and 60% are in Gender 2.
Region: 40% are in Region 1, 30% in Region 2, and 30% in Region 3.

2. Hypothesis Testing
 One Sample t-test:

H0: The average sale of respondent is=750

H1: The average sale of respondent is different from 750
Reject H0 and accept H1 because p value is less than 5% that means average sale of respondent
is different from 750.
The mean salary of the sample is $713.5, significantly lower than the hypothesized mean (t = -
2.4601, p = 0.0100). This suggests that the average salary in the sample is significantly different
from $750.

 Independent Sample t-test (Private Sector vs Government Sector):

H0: the efficiency of the worker in the private and government sector is the same.
H1: the efficiency of the worker in the private and government sector is not the same.
The mean salary for the Private Sector (42.55) is significantly higher than that for the
Government Sector (39.125) with a t-value of 2.6778 (p = 0.0105).
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

 Dependent Sample t-test (Before Training vs After Training):

H0: the mean value of observation before and after is the same.
H1: the mean value of observation before and after is not the same.
We reject H0 and accept H1 because p value is less than 5% that means the mean value of
observation before and after is not the same i.e. the mean after is greater than before. The mean
difference in salary before and after training is -3.45, indicating a decrease in salary after
training, and this difference is statistically significant (t = -2.3913, p = 0.0273).

 One-way ANOVA Analysis (among Location 1, 2, and 3):

H0: The sales volumes in three cities are the same.

H1: The sales volumes in three cities are not the same.
Reject H0 if p value is less than 5% and accept H0 if p value is greater than 5%.
So that we reject H0 because prob>F is less than 5% and we conclude that the sales volume in
three cities are not the same. There is a significant difference in sales among the three locations
(F = 35.52, p < 0.0001). Post hoc tests (e.g., Bonferroni, Scheffe) indicate specific differences
between locations.

 Two-way ANOVA Analysis (among educational qualifications and experience):

The overall model is significant (F = 8.24, p = 0.0005), and experience has a significant effect on
salary (F = 23.90, p < 0.0001). Education and the interaction between experience and education
are not significant.

 Chi-Square Test of Independence (Location vs Performance):

H0: performance is dependent on location

H1: performance is independent of location.
pr=0.552 which is >5% we conclude that we accept H0 that means performance depend on
location. There is no significant association between location and performance (χ² = 1.1902, p =
0.552). This suggests that location and performance are independent in this sample.
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

3. Correlation Analysis
 Bivariate/Pairwise Correlation:

Salary and Marks: Correlation coefficient: 0.7018

Interpretation: There is a strong positive correlation (0.7018) between salary and marks,
indicating that individuals with higher marks tend to have higher salaries.

Salary and Communication Awareness: Correlation coefficient: 0.8132

Interpretation: There is a very strong positive correlation (0.8132) between salary and
communication awareness. This suggests that higher communication awareness is associated
with higher salaries.

Salary and IQ: Correlation coefficient: 0.8444

Interpretation: There is a very strong positive correlation (0.8444) between salary and IQ,
indicating that individuals with higher IQ scores tend to have higher salaries.

Communication Awareness and IQ: Correlation coefficient: 0.4441

Interpretation: There is a moderate positive correlation (0.4441) between communication

awareness and IQ.

 Partial Correlation:

Partial correlation measures the relationship between two variables while controlling for the
influence of other variables. The partial correlations of salary with other variables are as follows:

Salary and Marks: Partial correlation: 0.3982

Interpretation: After controlling for other variables, the partial correlation suggests a moderate
positive relationship between salary and marks.
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

Salary and Communication Awareness: Partial correlation: 0.4527

Interpretation: After controlling for other variables, the partial correlation indicates a moderate
positive relationship between salary and communication awareness.

Salary and Awareness: Partial correlation: 0.2415

Interpretation: After controlling for other variables, the partial correlation suggests a weak
positive relationship between salary and awareness.

Salary and IQ: Partial correlation: 0.3493

Interpretation: After controlling for other variables, the partial correlation indicates a moderate
positive relationship between salary and IQ.

4. Multiple Linear Regression Analysis:

 Coefficients and Significance:

Size (coef = 0.2091, p = 0.003): The coefficient is positive, suggesting that as the size increases,
the performance tends to increase. The p-value indicates that the size variable is statistically
significant.

Age (coef = 0.1897, p = 0.083): The coefficient is positive, suggesting that as age increases,
performance tends to increase. However, the p-value is greater than 0.05, indicating that age is
not statistically significant at the 5% level.

Intercept (coef = -0.3568, p = 0.939): The intercept is not statistically significant, suggesting that
it may not add value to the model.

 Model Fit:

Adjusted R-squared = 0.1375: The adjusted R-squared suggests that the model explains
approximately 13.75% of the variance in the dependent variable.
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

F-Statistic (p = 0.0116): The F-statistic tests the overall significance of the model. The p-value is
less than 0.05, indicating that at least one independent variable is statistically significant in
predicting the dependent variable.

 Estimation of Yhat and Residual:

Yhat (predicted values): The Yhat values have been estimated for each observation.

Residuals (prediction errors): Residuals have been calculated as the difference between the
observed and predicted values.

 Diagnostic Statistical Tests:

 Normality Test:

Shapiro-Wilk and Shapiro-Francia Tests for Residuals: Both tests show non-significant
results (p-values > 0.05), suggesting that the residuals follow a normal distribution.

Skewness/Kurtosis tests for Normality for Residuals: The joint test indicates a non-
significant result (p-value > 0.05), suggesting that the skewness and kurtosis of residuals are not
significantly different from a normal distribution.

Histogram and Kernel Density Plot for Residuals: The histogram and kernel density plot
of residuals also visually support the normality assumption.

Normality Tests for Other Variables (Performance, Age, Size):

Skewness/Kurtosis tests for Normality: The joint tests show mixed results. Age and size
have significant p-values, indicating non-normality, while performance does not exhibit
significant departure from normality.

Shapiro-Wilk Tests for Normality: Similar to the joint tests, the Shapiro-Wilk tests show
that performance is borderline significant, while age and size are significantly non-normal.
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

In conclusion, the residuals from the multiple linear regression model appear to be normally
distributed, satisfying one of the key assumptions of linear regression. However, the predicted
values (Yhat) and some of the predictor variables (age and size) may not be perfectly normal.
Further investigation or transformation of variables might be considered to address these
departures from normality.

 Heteroscedastic Test:

H0: The model has constant variable.

H1: the model has no constant variable.
Accept H0 because prob 0.0825 is greater than 5% which means the model has constant
variance.Breusch-Pagan / Cook-Weisberg test (p = 0.0825): The p-value is greater than 0.05,
suggesting that there is no significant evidence of heteroscedasticity.

 Functional Form Test (Omitted Variable Bias):

H0: the model has no omitted variable.

H1: the model has omitted variable.
We accept H0 because pro>F is0.3550 which is greater than 5% that means the model have no
omitted variable. Ramsey RESET test (p = 0.3530): The p-value is greater than 0.05, suggesting
no strong evidence of omitted variable bias.

 Multicollinearity Test:

H0: there is no multi collinearity between variables.

H1: there is multicollinearity between variables.
Mean of VIF =1.20 which is less than 10 that means the model has no problem of
multicollinearity Variance Inflation Factor (VIF): VIF values are close to 1, indicating low
multicollinearity.
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

5. Qualitative Response Models Analysis

 Logistic Regression (Logit) Analysis:

Model: The logit regression model was applied to analyze the relationship between the
performance (dependent variable) and the size and age of entities.

Results: The log-likelihood decreased from -34.62 to -11.48 across iterations, indicating model
improvement. The LR chi2(2) test is significant (p < 0.001), suggesting that the model as a
whole is statistically significant. Pseudo R2 of 0.67 indicates a good fit.

Variable Coefficients: Size (coef: 0.32, p < 0.01) and age (coef: 0.19, p < 0.01) are positively
associated with performance. The intercept (_cons) is negative, suggesting a baseline
performance when size and age are zero.

 Probit Regression Analysis:

Model: The probit regression model was used to analyze the same relationship as in the logit
model.

Results: Similar to logit, LR chi2(2) is significant (p < 0.001), indicating overall model
significance. Pseudo R2 is 0.68, suggesting a good fit.

Variable Coefficients: Size (coef: 0.19, p < 0.01) and age (coef: 0.11, p < 0.01) positively
influence performance. The intercept (_cons) is negative, indicating a baseline performance.

 Tobit Regression Analysis:

Model: Tobit regression was used to analyze censored data regarding performance, considering
size and age.

Results: LR chi2(2) is significant (p < 0.001), indicating model significance. Pseudo R2 is 0.56,
suggesting a moderate fit.
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

Variable Coefficients: Size (coef: 0.014, p < 0.001) and age (coef: 0.015, p < 0.001) positively
impact performance. The intercept (_cons) is negative, representing baseline
performance. /sigma represents the standard deviation of the latent variable, indicating variability
in uncensored observations.
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

Part II: Time-Series Data Analysis using EViews

1. Descriptive Statistics
 Measures of Central Tendency:

LNRGDP LNGFCF LNEXPO LNEXD LNEHE LNAID INF

Mean 12.13975 10.61725 9.818750 9.571250 7.476000 8.877750 9.697500
Median 11.93500 10.52500 9.665000 10.02500 7.275000 9.055000 7.500000
Maximum 13.43000 12.47000 11.72000 12.49000 10.94000 11.20000 36.40000
Minimum 11.54000 9.560000 8.460000 6.370000 5.340000 6.790000 -10.60000
Std. Dev. 0.555469 0.775100 0.885925 1.605901 1.668928 1.312427 10.18585
Skewness 0.885804 0.661192 0.588851 -0.352645 0.552152 -0.036467 0.492548
Kurtosis 2.620992 2.566188 2.248437 2.327600 2.182485 1.802270 3.462591

Jarque-Bera 5.470401 3.228153 3.253045 1.582595 3.146365 2.399796 1.974006

Probability 0.064881 0.199074 0.196612 0.453256 0.207384 0.301225 0.372692

Sum 485.5900 424.6900 392.7500 382.8500 299.0400 355.1100 387.9000

Sum Sq. Dev. 12.03330 23.43040 30.60964 100.5778 108.6276 67.17610 4046.310

Observations 40 40 40 40 40 40 40

Mean: The average value of each variable over the observation period.

The mean of LNRGDP is 12.14, LNGFCF is 10.62, LNEXPO is 9.82, LNEXD is 9.57, LNEHE
is 7.48, LNAID is 8.88 and INF is 9.70.

Median: The middle value of each variable when arranged in ascending order.

The median of: LNRGDP is about 11.94, LNGFCF is about 10.53, LNEXPO is about 9.67,
LNEXD is about 10.03, LNEHE is about 7.26, LNAID is about 9.06 and INF is 7.50.

 Measures of Dispersion:

Maximum and Minimum: The highest and lowest values observed for each variable.

The maximum and minimum value of LNRGDP, LNGFCF, LNEXPO, LNEXD, LNEHE,
LNAID and INF is 13.43 & 11.54, 12.47 & 9.56, 11.72 & 8.46, 12.49 & 6.37, 10.94 & 5.34,
11.20 & 6.79 and 36.40 & -10.60 respectively.

Standard Deviation: A measure of the amount of variation or dispersion in the dataset.

Std. Dev. of LNRGDP, LNGFCF, LNEXPO, LNEXD, LNEHE, LNAID and INF is 0.56, 0.76,
0.89, 1.61, 1.67, 1.30 and 10.19 respectively. LNEHE has a relatively high standard deviation of
approximately 1.67.
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

 Measures of Distribution:
Skewness: Indicates the asymmetry of the distribution. Positive skewness (LNRGDP,
LNGFCF, LNEXPO, LNEHE, INF) indicates a right-skewed distribution.
Kurtosis: Measures the "tailedness" of the distribution. All variables have positive
kurtosis, suggesting heavy tails compared to a normal distribution.
Jarque-Bera Test: Tests whether the data follows a normal distribution. The higher the
statistic, the more divergent the distribution is from normal. For all variables, the p-values
are greater than 0.05, suggesting that they do not significantly depart from normality.

2. Unit Root Test (Stationarity Test

 Augmented Dickey-Fuller (ADF) Test:

UNIT ROOT TEST RESULTS TABLE (ADF)

Null Hypothesis: the variable has a unit root
At Level
LNRGDP LNGFCF LNEXPO LNEXD LNEHE LNAID INF
With Constant t-Statistic 3.2206 2.6018 0.7327 -0.9446 3.2294 -1.0715 -1.6933
Prob. 1.0000 1.0000 0.9914 0.7631 1.0000 0.7174 0.4262
n0 n0 n0 n0 n0 n0 n0
With Constant & Trend t-Statistic 1.0331 -2.1530 -1.9384 -3.1483 -0.1891 -3.6496 -1.7723
Prob. 0.9998 0.5013 0.6150 0.1106 0.9911 0.0390 0.6977
n0 n0 n0 n0 n0 ** n0
At First Difference
d(LNRGD d(LNEXPO
P) d(LNGFCF) ) d(LNEXD) d(LNEHE) d(LNAID) d(INF)
With Constant t-Statistic -1.9055 -7.9327 -5.2222 -5.0934 -4.0044 -4.2229 -8.7776
Prob. 0.3262 0.0000 0.0001 0.0002 0.0036 0.0021 0.0000
n0 *** *** *** *** *** ***
With Constant & Trend t-Statistic -6.6128 -4.4115 -5.4835 -5.0214 -3.4666 -4.1609 -8.7882
Prob. 0.0000 0.0066 0.0003 0.0012 0.0599 0.0122 0.0000
*** *** *** *** * ** ***
Notes:
a: (*)Significant at the 10%; (**)Significant at the 5%; (***) Significant at the 1% and (no) Not Significant
b: Lag Length based on AIC
c: Probability based on MacKinnon (1996) one-sided p-values.

At Level:

All variables (LNRGDP, LNGFCF, LNEXPO, LNEXD, LNEHE, LNAID, INF) with a constant:
None of the variables show unit root (p-values > 0.05).

All variables with a constant and trend: Some variables (LNRGDP, LNEXD, LNEHE, LNAID)
exhibit stationarity (p-values < 0.05).
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

At First Difference:

All variables with a constant: Some variables (LNGFCF, LNEXPO, LNEXD, LNEHE, LNAID,
INF) are stationary (p-values < 0.05).

All variables with a constant and trend: All variables are stationary.

 Phillips-Perron (PP) Test:

UNIT ROOT TEST RESULTS TABLE (PP)

Null Hypothesis: the variable has a unit root
At Level
LNRGDP LNGFCF LNEXPO LNEXD LNEHE LNAID INF
With Constant t-Statistic 8.0179 2.8194 1.3220 -0.9769 3.6504 -1.0786 -4.1354
Prob. 1.0000 1.0000 0.9983 0.7520 1.0000 0.7146 0.0025
n0 n0 n0 n0 n0 n0 ***
With Constant &
Trend t-Statistic 0.9705 -2.1343 -1.4248 -2.1886 -0.3007 -2.3320 -4.1656
Prob. 0.9998 0.5113 0.8377 0.4824 0.9878 0.4077 0.0112
n0 n0 n0 n0 n0 n0 **
At First Difference
d(LNRGDP) d(LNGFCF) d(LNEXPO) d(LNEXD) d(LNEHE) d(LNAID) d(INF)
With Constant t-Statistic -4.3960 -8.4930 -5.1493 -5.1152 -3.9998 -6.0232 -9.2348
Prob. 0.0012 0.0000 0.0001 0.0002 0.0036 0.0000 0.0000
*** *** *** *** *** *** ***
With Constant &
Trend t-Statistic -6.1563 -11.2515 -6.8779 -5.0448 -5.0918 -6.0215 -9.0811
Prob. 0.0000 0.0000 0.0000 0.0011 0.0010 0.0001 0.0000
*** *** *** *** *** *** ***

Notes:
a: (*)Significant at the 10%; (**)Significant at the 5%; (***) Significant at the 1% and (no) Not Significant
b: Lag Length based on AIC
c: Probability based on MacKinnon (1996) one-sided p-values.

At Level:

All variables with a constant: None of the variables show unit root (p-values > 0.05).

All variables with a constant and trend: Some variables (LNRGDP, LNAID) exhibit unit root
(p-values > 0.05).

At First Difference
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

All variables with a constant: All variables are stationary.

All variables with a constant and trend: All variables are stationary.

 Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test:

UNIT ROOT TEST RESULTS TABLE (KPSS)

Null Hypothesis: the variable is stationary
At Level
LNRGDP LNGFCF LNEXPO LNEXD LNEHE LNAID INF
With Constant t-Statistic 0.7276 0.7442 0.7085 0.7576 0.7606 0.7347 0.1691
Prob. ** *** ** *** *** ** n0

With Constant
& Trend t-Statistic 0.1998 0.2059 0.1967 0.1123 0.1970 0.1242 0.1383
Prob. ** ** ** n0 ** * *

At First Difference
d(LNRGDP) d(LNGFCF) d(LNEXPO) d(LNEXD) d(LNEHE) d(LNAID) d(INF)
With Constant t-Statistic 0.6870 0.4544 0.3036 0.0795 0.6451 0.1218 0.0212
Prob. ** * n0 n0 ** n0 n0
With Constant
& Trend t-Statistic 0.1312 0.2403 0.1152 0.0702 0.0510 0.0924 0.0181
Prob. * *** n0 n0 n0 n0 n0
Notes:
a: (*)Significant at the 10%; (**)Significant at the 5%; (***) Significant at the 1% and (no) Not Significant
b: Lag Length based on AIC
c: Probability based on Kwiatkowski-Phillips-Schmidt-Shin (1992, Table 1)

At Level:

All variables with a constant: Some variables (LNGFCF, LNEXD, LNEHE) show stationarity
(p-values < 0.05).

All variables with a constant and trend: Some variables (LNGFCF, LNEXD, LNEHE,
LNAID, INF) exhibit non-stationarity (p-values > 0.05).

At First Difference:

All variables with a constant: Some variables (LNGFCF, LNEHE) are stationary (p-values <
0.05).

All variables with a constant and trend:

Some variables (LNGFCF, LNEXPO, LNEHE, LNAID, INF) show non-stationarity (p-
values > 0.05).
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

3. Optimum Lag Length Determination

 Using LR, FPE, AIC, SIC and HQC

VAR Lag Order Selection Criteria

Endogenous variables: LNRGDP LNGFCF LNEXPO LNEXD LNEHE LNAID INF
Exogenous variables: C
Date: 12/19/23 Time: 21:07
Sample: 1974 2013
Included observations: 38

Lag LogL LR FPE AIC SC HQ

0 -167.1257 NA 2.25e-05 9.164509 9.466170 9.271837

1 49.06070 341.3469* 3.55e-09* 0.365226* 2.778511* 1.223855*
2 94.78917 55.35552 5.54e-09 0.537412 5.062321 2.147340

* indicates lag order selected by the criterion

LR: sequential modified LR test statistic (each test at 5% level)
FPE: Final prediction error
AIC: Akaike information criterion
SC: Schwarz information criterion
HQ: Hannan-Quinn information criterion

The presented results are from a vector autoregression (VAR) model with various lag orders (0,
1, and 2) for the endogenous variables (LNRGDP, LNGFCF, LNEXPO, LNEXD, LNEHE,
LNAID, INF) and an exogenous variable (C). The lag order selection criteria used are the
likelihood ratio test (LR), final prediction error (FPE), Akaike information criterion (AIC),
Schwarz information criterion (SC), and Hannan-Quinn information criterion (HQC). The
optimal lag order is indicated by asterisks (*) in each criterion.

Likelihood Ratio Test (LR):

Lag 0: NA (Not Applicable)

Lag 1: LR = 341.35* (significant at 5% level)

Lag 2: LR = 55.36

Final Prediction Error (FPE):

Lag 0: 2.25e-05

Lag 1: FPE = 3.55e-09* (lower FPE is better)

Lag 2: FPE = 5.54e-09

Akaike Information Criterion (AIC):

Lag 0: 9.16
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

Lag 1: AIC = 0.37* (lower AIC is better)

Lag 2: AIC = 0.54

Schwarz Information Criterion (SC):

Lag 0: 9.47

Lag 1: SC = 2.78* (lower SC is better)

Lag 2: SC = 5.06

Hannan-Quinn Information Criterion (HQC):

Lag 0: 9.27

Lag 1: HQC = 1.22* (lower HQC is better)

Lag 2: HQC = 2.15

Interpretation:

Based on the lag order selection criteria, lag 1 appears to be the optimum lag length for this VAR
model. This conclusion is supported by the significant likelihood ratio test, the lowest final
prediction error, and the lower values of AIC, SC, and HQC compared to lag 0 and lag 2.

It's important to note that the choice of lag order can significantly impact the results and
interpretation of a VAR model. In this case, lag 1 is preferred for its better fit based on the
specified criteria. Further analysis and model diagnostics should be conducted to ensure the
reliability of the chosen lag order.

4. Co-integration Test
 ARDL Bounds Test:

ARDL Bounds Test

Date: 12/19/23 Time: 13:48
Sample: 1976 2013
Included observations: 38
Null Hypothesis: No long-run relationships exist

Test Statistic Value k

F-statistic 2.960692 6

Critical Value Bounds

Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

Significance I0 Bound I1 Bound

10% 2.12 3.23

5% 2.45 3.61
2.5% 2.75 3.99
1% 3.15 4.43

Wald Test:
Equation: Untitled

Test Statistic Value df Probability

F-statistic 10.76525 (6, 23) 0.0000

Chi-square 64.59147 6 0.0000

Null Hypothesis: C(2)=C(3)=C(6)=C(9)=C(10)=C(12)=0

Null Hypothesis Summary:

Normalized Restriction (= 0) Value Std. Err.

C(2) 0.183663 0.044881

C(3) 0.006744 0.045834
C(6) -0.042454 0.021099
C(9) 0.167963 0.065460
C(10) -0.052981 0.021957
C(12) 0.001099 0.000786

Restrictions are linear in coefficients.

Test Statistic: F-statistic: 2.960692 with 6 degrees of freedom.

Critical Value Bounds: The F-statistic is compared to critical value bounds at different
significance levels. At 1% significance level, the critical values are 3.15 and 4.43.

Wald Test: The Wald test assesses the joint hypothesis that certain coefficients are zero. F-
statistic: 10.76525 with (6, 23) degrees of freedom, p-value: 0.0000.

Null Hypothesis Summary: C(2), C(3), C(6), C(9), C(10), C(12) coefficients are tested for
being equal to zero. All p-values are less than 0.05, suggesting rejection of the null hypothesis
for all coefficients.

 Johansen Co-integration Test:

Date: 12/19/23 Time: 21:21

Sample (adjusted): 1977 2013
Included observations: 37 after adjustments
Trend assumption: Linear deterministic trend
Series: LNRGDP LNGFCF LNEXPO LNEXD LNEHE LNAID INF
Lags interval (in first differences): 1 to 2

Unrestricted Cointegration Rank Test (Trace)

Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)
Hypothesized Trace 0.05
No. of CE(s) Eigenvalue Statistic Critical Value Prob.**

None * 0.789434 200.9355 125.6154 0.0000

At most 1 * 0.720882 143.2911 95.75366 0.0000
At most 2 * 0.703474 96.07462 69.81889 0.0001
At most 3 * 0.524104 51.09665 47.85613 0.0240
At most 4 0.377024 23.62206 29.79707 0.2169
At most 5 0.151112 6.111896 15.49471 0.6823
At most 6 0.001358 0.050263 3.841466 0.8226

Trace test indicates 4 cointegrating eqn(s) at the 0.05 level

* denotes rejection of the hypothesis at the 0.05 level
**MacKinnon-Haug-Michelis (1999) p-values

Unrestricted Cointegration Rank Test (Maximum Eigenvalue)

Hypothesized Max-Eigen 0.05

No. of CE(s) Eigenvalue Statistic Critical Value Prob.**

None * 0.789434 57.64431 46.23142 0.0021

At most 1 * 0.720882 47.21652 40.07757 0.0067
At most 2 * 0.703474 44.97797 33.87687 0.0016
At most 3 0.524104 27.47459 27.58434 0.0516
At most 4 0.377024 17.51016 21.13162 0.1493
At most 5 0.151112 6.061633 14.26460 0.6053
At most 6 0.001358 0.050263 3.841466 0.8226

Max-eigenvalue test indicates 3 cointegrating eqn(s) at the 0.05 level

* denotes rejection of the hypothesis at the 0.05 level
**MacKinnon-Haug-Michelis (1999) p-values

Unrestricted Cointegration Rank Test (Trace):

Tests for the number of cointegrating equations using the trace statistic.

The test indicates 4 cointegrating equations at the 0.05 level.

Unrestricted Cointegration Rank Test (Maximum Eigenvalue):

Another test for the number of cointegrating equations using the maximum eigenvalue.

This test suggests 3 cointegrating equations at the 0.05 level.

Interpretation: The results suggest that there are long-term relationships among the variables.

The ARDL Bounds Test indicates that the null hypothesis of no long-run relationships is
rejected.

The Johansen Co-integration Test suggests the presence of 3 to 4 cointegrating equations.

Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

5. Long-run and Short-run Model Estimation

 Case 1: Using ARDL Approach

ARDL Cointegrating And Long Run Form

Dependent Variable: LNRGDP
Selected Model: ARDL(1, 0, 2, 2, 0, 1, 2)
Date: 12/19/23 Time: 14:39
Sample: 1974 2013
Included observations: 38

Cointegrating Form

Variable Coefficient Std. Error t-Statistic Prob.

D(LNGFCF) 0.183663 0.044881 4.092233 0.0004

D(LNEXPO) 0.006744 0.045834 0.147132 0.8843
D(LNEXPO(-1)) -0.076854 0.036535 -2.103602 0.0466
D(LNEXD) -0.042454 0.021099 -2.012179 0.0561
D(LNEXD(-1)) 0.050479 0.022078 2.286333 0.0318
D(LNEHE) 0.167963 0.065460 2.565875 0.0173
D(LNAID) -0.052981 0.021957 -2.412910 0.0242
D(INF) 0.001099 0.000786 1.397514 0.1756
D(INF) 0.001095 0.000718 1.525293 0.1408
CointEq(-1) -0.630755 0.133659 -4.719128 0.0001

Cointeq = LNRGDP - (0.2912LNGFCF + 0.0010LNEXPO -0.0809*LNEXD

+ 0.2663*LNEHE + 0.0108*LNAID + 0.0018*INF + 7.7382 )

Long Run Coefficients

Variable Coefficient Std. Error t-Statistic Prob.

LNGFCF 0.291179 0.075103 3.877050 0.0008

LNEXPO 0.000993 0.049915 0.019901 0.9843
LNEXD -0.080894 0.022525 -3.591263 0.0015
LNEHE 0.266288 0.066655 3.995004 0.0006
LNAID 0.010785 0.036939 0.291979 0.7729
INF 0.001787 0.002436 0.733592 0.4706
C 7.738165 0.715410 10.816402 0.0000

Cointegrating Form:

The coefficient for the variable D(LNGFCF) is 0.183663, and it is statistically significant (t-
statistic = 4.092233, p-value = 0.0004).

The coefficient for the variable D(LNEXPO(-1)) is -0.076854, and it is statistically significant
(t-statistic = -2.103602, p-value = 0.0466).

The coefficient for the variable D(LNEXD(-1)) is 0.050479, and it is statistically significant (t-
statistic = 2.286333, p-value = 0.0318).
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

The coefficient for the variable D(LNEHE) is 0.167963, and it is statistically significant (t-
statistic = 2.565875, p-value = 0.0173).

The coefficient for the variable D(LNAID) is -0.052981, and it is statistically significant (t-
statistic = -2.412910, p-value = 0.0242).

The coefficient for the lagged dependent variable CointEq(-1) is -0.630755, and it is statistically
significant (t-statistic = -4.719128, p-value = 0.0001).

The cointegrating equation is:

CointEq=LNRGDP−(0.2912×LNGFCF+0.0010×LNEXPO−0.0809×LNEXD+0.2663×LNEHE+
0.0108×LNAID+0.0018×INF+7.7382)

Long Run Coefficients:

The coefficient for LNGFCF is 0.291179, indicating a positive relationship with the dependent
variable, and it is statistically significant (t-statistic = 3.877050, p-value = 0.0008).

The coefficient for LNEXD is -0.080894, indicating a negative relationship with the dependent
variable, and it is statistically significant (t-statistic = -3.591263, p-value = 0.0015).

The coefficient for LNEHE is 0.266288, indicating a positive relationship with the dependent
variable, and it is statistically significant (t-statistic = 3.995004, p-value = 0.0006).The constant
C is 7.738165, and it is statistically significant (t-statistic = 10.816402, p-value = 0.0000).

Overall, the cointegrating equation and long-run coefficients suggest that variables such as
LNGFCF, LNEXD, and LNEHE have a significant impact on the long-run behavior of the
dependent variable LNRGDP. The signs of the coefficients provide insights into the direction of
these relationships.

 Case 2: Using Johansen Co-integration Approach

Date: 12/20/23 Time: 15:42

Sample: 1974 2013
Included observations: 40
Lags interval (in first differences): 1 to 2
Endogenous variables: LNRGDP LNGFCF LNEXPO LNEXD LNEHE LNAID INF
Deterministic assumptions: Case 3 (Johansen-Hendry-Juselius): Cointegrating
relationship includes a constant. Short-run dynamics include a constant.

Unrestricted
Cointegration
Rank Test
(Trace)
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)
Hypothesized Trace 0.05 Prob.**
No. of CE(s) Eigenvalue Statistic Critical Value Critical Value

None * 0.789434 200.9355 125.6154 0.0000

Trace test indicates 4 cointegrating equation(s) at the 0.05 level

* denotes rejection of the hypothesis at the 0.05 level
**MacKinnon-Haug-Michelis (1999) p-values

Unrestricted
Cointegration
Rank Test (Max-
eigenvalue)

Hypothesized Max-Eigen 0.05 Prob.**

No. of CE(s) Eigenvalue Statistic Critical Value Critical Value

None * 0.789434 57.64431 46.23142 0.0021

Max-eigenvalue test indicates 3 cointegrating equation(s) at the 0.05 level

* denotes rejection of the hypothesis at the 0.05 level
**MacKinnon-Haug-Michelis (1999) p-values

Trace Test Results:

The null hypothesis for each case is that there are no cointegrating equations (CE).

At the 0.05 significance level:

None of CE: Eigenvalue 0.7894, Trace Statistic 200.9355 (Critical Value 125.6154), p-value
0.0000 (Reject the null hypothesis)

At most 1 CE: Eigenvalue 0.7209, Trace Statistic 143.2911 (Critical Value 95.75366), p-value
0.0000 (Reject the null hypothesis)

At most 2 CE: Eigenvalue 0.7035, Trace Statistic 96.07462 (Critical Value 69.81889), p-value
0.0001 (Reject the null hypothesis)
Oromia State University, School of Graduate Studies, Msc In
Development Economics (Burayu Center)

At most 3 CE: Eigenvalue 0.5241, Trace Statistic 51.09665 (Critical Value 47.85613), p-value
0.0240 (Reject the null hypothesis)

The trace test indicates that there are 4 cointegrating equations at the 0.05 significance level.

Max-Eigenvalue Test Results:

The null hypothesis for each case is the same as in the trace test.

At the 0.05 significance level:

None of CE: Eigenvalue 0.7894, Max-Eigen Statistic 57.64431 (Critical Value 46.23142), p-
value 0.0021 (Reject the null hypothesis)

At most 1 CE: Eigenvalue 0.7209, Max-Eigen Statistic 47.21652 (Critical Value 40.07757), p-
value 0.0067 (Reject the null hypothesis)

At most 2 CE: Eigenvalue 0.7035, Max-Eigen Statistic 44.97797 (Critical Value 33.87687), p-
value 0.0016 (Reject the null hypothesis)

At most 3 CE: Eigenvalue 0.5241, Max-Eigen Statistic 27.47459 (Critical Value 27.58434), p-
value 0.0516 (Do not reject the null hypothesis)

The max-eigenvalue test indicates that there are 3 cointegrating equations at the 0.05 significance
level.

Conclusion:

The results suggest that there is evidence of cointegration among the variables. The trace test
suggests 4 cointegrating equations, while the max-eigenvalue test suggests 3 cointegrating
equations. These results are important for understanding the long-run relationships among the
variables in your time-series data.