0% found this document useful (0 votes)
249 views11 pages

Assignment 2 Mba 652 PDF

The document contains the output of several linear regression models. It runs regressions with education level (ed) as the dependent variable and distance to college (dist) as the primary independent variable. Additional models control for other factors like test scores, demographics, income levels, etc. The estimated slope for the effect of distance on education level ranges from -0.07 to -0.03 depending on which other factors are included in the model. Controlling for more variables reduces the estimated effect of distance, suggesting some omitted variable bias.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
249 views11 pages

Assignment 2 Mba 652 PDF

The document contains the output of several linear regression models. It runs regressions with education level (ed) as the dependent variable and distance to college (dist) as the primary independent variable. Additional models control for other factors like test scores, demographics, income levels, etc. The estimated slope for the effect of distance on education level ranges from -0.07 to -0.03 depending on which other factors are included in the model. Controlling for more variables reduces the estimated effect of distance, suggesting some omitted variable bias.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Chapter 5 > ans1a<-lm(ahe~age) > summary(ans1a) Call: lm(formula = ahe ~ age) Residuals: Min 1Q Median 3Q Max -19.248 -6.

964 -1.924 4.536 63.186 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.08228 1.18426 0.914 0.361 age 0.60499 0.03985 15.180 <2e-16 *** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 9.992 on 7709 degrees of freedom Multiple R-squared: 0.02902, Adjusted R-squared: 0.0289 F-statistic: 230.4 on 1 and 7709 DF, p-value: < 2.2e-16

a.For 10% 2-sided significance level, critical value of t-stat = 1.64 5% = 1.96 1%=2.58 As the model suggests that the t-value for standard error = 15.180. So, we reject the null hypothesis that = 0 and conclude that the regression coefficient is statistically significant. b. For 95% confidence interval should lie between (0.605 - 1.96*0.04, 0.605+1.96*0.04).

c. rishav2<-read.table("D:\\econometrics.txt",header=TRUE) > attach(rishav2) > ans1c=lm(ahe~age) > summary(ans1c) Call: lm(formula = ahe ~ age) Residuals: Min 1Q Median 3Q Max -23.792 -7.359 -2.034 5.400 59.119 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -4.43916 1.79999 -2.466 0.0137 * MBA:-652 ASSIGNMENT 2 SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

age 0.92460 0.06057 15.265 <2e-16 *** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 10.63 on 3707 degrees of freedom Multiple R-squared: 0.05914, Adjusted R-squared: 0.05889 F-statistic: 233 on 1 and 3707 DF, p-value: < 2.2e-16 > For 10% 2-sided significance level, critical value of t-stat = 1.64 5% = 1.96 1%=2.58 As the model suggests that the t-value for standard error = 15.265. So, we reject the null hypothesis that = 0 and conclude that the regression coefficient is statistically significant. d. rishav3<-read.table("D:\\neweconometrics.txt",header=TRUE) > attach(rishav3) The following object(s) are masked from 'rishav2': age, ahe The following object(s) are masked from 'rishav1': age, ahe > ans1d=lm(ahe~age) > summary(ans1d) Call: lm(formula = ahe ~ age) Residuals: Min 1Q Median 3Q Max -14.245 -5.111 -1.351 3.225 57.551 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.52194 1.26999 5.135 2.95e-07 *** age 0.29786 0.04274 6.969 3.73e-12 *** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Multiple R-squared: 0.01199, Adjusted R-squared: 0.01175 F-statistic: 48.56 on 1 and 4000 DF, p-value: 3.727e-12 For 10% 2-sided significance level, critical value of t-stat = 1.64 5% = 1.96 1%=2.58 MBA:-652 ASSIGNMENT 2 SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

As the model suggests that the t-value for standard error = 6.969. So, we reject the null hypothesis that = 0 and conclude that the regression coefficient is statistically significant.

e. for college graduate is 1 =0.9246 , std error s1 = 0.06057 for non graduate is 2 =0.29786 , std error s2 = 0.04274 for 1% significance level, weve obtained the result and found that 1 > 2. So, we conclude that the effect on earnings different high school graduates and college graduates.

ANSWER TO 5.2 load("D:\\Teaching.txt") rishav4<-read.table("D:\\Teaching.txt", header=TRUE) > attach(rishav4) The following object(s) are masked from 'rishav3': age The following object(s) are masked from 'rishav2': age The following object(s) are masked from 'rishav1': age, female > ans2=lm(course_eval~beauty) > summary(ans2) Call: lm(formula = course_eval ~ beauty) Residuals: Min 1Q Median 3Q Max -1.80015 -0.36304 0.07254 0.40207 1.10373 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.99827 0.02535 157.727 < 2e-16 *** beauty 0.13300 0.03218 4.133 4.25e-05 *** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.5455 on 461 degrees of freedom Multiple R-squared: 0.03574, Adjusted R-squared: 0.03364 F-statistic: 17.08 on 1 and 461 DF, p-value: 4.247e-05

For 10% 2-sided significance level, critical value of t-stat = 1.64 5% = 1.96 MBA:-652 ASSIGNMENT 2 SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

1%=2.58 As the model suggests that the t-value for standard error = 4.133. So, we reject the null hypothesis for above all significance level that = 0 and conclude that the regression coefficient is statistically significant. p-value =4.25e-05.

ANSWER TO 5.3 > load("D:\\CollegeDistance.txt") > rishav5<- read.table("D:\\CollegeDistance.txt",header=TRUE) > attach(rishav5) > ans3=lm(ed~dist) > summary(ans3) Call: lm(formula = ed ~ dist) Residuals: Min 1Q Median 3Q Max -1.9559 -1.8091 -0.6624 2.0515 4.4844 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 13.95586 0.03772 369.945 <2e-16 *** dist -0.07337 0.01375 -5.336 1e-07 *** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 1.807 on 3794 degrees of freedom Multiple R-squared: 0.00745, Adjusted R-squared: 0.007188 F-statistic: 28.48 on 1 and 3794 DF, p-value: 1.004e-07 For 10% 2-sided significance level, critical value of t-stat = 1.64 5% = 1.96 1%=2.58 As the model suggests that the t-value for standard error = 5.336. So, we reject the null hypothesis that = 0 for above all significance level and conclude that the regression coefficient is statistically significant. b. 95% confidence interval for slope coefficient = (-0.07337 0.01375, -0.07337+0.01375)

c. > ans3c=lm(ed~dist) > summary(ans3c) Call: lm(formula = ed ~ dist) MBA:-652 ASSIGNMENT 2 SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

Residuals: Min 1Q Median 3Q Max -1.9359 -1.8075 -0.6632 2.0706 4.4491 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 13.93587 0.05112 272.593 < 2e-16 *** dist -0.06417 0.01881 -3.412 0.000657 *** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 1.802 on 2068 degrees of freedom (1726 observations deleted due to missingness) Multiple R-squared: 0.005599, Adjusted R-squared: 0.005118 F-statistic: 11.64 on 1 and 2068 DF, p-value: 0.0006567 95% confidence interval for slope coefficient = (-0.06417 0.01881, -0.06417+0.01881) = (-0.08298, -0.04536) d. > summary(ans1d) Call: lm(formula = ed ~ dist) Residuals: Min 1Q Median 3Q Max -1.9790 -1.8113 -0.6856 2.0294 4.3983 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 13.97899 0.05593 249.939 < 2e-16 *** dist -0.08384 0.02017 -4.157 3.38e-05 *** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Multiple R-squared: 0.009925, Adjusted R-squared: 0.009351 F-statistic: 17.28 on 1 and 1724 DF, p-value: 3.38e-05 95% confidence interval for slope coefficient = (-0.08384 0.02017, -0.08384 + 0.02017) = = (-0.10401 , -0.06267)

e. for female is 1 =0.06417 , std error s1 = 0.01881, n1 = 20 for male is 2 =0.08384 , std error s2 = 0.02017, n2 = 1726 for 1% significance level, weve obtained the result and found that 2 > 1. So, we conclude that the effect on earnings different high school graduates and college graduates. MBA:-652 ASSIGNMENT 2 SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

Chapter 6
ANSWER TO E6.1 a) > data1 <- read.table("D:\\Econometrics\\TeachingRatings.txt", header=TRUE) > attach(data1) > model1=lm(course_eval~beauty) > summary(model1) Call: lm(formula = course_eval ~ beauty) Residuals: Min 1Q Median 3Q Max -1.80015 -0.36304 0.07254 0.40207 1.10373 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.99827 0.02535 157.727 < 2e-16 *** beauty 0.13300 0.03218 4.133 4.25e-05 *** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.5455 on 461 degrees of freedom Multiple R-squared: 0.03574, Adjusted R-squared: 0.03364 F-statistic: 17.08 on 1 and 461 DF, p-value: 4.247e-05

Regression of course_eval on beauty: course_eval = 3.998 + 0.133 * Beauty The estimated slope is 0.133. b) > model1=lm(course_eval~beauty+intro+onecredit+female+minority+nnenglish) > summary(model1) Call: lm(formula = course_eval ~ beauty + intro + onecredit + female + minority + nnenglish) MBA:-652 ASSIGNMENT 2 SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

Residuals: Min 1Q Median 3Q Max -1.84611 -0.33300 0.04912 0.37672 1.05872 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.06829 0.03754 108.364 < 2e-16 *** beauty 0.16561 0.03073 5.389 1.14e-07 *** intro 0.01133 0.05448 0.208 0.835413 onecredit 0.63453 0.11134 5.699 2.17e-08 *** female -0.17348 0.04928 -3.520 0.000474 *** minority -0.16662 0.07628 -2.184 0.029448 * nnenglish -0.24416 0.10696 -2.283 0.022903 * --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.5135 on 456 degrees of freedom Multiple R-squared: 0.1546, Adjusted R-squared: 0.1435 F-statistic: 13.9 on 6 and 456 DF, p-value: 1.529e-14 Using additional regressors, we obtain the following regression equation: Course_eval= 4.068 + 0.165*beauty + 0.113*intro + 0.634*onecredit 0.173*female 0.1666*minority 0.244*nnenglish The estimated effect of beauty on course_eval is 0.165. Since the effect of beauty on course_eval isnt substantially different, there isnt omitted variable bias.

d) > newdata1=data.frame(female=0,nnenglish=0,minority=1,onecredit=0,intro=0,beauty=0) > predict(model1, newdata1) 1 3.901673 So Professor Smiths predicted course evaluation is 3.901

MBA:-652

ASSIGNMENT 2

SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

ANSWER TO E 6.2 a) > data2 <- read.table("D:\\Econometrics\\CollegeDistance.txt",header=TRUE) > attach(data2) > model1=lm(ed~dist) > model1 Call: lm(formula = ed ~ dist) Coefficients: (Intercept) dist 13.95586 -0.07337 The estimated slope is -0.07 b) > model2=lm(ed~dist+bytest+female+black+hispanic+incomehi+ownhome+dadcoll+cue80+stwmfg80) > model2 Call: lm(formula = ed ~ dist + bytest + female + black + hispanic + incomehi + ownhome + dadcoll + cue80 + stwmfg80) Coefficients: (Intercept) dist bytest female black hispanic 8.82752 -0.03154 0.09382 0.14541 0.36797 0.39852 incomehi ownhome dadcoll cue80 stwmfg80 0.39520 0.15213 0.69613 0.02321 -0.05178

After incorporating other regressors mentioned in the problem, the estimated effect of dist on ed is -.0315387.
c) Yes, the regression in a) seems to suffer from important omitted variable bias as there is an increase in the coefficient value when additional regressors are added.

d) > summary(model1) MBA:-652

ASSIGNMENT 2

SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

Residual standard error: 1.807 on 3794 degrees of freedom Multiple R-squared: 0.00745, Adjusted R-squared: 0.007188 F-statistic: 28.48 on 1 and 3794 DF, p-value: 1.004e-07 > summary(model2) Residual standard error: 1.542 on 3785 degrees of freedom Multiple R-squared: 0.2788, Adjusted R-squared: 0.2769 F-statistic: 146.3 on 10 and 3785 DF, p-value: < 2.2e-16 Based on both R-square and adjusted R-square, model 2 i.e. regression in b) is a better fit

e) The coefficient on dadcoll is positive, which is 0.6961. The variable dadcoll measures whether a students father attended college or not. So the positive value suggests that if a students dad went to college, then the student is .696 times more likely to go. f) Cue80 measures County Unemployment rate in 1980 and stwmfg80 measures State Hourly Wage in Manufacturing in 1980. The two variables appear in regression because no. of years of education is affected by the employment rate and wage rates. The value of cue80 obtained is positive and that of stwmfg80 is negative. Whereas I would have believed it to be the opposite i.e. negative and positive because unemployment rate would be inversely proportional to years of schooling and hourly wage would be directly proportional to the years of schooling. The magnitudes of the two variables are 0.02321 and -0.05178, which are quite low i.e. these two factors do not have much effect on the dependent variable. g) > newdata=data.frame(dist=2,bytest=58, incomehi=1, ownhome=1, dadcoll=0,momcoll=1, cue80=7.5,stwmfg80=9.75,female=0,black=1,hispanic=0) > predict(model2, newdata) 1 14.79051 Bobs years of completed schooling using regression in b) is 14.79 years. h) > newdata=data.frame(dist=4,bytest=58, incomehi=1, ownhome=1, dadcoll=0,momcoll=1, cue80=7.5,stwmfg80=9.75,female=0,black=1,hispanic=0) > predict(model2, newdata) 1 MBA:-652 ASSIGNMENT 2 SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

14.72744 Jims years of completed schooling using regression in b) is 14.72 years.

ANSWER TO E6.3 a) > data3 <- read.table("D:\\Econometrics\\Growth.txt",header=TRUE) > summary(data3) growth oil rgdp60 tradeshare Min. :-2.8119 Min. :0 Min. : 367 Min. :0.1405 1st Qu.: 0.8382 1st Qu.:0 1st Qu.:1148 1st Qu.:0.3933 Median : 1.9751 Median :0 Median :2019 Median :0.5433 Mean : 1.9427 Mean :0 Mean :3104 Mean :0.5647 3rd Qu.: 2.8803 3rd Qu.:0 3rd Qu.:5143 3rd Qu.:0.6816 Max. : 7.1569 Max. :0 Max. :9895 Max. :1.9926 rev_coups Min. :0.00000 1st Qu.:0.00000 Median :0.06667 Mean :0.16745 3rd Qu.:0.26667 Max. :0.97037 Standard Deviation:
growth 1.897119762 oil rgdp60 2512.656846 tradeshare 0.289270317 yearsschool 2.542000359 rev_coups 0.224679773 assasinations 0.49152842

yearsschool Min. : 0.200 1st Qu.: 1.940 Median : 3.650 Mean : 3.985 3rd Qu.: 5.560 Max. :10.070

assasinations Min. :0.0000 1st Qu.:0.0000 Median :0.1000 Mean :0.2776 3rd Qu.:0.2333 Max. :2.4667

b) > model3=lm(growth~tradeshare+yearsschool+rev_coups+assasinations+rgdp60) > model3 Call: lm(formula = growth ~ tradeshare + yearsschool + rev_coups + assasinations + rgdp60) Coefficients: (Intercept) tradeshare yearsschool rev_coups assasinations 0.4897603 1.5616957 0.5748461 -2.1575029 0.3540784 rgdp60 -0.0004693 MBA:-652 ASSIGNMENT 2 SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

The coefficient of rev_coups is -2.157. c) > newdata=data.frame(rgdp60=3104, tradeshare=0.5647, yearsschool=3.985, rev_coups=0.1674, assasinations=0.2776) > predict(model3, newdata) 1 1.942686 The predicted value of average annual growth rate is 1.94 d) Countrys value for trade share becomes- 0.5647+0.2892 = 0.8539 > newdata=data.frame(rgdp60=3104, tradeshare=0.8539, yearsschool=3.985, rev_coups=0.1674, assasinations=0.2776) > predict(model3, newdata) 1 2.394329 Now, the predicted value is 2.9 e) Oil is omitted from the regression because of 0 values of it. If it is included, itll lead to multicollinearity with other regressors, which will lead to error.

MBA:-652

ASSIGNMENT 2

SUBMITTED BY:-RISHAV SHIV RANJAN 12114011 M.TECH IME

You might also like