0% found this document useful (0 votes)
329 views4 pages

The Nature of Dummy Variables: Mid Term

The document discusses dummy variables and their use in regression analysis. Dummy variables allow qualitative, nominal variables to be included in regression models by assigning numeric values (usually 0 and 1) to categories. For example, a dummy variable could take the value 1 for males and 0 for females. Regression models that include only dummy variables are called analysis of variance (ANOVA) models. Dummy variables classify data into mutually exclusive categories and allow nominal variables to influence the dependent variable in regression analysis.

Uploaded by

Saadullah Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
329 views4 pages

The Nature of Dummy Variables: Mid Term

The document discusses dummy variables and their use in regression analysis. Dummy variables allow qualitative, nominal variables to be included in regression models by assigning numeric values (usually 0 and 1) to categories. For example, a dummy variable could take the value 1 for males and 0 for females. Regression models that include only dummy variables are called analysis of variance (ANOVA) models. Dummy variables classify data into mutually exclusive categories and allow nominal variables to influence the dependent variable in regression analysis.

Uploaded by

Saadullah Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

MID TERM

SOLUTION
PREVIUOS MID TERM

The nature of Dummy Variables


In regression analysis the dependent variable is frequently influenced not
only by ratio scale variables (e.g., income, output, prices, costs, height, and
temperature) but also by variables that are essentially qualitative or nominal
scale, in nature, such as sex, race, color, religion, nationality, geographical
region, political upheavals, and party affiliation. For example, holding all
other factors constant, female workers are found to earn less than their male
counterparts or Nonwhite workers are found to earn less than whites. This
shows that qualitative variables are not less important and should be
included in the regression analysis DV variables usually indicate the
presence or absence of a "quality or an attribute. How to quantify?
Construct artificial variables that take on values of 1 or 0, 1 indicating the
presence or absence. Dummy variables are thus essentially a device to
classify data into mutually exclusive categories such as male or female. How
to incorporate in regression models: Dummy variables can be incorporated
in regression models just as easily as quantitative variables. In other words a
regression model may contain regressors that are all exclusively dummy, or
qualitative, in nature. Such models are called Analysis of Variance
(ANOVA) models For example, the variables like sex (male or female),
colour (black, white), nationality, employment status (employed,
unemployed) are defined on a nominal scale. Such variables do not have any
natural scale of measurement. Such variables usually indicate the presence
or absence of a quality or an attribute like employed or unemployed,
graduate or non-graduate, smokers or non- smokers, yes or no, acceptance or
rejection, so they are defined on a nominal scale. Such variables can be
quantified by artificially constructing the variables that take the values, e.g.,
1 and 0 where 1 indicates usually the presence of attribute and 0
indicates usually the absence of attribute. For example, 1 indicates that the
person is male and 0 indicates that the person is female. Similarly, 1
may indicate that the person is employed and then 0 indicates that the
person is unemployed. Such variables classify the data into mutually
exclusive categories. These variables are called indicator variables or
dummy variables. Usually, the dummy variables take on the values 0 and 1
to identify the mutually exclusive classes of the explanatory variables. For
example, 1 if person is male 0 if person is female, 1 if person is employed 0
if person is unemployed.

1|Page
MID TERM
SOLUTION
PREVIUOS MID TERM

Consider an equation to explain salaries of CEOs in terms of annual firm


sales, return on equity (roe, in percentage form), and return on the firms
stock (ros, in percentage form):
log(salary) = 0+ 1log(sales) + 2roe+ 3ros+
(i) In terms of the model parameters, state the null hypothesis that, after
controlling for salesand roe, roshas no effect on CEO salary. State the
alternative that better stock market performance increases a CEOs
salary
Ans.H0: 3= 0. H1: 3> 0

log(salary) = 0+ 1log(sales) + 2roe+ 3ros+

(ii) Using a data set on firms, suppose the following was obtained via OLS
log()= 4.32 + .280log(sales) + .0174roe+ .00024ros
(.32) (.035)(.0041)(.00054)
N = 209, R2= .283
By what percentage is salary predicted to increase if rosincreases by 50
points? Does rosshare an economically large relationship with salary?

Ans.-Recall, we interpret a log-level model as %=100(or, 1)


-So, a 50 point increase in rosis associated with an increase in salary by 1.2
(.00024501=.012) percent
-A 1.2 percent increase in salary that is related to a 50 percent increase in a return
on a firms stock does not seem economically meaningful

log()= 4.32 + .280log(sales) + .0174roe+ .00024ros


(.32) (.035)(.0041)(.00054)
N = 209, R2= .283

(iii) Test the null hypothesis that roshas no affect on salary against the alternative
that roshas a positive effect. Carry out the test at the 10% significance level.
Ans. The 10% critical value for a one-tailed test, using df= 200, is 1.29. (table)
The tstat on rosis .00024/.00054 = .44, which is well below the critical value.
Therefore, we fail to reject H0at the 10% significance level and say that the
relationship between rosand salaryis statistically indistinguishable from zero.

2|Page
MID TERM
SOLUTION
PREVIUOS MID TERM

log()= 4.32 + .280log(sales) + .0174roe+ .00024ros


(.32) (.035)(.0041)(.00054)
N = 209, R2= .283
(iv) Would you include rosin a final model explaining CEO compensation in terms
of firm performance?
Ans. Based on this sample, rosis not a statistically significant predictor of CEO
compensation. However, including rosmay not be causing harmQ. What does
this depend on?
-It depends on how correlated it is with the other independent variables

The following table contains the ACT scores and the GPA (grade point
average) for eight College students. Grade point average is based on a four-
point scale and has been rounded to one digit after the decimal.
Student GPA ACT
1 2.8 21
2 3.4 24
3 3.0 26
4 3.5 27
5 3.6 29
6 3.0 25
7 2.7 25
8 3.7 30
2
(i)Estimate the relationship between GPA and ACT using OLS; that is, obtain the intercept
and slope estimates in the equation
d GPA = b _0 + b _1ACT

Comment on the direction of the relationship. Does the intercept have a useful interpre-
tation here? Explain. How much higher is the GPA predicted to be if the ACT score is
increased by _ve points?

The easiest way to get the answer to this question would be to enter the data by
hand into Stata (or Exel). Then run a simple OLS regression. Since there are only 8
data points, this would be pretty quick. Or, again because there aren't many data
points, you could just compute the formula for the OLS coefficients using the
formula on page 29 and a calculator. I used Stata (the easiest way to enter this
number of data points is by using the edit command and just _lling in the
spreadsheet) and found the following coefficient cients (rounded to two decimal
places):
b _0 = 0:57; b _1 = [Link]

3|Page
MID TERM
SOLUTION
PREVIUOS MID TERM

The direction of the relationship is positive, as expected. Students with a higher


ACT score have a higher GPA. Whether this has to do with intelligence, test-taking
Ability, or whatever, I am not sure. But the direction of the relationship is certainly
Expected. The literal interpretation of the intercept is \the GPA if the ACT score
was 0." Is that even possible? Can you score a 0 on your ACT? Or do you get 10
points for writing your name? I don't know. But either way, my answer to the
question is probably
\no, the intercept does not have a useful interpretation." All of the ACT scores in
our
data are well above 0, indicating to me that the notion of a 0 ACT score is well
outside
our sample, and thus well outside what I am comfortable predicting with this
model.
If the ACT score increases by 5 points, I expect to observe a 0.5 point increase in
GPA (= 0.1 * 5).

(iii)What is the predicted value of GPA when ACT = 20?


The predicted value of GPA when ACT = 20 is
0:57 + 0:1 _ 20 = [Link]
Even though we don't have an observation with ACT = 20, we can predict what
would happen if we did, using our model.

(iv)How much of the variation in GPA for these eight students is explained by ACT?
Explain.
About 58% of the variation in GPA is explained by ACT. We see this in the R-
squared statistic, which represents the proportion of the total variation in GPA that
is explained by all the explanatory variables. Since we only have one explanatory
variable, R-squared gives us exactly what the question is looking for. (This would
not be true if there were multiple explanatory variables).

4|Page

You might also like