0% found this document useful (0 votes)

57 views37 pages

14 Statistics and Probability

The learning objectives of this chapter are to teach students how to: 1) Develop simple and multiple linear regression models to predict dependent variables from independent variables 2) Interpret regression results such as slope, intercept, and statistical tests 3) Evaluate the fit of regression models and address assumptions like nonlinearity and outliers

Uploaded by

Muhammad Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views37 pages

14 Statistics and Probability

Uploaded by

Muhammad Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Learning Objectives

After completing this chapter, students will be able to:

1. Identify variables and use them in a regression model

2. Develop simple linear regression equations from sample data and interpret the
slope and intercept
3. Compute the coefficient of determination and the coefficient of correlation and
interpret their meanings
4. Interpret the F-test in a linear regression model
5. List the assumptions used in regression and use residual plots to identify
problems
Learning Objectives
After completing this chapter, students will be able to:

6. Develop a multiple regression model and use it to predict

7. Use dummy variables to model categorical data
8. Determine which variables should be included in a multiple regression model
9. Transform a nonlinear function into a linear one for use in regression
10. Understand and avoid common mistakes made in the use of regression analysis
Chapter Outline

1) Introduction
2) Scatter Diagrams
3) Simple Linear Regression
4) Measuring the Fit of the Regression Model
5) Assumptions of the Regression Model
Introduction
• Regression analysis is a very valuable tool for a manager
• Regression can be used to
•Understand the relationship between variables
•Predict the value of one variable based on another variable
• Examples
•Determining best location for a new store
•Studying the effectiveness of advertising dollars in increasing sales
volume
• The variable to be predicted is called the dependent variable
•Sometimes called the response variable
• The value of this variable depends on the value of the independent
variable
•Sometimes called the explanatory or predictor variable
Dependent Independent Independent
variable
= variable
+ variable
Scatter Diagram
• Graphing is a helpful way to investigate the relationship between
variables.
• A scatter diagram or scatter plot is often used for that relation.
• The independent variable is normally plotted on the X axis.
• The dependent variable is normally plotted on the Y axis.
Triple A Construction
• Triple A Construction renovates old homes.
• They have found that the dollar volume of renovation work is dependent
on the area payroll.

TRIPLE A’S SALES LOCAL PAYROLL

($100,000’s) ($100,000,000’s)
6 3
8 4
9 6
5 4
4.5 2
9.5 5
Table 1
Triple A Construction
12 –

10 –

8–
Sales ($100,000)

6–

4–

2–

0– | | | | | | | |
0 1 2 3 4 5 6 7 8
Payroll ($100 million)

Figure 1
Simple Linear Regression
◼ Regression models are used to test if there is a relationship between
variables (predict sales based on payroll)
◼ There is some random error that cannot be predicted

Y =  0 + 1X + 
where
Y = dependent variable (response)
X = independent variable (predictor or
explanatory)
𝜷0 = intercept (value of Y when X = 0)
𝜷 1 = slope of the regression line
e = random error
Simple Linear Regression

◼ True values for the slope and intercept are not known so they are
estimated using sample data

Yˆ = b0 + b1 X

where
^
Y = dependent variable (response)
X = independent variable (predictor or explanatory)
b0 = intercept (value of Y when X = 0)
b1 = slope of the regression line
Example; Triple A Construction

• Triple A Construction is trying to predict sales based on area payroll

Y = Sales
X = Area payroll
◼ The line chosen in Figure 4.1 is the one that minimizes the errors

Error = (Actual value) – (Predicted value)

e = Y − Yˆ
Least Squares Regression
Errors can be positive or negative so the average error could be zero even
though individual errors could be large.
Least squares regression minimizes the sum of the squared errors.

Payroll Line Fit Plot

10
($100,000)

8
Sales

6
4
2
0
0 2 4 6 8
Payroll ($100.000,000's)
• For the simple linear regression model, the values of the intercept and
slope can be calculated using the formulas below

Yˆ = b0 + b1 X

X=
 X
= average (mean) of X values
n

Y=
 Y
= average (mean) of Y values
n
b1 =
 ( X − X )(Y − Y )
(X − X ) 2

b0 = Y − b1 X
• Regression calculations

Y X (X – X)2 (X – X)(Y – Y)
6 3 (3 – 4)2 = 1 (3 – 4)(6 – 7) = 1
8 4 (4 – 4)2 = 0 (4 – 4)(8 – 7) = 0
9 6 (6 –𝜷4)2 = 4 (6 – 4)(9 – 7) = 4
5 4 (4 – 4)2 = 0 (4 – 4)(5 – 7) = 0
4.5 2 (2 – 4)2 = 4 (2 – 4)(4.5 – 7) = 5
9.5 5 (5 – 4)2 = 1 (5 – 4)(9.5 – 7) = 2.5
ΣY = 42 ΣX = 24 Σ(X – X)2 = 10 Σ(X – X)(Y – Y) = 12.5
Y = 42/6 = 7 X = 24/6 = 4

Table 2
• Regression calculations

X=
 X 24
= =4
6 6

Y=
 Y 42
= =7
6 6

b1 =
 ( X − X )(Y − Y ) 12.5
= = 1.25
(X − X ) 2
10

b0 = Y − b1 X = 7 − (1.25 )( 4 ) = 2

Therefore Yˆ = 2 + 1.25 X
• Regression calculations sales = 2 + 1.25(payroll)

X=
 X 24
= =4 If the payroll next
6 6 year is $600 million

Y=
 Y 42
= =7 Yˆ = 2 + 1.25(6) = 9.5 or $ 950,000
6 6

b1 =
 ( X − X )(Y − Y ) 12.5
= = 1.25
(X − X ) 2
10

b0 = Y − b1 X = 7 − (1.25 )( 4 ) = 2

Therefore Yˆ = 2 + 1.25 X
Measuring the Fit
of the Regression Model
◼ Regression models can be developed for any variables X and Y
◼ How do we know the model is actually helpful in predicting Y based on X?
◼ We could just take the average error, but the positive and negative
errors would cancel each other out
◼ Three measures of variability are
◼ SST – Total variability about the mean
◼ SSE – Variability about the regression line
◼ SSR – Total variability that is explained by the model
◼ Sum of the squares total
SST =  (Y − Y )2

◼ Sum of the squared error

SSE =  e 2 =  (Y − Yˆ )2

◼ Sum of squares due to regression

SSR =  (Yˆ − Y )2

◼ An important relationship
SST = SSR + SSE
Y X (Y – Y)2 Y^ ^ 2
(Y – Y) (Y^ – Y)2
6 3 (6 – 7)2 = 1 2 + 1.25(3) = 5.75 0.0625 1.563

8 4 (8 – 7)2 = 1 2 + 1.25(4) = 7.00 1 0

9 6 (9 – 7)2 = 4 2 + 1.25(6) = 9.50 0.25 6.25

5 4 (5 – 7)2 = 4 2 + 1.25(4) = 7.00 4 0

4.5 2 (4.5 – 7)2 = 6.25 2 + 1.25(2) = 4.50 0 6.25

9.5 5 (9.5 – 7)2 = 6.25 2 + 1.25(5) = 8.25 1.5625 1.563

^2
∑(Y – Y)2 = 22.5 ∑(Y – Y) = 6.875 ∑(Y^ – Y)2 = 15.625
Y=7 SST = 22.5 SSE = 6.875 SSR = 15.625

Table 3
For Triple A Construction

◼ Sum of the squares total SST = 22.5

SSE = 6.875
SST =  (Y − Y )2
SSR = 15.625
◼ Sum of the squared error
SSE =  e 2 =  (Y − Yˆ )2

◼ Sum of squares due to regression

SSR =  (Yˆ − Y )2

◼ An important relationship
SST = SSR + SSE
◼ SSR – explained variability
◼ SSE – unexplained variability
12 – ^
Y = 2 + 1.25X
10 –
^
Y–Y
Sales ($100,000) 8– ^
Y–Y
Y–Y Y
6–

4–

2–

0– | | | | | | | |
0 1 2 3 4 5 6 7 8
Payroll ($100 million)
Figure 4.2
Coefficient of Determination
•The proportion of the variability in Y explained by regression equation is
called the coefficient of determination
•The coefficient of determination is r2
SSR SSE
r = 2
= 1−
SST SST
◼ For Triple A Construction
15.625
r =
2
= 0.6944
22.5
◼ About 69% of the variability in Y is explained by
the equation based on payroll (X)
Assumptions of the Regression Model

◼ If we make certain assumptions about the errors in a regression model,

we can perform statistical tests to determine if the model is useful
1. Errors are independent
2. Errors are normally distributed
3. Errors have a mean of zero
4. Errors have a constant variance
◼ A plot of the residuals (errors) will often highlight any glaring violations
of the assumption
REGRESSION LINE OF X ON Y
The line which expresses the trend of two observed values is called a
regression line. For example if the sample data is given then the value
of 𝒚 corresponding to the given value of 𝒙 can be estimated by the
method of least squares. Now because the value of 𝒚 is estimated
from given value of 𝒙 therefore the resulting line is called regression line
of 𝒚 on 𝒙 which means that 𝒚 is dependent on 𝒙. The general equation of
𝒚 on 𝒙 is
𝒀 = 𝒂 + 𝒃𝒙
𝒏 σ 𝒙𝒚− σ 𝒚 σ 𝒙 σ𝒚 σ𝒙
where 𝒃 = ഥ − 𝒃ഥ
and 𝐚 = 𝒚 𝒙= −𝒃
𝒏 σ 𝒙𝟐 − σ 𝒙 𝟐 𝒏 𝒏
Question
Problem: The following table shows the chart of price and demand for an
item at different periods of time.
I. Forecast demand for the price of $ 25
• Solution:
Correlation
Correlation measures the degree of interdependence (association)
between two variables. If two variables are so related that an increase or
decrease of one is found in connection with increase or decrease of the
other, then the two variables are said to be correlated. Here it is
important to note that there might be a similar movement between two
variables such as automobile sales and demand for shoes. But these two
variables have no connection due to which the calculation for these two
variables is wrong because it does not make any sense. Therefore care
must be taken that the two variables have some connection before a
calculation can make sense.
Correlation Coefficient
• The correlation coefficient gives a mathematical value for measuring
the strength of the linear relationship between two variables.
• r lies between -1 and +1
• +1 indicates perfect positive relation
• -1 indicates perfect negative relation
• 0 shows no correlation
Pearson product moment Correlation
Coefficient
• The formula for calculating linear correlation coefficient is called
product-moment formula presented by Karl Pearson. Therefore it is also
called Pearsonian coefficient of correlation. The formula is given as:
Properties of Coefficient of Correlation
• Coefficient of correlation lies between -1 and +1,i.e. −𝟏 ≤ 𝒓 ≤ +𝟏.
• Coefficients of correlation are independent of change of origin and scale.
• Coefficients of correlation possess the property of symmetry. i.e. 𝒓𝒙𝒚 = 𝒓𝒚𝒙
• The coefficient of correlation is a geometric mean of two regression
coefficient. 𝐫 = ± 𝒃𝒙𝒚 . 𝒃𝒚𝒙
𝐫 = + 𝒃𝒙𝒚 . 𝒃𝒚𝒙 , if 𝒃𝒙𝒚 𝒂𝒏𝒅 𝒃𝒚𝒙 are positive.
𝐫 = − 𝒃𝒙𝒚 . 𝒃𝒚𝒙 , if 𝒃𝒙𝒚 𝒂𝒏𝒅 𝒃𝒚𝒙 are negative.
• Note: Correlation is the geometric mean of absolute values of
two regression coefficients i.e.
Question
Solution
Rank Correlation
• Rank correlation is used in a situation when the variable under
consideration is not measurable e.g. intelligence, knowledge,
experience, beauty etc. Such types of variables are judged by two
different people or by two procedures. Therefore it is necessary to find
the correlation between judgment of two person or two procedures.
For this purpose observations are ranked and the method is called rank
correlation coefficient.
• The formula for calculating rank correlation is given as:
Question
Solution

Interpretation
The value of r shows that there is positive correlation between the company
ranking and customer ranking. It also indicates that both company and
customer consider these characteristics important for customer satisfaction
due to which the association between two rankings is positive.

Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Sec2 Regression PDF
No ratings yet
Sec2 Regression PDF
183 pages
Quantitative Analysis For Management Ch04
100% (1)
Quantitative Analysis For Management Ch04
71 pages
QBA Chapter-4 Regression-Models
No ratings yet
QBA Chapter-4 Regression-Models
70 pages
QAM Chapter04 Regression Models
No ratings yet
QAM Chapter04 Regression Models
82 pages
Da Unit 3 R22
No ratings yet
Da Unit 3 R22
15 pages
Regression Course For Second Year (Chap 1-3)
No ratings yet
Regression Course For Second Year (Chap 1-3)
59 pages
RSH - Qam11 - ch04 GE Regression Models
No ratings yet
RSH - Qam11 - ch04 GE Regression Models
71 pages
Module 5 - Regression Models
No ratings yet
Module 5 - Regression Models
73 pages
MBAS901 2 Lecture
No ratings yet
MBAS901 2 Lecture
87 pages
Session 19&20
No ratings yet
Session 19&20
54 pages
Econometrics For MGT ppt-2
No ratings yet
Econometrics For MGT ppt-2
58 pages
6 ASAP Advanced Statistics-Regression
No ratings yet
6 ASAP Advanced Statistics-Regression
53 pages
DECISION and REGRESSION
No ratings yet
DECISION and REGRESSION
52 pages
Chapter 6
No ratings yet
Chapter 6
58 pages
Simplelinearregression NBC
No ratings yet
Simplelinearregression NBC
50 pages
Meweek 3
No ratings yet
Meweek 3
57 pages
Corelation and Regression
No ratings yet
Corelation and Regression
137 pages
Chapter 06-Regression Analysis
No ratings yet
Chapter 06-Regression Analysis
41 pages
BUSINESS STATISTICS: Simple Linear Regression and Correlation
No ratings yet
BUSINESS STATISTICS: Simple Linear Regression and Correlation
55 pages
Da Module 3
No ratings yet
Da Module 3
54 pages
Linear Regression Analysis. Statistics 2 Notes
No ratings yet
Linear Regression Analysis. Statistics 2 Notes
20 pages
Correlation
No ratings yet
Correlation
57 pages
DLL Week (3) Quarter 3 Discipline Ideas in Social Science
100% (1)
DLL Week (3) Quarter 3 Discipline Ideas in Social Science
7 pages
Module 2 Part 1 - Types of Forecasting Models and Simple Linear Regression
No ratings yet
Module 2 Part 1 - Types of Forecasting Models and Simple Linear Regression
71 pages
Evans Analytics2e PPT 08
No ratings yet
Evans Analytics2e PPT 08
65 pages
Unit III
No ratings yet
Unit III
13 pages
Simple Linear Regression Sample
No ratings yet
Simple Linear Regression Sample
55 pages
IAS 002 - ITDI - 1st - Ffup - Report
No ratings yet
IAS 002 - ITDI - 1st - Ffup - Report
160 pages
RSH Qam11 Ch04 GE
No ratings yet
RSH Qam11 Ch04 GE
71 pages
Module 3
No ratings yet
Module 3
34 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
Chap 10 Regression Analysis
No ratings yet
Chap 10 Regression Analysis
68 pages
Topic 8 - Regression Analysis
No ratings yet
Topic 8 - Regression Analysis
51 pages
Introduction To Linear Regression and Correlation Analysis: Objectives
100% (1)
Introduction To Linear Regression and Correlation Analysis: Objectives
33 pages
The Simple Linear Regression Model and Correlation
100% (1)
The Simple Linear Regression Model and Correlation
64 pages
Regression&Corr&Annova
No ratings yet
Regression&Corr&Annova
71 pages
Grade 7 w3
100% (1)
Grade 7 w3
3 pages
QMM Epgdm 5
No ratings yet
QMM Epgdm 5
58 pages
Regression and Correlation
No ratings yet
Regression and Correlation
66 pages
QTM Regression Analysis Ch4 RSH
No ratings yet
QTM Regression Analysis Ch4 RSH
40 pages
Introduction To Simple Linear Regression
No ratings yet
Introduction To Simple Linear Regression
34 pages
2023 Statistics Fin 10
No ratings yet
2023 Statistics Fin 10
14 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
91 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
Simple Regression and Correlation
No ratings yet
Simple Regression and Correlation
30 pages
Session 5 Marked B PDF
No ratings yet
Session 5 Marked B PDF
36 pages
DA-3rd Unit
No ratings yet
DA-3rd Unit
16 pages
Regression
No ratings yet
Regression
66 pages
BES - Lecture 10 - Simple Linear Regression
No ratings yet
BES - Lecture 10 - Simple Linear Regression
15 pages
Data Analytics Lesson 11 Notes
No ratings yet
Data Analytics Lesson 11 Notes
8 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
MTH401 #Zero Lecture: Discrete Mathematics
No ratings yet
MTH401 #Zero Lecture: Discrete Mathematics
25 pages
Simple Linear Regression and Correlation 568a5ac2ce9b3
No ratings yet
Simple Linear Regression and Correlation 568a5ac2ce9b3
31 pages
Management Science Notes
No ratings yet
Management Science Notes
13 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
Regression Models - Follow
No ratings yet
Regression Models - Follow
7 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
ch03 Regression
No ratings yet
ch03 Regression
10 pages
Non-Deterministic Reward and Action
No ratings yet
Non-Deterministic Reward and Action
2 pages
Topic - Chapter 12 - Regression Models
No ratings yet
Topic - Chapter 12 - Regression Models
1 page
Regression Analysis
No ratings yet
Regression Analysis
12 pages
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Interconnect Guide: X-Series System Automation
No ratings yet
Interconnect Guide: X-Series System Automation
128 pages
Chapter 24. Tool Kit For Portfolio Theory, Asset Pricing Models, and Behavioral Finance
No ratings yet
Chapter 24. Tool Kit For Portfolio Theory, Asset Pricing Models, and Behavioral Finance
22 pages
How To Answer - Reason For Job Change
No ratings yet
How To Answer - Reason For Job Change
21 pages
IST Module 2 Day 3 Revised HANDOUT Introduction To Modes Saturday October 10th
No ratings yet
IST Module 2 Day 3 Revised HANDOUT Introduction To Modes Saturday October 10th
25 pages
TESCO
No ratings yet
TESCO
19 pages
Development of Indian Civilization
No ratings yet
Development of Indian Civilization
16 pages
QT Module 3
No ratings yet
QT Module 3
21 pages
Sinta4 JABI CSRGreenBankingandPerformance
No ratings yet
Sinta4 JABI CSRGreenBankingandPerformance
16 pages
Wormhole: Wormhole (Disambiguation) Einstein-Rosen Bridge (EP)
No ratings yet
Wormhole: Wormhole (Disambiguation) Einstein-Rosen Bridge (EP)
12 pages
Core Reading 1 Resistance and Persuasion - (Strategies For Overcoming Resistance)
No ratings yet
Core Reading 1 Resistance and Persuasion - (Strategies For Overcoming Resistance)
32 pages
The Self From Pschological Perspective
No ratings yet
The Self From Pschological Perspective
17 pages
Blockchain Technology and Its Relationships To Sustainable Supply Chain Management
No ratings yet
Blockchain Technology and Its Relationships To Sustainable Supply Chain Management
20 pages
Jagati Profile Company
No ratings yet
Jagati Profile Company
10 pages
Mapa Conceptual Sedimentadores
No ratings yet
Mapa Conceptual Sedimentadores
5 pages
2025 Um Viewbook
No ratings yet
2025 Um Viewbook
13 pages
Wilda Hasanah Harahap, Pindi Patana, Yunus Afifuddin
No ratings yet
Wilda Hasanah Harahap, Pindi Patana, Yunus Afifuddin
10 pages
AF2ZP0EA (ACE2-2uC-datasheet)
No ratings yet
AF2ZP0EA (ACE2-2uC-datasheet)
5 pages
Davy
No ratings yet
Davy
6 pages
Change in Social Institution
No ratings yet
Change in Social Institution
4 pages
Klakson Stebel
No ratings yet
Klakson Stebel
6 pages
Activity 1 and 2
No ratings yet
Activity 1 and 2
4 pages
Paper 4 - Health and Safety
No ratings yet
Paper 4 - Health and Safety
3 pages
Experiments in Bioluminescence
No ratings yet
Experiments in Bioluminescence
3 pages
LF1ZLD12L Techdata
No ratings yet
LF1ZLD12L Techdata
2 pages
K2O Prescription Mapping
No ratings yet
K2O Prescription Mapping
8 pages

14 Statistics and Probability

Uploaded by

14 Statistics and Probability

Uploaded by

Learning Objectives

After completing this chapter, students will be able to:

1. Identify variables and use them in a regression model

6. Develop a multiple regression model and use it to predict

TRIPLE A’S SALES LOCAL PAYROLL

• Triple A Construction is trying to predict sales based on area payroll

Error = (Actual value) – (Predicted value)

Payroll Line Fit Plot

◼ Sum of the squared error

◼ Sum of squares due to regression

8 4 (8 – 7)2 = 1 2 + 1.25(4) = 7.00 1 0

9 6 (9 – 7)2 = 4 2 + 1.25(6) = 9.50 0.25 6.25

5 4 (5 – 7)2 = 4 2 + 1.25(4) = 7.00 4 0

4.5 2 (4.5 – 7)2 = 6.25 2 + 1.25(2) = 4.50 0 6.25

9.5 5 (9.5 – 7)2 = 6.25 2 + 1.25(5) = 8.25 1.5625 1.563

◼ Sum of the squares total SST = 22.5

◼ Sum of squares due to regression

◼ If we make certain assumptions about the errors in a regression model,

You might also like