0% found this document useful (0 votes)

28 views6 pages

Stat 1

Uploaded by

Reevu Thapa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views6 pages

Stat 1

Uploaded by

Reevu Thapa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Review of Statistics

1 Random Variables and Key Statistics

Random Variable: A random variable is a variable that takes on different numerical values from
a sample space determined by chance (probability distribution, f(x)). For example, the outcome
of rolling a fair dice is a random variable having possible values of 1, . . . , 6 each with a chance of
1
6 . A random variable is discrete if it can assume at most a countable number of values.
Key statistics for a random variable, X:
P6 1
• Expected value µ = E(X) =
P
all x xf (x), for example, µ = x=1 6 x from rolling a fair
dice.

• Variance: measures the level of dispersion of a random variable- average square distance to
the mean. X
σ 2 = V (X) = E[(X − µ)2 ] = (x − µ)2 f (x)
all x
or
σ 2 = V (X) = E(X 2 ) − [E(X)]2

Test hypotheses: Two-Tailed, Large-Sample Test for the Population Mean

• H0 : µ = µ0
H1 : µ 6= µ0

• The significant level of the test: α (usually, we set α = 0.01, 0.05, or 0.1)
x̄−µ
• Test statistic: z = √0
s/ n

• Critical points: ±Zα/2

• The decision rule: Reject the null hypothesis if either Z > Zα/2 or Z < −Zα/2 .
Example 1.1 An insurance company executive believes that, over the last few years, the average
liability insurance per board seat in companies defined as “small companies” has been $2,000.
A recent survey of small business by Growth Resources, Inc., reports that the average liability
tab per board seat in their sample is $2,700. Assume that the sample used by Growth Resources
contained 100 randomly chosen small firms (as defined by their total annual gross billing) and
that the sample standard deviation was $947. Do these sampling results provide evidence to reject
the executive’s claim that the average liability per board seat is $2,000, using a α = 0.01 level of
significance?
Answer: We set H0 : µ = 2000 and H1 : µ 6= 2000. Since α = 0.01, we have the Z statistics for
the two critical points, ±Zα/2 = ±2.575, while the test statistic
x̄ − µ0 2700 − 2000 700
z= √ = = = 7.39 > Zα/2
s/ n 947/10 94.7
Thus, we reject the null hypothesis.

1
2 Measures of Association Between Two Variables
In data analysis, we sometimes want to learn the relationship between two variables, for example,
does the higher temperature in July lead to higher electricity consumption? The statistics, co-
variance and correlation serve for that purpose. They are the building blocks of many advanced
multi-variate analysis.
P
(xi −x̄)(yi −ȳ)
• sample covariance: sxy = n−1
P
(xi −µx )(yi −µy )
• population covariance: σxy = cov(x, y) = N

Effect of variable scaling:

sxy
• Pearson sample correlation coefficient: rxy = s x sy where sx and sy are sample standard
rP rP
(xi −x̄)2 (yi −ȳ)2
deviations of random variables x and y respectively, and sx = n−1 , sy = n−1 .
Note that −1 ≤ rxy ≤ 1.
σxy
• Pearson population correlation coefficient: ρxy = σx σ y where σx and σy are population
rP
(xi −µx )2
standard deviations of random variables x and y respectively, and σx = N , σy =
rP
(yi −µy )2
N

• Graphic Interpretations:
Example 2.1 The following data set contains 2 variables and 10 observations. For example, the
data might come from a survey to 10 female respondents. Variable x represents the number of
children the respondent has and variable y records the age of the respondent. We are interested in
knowing whether the older generation tends to raise more children than the younger generation.
Note that all respondents are either in their late stage of reproductive period or has passed that
period. For survey data, we usually arrange the data in rows and columns, each row corresponding
to the answers to all survey questions from a respondent and each column listing answers to one
question from all respondents.
obs. xi yi xi − x̄ yi − ȳ (xi − x̄)(yi − ȳ)
1 2 50 -1 -1 1
2 5 57 2 6 12
3 1 41 -2 -10 20
4 3 54 0 3 0
5 4 54 1 3 3
6 1 38 -2 -13 26
7 5 63 2 12 24
8 3 48 0 -3 0
9 4 59 1 8 8
10 2 46 -1 -5 5
Sum 30 510 0 0 99
Average 3 51 0 0 9.9

2
Answer:
P
(xi −x̄)(yi −ȳ) 99
• sxy = n−1 = 10−1 = 11
rP
(xi −x̄)2
q
20
• sx = n−1 = 9 = 1.4907
rP
(yi −ȳ)2
q
566
• sy = n−1 = 9 = 7.9303
sxy 11
• rxy = sx sy = (1.4907)(7.9303) = 0.93
When two variables X and Y are positively correlated, higher value of X usually comes
with higher value of Y and smaller value of X is more likely to be associated with small value of
Y.

3 Linear Combinations of Random Variables

We now consider multi-variate cases where there are two or more variables. Let’s consider a
scenario where we are creating a portfolio consisting of n individual stocks with initial capital one
million dollars. It is our decision to determine the percentage of the initial capital to be invested
in each stock so that certain goals can be achieved, for example, at least 10% expected daily
return and no more than 15% risk (measured by standard deviation). To facilitate the decision
process, we need to evaluate the portfolio expected return and risk under various alternatives.
Assuming that in one alternative, we invest ai portion of total capital in stock i, where 0 ≤ ai ≤ 1
and ni=1 ai = 1, we can find the expected return and risk for this alternative if we know the
P

expected daily return and risk for each individual stock. The expected return and risk of each
individual stock can be obtained from historical data. For example, the expected daily return of
stock i is the mean daily return of stock i in the past three years (or any duration in which we
have data) and the expected risk of stock i is the standard deviation of the daily return in the
same period. In addition to return and risk, we also need to know the covariance between any
pair of stocks in the portfolio. This can again be obtained from the historical data. Once the
information about the individual stock is available, the expected return and risk of the portfolio
given the portfolio composition ai , i = 1, . . . , n can be easily calculated following the following
theorems. In this example, the daily return of each stock Xi , i = 1, . . . , n is a random variable
and the daily return of the portfolio is also a random variable which is a linear combination of
the n individual random variables (Xp = a1 X1 + a2 X2 + · · · + an Xn ).

Theorem 1 Let X1 , X2 , . . . , Xn be random variables with means µ1 , µ2 , . . . , µn and variances

σ12 , σ22 , . . . , σn2 respectively. Then

E[a1 X1 + a2 X2 + · · · + an Xn ] = a1 E[X1 ] + a2 E[X2 ] + · · · + an E[Xn ]

= a1 µ1 + a2 µ2 + · · · + an µn (1)

V ar[a1 X1 + a2 X2 + · · · + an Xn ] = a21 σ12 + a22 σ22 + · · · + a2n σn2 + 2Σni=1,i<j Σnj=1 ai aj Cov(Xi , Xj ) (2)

3
where X
µ = E(X) = xf (x)
all x
and X
σ 2 = V (X) = E[(X − µ)2 ] = (x − µ)2 f (x)
all x
Theorem 1 shows that the expected value of the linear combination of some random variables
is the linear combination of the means of those variables.

Theorem 2 Let X1 , X2 , . . . , Xn be independent random variables with means µ1 , µ2 , . . ., µn

and variances σ12 , σ22 , . . . , σn2 respectively. Then

V ar[a1 X1 + a2 X2 + · · · + an Xn ] = a21 σ12 + a22 σ22 + · · · + a2n σn2

When two variables Xi and Xj are independent, Cov(Xi , Xj ) = 0. Thus, the last term in
the variance formula is gone.

Theorem 3 Let X1 , X2 , . . . , Xn be independent identically distributed random variables

with mean µ and variance σ 2 .

V ar[a1 X1 + a2 X2 + · · · + an Xn ] = [a21 + a22 + · · · + a2n ]σ 2

Example 3.1 Let X1 , X2 , . . . , Xn be independent identically distributed random variables with

mean µ and variance σ 2 .
1 1 1
E[ X1 + X2 + · · · + Xn ] = n1 E[X1 ] + n1 E[X2 ] + · · · + n1 E[Xn ]
n n n
= n1 µ + n1 µ + · · · + n1 µ = n( n1 µ) = µ

1 1 1 1 1 1 1 σ2
V ar[ X1 + X2 + · · · + Xn ] = ( )2 σ 2 + ( )2 σ 2 + . . . + ( )2 σ 2 = n( )2 σ 2 =
n n n n n n n n
2
Note that n1 X1 + n1 X2 + · · · + n1 Xn = X1 +X2n+···+Xn = X̄. So E[X̄] = µ, V ar(X̄) = σn
and σX̄ = √σn , that is, the mean and standard deviation of the sampling distribution are µ and
σX̄ = √σn , respectively. It is clear that as sample size n increases, the standard deviation of the
sampling distribution becomes smaller.
We now rewrite Theorem 1 in matrix forms. Let m = [µ1 , µ2 , . . . , µn ]T and C be the
variance covariance matrix, i.e.

···

σ11 σ12 σ1n

 σ21 σ22 ··· σ2n 

C= .. .. .. .. 
. . . .
 
 
σn1 ··· σn,n−1 σnn

4
Outcome 1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12

Table 1: Possible outcomes of Y

If a = [a1 , a2 , . . . , an ]T contains the coefficients of the linear combination in Theorem 1,

Equations (1) and (2) can be rewritten as

E(Y ) = aT m

and
V ar(Y ) = aT Ca
where Y = a1 X1 + a2 X2 + · · · + an Xn .

Example 3.2 Let X be a discrete uniformly distributed random variable with possible values
1, 2, . . . , 6. Find the mean and standard deviation of the mean of nine randomly chosen observa-
tions.

1 1 1 21
µ = E(X) = Σall x xf (x) = (1)( ) + (2)( ) + · · · + (6)( ) = = 3.5
6 6 6 6
V ar(X) = E(X − µ)2 = Σall x (x − µ)2 f (x) = E(X 2 ) − [E(X)]2
Since E(X 2 ) = (12 )( 61 ) + (22 )( 16 ) + . . . + (62 )( 16 ) = 91
6 , V ar(X) = E(X 2 ) − [E(X)]2 =
91
6 − ( 21 2 546 441 105
6 ) = 36 − 36 = 36 = 2.9167

σX = 1.7

E(X̄) = E(X) = µ = 3.5

σX 1.7
σX̄ = √ = √ = .57
n 9
If we rolled a set of nine fair dices and averaged the number of dots on the top faces, we
would expect that this average would be between 3.5- 1.96(.57) and 3.5+1.96(.57) or between 2.38
and 4.62 about 95% of the time if we believe the Central Limit Theorem applies to a sample this
small.

Example 3.3 Let X1 , X2 be discrete uniformly distributed random variables with possible values
1, 2, . . . , 6. Find the mean and standard deviation of the random variable Y = X1 + X2 .

5
From Table 1, we have
1 2 3 4 5 6
µ = E(Y ) = Σall x xf (x) = (2)( ) + (3)( ) + (4)( ) + (5)( ) + (6)( ) + (7)( )
36 36 36 36 36 36
5 4 3 2 1 252
+(8)( ) + (9)( ) + (10)( ) + (11)( ) + (12)( ) = =7
36 36 36 36 36 36
This is the same as 2 × E(X) = 2 × 3.5.

V ar(Y ) = E(Y − µ)2 = Σall y (y − µ)2 f (y) = E(Y 2 ) − [E(Y )]2

1 2 3
= [22 ( ) + 32 ( ) + 42 ( )
36 36 36
4 5 6
+52 ( ) + 62 ( ) + 72 ( )
36 36 36
2 5 2 4 2 3
+8 ( ) + 9 ( ) + 10 ( )
36 36 36
2 1 1974
+112 ( ) + 122 ( )] − 72 = − 49 = 5.8333
36 36 36

√
σY = 5.8333 = 2.415

We can also obtain it from

V ar(Y ) = V ar(X) + V ar(X) = 2V ar(X) = 2 ∗ 2.9167

Example 3.4 If a car dealer estimated that she has a 30% of chance selling 3 cars a day, a 40%
of chance selling 2 cars a day, a 20% chance of selling 1 car a day and 10% of chance with no
sales in a day.

1. What is the expected number of cars sold by the dealer and what is the standard deviation?

2. If the dealer now owns 3 stores and the distribution of number of cars sold in a day is
identical in all stores, what is the expected total number of cars sold in a day by the 3
stores and what is the standard deviation of the total? (assume the distribution for each
store is the same as the one described for the one store case)

3. In the 3-store case, what is the expected value of the average number of cars sold a day
from the three stores and what is the standard deviation of the average?

Statistics Boot Camp: X F X X E DX X XF X E Important Properties of The Expectations Operator
No ratings yet
Statistics Boot Camp: X F X X E DX X XF X E Important Properties of The Expectations Operator
3 pages
Expectation: Definition Expected Value of A Random Variable X Is Defined
No ratings yet
Expectation: Definition Expected Value of A Random Variable X Is Defined
15 pages
Sum of Variances
No ratings yet
Sum of Variances
11 pages
Introductory Econometrics: Probability and Statistics Refresher
No ratings yet
Introductory Econometrics: Probability and Statistics Refresher
35 pages
SDM 1 Formula
No ratings yet
SDM 1 Formula
9 pages
Corporate Finance - Statistics Review: Random Variable
No ratings yet
Corporate Finance - Statistics Review: Random Variable
15 pages
Sasin DECS 434 Session 1 and 2 - Probability and Excel
No ratings yet
Sasin DECS 434 Session 1 and 2 - Probability and Excel
104 pages
MIT6 436JF18 Lec06
No ratings yet
MIT6 436JF18 Lec06
18 pages
Review of Probability and Statistics
No ratings yet
Review of Probability and Statistics
34 pages
Microeconometrics Probability Theory Guide
No ratings yet
Microeconometrics Probability Theory Guide
6 pages
Econometrics for Economics Students
No ratings yet
Econometrics for Economics Students
61 pages
1 + X E (X Is Is Integrable, But Not Square Is Not Integrable, The Variance Is
No ratings yet
1 + X E (X Is Is Integrable, But Not Square Is Not Integrable, The Variance Is
18 pages
Quantitative Analysis Basics
No ratings yet
Quantitative Analysis Basics
47 pages
Probability
No ratings yet
Probability
12 pages
1 The Econometrics of The Simple Regression Model: I 1 1i 2 2i K Ki I
No ratings yet
1 The Econometrics of The Simple Regression Model: I 1 1i 2 2i K Ki I
50 pages
FRM Part 1: Basic Statistics
No ratings yet
FRM Part 1: Basic Statistics
28 pages
Basic Statistic
No ratings yet
Basic Statistic
20 pages
323 Egec
No ratings yet
323 Egec
18 pages
Introductory Probability and The Central Limit Theorem
No ratings yet
Introductory Probability and The Central Limit Theorem
11 pages
2A2. Review of Probability
No ratings yet
2A2. Review of Probability
8 pages
Expectations 13 Pages
No ratings yet
Expectations 13 Pages
13 pages
Expectation and Variance Guide
No ratings yet
Expectation and Variance Guide
39 pages
Understanding Mathematical Expectation
No ratings yet
Understanding Mathematical Expectation
52 pages
Y K Mukhoti: Statistical Methods For Business
No ratings yet
Y K Mukhoti: Statistical Methods For Business
15 pages
SamplingDistribution Notes
No ratings yet
SamplingDistribution Notes
15 pages
Basic Probability Reference Sheet: February 27, 2001
No ratings yet
Basic Probability Reference Sheet: February 27, 2001
8 pages
Statistics For Management and Economics, Sixth Edition: Formulas
No ratings yet
Statistics For Management and Economics, Sixth Edition: Formulas
15 pages
X, ..., X, X, ..., X X, X, ..., X: Basic Statistics
No ratings yet
X, ..., X, X, ..., X X, X, ..., X: Basic Statistics
29 pages
Probability Concepts Explained
No ratings yet
Probability Concepts Explained
21 pages
Sampling Distribution Basics
No ratings yet
Sampling Distribution Basics
169 pages
Stats Cheat Sheet
No ratings yet
Stats Cheat Sheet
2 pages
Week Two Note
No ratings yet
Week Two Note
19 pages
3 1 Markowitz
No ratings yet
3 1 Markowitz
28 pages
EE4 App B Solutions Manual
No ratings yet
EE4 App B Solutions Manual
7 pages
Probability Theory Basics
No ratings yet
Probability Theory Basics
7 pages
0 Basic Statistics Knowledge
No ratings yet
0 Basic Statistics Knowledge
6 pages
SEEM5820/ECLT5920: Models and Decisions With Financial Applications
No ratings yet
SEEM5820/ECLT5920: Models and Decisions With Financial Applications
35 pages
Basics
No ratings yet
Basics
8 pages
Stats Cheat Sheets
No ratings yet
Stats Cheat Sheets
15 pages
Math Review
No ratings yet
Math Review
29 pages
Topic4 DiscreteRV
No ratings yet
Topic4 DiscreteRV
40 pages
Understanding Mean and Variance
No ratings yet
Understanding Mean and Variance
14 pages
Data Science Probability Review
No ratings yet
Data Science Probability Review
12 pages
Stat2024 9
No ratings yet
Stat2024 9
15 pages
Appendix C - Standard Statistical Resu - 2016 - Computational Finance Using C An
No ratings yet
Appendix C - Standard Statistical Resu - 2016 - Computational Finance Using C An
10 pages
Week 3
No ratings yet
Week 3
3 pages
Week 3
No ratings yet
Week 3
3 pages
Week 3 - Notes
No ratings yet
Week 3 - Notes
3 pages
Econ 140 (Spring 2018) - Section 1: 1 Random Variable (RV)
No ratings yet
Econ 140 (Spring 2018) - Section 1: 1 Random Variable (RV)
7 pages
Session 1ecotrix
No ratings yet
Session 1ecotrix
21 pages
Cosm Unit II
No ratings yet
Cosm Unit II
39 pages
91 With: Probability
No ratings yet
91 With: Probability
13 pages
Statistics Formula
No ratings yet
Statistics Formula
4 pages
Probability and Statistics
No ratings yet
Probability and Statistics
33 pages
Lecture 15 N
No ratings yet
Lecture 15 N
37 pages
1 Math Fundamentals: 1.1 Integrals, Factors and Techniques
No ratings yet
1 Math Fundamentals: 1.1 Integrals, Factors and Techniques
11 pages
MSTISITest InterviewSelectedList
100% (1)
MSTISITest InterviewSelectedList
3 pages
Trinomial Distribution
No ratings yet
Trinomial Distribution
4 pages
Uniform Binomial Poisson STATISTICS
No ratings yet
Uniform Binomial Poisson STATISTICS
19 pages
CamScanner Document Compilation
No ratings yet
CamScanner Document Compilation
272 pages
Duality LPP
No ratings yet
Duality LPP
25 pages
DIPS Previous Yeare Chapterwise
No ratings yet
DIPS Previous Yeare Chapterwise
250 pages
Product Limit Estimator
No ratings yet
Product Limit Estimator
18 pages
Green University of Bangladesh: Department of Computer Science and Engineering Assignment
No ratings yet
Green University of Bangladesh: Department of Computer Science and Engineering Assignment
3 pages
(MHS) Sampling 2025
No ratings yet
(MHS) Sampling 2025
15 pages
Types of Errors in Chain Surveying
50% (2)
Types of Errors in Chain Surveying
4 pages
Thoreau's Quest for Simple Living
No ratings yet
Thoreau's Quest for Simple Living
3 pages
Unit 5
No ratings yet
Unit 5
26 pages
Project Management Methodologies: A Comparative Analysis: October 2012
No ratings yet
Project Management Methodologies: A Comparative Analysis: October 2012
14 pages
Lean Management Strategy Model
No ratings yet
Lean Management Strategy Model
9 pages
Pure O2 Activated Sludge
No ratings yet
Pure O2 Activated Sludge
35 pages
Soil Testing and Analysis
No ratings yet
Soil Testing and Analysis
21 pages
Review On Hand Gesture Recognition
No ratings yet
Review On Hand Gesture Recognition
5 pages
Ideologies in Globalization & Development
No ratings yet
Ideologies in Globalization & Development
10 pages
Motiwalla Esm2e PP 06 PDF
No ratings yet
Motiwalla Esm2e PP 06 PDF
19 pages
AJACM 2010 5 2 Wang Juyi Interview Part 1 of 2 - Liu
No ratings yet
AJACM 2010 5 2 Wang Juyi Interview Part 1 of 2 - Liu
7 pages
Univer Management System
No ratings yet
Univer Management System
2 pages
No Joke Making Jewish Humor 1st Edition Wisse Newest Edition 2025
No ratings yet
No Joke Making Jewish Humor 1st Edition Wisse Newest Edition 2025
93 pages
Communist Manifesto Filetype PDF
No ratings yet
Communist Manifesto Filetype PDF
2 pages
Project Planning Analysis and Management
100% (1)
Project Planning Analysis and Management
24 pages
Rothblum, Solomon, Murakami 1986 - Diferencias Cognitivas, Afectivas y Conductuales Entre Procrastinadores Altos y Bajos
No ratings yet
Rothblum, Solomon, Murakami 1986 - Diferencias Cognitivas, Afectivas y Conductuales Entre Procrastinadores Altos y Bajos
8 pages
Bit String Flicking
No ratings yet
Bit String Flicking
3 pages
Cerulean Seas Waves of Thought
100% (5)
Cerulean Seas Waves of Thought
98 pages
Check Zone & Redundant BB Protn
No ratings yet
Check Zone & Redundant BB Protn
2 pages
03 - Customization & Industry 4.0 PDF
No ratings yet
03 - Customization & Industry 4.0 PDF
37 pages
Unit 7: Management Styles
No ratings yet
Unit 7: Management Styles
3 pages
Detailed Lesson Plan in English 9
100% (6)
Detailed Lesson Plan in English 9
5 pages
Scaffold Erection NC2 Cert
No ratings yet
Scaffold Erection NC2 Cert
1 page
Final CSR Report Kartik
No ratings yet
Final CSR Report Kartik
38 pages
CFD Blood Flow PDF
No ratings yet
CFD Blood Flow PDF
4 pages
VPEC-T A Thinking Framework As Presented To ScIO
No ratings yet
VPEC-T A Thinking Framework As Presented To ScIO
35 pages
User Manual: HGM9510 Genset Parallel (With Genset) Unit
100% (1)
User Manual: HGM9510 Genset Parallel (With Genset) Unit
65 pages

Stat 1

Uploaded by

Stat 1

Uploaded by

Review of Statistics

1 Random Variables and Key Statistics

Test hypotheses: Two-Tailed, Large-Sample Test for the Population Mean

• Critical points: ±Zα/2

Effect of variable scaling:

3 Linear Combinations of Random Variables

Theorem 1 Let X1 , X2 , . . . , Xn be random variables with means µ1 , µ2 , . . . , µn and variances

E[a1 X1 + a2 X2 + · · · + an Xn ] = a1 E[X1 ] + a2 E[X2 ] + · · · + an E[Xn ]

Theorem 2 Let X1 , X2 , . . . , Xn be independent random variables with means µ1 , µ2 , . . ., µn

V ar[a1 X1 + a2 X2 + · · · + an Xn ] = a21 σ12 + a22 σ22 + · · · + a2n σn2

Theorem 3 Let X1 , X2 , . . . , Xn be independent identically distributed random variables

V ar[a1 X1 + a2 X2 + · · · + an Xn ] = [a21 + a22 + · · · + a2n ]σ 2

Example 3.1 Let X1 , X2 , . . . , Xn be independent identically distributed random variables with

Table 1: Possible outcomes of Y

If a = [a1 , a2 , . . . , an ]T contains the coefficients of the linear combination in Theorem 1,

E(X̄) = E(X) = µ = 3.5

V ar(Y ) = E(Y − µ)2 = Σall y (y − µ)2 f (y) = E(Y 2 ) − [E(Y )]2

We can also obtain it from

V ar(Y ) = V ar(X) + V ar(X) = 2V ar(X) = 2 ∗ 2.9167

You might also like