0% found this document useful (0 votes)
42 views202 pages

E-Note 20895 Content Document 20240607120458PM

Class Notes

Uploaded by

isseihyoudou609
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views202 pages

E-Note 20895 Content Document 20240607120458PM

Class Notes

Uploaded by

isseihyoudou609
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 202

Unified file - Probability and Statistics - 4th Sem,

B.Tech. 2023
Department of Mathematics - SOE, DSU

June 7, 2024

Contents
1. Syllabus 2

2. Module1 - Probability 6

3. Module2-Discrete RVs 48

4. Module2-Non-discrete RVs 82

5. Module 3 - Class notes 134

6. Module4-HypTest 135

7. Module5GoodnessOfFit 154

8. Module5LinearReg 161

9. Module5LastReg 175

10.Practice-problems-Mod1 181

11.Practice-problems-Mod2 185

12.Practice-problems2-Mod2 189

13.PracticeQuestionsMod3 195

14.Assignment-Mod1+2 198

15.Assignment2 200

1
PROBABILITY AND STATISTICS
[As per Choice Based Credit System (CBCS) scheme]
SEMESTER – IV
Subject Code : Credits : 03
Hours / Week : 03 Hours Total Hours : 39 Hours
L–T–P–S : 3–0–0–0
Course Learning Objectives:
This Course will enable students to:
1. Apply statistical principles and probability concepts to solve complex problems in
real-world scenarios involving uncertainty and randomness.
2. Evaluate and select appropriate probability distributions and statistical techniques
to analyze and interpret data accurately in various applications.
3. Justify the use of estimation methods and hypothesis testing techniques for
drawing meaningful inferences about population parameters.
4. Analyze and interpret sample test results for different statistical relationships, such
as means, variances, correlation coefficients, regression coefficients, goodness of
fit, and independence, to make informed decisions.
5. Identify sample tests using appropriate statistical procedures to investigate the
significance of observed data and communicate findings effectively.
Teaching-Learning Process (General Instructions)
These are sample new pedagogical methods, where teacher can use to accelerate the
attainment of the various course outcomes.
1. Lecture method means it includes not only traditional lecture method, but different
type of teaching methods may be adopted to develop the course outcomes.
2. Interactive Teaching: Adopt the Active learning that includes brainstorming,
discussing, group work, focused listening, formulating questions, notetaking,
annotating, and roleplaying.
3. Show Video/animation films to explain functioning of various concepts.
4. Encourage Collaborative (Group Learning) Learning in the class.
5. To make Critical thinking, ask at least three Higher order Thinking questions in the
class.
6. Adopt Problem Based Learning, which fosters students’ Analytical skills, develop
thinking skills such as the ability to evaluate, generalize, and analyse information
rather than simply recall it.
7. Show the different ways to solve the same problem and encourage the students to
come up with their own creative ways to solve them.
8. Discuss how every concept can be applied to the real world - and when that's
possible, it helps improve the students' understanding.
UNIT – I : Probability 09 Hours
Definitions of Probability, Addition Theorem, Conditional Probability, Multiplication
Theorem, Bayes’ Theorem of Probability

UNIT – II: Random Variables and their Properties and Probability 09 Hours
Distributions
Discrete Random Variable, Continuous Random Variable, Joint Probability Distributions
Their Properties, Probability Distributions: Discrete Distributions: Binomial, Poisson
Distributions and their Properties; Continuous Distributions: Exponential ,Normal,
Distributions and their Properties.

UNIT – III: Estimation and testing of hypothesis 06 Hours


Sample, Populations, Statistic, Parameter, Sampling Distribution, Standard Error,
Un-Biasedness, Efficiency, Maximum Likelihood Estimator, Notion & Interval Estimation.

UNIT – IV: Sample Tests-1 07 Hours


Large Sample Tests Based on Normal Distribution , Small Sample Tests : Testing Equality
of Means, Testing Equality of Variances, Test of Correlation Coefficient

UNIT – V: Sample Tests-2 08 Hours


Test for Regression Coefficient; Coefficient of Association, 2 – Test for Goodness of Fit,
Test for Independence.

Bloom’s
Course
Description Taxonomy
Outcome
Level
At the end of the course the student will be able to:
Apply the principles of probability to solve complex problems in
1 various real-world scenarios. L2 & L3

Solve and compare different probability distributions, including


2 discrete and continuous random variables, in order to make L2 & L3
informed decisions and predictions.
Apply statistical estimation techniques, such as maximum
3 likelihood estimation and interval estimation, to draw meaningful L3
inferences about population parameters from sample data.
Examine hypothesis testing methods, including large and small
4 sample tests, to assess the significance of observed data and draw L4
valid conclusions.
Analyze statistical relationships and perform sample tests to
assess the Equality of means in different populations, Correlation
coefficients between variables to determine the strength and
5 direction of the relationship. Independence of variables using L4
appropriate statistical tests to assess the absence of any
relationship.

Table: Mapping Levels of COs to POs / PSOs

COs Program Outcomes (POs) PSOs


1 2 3 4 5 6 7 8 9 10 11 12 1 2
CO1 3 2 2 2 1
CO2 3 2 2 2 1
CO3 3 2 2 1
CO4 3 2 2 2 1
CO5 3 2 2 2 1
3: Substantial (High) 2: Moderate (Medium) 1: Poor (Low)
TEXT BOOKS:

1. Probability & Statistics for Engineers and Scientists, Walpole, Myers, Myers, Ye.
Pearson Education.
REFERENCE BOOKS:
1. Probability, Statistics and Random Processes T. Veerarajan Tata McGraw – Hill
2. Probability & Statistics with Reliability, Queuing and Computer Applications, Kishor
S. Trivedi, Prentice Hall of India ,1999
E-Resources:
1. https://nptel.ac.in/courses/106104233
2. https://nptel.ac.in/courses/117103067
3. https://nptel.ac.in/courses/103106120
4. https://www.coursera.org/learn/probability-intro#syllabus
5. https://nptel.ac.in/courses/111104073
Activity Based Learning (Suggested Activities in Class)

1. Tools like Python programming, R programming can be used which helps student to
develop a skill to analyze the problem and providing solution.
2. Regular Chapter wise assignments/ Activity/Case studies can help students to have
critical thinking, developing an expert mind set, problem-solving and teamwork.
Following are Assignments/ Activities Can be carried out using either R programming language
or Python Programming or excel solver.
1. There are n people gathered in a room. What is the probability that at least 2 of them
will have the same birthday? (Use excel solver, R Programming, Python Programming)
a. Use simulation to estimate this for various n., and Produce Simulation Graph.
b. Find the smallest value of n for which the probability of a match is greater than 0.5.
c. Explore how the number of trials in the simulation affects the variability of our
estimates.
2. Case Study 1: Customer Arrivals at a Coffee Shop
a. A coffee shop wants to analyze the number of customer arrivals during its
morning rush hour (7:00 AM to 9:00 AM). The shop has been recording the
number of customer arrivals every 15 minutes for the past month.
b. Data: The data consists of the number of customer arrivals recorded at the coffee
shop during each 15-minute interval for the past month.
c. Here is a sample of the data:

Time Interval Customer Arrivals


7:00 AM - 7:15 AM 6
7:15 AM - 7:30 AM 4
7:30 AM - 7:45 AM 9
7:45 AM - 8:00 AM 7
8:00 AM - 8:15 AM 5
8:15 AM - 8:30 AM 8
8:30 AM - 8:45 AM 10
8:45 AM - 9:00 AM 6

analyze the customer arrivals and determine the probability distribution that
best fits the data. Specifically, explore both discrete and continuous probability
distributions, including the binomial, Poisson, exponential, and normal
distributions.
3. Case Study 2: Comparing the Performance of Two Groups
a. Suppose you are a data analyst working for a company that manufactures a new
energy drink. The marketing team conducted a promotional campaign in two
different cities (City A and City B) to determine the effectiveness of the campaign
in increasing sales. The sales data for a random sample of customers in each city
was collected over a week. Your task is to compare the average sales between the
two cities and test whether there is a significant difference in the variance of
sales.
b. Data: Let's assume the following sample data for the number of energy drinks
sold in each city:
City A: [30, 28, 32, 29, 31, 33, 34, 28, 30, 32]
City B: [25, 24, 26, 23, 22, 27, 29, 30, 26, 24]
perform a two-sample t-test to test the equality of means and a test for equality
of variances using Python's SciPy library.
4. case study 3: testing independence between two categorical variables.
a. Data: Sample of 100 employees, and each employee is classified as either Male or
Female. They were asked to rate their job satisfaction on a scale of 1 to 5, where
1 represents low satisfaction and 5 represents high satisfaction. The data is as
follows:

Employee Gender Job Satisfaction

1 Male 4

2 Female 3

3 Male 2

4 Female 5

... ... ...

100 Female 4

b. Test for independence between gender and job satisfaction, use the chi-squared
test in R.
************************************
Department of Mathematics
COURSE TITLE: PROBABILITY AND STATISTICS
MODULE 1: PROBABILITY
Contents:
Definitions of Probability, Addition Theorem, Conditional Probability, Multiplication Theorem, Bayes’ Theorem of
Probability.
Introduction:
➢ The theory of probability is the study of random phenomena/experiment which are not deterministic.
➢ A deterministic experiment is an experiment whose outcome or result is known with certainty or
predictable, i.e., result is unique.
➢ A non-deterministic/random experiment is an experiment whose outcome or result is not unique and
therefore cannot be predicted with certainty.
➢ In such random experiments, we are not sure of outcome but we intend to estimate the chances of our
outcomes being true.
➢ Examples:
➢ Tossing of a coin, head or tail may occur.
➢ Throwing a die, 1, 2, 3, 4, 5, or 6 may appear.
Preliminary definitions
• A random experiment is an experiment whose outcome or result is not unique and therefore cannot
be predicted with certainty.
• Trial is a single performance of an experiment.
• Sample space S of a random experiment is the set of all possible outcomes of the experiment.
• Examples:
i) Tossing of a coin: S = {H, T }
ii) Throwing a die: S = {1, 2, 3, 4, 5, 6}
• Event: Event E is a subset of a sample space S.
• Examples:
1) Tossing of a die
• E1 = {odd number} = {1, 3, 5} , E2 = {even number} = {2, 4, 6}
• E3 = {prime number} = {2, 3, 5}, E4 = {number greater than 2} = {3, 4, 5, 6}

2) Tossing a coin twice. The sample space is S = {HH, HT, T H, T T}. E = {HH, HT} is an event,
which can be described in words as the ”first toss results in a Heads”
• Mutually exclusive events: Two events A and B are mutually exclusive if A and B can not happen
(occur) simultaneously, i.e., A ∩ B = φ, i.e., A and B are disjoint.
• Example :
➢ In tossing a coin , getting head and tail are mutually exclusive because if head turns out, then
getting a tail is not possible.
➢ In throwing a die, getting any of the number 1,2,3,4,5,6 are mutually exclusive events as the
turn out of any number rules out the possibility of the turn out of other numbers.

• Collectively exhaustive events: A list of events A1, A2, . . . , An are said to be collectively exhaustive if
‫=𝑖𝑛ڂ‬1 𝐴𝑖 = S.
• Independent events: Two or more events are said to be independent if the happening or non-happening of
one event does not prevent the happening or non-happening of the others.
• Example:
1) When two coins are tossed the event of getting head is an independent event as both the coins
can turn out heads.
2) When a card is drawn at random from a pack of 52 cards and if the card is replaced, the result of
the second draw is independent of the first .But if the card is not replaced then the result of the second
depends on the result of the first draw.
Mathematical definition of Probability
➢ If the outcome of a trial consists n exhaustive , mutually exclusive, equally possible cases, of which m of them
are favorable cases to an event E, then the probability of the happening of the event E, usually denoted by P(E)
𝑚
or simply p is defined to be equal to 𝑛 .
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑐𝑎𝑠𝑒𝑠 𝑓𝑜𝑟 𝐸 𝑚
i.e., P(E) = p = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑐𝑎𝑠𝑒𝑠
= 𝑛

Note: ➢ Since m cases are favorable to the event E , it follows that (n-m) cases are not favorable to the event. The set of
unfavorable event is denoted by 𝐸ത or E’ or 𝐸 𝑐 (𝑐𝑜𝑚𝑝𝑙𝑒𝑚𝑒𝑛𝑡 𝑜𝑓 𝐸).
Therefore probability of the non-happening of the event (probability of failure) is denoted by q is given by
𝑛−𝑚 𝑛−𝑚 𝑚
q= 𝑛
or P(𝐸ത )= 𝑛 = 1- 𝑛 = 1-P(E)

➢ Therefore, p+q=1 and 0≤p≤1 and 0≤q≤1.


➢ If 𝑃 𝐸 = 1 then the event E is called as the sure event and if the 𝑃 𝐸 = 0 then the event
E is called as the impossible event.
➢ Number of elements present in the sample space S is called as the order of the sample space denoted as o(S).
Number of elements present in the event E gives the order of the event E which is denoted as 𝑜 𝐸 .
𝑜(𝐸) 𝑚
➢ ∴ P(E)= 𝑜(𝑆) = 𝑛
.
Statistical or Empirical definition of Probability:

➢ If the experiment/trial is repeated a large number of times, then


𝑚
p = P(E) = lim 𝑛
𝑛→∞
where m is the number of times event E happens (occurs) in ‘n’ trials assuming that the trials are performed under
essentially identical conditions.

Axioms of Probability :
Consider an experiment with sample space S. A real-valued function P on the space of all events of the experiment is
called a probability measure if

(i) For every event E, 0 ≤ P(E) ≤ 1

(ii) P(S) = 1 for the sure or certain event S.

(iii) For any sequence of mutually exclusive events E1, E2,...,


P(‫∞ڂ‬ ∞
𝑖=1 𝐸𝑖 )= σ𝑖=1 P(𝐸𝑖 )
Probability Theorems
➢ Addition theorem of probability
If A and B are any two arbitrary events of S, then P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
➢ Special case :
The probability of the happening of one or the other mutually exclusive events is equal to the sum of
the probabilities of the two events i.e., if A , B are two mutually exclusive events then,
P(A or B)= P(A)+ P(B)
P(AՍB)= P(A) + P(B)
➢ Multiplication rule of Probability
If A and B are any two arbitrary events of S, then P(A and B) = P(AꓵB) = P(A).P(B|A) , where
P(B|A) is the probability of the event B subject to the restriction that A is certain to occur( In other words, probability of
the event B when the event A has already happened (conditional probabilities)).

➢ Special case:
If a compound event is made up of a number of independent events, the probability of the happening of the
compound event is equal to the product of the probabilities of the independent events i.e., if A and B are independent
events then
P(A and B)= P(A). P(B)
P(AꓵB)= P(A).P(B)
Note:
➢P(ø)=0
➢P 𝐴ҧ = 1 − P A , where 𝐴ҧ is the complement of event A.
➢If A⊆B, then P(A)≤P(B).
➢If A and B are independent events, then
➢𝐴ҧ and 𝐵ത are independent : P(𝐴∩ҧ 𝐵) ҧ 𝐵)
ത = P(𝐴).P( ത
➢𝐴ҧ and B are independent : P(𝐴ҧ ∩B) = P(𝐴).P(B)
ҧ
➢A and 𝐵ത are independent : P(A ∩𝐵)ത = P(A).P(𝐵)

ҧ 𝐵)
➢ De-Morgan’s law : P(𝐴 ∪ 𝐵) = P(𝐴∩ ത and P(𝐴 ∩ 𝐵) = P(𝐴ҧ ∪ 𝐵)

➢ P(A ∪ B ∪ C)=P(A)+P(B)+P(C)+P(A ∩ B ∩ C)-P(A ∩B)-P(B ∩C)-P(C ∩A)

𝑃 𝐴 −𝑃(𝐴∩𝐵) 𝑃(𝐴∩𝐵)
ത =
➢ P(A|𝐵) , 𝑤ℎ𝑒𝑟𝑒 ҧ =1−
𝑃 𝐵 ≠ 1 and P(𝐴|B)
1−𝑃(𝐵) 𝑃(𝐵)

➢ If A,B,C are any three events, P( (A U B) | C) = P(A|C)+ P(B|C)−P( (𝐴∩𝐵)|𝐶)


➢ If A and B are mutually exclusive events, P( (A U B) | C) = P(A|C)+ P(B|C)
Review of counting:
• Sum rule:
If a first task can be done in n1 ways and a second task in n2 ways and if these two tasks
cannot be performed simultaneously, then there are n1 + n2 ways of performing either task.
• Example:
Suppose a university representative is to be chosen either from 200 teaching or 300
nonteaching employees. Then there are 200 + 300 = 500 possible ways to pick this
representative.
• Extension of Sum Rule:
If tasks T1, T2 . . . Tm can be done in n1, n2, . . . , nm ways respectively and no two of
these tasks can be performed at the same time, then the number of ways to do one of these
tasks is n1 + n2 + · · · + nm.
• Example:
If a student can choose a project from either 20 from Mathematics or 35 from
computer science or 15 from Mechanics then the student can choose a project in 20 + 35 + 15
= 70 ways.
• The Product Rule:
Suppose a procedure can be broken down into two tasks T1 and T2. If the first task T1 can be performed in n1 ways
and the second task T2 can be performed in n2 ways after the first task T1 has been done, then the total procedure can be
carried out, in the designated order, in n1 · n2 ways.
• Example:
A tourist can travel from Hyderabad to Tirupati in 4 ways (by plane, train, bus or taxi). He can travel from Tirupati to
Tirumala hills in 5 ways (by bus, taxi, walk, chair car or motor cycle). Then the tourist can travel from Hyderabad to
Tirumala hills in 4 × 5 = 20 ways.

• Extension of Product Rule:


Suppose a procedure consists of performing tasks T1, T2, . . ., Tm in that order. Suppose task Ti can be performed in
ni ways, after the tasks T1, T2, . . ., Ti−1 are performed, then the number of ways the procedure can be executed in the
designated order is n1 · n2 · n3 . . . nm.
• Example:
• A clothing brand produces shirt in 12 colors, both male and female version, and it comes in 4 sizes for each in three
ranges economy, standard and luxury. Then the number of different types of shirts produced are 12 × 2 × 4 × 3 = 288
types.
• A hotel offers 12 kinds of sweets, 10 kinds of hot tiffins and 5 kinds of beverages (hot tea, hot coffee,
juice, coke, ice cream).
The breakfast consists of a sweet and a hot beverage or a hot tiffin and cold beverage.
The number of ways in which the above breakfast can be ordered is 12 × 2 + 10 × 3 = 24 + 30 = 54.
(Here we have applied both product rule and sum rule.)
Permutation and Combinations:
• A Permutation of a set of n distinct objects is an ordered arrangement of these n objects.
• An r-permutation is an ordered arrangement of r elements taken from the n objects.
• Example:
• A = {a, b, c, d}. Arrangements dcba, cdba are permutation of A.
• Arrangements abc, abd, bcd, dbc, etc. are 3-permutations of A.
• Arrangements ab, ba, cd, dc, etc. are 2-permutations of A.
• The number of r-permutations of a set with n distinct elements is denoted by 𝑛𝑃𝑟 and is given by
𝑛𝑃 =
𝒏!
𝑟 ,𝟎 ≤ 𝒓 ≤ 𝒏
𝒏−𝒓 !
• An r-combination is an unordered selection or combination of r elements from a set with n distinct elements.
• The number of combinations of size r from a set of size n is denoted by C(n, r) and is given by
𝑛𝐶 =
𝒏!
𝑟 ,𝟎 ≤ 𝒓 ≤ 𝒏
𝒏 − 𝒓 ! 𝒓!
• Example:
5!
A = {a, b, c, d, e}. The number of 3- combinations are 5𝐶3 = = 10.
3!2!
They are {a, b, c}, {a, b, d}, {a, b, e}, {b, c, d}, {b, c, e}, {c, d, e}, {a, c, e}, {a, c, d} {b, d, e}, {d, e, a}. Observe
that the order is irrelevant in combinations. Thus {a, b, c}, {a, c, b}, {b, a, c} {b, c, a}, {c, a, b}, {c, b, a} are all one
and the same 3-combination of a, b, c.
Examples:
➢ The probability of getting a ‘head’ in tossing a coin.
The possible outcomes are head and tail. S={H,T}
Number of possible (exhaustive) cases/outcomes (n)=2 and Number of favorable case/outcomes(m)=1.
m 1
⸫ probability of getting head, p = n
= 2

➢ The probability of getting (a) king (b) king or queen, when a card is drawn at random from a pack of 52 cards.

Number of possible (exhaustive) cases/outcomes (n)=52


(a)Number of favorable case(m)=4.
m 4 1
⸫Probability of getting a king, p = n
= 52
= 13

(b)Number of favorable case(m)=4+4=8.


m 8 2
⸫Probability of getting a king or queen, p = n
= 52
= 13
➢ The probability of getting (a) a number greater than 2 (b) an
odd number when a ‘die’ is thrown.
Number of possible (exhaustive) cases (n)=6
(a)Number of favorable case(m)=4.(numbers 3,4,5,6 are favourable to the event)
⸫Probability of getting a number greater than 2
m 4 2
p= = =
n 6 3

(b)Number of favorable case(m)=3.


m 3 1
⸫Probability of getting an odd number, p = = =
n 6 2

➢ Probability that a leap year will have 53 Sundays.


Question 1: A box contains 3 white, 5 black and 6 red balls. If a ball is drawn at random what is the probability that it is
(a) either red or white (b) either white or black (c) either black or red
(d) white or black or red.

Solution: Total number of balls in the box = 3 white+ 5 black + 6 red = 14


One ball has to be chosen at random. Therefore 𝑜 𝑆 = 14𝐶1 = 14
Let us consider the events in the sample space as
W: drawing white ball from the box & therefore 𝑜 𝑊 = 3𝐶1 = 3
B: drawing black ball from the box & therefore 𝑜 𝐵 = 5𝐶1 = 5
R: drawing red ball from the box & therefore 𝑜 𝑅 = 6𝐶1 = 6
𝑜(𝑅) 6
(a) The probability of drawing 1 red ball : 𝑃 𝑅 = = 14
𝑜(𝑆)

𝑜(𝑊) 3
The probability of drawing 1 white ball : 𝑃 𝑊 = 𝑜(𝑆)
= 14

Here R and W are mutually exclusive events. So by the addition theorem in probability
6 3 9
𝑃 𝑅 𝑜𝑟 𝑊 = 𝑃 𝑅 ∪ 𝑊 = 𝑃 𝑅 + 𝑃 𝑊 = + =
14 14 14
9
∴ 𝑃 𝑅 𝑜𝑟 𝑊 = 𝑃 𝑅 ∪ 𝑊 =
14
𝑜(𝐵) 5
(b) Similarly, The probability of drawing 1 black ball : 𝑃 𝐵 = 𝑜(𝑆)
= 14

𝑜(𝑊) 3
The probability of drawing 1 white ball : 𝑃 𝑊 = 𝑜(𝑆)
= 14

Here B and W are mutually exclusive events. So by the addition theorem in probability
5 3 8
𝑃 𝐵 𝑜𝑟 𝑊 = 𝑃 𝐵 ∪ 𝑊 = 𝑃 𝐵 + 𝑃 𝑊 = + =
14 14 14
8
∴ 𝑃 𝐵 𝑜𝑟 𝑊 = 𝑃 𝑅 ∪ 𝑊 = 14

(c) Similarly,
𝑜(𝐵) 5
The probability of drawing 1 black ball : 𝑃 𝐵 = =
𝑜(𝑆) 14

𝑜(𝑅) 6
The probability of drawing 1 red ball : 𝑃 𝑅 = = 14
𝑜(𝑆)

Here B and R are mutually exclusive events. So by the addition theorem in probability
5 6 11
𝑃 𝐵 𝑜𝑟 𝑅 = 𝑃 𝐵 ∪ 𝑅 = 𝑃 𝐵 + 𝑃 𝑅 = 14 + 14 = 14
5 6 3
(d) 𝑃 𝐵 𝑜𝑟 𝑅 𝑜𝑟 𝑊 = 𝑃 𝐵 ∪ 𝑅 ∪ 𝑊 = 𝑃 𝐵 + 𝑃 𝑅 + 𝑃 𝑊 = + +
14 14 14
14
𝑃 𝐵 𝑜𝑟 𝑅 𝑜𝑟 𝑊 = 𝑃 𝐵 ∪ 𝑅 ∪ 𝑊 = 14 = 1
Question 2: A box contains 2 white and 2 black balls and a second box contains 2 white and 4 black balls.
If one ball is drawn at random from each box what is the probability that they are of the same color?

Solution :
Total no. of balls in the first box=2W+2B=4 and Total no. of balls in the second box=2W+4B=6
2 2 1
Case(i) Event A: Suppose both the balls drawn are white, then the probability is x =
4 6 6
2 4 1
Case(ii) Event B: Suppose both the balls drawn are black , then the probability is x =
4 6 3

Since either of these two cases are favourable to the event,


1 1 1
the probability is, p(A or B) = P(AՍB)= P(A) + P(B)= + 3=
6 2
Question 3: 5 balls are drawn at random from a bag of 6 white and 4 black balls. What is the probability that 3 of them
are white and 2 are black?

Solution:
➢ By data, 5 balls are to be drawn at random from the bag
Therefore 𝑜 𝑆 = 10 𝐶5 = 252

➢ Number of ways of getting 3 white and 2 black balls = 6𝐶3 ∗ 4𝐶2 = 120 ways.
➢ Therefore the probability of getting 3 white and 2 black balls from bag containing 10 balls from which 5 has to
drawn at random are

P(3 white and 2 black) =


6𝐶3 ∗4𝐶2
10 𝐶5
=
120
252
= 10/21 or 0.4762
Question 4: Suppose the manufacturer’s specifications for the length of a certain type of computer cable are 2000±10
millimeters. In this industry, it is known that small cable is just as likely to be defective (not meeting specifications)
as large cable (i.e.,) the probability of randomly producing a cable with length exceeding 2010 millimeters is equal
to probability of producing a cable with length smaller than 1990 millimeters. The probability that the production
procedure meets specifications is known to be 0.99.
(a) What is the probability that a cable selected randomly is too large?
(b) What is the probability that a randomly selected cable is larger than 1990 millimeters?

Solution:
➢ Let M be the event that a cable meets specifications. Given that P(M)=0.99
➢ Then P(𝑀)ഥ = 1-0.99=0.01 , where 𝑀 ഥ is the event that a cable does not meet specifications (i.e.,) length is more than 2010
(too large) and less than 1990 millimeters (too small).
➢ (a) Let L be the event that the cable is too large and S be the event that the cable is too small.
➢ P(L) = P(S) (given)
➢ P(L)+P(S)=0.01 ⇒ P(L)=0.01/2=0.005
➢ (b) Let X be the length of a randomly selected cable. Then P(1990≤X≤2010)=P(M)=0.99
➢ P(X≥2010) = P(L)=0.05
➢ P(X≥1990) = P(1990≤X≤2010)+P(L)=0.99+0.05 =0.995.
Question 5: A bag contains 4 white and 2 black balls. Another bag contains 3 white and 5 black balls. If a ball is drawn from
each, find the probability that (i) both are white (ii) both are black (iii) one is black and another is white.

Solution :
Total no. of balls in the first bag B1=4W+2B=6 and Total no. of balls in the second bag B2=3W+5B=8
4 3 1
(i) both the balls drawn are white, then the probability is x =
6 8 4
2 5 5
(ii) both the balls drawn are black , then the probability is x =
6 8 24
2 3 1
(iii) Event A: B from B1 and W from B2 and P(B from B1 and W from B2) = x = (OR)
6 8 8
4 5 5
Event B: W from B1 and B from B2 and P(W from B1 and B from B2) = 6
x 8
= 12

Since either of these two cases are favourable to the event,


1 5 13
the required probability is, p(A or B) = P(AՍB) = P(A) + P(B)= + =
8 12 24
Question 6: A bag contains 10 white and 3 red balls while another bag contains 3 white and 5 red balls. 2 balls are
drawn at random from the first bag and put in the second bag. Then a ball is drawn at random from the second bag.
What is the probability that it is a white ball?

Solution:

➢ B1: 10W+3R=13 balls ; B2 : 3W+5R=8 balls

➢ No. of possible ways of choosing two balls from the first bag is 13C2. The outcomes and their probabilities are
10C 5 10C
➢ (W and W): P(2W) = 13 2 ; B2+2W : 5W+5R=10 balls ; Then, P(1W) = X 13 2
C2 10 C2
3C 3 3C
➢ (R and R) : P(2R) = 13C2 ; B2+2R : 3W+7R=10 balls ; Then, P(1W) = 10X 13C2
2 2

10C 3 10C 3
1 x C1 4 1 x C1
➢ (W and R) : P(1W and 1R) = 13C ; B2+(1W+1R) : 4W+6R=10 balls ; Then, P(1W) =10X 13C
2 2

➢ Since either of these 3 cases are favourable to the event, the required probability is

5 10C 3 3C 4 10C1x3C1 1
X 13 2 + X 13 2 + X 13 = x 354 = 0.454
10 C2 10 C2 10 C2 10 X 13C2
Definition: Events A and B are independent if P(A ∩ B) = P(A)P(B); otherwise they are dependent.

Example:
➢ A fair coin is tossed three times, yielding the sample space S = {HHH, HHT, HTH, HTT, THH, THT, TT H, TTT }.
➢ Consider the events: A={first toss is head}={HHH,HHT, HTH, HTT}; B={Second toss is heads}={HHH, HHT,
THH, THT}.
1 1 1
➢ Clearly A and B are independent events; P(A)P(B) = 𝑋 = = P(A ∩ B)
2 2 4

Exercise: Suppose that we toss two fair dice. Let E1 denote the event that the sum of the dice is 6, E2
denote the event that the first die equals 4 and E3 denote the event that the sum is 7. Determine if E1 and
E2 are independent. Also, determine if E2 and E3 are independent.
Question: The probability that a person A solves the problem is 1/3, that of B is 1/2 and that of C is 3/5. If the problem is
simultaneously assigned to all of them what is the probability that the problem is solved? Assume that a given person,
independent of other two persons, solves the problem.
Solution :
➢ Note: Even if one of them solves the problem, it is presumed that the problem is solved.
➢ Let E be the event of solving the problem and p be the probability of solving the problem.
➢ The probability that the problem is solved p = P(at least one of them solves the problem) = 1-P(none of them solves the
ҧ P(𝐵).
problem) (i.e.,) P(E)=1- P(𝐴). ത P(𝐶)ҧ
1 2
➢ The probability that A doesnot solve the problem P(𝐴)ҧ =1 − =
3 3
1 1
ത =1 −
➢ The probability that B doesnot solve the problem P(𝐵) =
2 2
3 2
➢ The probability that C doesnot solve the problem P(𝐶)ҧ =1 − =
5 5
2 1 2 2 2
⸫ The probability that none of them solves the problem = x x = (i.e.,) q=
3 2 5 15 15
2 13
Then, the probability of solving the problem is p =1-q = 1 - = .
15 15
Conditional Probability:

➢ To find the probability of an event E with respect to a reduced sample space.

➢ Let A and B be two events. Probability of the happening of the event B when the event A has already
happened is called the conditional probability and it is denoted by P(B|A).

𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒 𝑜𝑓 𝑏𝑜𝑡ℎ 𝐵 𝑎𝑛𝑑 𝐴


➢ P(B|A)=
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑔𝑖𝑣𝑒𝑛 𝑒𝑣𝑒𝑛𝑡 𝐴

𝑃 𝐴∩𝐵 𝑃 𝐴∩𝐵
➢ P(B|A)= , 𝑤ℎ𝑒𝑟𝑒 𝑃 𝐴 > 0 and P(A|B) = , 𝑤ℎ𝑒𝑟𝑒 𝑃 𝐵 > 0.
𝑃(𝐴) 𝑃(𝐵)

➢ 𝑃 𝐴 ∩ 𝐵 = P(B|A). 𝑃(𝐴) and 𝑃 𝐴 ∩ 𝐵 = P(A|B). 𝑃(𝐵) (Multiplication rule of probability)

➢ Special case: If A and B are independent events (i.e.,) P(AꓵB)= P(A).P(B) , then
𝑃 𝐴 𝑃 𝐵
➢ P(B|A)= 𝑃(𝐴)
= 𝑃(𝐵)
𝑃 𝐴 𝑃 𝐵
➢ P(A|B)= 𝑃(𝐵)
= 𝑃(𝐴)
➢ For example, consider the following table

➢ Total sample space is 420 persons. If a person is male, what is the probability that he is unemployed?

➢ In this case we should consider only the reduced sample space of males (only) (since it is given or known that
140 unemployed and male
the person is male). Thus the required probability is 300 = male
Example 1: The probability that it is Friday and that a student is absent is 0.03. Since there are 5 school days in a week,
the probability that it is Friday is 0.2. What is the probability that a student is absent given that today is Friday?

P(Friday and Absent) 0.03


Solution: P(Absent|Friday) = = = 0.15
P(Friday) 0.2

Example 2: A jar contains black and white marbles. Two marbles are chosen without replacement. The probability of
selecting a black marble and a white marble is 0.34, and the probability of selecting a black marble on the first draw is
0.47. What is the probability of selecting a white marble on the second draw, given that the first marble drawn was
black?
P(white and black) 0.34
Solution: P(white|black) = = 0.47 = 0.72
P(black)

Question: The probability of raining on Sunday is 0.07. If today is Sunday then find the probability of rain today?

Solution:
Let P(rain) be the probability of rain and P(Sunday) be the probability of Sunday.
1
Then, P(Sunday) = 7 ; P(rain and Sunday)=0.07
P(rain and Sunday) 0.07
P(rain|Sunday) = = = 0.49
P(Sunday) 0.14
Question: A coin is flipped twice . Assuming that all four points in the sample space are equally likely , What is the
conditional probability that both flips land on heads, given that (a) the first flip lands on heads? (b) at least one flip lands
on heads?

Solution:

➢ Sample space S={{HH},{HT},{TH},{TT}}

1
➢ Let A be the event that both flips land on heads : A={HH} ; P(A)= 4

2 1
➢ Let B be the event that first flip land on heads : B={HT,HH} ; P(B)= =
4 2

3
➢ Let C be the event that at least one flip land on heads : C={HH,HT,TH}; P(C)= 4

1 1
➢ 𝐴 ∩ 𝐵 = {HH} : P(𝐴 ∩ 𝐵)= ; A ∩ C = {HH} : P(𝐴 ∩ 𝐶)=
4 4
1
𝑃 𝐴∩𝐵 4 1
➢ (a) P(A|B) = = 1 =
𝑃(𝐵) 2
2
1
𝑃 𝐴∩𝐶 4 1
➢ (b) P(A|C) = = 3 =
𝑃(𝐶) 3
4
Question: Rahul is undecided as to whether to take a French course or a chemistry course. He estimates that his
probability of receiving an A grade given that he takes a French course would be 1/2 and 2/3 in a Chemistry course. If
Rahul decides to base his decision on the flip of a fair coin, what is the probability that he gets an A in chemistry?

Solution:
1
➢ Let C be the event that Rahul takes chemistry : P(C)= 2
➢ Let A be the event that he gets an A grade in whatever course he takes.
2
➢ The probability that he gets an A grade when he takes chemistry : P(A|C) =
3
1 2 1
➢ Then, the probability that he gets an A in chemistry : 𝑃 𝐴 ∩ 𝐶 = P(C). P(A|C) = 𝑋 =
2 3 3

Question: Joe is 80% certain that his missing key is in one of the two pockets of his hanging jacket, being 40% certain it is
in the left-hand pocket and 40% certain it is in the right- hand pocket . If a search of the left hand pocket does not find the
key, What is the conditional probability that it is in the other pocket?

Solution:
➢ Let P(L) be the probability that the key is in the left hand pocket, P(L)=0.4
➢ Let P(R) be the probability that the key is in the right hand pocket, P(R)=0.4
➢ The probability that the key is in right pocket given that it is not in left pocket,
𝑃 𝑅∩𝐿ത
P(R|𝐿ത ) =
𝑃 𝐿ത
𝑃 𝑅 𝑃 𝑅 0.4
= = = = 0.6667
𝑃 𝐿ത 1−𝑃 𝐿 1 − 0.4
Theorem of total probability:

Statement : Let A1, A2, . . . , An be a set of exhaustive and mutually exclusive events of the sample space S with P(Ai) ≠ 0
for each i. If A (any event of S) is a subset of union of Ai denoted by (A ⊂ ‫=𝑖𝑛ڂ‬1 𝐴𝑖 ) , then

P(A) = σ𝒏𝒊=𝟏 P(𝑨𝒊 ∩ A) = σ𝒏𝒊=𝟏 P(𝑨𝒊 )P(A|𝑨𝒊 )


Question: Three machines A, B and C produce respectively 50%, 30% and 20% of the total number of items of a factory. The
percentage of defective output of these machines are 3%, 4% and 5%. If an item is selected at random, find the probability
that the item is defective.

Solution: Let D be the event that the selected item is defective. Let A,B,C be the event that the item is manufactured by
machine A,B and C respectively.
Baye’s theorem on conditional probability (theorem of inverse probability)

Statement : Let A1, A2, . . . , An be a set of exhaustive and mutually exclusive events of
the sample space S with P(Ai) ≠ 0 for each i . If A (any event of S) is a subset of union of
Ai denoted by (A ⊂ ‫=𝑖𝑛ڂ‬1 𝐴𝑖 ) with P(A) ≠ 0 then

P(𝑨 )P(A|𝑨𝒊) P(𝑨𝒊 )P(A|Ai)


P(Ai|A) = σ𝑛 𝒊 =
𝑖=1 P(𝑨𝒊 )P(A|Ai) 𝑃(𝐴)
Question 1: A bin contains 3 different types of disposable flashlights. The probability that a type 1 flashlight
will give more than 100 hours of use is 0.7, with the corresponding probabilities for type 2 and type 3
flashlights being .4 and .3, respectively. Suppose that 20 percent of the flashlights in the bin are type 1,
30 percent are type 2 and 50 percent are type 3.

i) What is the probability that a randomly chosen flashlight will give more than 100 hours of use?

ii) Given that a flashlight lasted over 100 hours, what is the conditional probability that it was a type j
flashlight, j=1,2,3?

Solution:
➢ Let A be the event that the flashlight will give more than 100 hours of use and Let F1 , F2 , F3 be the event
that a type 1, 2,3 flashlight, respectively, is chosen.
➢ P(F1)=20/100 = 0.2 ; P(F2)=30/100 = 0.3 ; P(F3)=50/100 = 0.5.
➢ P(A| F1)=0.7 ; P(A| F2)=0.4 ; P(A| F3)=0.3
➢ (i) P(A)= P(F1)P(A|F1)+ P(F2)P(A|F2)+ P(F3)P(A|F3) =
(0.2)(0.7)+(0.3)(0.4)+(0.5)(0.3)=0.41
➢ (i.e.,) there is 41% chance that the flash light will last for more than 100 hours.

➢ By Baye’s theorem,

P(F1)P(A|F1) P(F1)P(A|F1) (0.2)(0.7) 0.14 14


➢ (ii) P(F1|A) = σ3𝑖=1 P(Fi)P(A|Fi)
= = = = = 0.341
𝑃(𝐴) 0.41 0.41 41

P(F )P(A|F2) P(F2)P(A|F2) (0.3)(0.4) 0.12 12


➢ P(F2|A) = σ3 2 = = = = = 0.292
𝑖=1 P(Fi)P(A|Fi) 𝑃(𝐴) 0.41 0.41 41

P(F )P(A|F3) P(F3)P(A|F3) (0.5)(0.3) 0.15 15


➢ P(F3|A) = σ3 3 = = = = = 0.365
𝑖=1 P(Fi)P(A|Fi) 𝑃(𝐴) 0.41 0.41 41
Question 2: In a class 70% are boys and 30% are girls. 5% of boys, 3% of the girls are irregular to the classes.
(i) What is the probability of a student selected at random is irregular to the classes?
(ii) What is the probability that the irregular student is a girl?
Solution:
➢ Probability of selecting a boy P(B)=70/100 = 0.7 and Probability of selecting a girl P(G)=30/100=0.3
➢ Let A be the event of selecting an irregular student.
➢ P(A|B)=5/100=0.05 and P(A|G)=3/100=0.03
➢(i) P(A)= P(B)P(A|B)+ P(G)P(A|G)
= (0.7)(0.05)+(0.3)(0.03)=0.044
(i.e.,) probability of selecting an irregular student is 0.044.

➢(ii) By Baye’s theorem,

P(G)P(A|G)
➢ P(𝐺|𝐴) =
P(B)P(A|B)+ P(G)P(A|G)

P(G)P(A|G) (0.3)(0.03) 0.009


= = = = 0.204
𝑃(𝐴) 0.044 0.044
Question 3: In a bolt factory there are four machines A, B, C, D manufacturing respectively 20%, 15%, 25%, 40% of the
total production. Out of these 5%, 4%, 3%, 2% are defective. If a bolt drawn at random was found defective what is the
probability that it was manufactured by A or D?

Solution:
➢ Probability that the bolt is manufactured by A,B,C,D :

➢ P(A)=20/100 = 0.2, P(B)=15/100=0.15, P(C)=25/100=0.25, P(D)=40/100=0.4,

➢ Let X be the event of selecting a defective bolt.

➢ P(X|A)=5/100=0.05, P(X|B)=4/100=0.04, P(X|C)=3/100=0.03, P(X|D)=2/100=0.02

➢ To compute : P( (A U D) | X)

➢ Since A and D are mutually exclusive events, P( (A U D) | X) = P(A|X)+ P(D|X)


➢ By Baye’s theorem,

P(A)P(X|A) (0.2)(0.05)
➢ P(A|X) = =
P(A)P(X|A)+ P(B)P(X|B)+P(C)P(X|C)+P(D)P(X|D) 0.2 0.05 + 0.15 0.04 + 0.25 0.03 +(0.4)(0.02)

(0.01)
= = 0.317
0.0315

P(D)P(X|D) (0.4)(0.02)
➢ P(D|X) = =
P(A)P(X|A)+ P(B)P(X|B)+P(C)P(X|C)+P(D)P(X|D) 0.2 0.05 + 0.15 0.04 + 0.25 0.03 +(0.4)(0.02)

(0.008)
= = 0.253
0.0315

➢ P( (A U D) | X) = P(A|X)+ P(D|X) = 0.317+0.253=0.57


Question 4: A bag contains three coins, one of which is two headed and the other two
are normal and fair. A coin is chosen at random from the bag and tossed four times in
succession.
i) Find the probability that head turns up each time.
ii) Given that head turns up each time in succession, what is the probability that it was
the two headed coin?

Solution:
➢ Let C1 be the two headed coin and C2 , C3 be the normal coins.
➢ Probability that the chosen coin is C1 , C2 , C3 is P(C1)=1/3 , P(C2)=1/3, P(C3)=1/3.
➢ Let E be the event of getting 4 heads in succession.

➢ P(E|C1) = 1 because C1 is a two headed coin ( Getting a head {H} is the sure event).

11 11 1 11 1 1 1
➢ P(E|C2) = = and P(E|C3)= = , because P({H}) = ½ in a normal coin.
2 2 2 2 16 2 2 2 2 16

➢ (i) To compute : P(E) = P(C1)P(E|C1)+ P(C2)P(E|C2)+ P(C3)P(E|C3)


1 1 1 1 1 18 3
= (1)+ + = = .
3 16 3 16 3 48 8

1
P(C1)P(E|C1) (1) 8
➢ (ii) To compute : P(C1 | E) = = 3
3 = .
𝑷(𝑬) 9
8
Question 5: A company manufactures ball pens in two colors blue & red and make packets of 10 pens with 5 pens of
each color. In a particular shop it was found that after sales , packet 1 contained 3 blue and 2 red pens, packet 2
contained 3 blue and 5 red pens. On the demand of a customer for a pen, the packet was drawn at random and a pen
was taken out. It was found blue. Find the probability that packet 1 was selected.

Solution:
➢ Let B be the event of selecting a blue pen and
Let E1 , E2 be the event of selecting packet 1, 2
respectively.
➢ P(E1)=1/2 ; P(E2)=1/2
➢ P(B| E1)=3/5 ; P(B| E2)=3/8
➢ To find : P(E1 |B)
➢ By Baye’s theorem,
P(E1)P(B|E1)
➢ P(E1 |B) = =
P(E1)P(B|E1)+ P(E2)P(E|B2)
1 3
(2)(5) 0.3
1 3 1 3 =
+ 0.3+0.1875
2 5 2 8
0.3
= = 0.615
0.4875
Question 6: The chance that a doctor will diagnose a disease correctly is 60% The chance that a patient will die after
correct diagnose is 40% and the chance of death by wrong diagnosis is 70%. If a patient dies, what is the chance that his
disease was correctly diagnosed?

Solution:
➢ Let A be the event of correct diagnosis
and B be the event of wrong diagnosis
➢ P(A)=60/100 = 0.6, P(B)=40/100=0.4
➢ Let E be the event that the patient dies.
➢ P(E|A)=40/100=0.4, P(E|B)=70/100=0.7
➢ To compute : P(A |E)

➢ By Baye’s theorem,
P(A)P(E|A)
P(A |E) =
P(A)P(E|A)+ P(B)P(E|B)

(0.6)(0.4)
= = 0.4615
(0.6)(0.4)+(0.4)(0.7)
Department of Mathematics
Course Title: PROBABILITY AND STATISTICS
Module 2: Random Variables and their Properties and Probability Distributions

Contents
 Random variables
 Discrete Probability Distributions
 Binomial distribution
 Poisson distribution
 Continuous Probability Distributions
 Normal distribution
 Exponential distribution
 Joint Probability Distributions
Random Variables
Definitions:
 Random variables: A random variable X, X: S → ℝ, is a function that associates a real number with each element
in the sample space.
 We shall use a capital letter X to denote a random variable and small letter 𝑥 for one of its values.
 Range space is the set of all possible values of a random variable X(which is a subset of real numbers ℝ).
Examples:
 When tossing a fair coin, the sample space is S={H,T}. Let the value 1 is assigned to head and the value 0 is
assigned to tail. If X is the random variable, then X(H)=1, x(T)=0. Range of X={0,1}

 Suppose a coin is tossed twice S={HH,HT,TH,TT}. Let X denote the number of heads in the outcome and Y denote
number of tails in the outcome. Range of X={0,1,2} and Range of Y={0,1,2}

Outcome HH HT TH TT
X 2 1 1 0
Y 0 1 1 2
 Suppose we toss 3 fair coins and let Y denote the number of heads that appear, then Y is a random
variable taking on one of the values 0,1,2,3 with respective probabilities
 P{Y=0}=P{TTT}=1/8
 P{Y=1}=P{TTH,THT,HTT}=3/8
 P{Y=2}=P{THH,HTH,HHT}=3/8
 P{Y=3}=P{HHH}=1/8

 Discrete random variable: If a random variable takes finite or countably infinite number of possible values,
then it is discrete.
 Examples : Tossing a coin and observing number of heads turning up.
Throwing a pair of die and assign the sum of numbers that appear on two dice.

 Continuous random variable: If a random variable takes uncountable number of possible values, then it is
continuous (possible values comprise either a single interval on the number line or a union of disjoint intervals).
Examples:
 Let X be the random variable defined by the waiting time, in hours, between successive speeders
spotted by a radar unit. The random variable X takes on all values 𝑥 for which 𝑥 ≥ 0.
 The distance travelled by a certain automobile over a prescribed test course on 5 litres of gasoline. In
this case, we have infinite number of possible distances in the sample space.
DISCRETE PROBABILITY DISTRIBUTIONS
Definitions:

 A function p(x) ( or P(X) ) is called probability mass function (p.m.f) of the discrete random
variable X, if for each value xi (i=1,2,…) of a discrete random variable X, we assign a real
number p(xi) such that
 𝑝(xi) ≥0 and p(x)=0 for all other values of x.
 σ𝑖 𝑝(xi)=1
where 𝑝(xi) denote the probability that X take the values xi, (i.e.,) P(X = xi) = pi (or) p(xi).

 The set of values (xi,p(xi)) is called discrete probability distribution of the random variable X.

 The cumulative distribution function 𝐹(𝑥) of a discrete random variable X with probability
distribution p 𝑥 is
𝐹 𝑥 = P X ≤ 𝑥 =, σ𝑡≤𝑥 𝑝(𝑡) , −∞ < 𝑥 < ∞.
 If X is a discrete random variable having a probability mass function p(x), then the
expectation, or the expected value of X, denoted by E[X], is defined by

 E[X] = σ∞
𝑖=1 xi 𝑝(xi) 𝑀𝑒𝑎𝑛 μ

 If X is a random variable with mean µ, then the variance of X, denoted by Var(X), is defined
by
 Var(X) = E[(X − µ)2 ] = E[X2] − (E[X])2 ( Variance σ2 = σ∞ 2
𝑖=1 xi − μ 𝑝 xi )

 Var(X) = σ∞ 2 ∞
𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 ) − [σ𝑖=1 𝑥𝑖𝑝(𝑥𝑖 )]
2

 The square root of the Var(X) is called the standard deviation of X, and we denote it by
SD(X) (i.e.,)
 SD(X) = 𝑉𝑎𝑟(𝑋) (Standard deviation σ)

 Question: Find E[X], Var(X), where X is the outcome when we roll a fair die.
Question 1: A random variable X has the following probability function:

Solution:
 Since σ𝑖 𝑝(xi)=1, 0+k+2k+2k+3k+k2+2k2+7k2+k=1 which implies 10k2+9k-1=0 ⇒(10k-1)(k+1)=0

 k=1/10 (or) k=-1


 If k = -1, the condition 𝑝(xi) ≥0 fails. Therefore, k=1/10.

x 0 1 2 3 4 5 6 7
P(x) 0 1/10 1/5 1/5 3/10 1/100 1/50 17/100
 P(X<6)=P(X=0)+P(X=1)+P(X=2)+P(X=3)+P(X=4)+P(X=5) = 0+1/10+1/5+1/5+3/10+1/100 = 0.81

 P(X≥6)=P(X=6)+P(X=7)=1/50+17/100=0.19

 P(3<X≤6)=P(X=4)+P(X=5)+P(X=6)=3/10+1/100+1/50=0.33

1 1 1 3 1 1 17
 Mean μ = σ8𝑖=1 xi 𝑝(xi) = 0 + 1 +2 +3 +4 +5 +6 +7 = 3.66
10 5 5 10 100 50 100

 Var σ2= σ8𝑖=1 𝑥𝑖2𝑝(𝑥𝑖 ) − [σ8𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 )]2 = σ8𝑖=1 𝑥𝑖2𝑝(𝑥𝑖 ) − μ2

1 1 1 3 1 1 17 2
=0+1 +4 +9 + 16 + 25 + 36 + 49 − 3.66 =3.40
10 5 5 10 100 50 100

 Cumulative distribution function f(x)=P(X≤𝒙) = σ𝑥𝑖=1 𝑝(xi)

x 0 1 2 3 4 5 6 7
P(x) 0 0+0.1=0. 0.1+0.2= 0.3+0.2= 0.5+0.3= 0.8+0.01=0.8 0.81+0.02 0.83+0.17
1 0.3 0.5 0.8 1 =0.83 =1
Question 2: If random variable X take the value 1,2,3,4 such that 2P{X = 1} = 3P{X = 2} = P{X = 3} = 5P{X = 4}, find
the probability distribution and cumulative distribution of X.

Solution:
 Let the distribution (x,p(x)) be

x 1 2 3 4
p(x) p1 p2 p3 p4

 Given : 2p1=3p2=p3=5p4 ⇒p2=2/3(p1) ; p3=2(p1) ; p4=2/5(p1)


 Since σ𝑖 𝑝(xi)=1, p1+p2+p3+p4=1,
 p1+2/3(p1)+2(p1)+2/5(p1) =1 ⇒ p1=15/61
 Then, p2=10/61 ; p3=30/61 ; p4=6/61

x 1 2 3 4
p(x) 15/61 10/61 30/61 6/61

f(x) 15/61 25/61 55/61 61/61=1


Question 3: Let X be a random variable giving the number of heads minus number of tails in three tosses of a
coin. List the elements of the sample space S for the three tosses of the coin and to each sample point assign a
value 𝑥 of X. Compute E[X] and var(X).

Solution: The sample space S and the values of random variables X are as follows:

S HHH HHT HTH HTT THH THT TTH TTT


𝑋=𝑥 3 1 1 -1 1 -1 -1 -3

The probability distribution of X:


 X assumes the values -3,-1,1,3.
1 3 3 1
 𝑃 𝑋 = −3 = , 𝑃 𝑋 = −1 = , 𝑃 𝑋 = 1 = , 𝑃 𝑋 = 3 =
8 8 8 8

x -3 -1 1 3
P(x) 1/8 3/8 3/8 1/8

 Mean μ = σ4𝑖=1 xi 𝑝(xi)


1 3 3 1
 = −3 −1 +1 +3 =0
8 8 8 8

1 3 3 1
 Var σ2= σ4𝑖=1 𝑥𝑖2𝑝(𝑥𝑖 ) − μ2 = 9 +1 +1 +9 =3−0=3
8 8 8 8
Question 4: A random experiment of tossing a ’die’ twice is performed. Let X denotes the random variable of
sum of two numbers turning up on the toss. Compute E[X] and standard deviation.
Solution:
 S={(x,y):x=1,2,…6 and y=1,2,…6} and number of elements in S o(S)=36
 Range of X={2,3,4,5,6,…,12}
𝑜(𝐸𝑣𝑒𝑛𝑡 1) 1 𝑜(𝐸𝑣𝑒𝑛𝑡 2) 2
 p(x1)= = ; p(x2)= = and so on.
𝑜 𝑆 36 𝑜 𝑆 36
 The discrete probability distribution for X is given by

 Mean μ = σ11
𝑖=1 xi 𝑝(xi)
1 2 3 4 5 6 5 4
 =2 +3 +4 +5 +6 +7 +8 +9
36 36 36 36 36 36 36 36
3 2 1 252
+10 + 11 + 12 = = 7
36 36 36 36

 Var σ2= σ11 2 11 2 11


𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 ) − [σ𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 )] = σ𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 ) − μ
2 2

1 2 3 4 5 6 5 4 3 2
 =4 +9 + 16 + 25 + 36 + 49 + 64 + 81 + 100 + 121
36 36 36 36 36 36 36 36 36 36
1 1974 1974−1764 210 35
+ 144 − (49) = − 49 = = =
36 36 36 36 6

35
 Standard deviation: σ= 𝑣𝑎𝑟(𝑋) =
6
Question 5: From a sealed box containing a dozen apples it was found that 3 apples have perished.
Obtain the probability distribution of the number of perished apples when 2 apples are drawn at
random. Also find the mean and variance of this distribution.

Solution:

 3Perished+9good(G)=12

 No. of possible ways of choosing 2 from 12 is 12C


2.

 Let X denote the number of perished apples. Then X can take the values 0,1,2.
3C 𝑋 9𝐶 6
 P(X=0)= P(0 Perished and 2 good) = 012 2 =
C2 11

3C 𝑋 9𝐶 9
 P(X=1)= P(1 Perished and 1 good) = 1
12C
1
= 22
2
3C 𝑋 9𝐶 1
 P(X=2)= P(2 Perished and 0 good) = 2
12C
0
= 22
2
 The probability distribution is given by

X=xi 0 1 2
p(xi) 6/11 9/22 1/22

6 9 1 11
 Mean μ = σ3𝑖=1 xi 𝑝(xi) = 0 +1 +2 = = 1/2
11 22 22 22

6 9 1
 Var σ2 = σ3𝑖=1 𝑥𝑖2𝑝(𝑥𝑖 ) − [σ3𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 )]2 = σ3𝑖=1 𝑥𝑖2𝑝(𝑥𝑖 ) − μ2 = 0 +1 +4 − (1/2)2 =
11 22 22
13 1
− = 15/44
22 4
Bernoulli trial

 A random experiment with only two possible outcomes categorized as


success and failure is called a Bernoulli trial where the probability of success
‘p’ is same for each trial.

 If we let X=1 when the outcome is a success and X=0 when it is a failure,
then the probability mass function of X is given by
 P(X=0)=1-p and P(X=1)=p ------- (1)
where p is the probability that the trial is a success.

 A random variable X is said to be a Bernoulli random variable if its


probability mass function is given by equation (1) for some p ∈ (0,1).
 If p is the probability of success and q is the probability of failure, the probability of
‘x’ successes out of ‘n’ trials is given by 𝒏𝒄𝒙 𝒑𝒙 𝒒𝒏−𝒙
Binomial Distribution
 Binomial distribution is a discrete probability distribution.
 Suppose an experiment is repeated ‘n’ number of times (‘n’ trials) which results in success
with probability p or in a failure with probability 1-p. If X represents the number of
successes that occur in ‘n’ trials, then X is a said to be a binomial random variable.
 Definition: The probability mass function of a binomial random variable x, with parameters
(n,p) is the number of successes in ‘n’ Bernoulli trials and it is given by
P(x)= 𝑛𝑐𝑥 𝑝 𝑥 𝑞𝑛−𝑥 , x=0,1,2,…,n (Probability distribution of Binomial random variable X)
Note:
 σ∞𝑥=0 𝑝 𝑥 = 1
 since σ∞ ∞ 𝑛 𝑥
𝑥=0 𝑝 𝑥 = σ𝑥=0 𝑐𝑥 𝑝 (1 − 𝑝)
𝑛−𝑥
= [𝑝 + (1 − 𝑝)]𝑛 = 1.
 Mean of the binomial distribution μ =np
 Variance of the binomial distribution σ2=npq=np(1-p)
 SD of the binomial distribution σ= Var = npq

Exercise: If X is a binomial random variable with parameters n and p, then prove that E[X] = np
and Var[X] = np(1 − p)
Example: Five fair coins are flipped. If the outcomes are assumed independent, find the
probability mass function of the number of heads obtained.
Question 1: In a quiz contest of answering ’Yes’ or ’No’, (i) What is the probability of guessing at least six answers
correctly out of ten questions asked? (ii) Find the probability of guessing at least six answers correctly if there are
four options for a correct answer.
Solution: Let x denote the no. of correct answers, p denote the probability of guessing a correct
answer and q denote the probability of guessing a wrong answer.
 (i) 2 options: yes or no answer : p =1/2 and q=1/2 ; n=10
𝟏 𝟏 𝟏
 The Binomial probability function P(x)= 𝒏𝒄𝒙 𝒑𝒙 𝒒𝒏−𝒙 = 𝟏𝟎𝒄𝒙(𝟐)𝒙 (𝟐)𝟏𝟎−𝒙 = 𝟏𝟎𝒄𝒙(𝟐)𝟏𝟎
 To find : P(x≥6)=P(x=6)+P(x=7)+P(x=8)+P(x=9)+P(x=10)
1 1 386
 =210 (𝟏𝟎𝒄𝟔+ 𝟏𝟎𝒄𝟕+ 𝟏𝟎𝒄𝟖+ 𝟏𝟎𝒄𝟗+ 𝟏𝟎𝒄𝟏𝟎)= 210(210+120+45+10+1)= 1024 = 0.377
 P(x≥6)= 0.377
 (ii) 4 options : p=1/4 ; q=3/4 ; n=10
𝟏 𝟑 1
 The Binomial probability function P(x)= 𝒏𝒄𝒙 𝒑𝒙 𝒒𝒏−𝒙 = 𝟏𝟎𝒄𝒙( )𝒙 ( )𝟏𝟎−𝒙 = 10 𝟏𝟎𝒄𝒙 𝟑𝟏𝟎−𝒙
𝟒 𝟒 4
 To find : P(x≥6)=P(x=6)+P(x=7)+P(x=8)+P(x=9)+P(x=10)
1 1
 = 10 (𝟏𝟎𝒄𝟔(34 )+ 𝟏𝟎𝒄𝟕(33 )+ 𝟏𝟎𝒄𝟖(32 )+ 𝟏𝟎𝒄𝟗(31 )+ 𝟏𝟎𝒄𝟏𝟎(30 ))= 10(17010+3240+405+30+1)
4 4
20686
 = 1048576 = 0.019
 P(x≥6)=0.019
Question 2: A large chain retailer purchases a certain kind of electronic device from a manufacturer. The
manufacturer indicates that the defective rate of the device is 3%.
(a) The inspector randomly picks 20 items from a shipment. What is the probability that there will be at least
one defective item among these 20?
(b) Suppose that the retailer receives 10 shipments in a month and the inspector randomly tests 20 devices per
shipment. What is the probability that there will be exactly 3 shipments each containing at least one
defective device among the 20 that are selected and tested from the shipment?
Solution:
(a) Let X denotes the number of defective devices among the 20, p denote the probability of a defective item and q
denote the probability of non defective item.
3 3
 n=20 ; p = = 0.03 and q= 1 − = 97/100 = 0.97
100 100
 Then the probability distribution of X is P(x)= 𝒏𝒄𝒙 𝒑𝒙 𝒒𝒏−𝒙 = 𝟐𝟎𝒄𝒙(0.03)𝒙 (0.97)𝟐𝟎−𝒙

 P(X≥1)=1 − 𝑃 𝑥 = 0 = 1 − 𝟐𝟎𝒄𝟎(0.03)𝟎 (0.97)𝟐𝟎 = 0.4562


(b) In this case, each shipment can either contain at least one defective item or not. Hence, testing of each
shipment can be viewed as a Bernoulli trial with probability 𝑝 = 0.4562 (from part (a)).
 Let Y denotes the number of shipments containing at least one defective item. Then, n=10, 𝑝 =
0.4562, 𝑞 = 1 − 0.4562 = 0.5438
 Then the probability distribution of X is P(y)= 𝒏𝒄𝒚 𝒑𝒚𝒒𝒏−𝒚 = 𝟏𝟎𝒄𝒚(0.4562)𝒚 (0.5438)𝟏𝟎−𝒚

 𝑃 𝑌 = 3 = 𝟏𝟎𝒄𝟑(0.4562)𝟑 (0.5438)𝟕 = 0.1602.


Question 3: The probability that a pen manufactured by a factory be defective is 1/10. If 12 such pens
are manufactured, what is the probability that (i) exactly 3 are defective (ii) at least 4 are defective (iii)
at most 2 are defective (iv)none of them are defective (v) two or more are defective.

Solution: Let x denote the no. of defective pens, p denote the probability of a defective pen
and q denote the probability of non defective pen.

 n=12 ; p =1/10 and q=1 - 1/10 = 9/10

𝟏 𝟗
 The Binomial probability function P(x)= 𝒏𝒄𝒙 𝒑𝒙 𝒒𝒏−𝒙 = 𝟏𝟐𝒄𝒙(𝟏𝟎)𝒙 (𝟏𝟎)𝟏𝟐−𝒙
𝟏 𝟗
 (i) P(x=3)= 𝟏𝟐𝒄𝟑(𝟏𝟎)𝟑 (𝟏𝟎)𝟗 = (220)(0.001)(0.3874)=0.085

 P(x=3)= 0.085
 (ii) P(x≥4) = 1-{P(x=0)+P(x=1)+P(x=2)+P(x=3)}

𝟏 𝟗 𝟏 𝟗 𝟏 𝟗 𝟏 𝟗
 =1-{𝟏𝟐𝒄𝟎(𝟏𝟎)𝟎 (𝟏𝟎)𝟏𝟐 + 𝟏𝟐𝒄𝟏(𝟏𝟎)𝟏 (𝟏𝟎)𝟏𝟏 + 𝟏𝟐𝒄𝟐(𝟏𝟎)𝟐 (𝟏𝟎)𝟏𝟎 + 𝟏𝟐𝒄𝟑(𝟏𝟎)𝟑 (𝟏𝟎)𝟗 }

 =1-0.9743=0.0257

𝟏 𝟗 𝟏 𝟗 𝟏 𝟗
 (iii) P(x≤2) = P(x=0)+P(x=1)+P(x=2)=𝟏𝟐𝒄𝟎(𝟏𝟎)𝟎 (𝟏𝟎)𝟏𝟐 + 𝟏𝟐𝒄𝟏(𝟏𝟎)𝟏 (𝟏𝟎)𝟏𝟏 + 𝟏𝟐𝒄𝟐(𝟏𝟎)𝟐 (𝟏𝟎)𝟏𝟎
 =0.8891

𝟏 𝟗
 (iv) P(x=0) = 𝟏𝟐𝒄𝟎( )𝟎 ( )𝟏𝟐 = 0.2824
𝟏𝟎 𝟏𝟎

𝟏 𝟗 𝟏 𝟗
 (v) P(x≥2) = 1-{P(x=0)+P(x=1)}=1-{𝟏𝟐𝒄𝟎(𝟏𝟎)𝟎 (𝟏𝟎)𝟏𝟐 + 𝟏𝟐𝒄𝟏(𝟏𝟎)𝟏 (𝟏𝟎)𝟏𝟏 }=1-0.0.6590=0.341
Question 4: Find the binomial distribution which has mean 2 and variance 4/3.

Solution:
 Mean np = 2 and variance npq = 4/3
2
 (2)q=4/3 ⇒ q=2/3 ; p=1-q=1 − 3 = 1/3
 n(1/3)=2 ⇒ n=6.
𝟏 𝟐
Then, P(x)= 𝒏𝒄𝒙 𝒑𝒙 𝒒𝒏−𝒙 = 𝟔𝒄𝒙(𝟑)𝒙 (𝟑)𝟔−𝒙

x 0 1 2 3 4 5 6
P(x) 𝟎 𝟔 𝟏 𝟓 𝟐 𝟒 𝟑 𝟑 𝟒 𝟐 𝟓 𝟏 𝟔 𝟎
𝟔𝒄
𝟏 𝟐 𝟔𝒄
𝟏 𝟐 𝟔𝒄
𝟏 𝟐 𝟔𝒄
𝟏 𝟐 𝟔𝒄
𝟏 𝟐 𝟔𝒄
𝟏 𝟐 𝟔𝒄
𝟏 𝟐
𝟎 𝟏 𝟐 𝟑 𝟒 𝟓 𝟔
𝟑 𝟑 𝟑 𝟑 𝟑 𝟑 𝟑 𝟑 𝟑 𝟑 𝟑 𝟑 𝟑 𝟑
Question 5: Out of 800 families with 5 children each, how many families would you expect to
have (a) three boys (b) five girls (c) either two or three boys (d) at most two girls. Assume
equal probabilities for boys and girls.

Solution:
 Let x denote the number of boys in a family.
𝟏 𝟏 𝟏 𝟓
Then, P(x)= 𝒏𝒄𝒙 𝒑𝒙 𝒒𝒏−𝒙 = 𝟓𝒄𝒙(𝟐)𝒙 (𝟐)𝟓−𝒙 = 𝟑𝟐
𝒄𝒙
1 5 5
 (a) Probability of a family having 3 boys P(X=3)= 𝑐3=
32 16
5
 Expected number of families having 3 boys out of 5 children= 800(16)=250

1 5 1
 (b) Probability of a family having 5 girls P(x=0)= 𝑐 0=
32 32
1
 Expected number of families having 3 boys out of 5 children= 800( )=25
32
 (c) Probability of a family having either 2 or 3 boys P(X=2)+P(X=3)
1 5 1 5 10 10
= 𝑐 2+ 𝑐 3 = + 32
32 32 32
20
 Expected number of families = 800(32)=500

 (d) Probability of a family having at most 2 girls P(x=5)+P(x=4)+P(x=3)

1 5 1 5 1 5 1
= 𝑐5 + 𝑐 4+ 𝑐3 =2
32 32 32

1
 Expected number of families having 3 boys out of 5 children= 800(2)=400
Question 6: The probability of a shooter hitting a target is 1/3. How many times he should
shoot so that the probability of hitting the target at least once is more than ¾.

Solution:
 Let p denote the probability of hitting a target , p=1/3. Then, q=1-(1/3)=2/3
𝑛𝑐 𝑝 𝑥 𝑞 𝑛−𝑥 1 𝑥 2 𝑛−𝑥
 Then, P(x)= 𝑥
𝑛𝑐
= 𝑥 3
3
𝟑
 To find ‘n’ such that P 𝒙 ≥ 𝟏 > 𝟒
3 3
 (1-P(x<1))> ⇒ (1-P(x=0)) >
4 4
1 0 2 𝑛 3 2 𝑛 3 2 𝑛 1
 (1- 𝑛𝑐0 )) > ⇒ 1- > ⇒ < .
3 3 4 3 4 3 4
2 2 2 2 3
 n=1, = 0.66 > 0.25, n=2, = 0.44 > 0.25, n=3, = 0.29 > 0.25,
3 3 3
2 4
 n=4, 3 = 0.197 > 0.25.
 Required value is n=4.
Question 7: Six dice are thrown 729 times. How many times do you expect at least three dice to
show a five or six?

Solution:
 Here, getting a five or six is considered as a success outcome.
1 1 1 𝟏 𝟐
 Then, p= 6 + 6 = 3; q=1-p= 𝟏 − 𝟑 = 𝟑 ; 𝒏 = 𝟔
 Let x denote the number of dice to show five or six.
𝒙 𝒏−𝒙 𝟏 𝒙 𝟐 𝟔−𝒙
Then, P(x)= 𝒏𝒄
𝒙𝒑 𝒒 = 𝟔𝒄
𝒙 𝟑 𝟑

 𝑃(𝑥 ≥ 3) = 1 − 𝑃 𝑥 = 0 + 𝑃 𝑥 = 1 + 𝑃(𝑥 = 2)
𝟏 𝟎 𝟐 𝟔 𝟏 𝟏 𝟐 𝟓 𝟏 𝟐 𝟐 𝟒 233
1 − 𝟔𝒄
𝟎 𝟑 + 𝟔𝒄𝟏 + 𝟔𝒄𝟐 =
𝟑 𝟑 𝟑 𝟑 𝟑 729

233
 Expected number of times = 729(729)=233
Poisson Distribution
 Experiments yielding numerical values of a random variable X, the number of outcomes occurring during a
given time interval or in a specified region, are called Poisson experiments.
 The number of outcomes, X, occurring during a Poisson experiments is called a Poisson random variable,
and its probability distribution is called the Poisson distribution.
 Example:
 The number of telephone calls received per hour by an office (specified time interval)
 The number of days school is closed due to rain during winter season (specified time interval)
 The number of people in a community who survive to age 100 (specified region)
 The number of typing errors per page of a book (specified region)
 A random variable X that takes on one of the values 0,1,2,… is said to be Poisson random variable with
parameter λ (average number of outcomes per unit time) , if for some λ>0,
𝑒 −λ λ𝑖
p i =P x=i = , 𝑖 = 0,1,2,3, … (Probability distribution of Poisson random variable X)
𝑖!
and 𝑒=2.7182.
 The above equation defines a probability mass function, i.e., σ∞
𝑖=0 𝑝 𝑖 = 1 , since

−λ σ∞ λ𝑖
σ∞
𝑖=0 𝑝 𝑖 = 𝑒 𝑖=0 = 𝑒 −λ 𝑒 λ = 1.
𝑖!
Note:
 Mean of the Poisson distribution μ =E[x]= λ
 Variance of the Poisson distribution σ2= λ
 SD of the Poisson distribution σ= Var = λ

Exercise: If X is a Poisson random variable with parameters λ, then prove that E[X] = λ and
Var[X] = λ

Approximation of Binomial distribution by a Poisson Distribution

 If ‘n’ independent trials, each of which results in a success with probability ‘p’, are
performed, then when ‘n’ is large and ‘p’ is close to 0, the number of successes
occurring is approximately a Poisson random variable with parameter λ=np (i.e.,)
Poisson distribution can be used to approximate binomial probabilities.
Question 1: Suppose that the number of typographical errors on a single page of a book has a Poisson
distribution with parameter λ = 0.5. Calculate the probability that there is at least one error on a page.
Solution:
 Let X denote the number of errors on a page.
 To find P{X ≥ 1} = 1 − P{X = 0} = 1 − 𝑒 −0.5 = 0.393.
Question 2: Three is the average number of oil tankers arriving each day at a certain port. The facilities
at the port can handle at most 4 tankers per day. What is the probability that on a given day tankers
have to be turned away?
Solution:
 Let X denote the number of tankers arriving each day.
 Given : average number of arrivals, λ=3.
e−λ λ𝑥 e−3 (3)𝑥
 Poisson distribution, P 𝑥 = 𝑥! = 𝑥!

𝑃 𝑥 > 4 =1− 𝑃 𝑥 = 0 +𝑃 𝑥 = 1 +𝑃 𝑥 = 2 +𝑃 𝑥 = 3 +𝑃 𝑥 = 4

𝑒 −3 3 0 𝑒 −3 3 1 𝑒 −3 3 2 𝑒 −3 3 3 𝑒 −3 3 4
=1− + + + + = 1 − 0.8153=0.1847
0! 1! 2! 3! 4!
Question 3: A shop has 4 diesel generator sets which it hires out every day. The demand for a
generator set on an average is a Poisson variate with the value 5/2. Obtain the probability that
on a particular day (i) there was no demand (ii) a demand had to be refused.

Solution:
 Let the Poisson random variable 𝑥 be the demand for a generator set.
 Given that Mean demand for a generator, λ=5/2=2.5
e−λ λ𝑥 e−2.5 (2.5)𝑥
 Poisson distribution, P 𝑥 = =
𝑥! 𝑥!
e −2.5 (2.5) 0
 𝑖 𝑃 𝑥=0 = 0!
= e−2.5 =0.0820
 𝑖𝑖 𝐼𝑓 𝑎 𝑑𝑒𝑚𝑎𝑛𝑑 ℎ𝑎𝑠 𝑡𝑜 𝑏𝑒 𝑟𝑒𝑓𝑢𝑠𝑒𝑑, 𝑡ℎ𝑒𝑟𝑒 𝑠ℎ𝑜𝑢𝑙𝑑 ℎ𝑎𝑣𝑒 𝑏𝑒𝑒𝑛 𝑎 𝑑𝑒𝑚𝑎𝑛𝑑 f𝑜𝑟 𝑚𝑜𝑟𝑒 𝑡ℎ𝑎𝑛 4
𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑜𝑟𝑠 (i.e.,) we need to find 𝑃 𝑥 > 4
𝑃 𝑥 > 4 =1− 𝑃 𝑥 = 0 +𝑃 𝑥 = 1 +𝑃 𝑥 = 2 +𝑃 𝑥 = 3 +𝑃 𝑥 = 4

𝑒 −2.5 2.5 0 𝑒 −2.5 2.5 1 𝑒 −2.5 2.5 2 𝑒 −2.5 2.5 3 𝑒 −2.5 2.5 4
=1− 0!
+ 1!
+ 2!
+ 3!
+ 4!
2.5 2 2.5 3 2.5 4
 1 − e−2.5 1 + 2.5 + 2! + 3! + 4! = 1 − e −2.5
10.8567
 = 1 − 0.8911 = 0.1089
Question 4: The number of accidents in a year attributed to taxi drivers in a city follows Poisson
distribution with mean 3. Out of 1000 taxi drivers of that city, find approximately the number of drivers
with (i) no accidents in a year (ii) more than 3 accidents in a year.

Solution:
 Given that mean number of accidents, λ=3
e−λ λx e−3 (3)x
 Poisson distribution, P x = x! = x!
e−3 (3)0
 𝑖 𝑃 𝑥=0 = = 0.0497
0!
 Number of drivers with no accidents= 1000(0.0497)=49.7≈50
𝑃 𝑥 >3 =1− 𝑃 𝑥 = 0 +𝑃 𝑥 = 1 +𝑃 𝑥 = 2 +𝑃 𝑥 = 3

e−3 3 0 e−3 3 1 e−3 3 2 e− 3 3 3


=1− 0!
+ 1!
+ 2!
+ 3!

3 2 3 3
 = 1 − e−3 1 + 3 + 2! + 3! = 1 − 0.0497 1 + 3 + 4.5 + 4.5
 = 1 − 0.6461 = 0.3539
 Number of drivers with more than 3 accidents = 1000(0.3539)=353.9 ≈354 (353 is also
fine)
Question 5: In a manufacturing process where glass products are made, defects occur occasionally rendering
the piece undesirable for marketing. It is known that, probability of any product to be defective is 0.001.
What is the probability that a random sample of 8000 will yield (i) exactly 2 defectives? (ii) fewer than 4
defectives?

Solution:
 This is a binomial experiment with n=8000 and p=0.001. Since p is very close to 0 and n is quite
large, we shall approximate with Poisson distribution.
 Let p be the probability that the product is defective p = 0.001.
 Given : n=8000 (no. of items in the sample), then Mean number of defective product
λ =np=(8000)(0.001)=8
 If X represents number of defective products, By Poisson distribution,
e−λ λx e−8 (8)x
P x = =
x! x!
e−8 (8)2
 (i) 𝑃 𝑥 = 2 = = 0.01073
2!

 (ii) 𝑃(𝑋 < 4) = 𝑃 𝑥 = 0 + 𝑃 𝑥 = 1 + 𝑃 𝑥 = 2 + 𝑃 𝑥 = 3

𝑒 −8 8 0 𝑒 −8 8 1 𝑒 −8 8 2 𝑒 −8 8 3
= + + + = 0.04238
0! 1! 2! 3!
Question 6: In a certain factory producing blades there is a small probability of 1/500 for any blade to be
defective. The blades are supplied in packets of 10. Use Poisson distribution to calculate the approximate
number of packets containing (i) no defective (ii) one defective (iii) two defective blades in a consignment of
10,000 packets.
1
Solution: Let p be the probability that the item is defective p = 500 = 0.002 (as probability of
occurrence p is small, it follows Poisson distribution)
 Given that n=10 (no.of items in each packet), then Mean number of defective blades
λ=np=(10)(0.002)=0.02
e−λ λx e−0.02 (0.02)x
 By Poisson distribution, P x = x! = x!
e−0.02 (0.02) 0
 𝑖 𝑃 𝑥=0 = = 0.9802
0!
 Number of packets containing no defective = 10000(0.9802)=9802

e−0.02 (0.02)1
 (ii) 𝑃 𝑥 = 1 = = 0.0196
1!
 Number of packets containing one defective=10000(0.1960)=196
e−0.02 (0.02)2
 (iii) 𝑃 𝑥 = 2 = = 0.000196
2!
 Number of packets containing one defective=10000(0.000196)=1.96 ≈ 2
Question 7: If the probability of a bad reaction from a certain injection is 0.001, determine
the chance that out of 2000 individuals, more than two will get a bad reaction.
Solution:
 Let p be the probability of bad reaction p = 0.001(as probability of occurrence p is small, it
follows Poisson distribution)
 Given that n=2000, then Mean λ=np=(2000)(0.001)=2
e−λ λx e−2 (2)x
 By Poisson distribution, P x = x! = x! , where x denotes the number of people to have
bad reaction from the injection.
 𝑇𝑜 𝑓𝑖𝑛𝑑 𝑃 𝑥 > 2

𝑃 𝑥 > 2 =1− 𝑃 𝑥 = 0 +𝑃 𝑥 = 1 +𝑃 𝑥 = 2

e−2 2 0 e−2 2 1 e−2 2 2


=1− + +
0! 1! 2!

 = 1 − e−2 1 + 2 + 2 = 1 − 5e−2 = 0.323


Department of Mathematics
Course Title: PROBABILITY AND STATISTICS
Module 2: Random Variables and their Properties and Probability Distributions

CONTINUOUS PROBABILITY DISTRIBUTIONS


Definitions:
 Recall: If a random variable takes uncountable number of possible values, then it is continuous. Here,
we shall concern ourselves with computing probabilities for various intervals of continuous random
variables.
 The function 𝑓 𝑥 is a probability density function (𝑝𝑑𝑓) for the continuous random variable X,
defined over the set of real numbers, if
1) 𝑓 𝑥 ≥ 0

2) −∞ 𝑓 𝑥 dx = 1
𝑏
3) P 𝑎 < 𝑋 < 𝑏 = 𝑎
𝑓(𝑥) 𝑑𝑥
𝑎
 Note : (i) When X is continuous, 𝑃 𝑋 = 𝑎 = 𝑎 𝑓(𝑥) 𝑑𝑥= 0 (i.e.,) the probability that a continuous
random variable will assume any fixed value is zero, and hence
 P a<X≤𝑏 =P a<X<b +P b =P a<X<b .
(i.e.,) it does not matter whether we include an end point of the interval or not. This is
not true, though, when X is discrete.
(ii) Areas will be used to represent probabilities and probabilities are positive numerical values, the density
function must lie entirely above the x-axis.
(iii) The probability density function is constructed so that the area under its curve bounded by the x-axis is
equal to 1 when computed over the range of X for which 𝑓(𝑥) is defined.
(iv) The probability that X assumes a value between 𝑎 and 𝑏 is equal to the shaded area under the density
𝑏
function between the ordinates at 𝑥 = 𝑎 and 𝑥 = 𝑏. (P 𝑎 < 𝑋 < 𝑏 = 𝑎 𝑓(𝑥) 𝑑𝑥)

(v) For a continuous random variable X and real number ‘a’,



 𝑃 𝑥 ≥ 𝑎 = 𝑎 𝑓 𝑥 𝑑𝑥

 𝑃 𝑥 <𝑎 =1−𝑃 𝑥 ≥𝑎 =1− 𝑎
𝑓(𝑥) 𝑑𝑥
 Definition: For a continuous random variable X,
𝑥
𝐹 𝑥 =𝑃 𝑋≤𝑥 = −∞
𝑓 𝑡 𝑑𝑡 , for −∞ < 𝑥 < ∞ is called the cumulative distribution function (c.d.f) of X.
𝑑
 Note: (i) 𝐹 𝑥 = 𝑓(𝑥), if derivative exists (ii) P(a<X<b)=F(b)-F(a)
𝑑𝑥

 If X is a continuous random variable having a probability density function f(x), the expectation, or
the expected value of X, denoted by E[X], is defined by

 E[X] = −∞
𝑥𝑓 𝑥 𝑑𝑥 𝑀𝑒𝑎𝑛 μ

 If X is a random variable with mean µ, then the variance of X, denoted by Var(X), is defined by

 Var(X) = σ2 = E 𝑋 2 − 𝐸 𝑋 2 = −∞ 𝑥 2 𝑓 𝑥 𝑑𝑥 − μ2
∞ 2
 Var(X) = −∞
𝑥−μ 𝑓 𝑥 𝑑𝑥
 Standard deviation of X, SD(X) = σ = 𝑉𝑎𝑟(𝑋)
Question 1

Solution:

 (a) Since f is a probability density function, −∞
𝑓 𝑥 𝑑𝑥 = 1
0 2 ∞ 0 2 ∞
 −∞
𝑓 𝑥 𝑑𝑥 + 0
𝑓 𝑥 𝑑𝑥 + 2 𝑓 𝑥 𝑑𝑥 = −∞ 0 𝑑𝑥 + 0
𝑐(4𝑥 − 2𝑥 2 ) 𝑑𝑥 + 2
0 𝑑𝑥 = 1
𝑥=2 𝑥=2 3
2 𝑥2 𝑥3 8
 0
𝑐(4𝑥 − 2𝑥 2 ) 𝑑𝑥=1 ⇒ c 4. − 2. = 1⇒ c = 1⇒ c =
2 𝑥=0 3 𝑥=0 3 8
3
4𝑥 − 2𝑥 2 , 0 < 𝑥 < 2
8
 𝑓 𝑥 =
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
∞ 2 3 ∞ 1
 (b) P(X>1) = 1
𝑓 𝑥 𝑑𝑥 = 1 8
4𝑥 − 2𝑥 2 𝑑𝑥 + 2
0 𝑑𝑥 =
2
Question 2: The total number of hours, measured in units of 100 hours, that a family runs a vacuum cleaner over a period
of one year is a continuous random variable X that has the density function
𝑥, 0<𝑥<1
𝑓 𝑥 = 2 − 𝑥, 1 ≤ 𝑥 < 2,
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒.
Find the probability that over a period of one year, a family runs their vacuum cleaner (a) less than 120 hours
(b) between 50 and 100 hours.

Solution: Given that the total number of hours is measured in units of 100 hours (i.e.,) 1 unit = 100 hours
 (a) To find : P(vacuum cleaner runs less than 120 hours)=P(𝑥 < 1.2) (120 hours=1.2 units)
𝑎
 𝐹𝑜𝑟𝑚𝑢𝑙𝑎 ∶ 𝑃 𝑋 < 𝑎 = −∞
𝑓 𝑥 𝑑𝑥
1.2 1 1.2
 𝑃 𝑋 < 1.2 = 0
𝑓 𝑥 𝑑𝑥 = 0
𝑥 𝑑𝑥 + 1
(2 − 𝑥) 𝑑𝑥
𝑥=1 𝑥=1.2
𝑥2 𝑥2 1
 = + 2𝑥 − = + 1.68 − 1.5 = 0.68
2 𝑥=0 2 𝑥=1 2

 (b) To find : P(vacuum cleaner runs between 50 and 100 hours)=P(0.5 < 𝑥 < 1) (50 hours=0.5 units and 100
hours=1unit)
𝑏
 𝐹𝑜𝑟𝑚𝑢𝑙𝑎 ∶ P 𝑎 < 𝑋 < 𝑏 = 𝑎
𝑓(𝑥) 𝑑𝑥
1 1
 P 𝑎<𝑋<𝑏 = 0.5 𝑓 𝑥 𝑑𝑥 = 0.5 𝑥 𝑑𝑥
𝑥=1
𝑥2 1 0.52
 = 2 𝑥=0.5
= 2
− 2
= 0.375
Question 3

Solution:
𝑥
 By definition 𝐹 𝑥 = 𝑃 𝑋 ≤ 𝑥 = −∞
𝑓 𝑡 𝑑𝑡 , −∞ < 𝑥 < ∞, 𝑤ℎ𝑒𝑟𝑒 𝑓 𝑡 𝑖𝑠 𝑡ℎ𝑒 𝑝. 𝑑. 𝑓. Of X.
𝑑
 Use: 𝐹 𝑥 = 𝑓 𝑥 , 𝑤ℎ𝑒𝑟𝑒 𝑓 𝑥 𝑖𝑠 𝑡ℎ𝑒 𝑝. 𝑑. 𝑓.
𝑑𝑥
0, 𝑥≤1
3
 Differentiating F(x) with respect to x, 𝑓 𝑥 = 4𝑐 𝑥 − 1 , 1 ≤ 𝑥 ≤ 3
0, 𝑥>3
∞ 3
 Since f is a probability density function, −∞ 𝑓 𝑥 𝑑𝑥 = 1 , 1 4𝑐 𝑥 − 1 3 𝑑𝑥 = 1
𝑥=3
(𝑥−1)4 1
 4c = 1⇒ 16𝑐 = 1⇒ c =
4 𝑥=1 16
0, 𝑥≤1
1
 Then, the p.d.f is 𝑓 𝑥 = 𝑥 − 1 3, 1 ≤ 𝑥 ≤ 3
4
0, 𝑥>3
Question 4


 Since f is a probability density function, −∞
𝑓 𝑥 𝑑𝑥 = 1
0 3 ∞ 0 3 ∞
 −∞
𝑓 𝑥 𝑑𝑥 + 0
𝑓 𝑥 𝑑𝑥 + 3
𝑓 𝑥 𝑑𝑥 = −∞
0 𝑑𝑥 + 0
𝑘𝑥 2 𝑑𝑥 + 3
0 𝑑𝑥 = 1

𝑥=3
3 𝑥3 1
 0
𝑘𝑥 2 𝑑𝑥 = 1 ⇒ k = 1⇒ 9c = 1⇒ c =
3 𝑥=0 9

1 2
𝑥 , 0<𝑥<3
 𝑓 𝑥 = 9
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑥=2
2 21 2 1 𝑥3 7
 (i) P(1<X<2) = 1
𝑓 𝑥 𝑑𝑥 = 1 9
𝑥 𝑑𝑥 = =
9 3 𝑥=1 27

𝑥=1
1 0 1 11 2 1 𝑥3 1
 (ii) P X ≤ 1 = −∞
𝑓 𝑥 𝑑𝑥 = −∞
𝑓 𝑥 𝑑𝑥 + 0
𝑓 𝑥 𝑑𝑥 = 0 9
𝑥 𝑑𝑥 == =
9 3 𝑥=0 27

𝑥=3
∞ 31 2 ∞ 1 𝑥3 1 26
 P(X>1) = 1
𝑓 𝑥 𝑑𝑥 = 1 9
𝑥 𝑑𝑥 + 3
0 𝑑𝑥 = = 27 − 1 =
9 3 𝑥=1 27 27

∞ 0 3 1 ∞
 Mean μ = −∞
𝑥𝑓 𝑥 𝑑𝑥 = −∞
0 𝑑𝑥 + 0
𝑥 𝑥2 𝑑𝑥 + 3
0 𝑑𝑥
9

𝑥=3
1 3 1 𝑥4 9
= 𝑥 3 𝑑𝑥 = =
9 0 9 4 𝑥=0 4

∞ 2 3 2 1 2 9 2
 Variance σ2 = −∞
𝑥 𝑓 𝑥 𝑑𝑥 − μ2 = 0
𝑥 𝑥 𝑑𝑥 −
9 4

𝑥=3
1 3 4 81 1 𝑥5 81 27
= 𝑥 𝑑𝑥 − = − =
9 0 16 9 5 𝑥=0 16 80
Question 5: A random variable X has the density function
𝑘
𝑓 𝑥 = , −∞ < 𝑥 < ∞, 𝑤ℎ𝑒𝑟𝑒 𝑘 𝑖𝑠 𝑎 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡. Determine 𝑘 and hence evaluate (i) 𝑃 𝑋 ≥ 0 (ii)
1+𝑥 2
𝑃 0<𝑋<1 .
Solution:

 Since f is a probability density function, −∞
𝑓 𝑥 𝑑𝑥 = 1
∞ ∞
 Since f(x) is an even function, −∞
𝑓 𝑥 𝑑𝑥 = 2 0
𝑓 𝑥 𝑑𝑥

∞ 𝑘 𝑥=∞ 𝜋 1
 2 0 1+𝑥 2
𝑑𝑥 = 1 ⇒ 2𝑘 tan−1 𝑥 𝑥=0 = 1 ⇒ 2𝑘 =1⇒𝑘=
2 𝜋

1
 𝑓 𝑥 = , −∞ <𝑥<∞
𝜋(1+𝑥 2 )

∞ ∞ 1 1 𝑥=∞ 1 𝜋 1
 P(X ≥ 0) = 0
𝑓 𝑥 𝑑𝑥 = 0 𝜋(1+𝑥 2 )
𝑑𝑥 = tan−1 𝑥 𝑥=0 = =
𝜋 𝜋 2 2

1 1 1 1 𝑥=1 1 𝜋 1
 P(0<X<1) = 0
𝑓 𝑥 𝑑𝑥 = 0 𝜋(1+𝑥 2 )
𝑑𝑥 = tan−1 𝑥 𝑥=0 = =
𝜋 𝜋 4 4
Question 6 : The lifetime (in hours) of a certain kind of radio tube is a random variable having a probability
density function given by
0, 𝑥 ≤ 100
𝑓 𝑥 = 100
, 𝑥 > 100
𝑥2
What is the probability that exactly 2 of 5 such tubes in a radio set will have to be replaced within the first 150
hours of operation? Assume that the events 𝐸𝑖 , i = 1, 2, 3, 4, 5 that the i-th such tube will have to be replaced
within this time are independent.

Solution:

150
 The probability that the radio tube will function for 150 hours : P(X<150) = 0
𝑓 𝑥 𝑑𝑥
100 150 100 150 −2
 0
0𝑑𝑥 + = 100
100 𝑥 2 0
𝑥 𝑑𝑥
= 1/3
 Let Y denotes the number of tubes to be replaced within 150 hours of operation. Then Y can be considered
as a binomial random variable. Then, n=5, x=2 (two tubes to be replaced) , p= 1/3 , q=1-1/3=2/3
1 2 2 3 80
 P(Y=2)=5C2 =
3 3 243
Question 7
Question 8


 Since f is a probability density function, −∞
𝑓 𝑥 𝑑𝑥 = 1
1 𝑥3 𝑥=1 1
𝑥=1
 0
(𝑎 + 𝑏𝑥 2 ) 𝑑𝑥 = 1 ⇒ 𝑎 𝑥 𝑥=0 +𝑏 = 1 ⇒𝑎 + 𝑏 = 1⇒ 3a+b=3 −−−− (1)
3 𝑥=0 3
∞ 1 1 3
 Given E[x]= μ = −∞
𝑥 𝑓 𝑥 𝑑𝑥 = 0 𝑥 𝑓 𝑥 𝑑𝑥 = 0 (𝑎𝑥 + 𝑏𝑥 3 ) 𝑑𝑥 =
5
24
 ⇒ 4𝑎 + 2𝑏 = ------- (2)
5

 Solving (1) and (2) , a=3/5 and b=6/5


Normal Distribution
Definitions:
 X is a normal random variable or X is normally distributed with parameters μ and 𝜎 2 if the
density of X is given by
(𝑥−μ)2
1 −
𝑓 𝑥 = 𝑒 2σ2 , −∞ < 𝑥 < ∞
2𝜋 σ

 This density function 𝑓 𝑥 is a bell-shaped curve that is symmetric about μ ( the graph is
called the normal curve).
 Normal distribution is also known as Gaussian Distribution.

Note:
2
∞ 1 ∞ − (𝑥−μ)
−∞
𝑓 𝑥 𝑑𝑥 = −∞
𝑒 2σ2 𝑑𝑥 = 1 (For normal random variable)
2𝜋 σ

 For the normal random variable X:


 Mean= E(X) = µ, Variance= Var(X) = 𝜎 2 , Standard Deviation= σ .
Normal density functions
𝑥2
1 −
(a) μ=0 and σ=1 : 𝑓 𝑥 = 𝑒 2
2𝜋
(𝑥−μ)2
1 −
(b) arbitrary μ, 𝜎2 :𝑓 𝑥 = 𝑒 2σ2
2𝜋 σ

Note:

 The line 𝑥 = 𝜇 divides the total area under the curve which is
equal to 1 into two equal parts

 The area to the right as well as to the left of the line 𝑥 = 𝜇 is


0.5
Standard normal distribution
𝑋−µ
 If X is normally distributed with parameters µ and 𝜎 2 , then Z = is normally distributed with the
σ
parameter 0 and 1. Such a random variable is said to be a standard or a unit normal random variable.
2
𝑏 1 𝑏 − (𝑥−𝜇)
 If X is a normal variate, then 𝑃 𝑎 ≤ 𝑋 ≤ 𝑏 = 𝑎
𝑓(𝑥) 𝑑𝑥 = 𝑒 2𝜎2 𝑑𝑥 −−− −(1)
2𝜋 𝜎 𝑎
𝑋−µ 𝑎−µ 𝑏−µ
 Sub. Z = , and changing the limits to 𝑧1 = and 𝑧2 = in Equation (1)
σ σ σ
2
1 𝑧2 − 𝑧
 𝑃 𝑎 ≤ 𝑋 ≤ 𝑏 = 𝑃 𝑧1 ≤ 𝑍 ≤ 𝑧2 = 𝑧1
𝑒 2 𝑑𝑧
2𝜋
𝑧2
1 −
 Standard normal probability density function 𝐹 𝑧 = 𝑒 2 is also called the standard normal curve
2𝜋
which is symmetrical about the line z=0.
Note:
2
∞ 1 ∞ − (𝑧)
 −∞
𝑓 𝑧 𝑑𝑧 = −∞
𝑒 2 𝑑𝑧 = 1 (For standard normal random variable)
2𝜋
(𝑧) 2 2
0 1 0 − 2 1 ∞ 1 ∞ − (𝑧)
 −∞
𝑓 𝑧 𝑑𝑧 = −∞
𝑒 𝑑𝑧 = 𝑎𝑛𝑑 0
𝑓 𝑧 𝑑𝑧 = 0
𝑒 2 𝑑𝑧 = 1/2
2𝜋 2 2𝜋

 For the standard normal variable Z:


 Mean= E[Z] = 0, Variance= Var(Z) = 1, Standard Deviation= 1.
𝟐
𝟏 𝒂 −𝒛
 Define 𝝓 𝒛 = 𝒂 = 𝟎
𝒆 𝟐 𝒅𝒛 (This represents the area under the standard normal curve
𝟐𝝅
from Z=0 to a)

 The table which gives the area for different values of z is called normal probability table.
Normal probability table

the area under the


standard normal
curve from Z=0 to a
Note:

 For 𝒛𝟏 and 𝒛𝟐 > 𝟎,


Question 1: If X is a normal random variable with parameters µ = 3 and 𝜎 2 = 9, find (a) P{2 < X <
5} (b) P{X > 0} (c) P{|X − 3| > 6}, (d) P{|X − 3| ≤ 6}.
Given:𝜙 1 = 0.3413, 𝜙 2 = 0.4772, 𝜙 0.67 = 0.2486, 𝜙 0.33 = 0.1293.
Solution:
 Given: 𝜇 = 3, 𝜎 2 = 9 ⇒ 𝜎=3.
𝑥 − µ 𝑥 −3
 Standard normal variate : z = =
σ 3
 (a) P(2 < 𝑥 < 5) :
2 −3 5 −3
 When 𝑥 =2, z = = −0.33 and when 𝑥 =5, z = = 0.66
3 3
 P(2 < 𝑥 < 5) = P(-0.33< z <0.67)=P −0.33 < z < 0 + P 0 < z < 0.67
 = P 0 < z < 0.33 + P 0 < z < 0.67 (By symmetry of standard normal curve)
2
1 𝑎 −𝑧
 =𝜙 0.33 + 𝜙 0.67 ( 𝑅𝑒𝑐𝑎𝑙𝑙: 𝑏𝑦 𝑑𝑒𝑓𝑛. 𝜙 𝑧 = 𝑎 = 0
𝑒 2 𝑑𝑧 )
2𝜋
 = 0.1293 + 0.2486 = 0.3779

 (b) P(𝑥 > 0) :


 When 𝑥 =0, z = −1
 P(𝑥 > 0) = P(z > −1)=P −1 < z < 0 + P z ≥ 0
 = P 0 < z < 1 + 0.5 (By symmetry of standard normal curve)
 𝜙 1 + 0.5 = 0.3413 + 0.5 = 0.8413
 (c) P{|X − 3| > 6}:
𝑥 − 3, 𝑥 − 3 > 0,
 𝑥−3 =
− 𝑥−3 , 𝑥−3<0
𝑥−3>6
 𝑥−3 >6 ⇒
− 𝑥−3 >6
𝑥>9 𝑥>9
⇒ ⇒
−𝑥 > 3 𝑥 < −3
 𝑃 𝑥 − 3 > 6 = 𝑃 𝑥 < −3 + 𝑃 𝑥 > 9

−3 −3 9 −3
 When 𝑥 = −3, z = = −2  When 𝑥 = 9, z = =2
3 3
 P(𝑥 < −3) = P(z < −2) = P z > 2 (By symmetry)  P(𝑥 > 9) = P(z > 2) =P z ≥ 0 − P 0 < z < 2
 = P z ≥ 0 − P 0 < z < 2 = 0.5 − 𝜙 2  = 0.5 − 𝜙 2
 = 0.5 − 0.4772 = 0.0228  = 0.5 − 0.4772 = 0.0228

 𝑃 𝑥 − 3 > 6 = 𝑃 𝑥 < −3 + 𝑃 𝑥 > 9 = 0.0228+0.0228=0.0456

(d) P{|X − 3| ≤ 6}= 1-P{|X − 3| > 6} = 1-0.0456=0.9544


Question 2: The marks of 1000 students in an examination follows a normal distribution with mean 70 and
standard deviation 5. Find the number of students whose mark will be (i) less than 65, (ii) more than 75, (iii)
between 65 and 75. Given: 𝜙 1 = 0.3413
Solution:
 Let 𝑥 represents the marks of students.
 Given: 𝜇 = 70, 𝜎=5.
𝑥 − µ 𝑥 −70
 Standard normal variate : Z = =
σ 5
65 −70
 (i) To find P(𝑥 <65) : When 𝑥 =65, z = = −1
5

 P(𝑥 <65) = P(Z<-1) = P(Z≤0)-P(-1<Z<0)


=0.5-P(0<Z<1)=0.5-𝜙 1 (OR)
= 0.5 − 0.3413 = 0.1587
𝑎
1 𝑧2
− 2
( 𝑏𝑦 𝑑𝑒𝑓𝑛. 𝜙 𝑧 = 𝑎 = 𝑒 𝑑𝑧 )
2𝜋
0

 Number of students whose marks will be less than 65 = 1000 * 0.1587 = 158.7 ≈ 159
 (ii) To find P(𝑥 >75)
75 −70
 When 𝑥 =75, z = =1
5
 P(𝑥 >75)=P(z >1)=P(1< z <∞)
 =P(z ≥0)-P(0< z <1)=0.5-P(0< z <1)=0.5-𝜙 1 = 0.5 − 0.3413 = 0.1587
 Number of students whose marks will be more than 75 = 1000 * 0.1587 = 158.7 ≈ 159

 (iii) To find P(65 < 𝑥 < 75)


 When 𝑥 =65, z = −1 , 𝑥 =75, z = 1
 P(65 < 𝑥 < 75) = P(-1< z <1)=2*P(0< z <1) = 2*𝜙 1 = 2(0.3413) = 0.6826
 Number of students scoring marks between 65 and 75 = 1000 * 0.6826 = 682.6 ≈ 683
Question 3: A lawyer commutes daily from his suburban home to his midtown office. The average time for a one-
way trip is 24 minutes, with a standard deviation of 3.8 minutes. Assume the distribution of trip times to be
normally distributed.
(i) What is the probability that a trip will take at least ½ hour?
(ii) Find the probability that 2 of next 3 trips will take at least ½ hour?
(iii) If the office opens at 9:00AM and the lawyer leaves his house at 8:45 AM daily, what percentage of the time is
he late for work?
(iv) If he leaves the house at 8:35 AM and coffee is served at the office from 8:50AM until 9AM, what is the
probability that he misses coffee?
Given: P 0 < Z < 1.58 = 𝜙 1.58 = 0.4429, P 0 < Z < 2.37 = 𝜙 2.37 = 0.4911, P 0 < Z < 2.37 = 𝜙 0.26 =
0.1026.
Solution: Let 𝑋 represents the trip time.
𝑥−µ 𝑥 −24
 Given: 𝜇 = 24, 𝜎=3.8, Standard normal variate : Z = =
σ 3.8
 (i) P(trip takes at least ½ hour)=P(𝑋 ≥30) :
30−24
 When 𝑋 =30, Z = = 1.58
3.8
 P(𝑋 >30)=P(Z >1.58)= P(Z ≥0)−P(0< Z <1.58)
2
1 𝑎 −𝑧
=0.5 − 𝜙 1.58 ( 𝑏𝑦 𝑑𝑒𝑓𝑛. 𝜙 𝑧 = 𝑎 = 2𝜋 0
𝑒 2 𝑑𝑧)
= 0.5 − 0.4429 = 0.0571
(ii) P( 2 of next 3 trips will take at least ½ hour)
 Let Y denotes the number of trips that will take at least ½ hour. Then Y can be considered as a binomial
random variable.
 Then, n=3, X=2 (two trips will take at least ½ hour) , p= 0.0571 , q=1-0.0571 = 0.9429
 P(Y=2)=3C2 0.0571 2 0.9429 1 = 0.0092.
(iii) what percentage of the time is he late for work ?
 P( He is late for work)=𝑃 𝑡𝑟𝑎𝑣𝑒𝑙 𝑡𝑖𝑚𝑒 𝑒𝑥𝑐𝑒𝑒𝑑𝑠 15 𝑚𝑖𝑛𝑢𝑡𝑒𝑠 = 𝑃(𝑋 > 15)
15−24
 When 𝑋 =15, Z = = −2.37 (𝑟𝑜𝑢𝑛𝑑𝑒𝑑 𝑡𝑜 𝑡𝑤𝑜 𝑑𝑖𝑔𝑖𝑡𝑠)
3.8
 P(𝑋 >15)=P(Z > −2.37) = P(−2.37 < Z <0)+P( Z > 0)
=P 0 < Z < 2.37 + 0.5
=0.5 + 𝜙 2.37 = 0.5 + 0.4911 = 0.9911
 Conclusion: 99.11 % of the time he is late for work.
(iv) Probability that he misses coffee:
 P(he misses coffee)=𝑃 𝑡𝑟𝑎𝑣𝑒𝑙 𝑡𝑖𝑚𝑒 𝑒𝑥𝑐𝑒𝑒𝑑𝑠 25 𝑚𝑖𝑛𝑢𝑡𝑒𝑠 = 𝑃(𝑋 > 25)
25−24
 When 𝑋 =25, Z = = 0.26 (𝑟𝑜𝑢𝑛𝑑𝑒𝑑 𝑡𝑜 𝑡𝑤𝑜 𝑑𝑖𝑔𝑖𝑡𝑠)
3.8
 P(𝑋 >25)=P(Z > 0.26) = P(Z ≥0)−P(0< Z <0.26)=0.5 − 𝜙 0.26 = 0.5 − 0.1026 = 0.3974
Question 4: An electrical firm manufactures light bulbs that have a life, before burn-out,
that is normally distributed with mean equal to 2040 hours and standard deviation of 60
hours. In a test on 2000 bulbs, Estimate the number of bulbs likely to last for (i) more than
2150 hours (ii) less than 1950 hours (iii) more than 1920 hours but less than 2160 hours.
Given: 𝜙 1.83 = 0.4664, 𝜙 1.5 = 0.4332, 𝜙 2 = 0.4772.
Solution:
 Let 𝑥 represents the lifetime of the bulb.
 Given: 𝜇 = 2040, 𝜎=60.
𝑥 − µ 𝑥 −2040
 Standard normal variate : z = =
σ 60
 (i) To find P(𝑥 > 2150)
2150 −2040 11
 When 𝑥 =2150, z = = = 1.83
60 6
 P(𝑥 > 2150)=P(z > 1.83)=P(z ≥ 0)-P(0< z <1.83)
2
1 𝑎 −𝑧
 = 0.5 − 𝜙 1.83 𝐵𝑦 𝑑𝑒𝑓𝑛. 𝜙 𝑧 = 𝑎 = 0
𝑒 2 𝑑𝑧
2𝜋
 = 0.5 − 0.4664 = 0.0336
 Number of bulbs to last more than 2150 hours = 2000 * 0.0336 = 67.2 ≈ 67
1950−2040
 (ii) To find P(𝑥 <1950) : When 𝑥 =1950, z = = −1.5
60
 P(𝑥 <1950) = P(z<-1.5) = P(z >1.5)=P(z ≥ 0)-P(0< z <1.5) = 0.5-𝜙 1.5
= 0.5 − 0.4332 = 0.0668
 Number of bulbs to last less than 1950 hours = 2000 * 0.0668 = 133.6 ≈ 134

 (iii) To find P(1920 < 𝑥 < 2160)


 When 𝑥 =1920, z = −2 , 𝑥 =2160, z = 2
 P(1920 < 𝑥 < 2160) = P(-2< z <2)=2*P(0< z <2) = 2*𝜙 2 = 2(0.4772) = 0.9544
 Number of bulbs to last more than 1920 hours but less than 2160 hours = 2000 * 0.9544 = 1908.8 ≈
1909
Question 5: In a normal distribution, 31% of the item are under the value 45 and 8% of the item are over the
value 64. Find the mean and variance of the distribution. Given that 𝑃 z < −1.4 = 0.08 and P z > 0.5 =
0.31
Solution:
 Let X be the normal random variable with mean µ and standard deviation σ.
 Given: P 𝑥 < 45 = 0.31 and P 𝑥 > 64 = 0.08

 There are two possible cases : 45> µ (or) 45<µ  There are two possible cases : 45 < 64 < µ (or)
 If 45> µ , 45 < µ < 64.
 P 𝑥 < 45 = 𝑃 𝑥 ≤ µ + 𝑃 µ < 𝑥 < 45 = 0.31  If 45 < 64 < µ, P 𝑥 > 64 = P 64 < 𝑥 < µ +
(𝑤ℎ𝑖𝑐ℎ 𝑖𝑠 𝑛𝑜𝑡 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑏𝑒𝑐𝑎𝑢𝑠𝑒 𝑃 𝑥 ≤ µ = 0.5) P x ≥ µ = 0.08
 Therefore, 45 < µ (or) µ > 45 ( which is not possible because P x ≥ µ = 0.5 )
 Therefore, 45 < µ < 64
𝑥−µ
 Standard normal variate : z =
σ
𝑥−µ 45 − µ 64 − µ
 When 𝑥 = 45, z = = and when 𝑥 = 64, z =
σ σ σ
45 − µ
 P 𝑥 < 45 = P 𝑧 < = 0.31 ----- (1) and
σ
64 − µ
 P 𝑥 > 64 = P z > = 0.08 ------ (2)
σ

 Given : 𝑃 𝑧 < −1.4 = 0.08 ⇒ 𝑃 𝑧 > 1.4 = 0.08 ------ (3) and
 P z > 0.5 = 0.31 ⇒ 𝑃 𝑧 < −0.5 = 0.31-------- (4)

64 − µ
 Comparing (2) and (3) , = 1.4 ⇒ µ + 1.4σ = 64 ---- (5)
σ
45 − µ
 Comparing (1) and (4) , = −0.5⇒ µ − 0.5σ = 45---- (6)
σ

 Solving (5) and (6) , we get Mean µ=50, Standard deviation σ =10.
𝑥2 −6𝑥+4

Question 6: If 𝑓 𝑥 = 𝑐 𝑒 24 is the probability density function of a normal variate,
then find c, mean and variance.

Solution:
(𝑥−μ)2
1 −
 The probability density function of a normal variable X is given by 𝑓 𝑥 = 𝑒 2σ2 ------ (1)
2𝜋 σ
𝑥2 −6𝑥+4

 Given 𝑓 𝑥 = 𝑐 𝑒 24

 Rewriting given 𝑓 𝑥 ∶
1 1 (𝑥−3)2
− 24 (𝑥 2 − 2 3 𝑥 +9−9+4) − 24 ((𝑥−3)2 −5) −
𝑐𝑒 = 𝑐𝑒 =𝑐 𝑒 5/24 𝑒 24
𝑥−3 2

 = 𝑐 𝑒 5/24 𝑒 2 ( 12)2 −−−− − (2)

 Comparing the exponents in (1) and (2),


5

𝑒 24
 Mean μ = 3 , Standard deviation σ = 12 provided the constant c =
24π
Percentages of the Area Under the Normal Curve
Percentages of the Area Under the Standard Normal Curve
Exponential Distribution
 The exponential distribution often arises, in practice, as being the distribution of the amount of time until some
specific event occurs.
Eg: Time between arrivals at a congested intersection during rush hour in a large city.
Amount of time until a phone call you receive turns out to be a wrong number.
Amount of time before a certain type of component in a system fails.(time to failure).
 Definition: A continuous random variable whose probability density function is given by
𝛼𝑒 −𝛼𝑥 , 𝑥 ≥ 0
 𝑓 𝑥 = 𝑓𝑜𝑟 𝑠𝑜𝑚𝑒 α > 0, is said to be exponential random
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
variable with parameter α.
 The cumulative distribution function F(a) of an exponential random variable is given by
 𝐹 𝑎 = 𝑃 𝑋 ≤ 𝑎 = 1 − 𝑒 −𝛼𝑎 , 𝑎 ≥ 0.
 P X > a = 1 − 𝑃 𝑋 ≤ 𝑎 = 1 − 1 − 𝑒 −𝛼𝑎 = 𝑒 −𝛼𝑎 , 𝑎 ≥ 0.
 𝑃 𝑎 < 𝑋 < 𝑏 = 𝐹 𝑏 − 𝐹 𝑎 = 1 − 𝑒 −𝛼𝑏 − 1 − 𝑒 −𝛼𝑎 = 𝑒 −𝛼𝑎 − 𝑒 −𝛼𝑏
Note:
∞ ∞
 −∞ 𝑓 𝑥 𝑑𝑥 = 𝟎
𝛼𝑒 −𝛼𝑥 𝑑𝑥 = 1
1
 Mean (µ) : E(X) = (Mean of the exponential is the reciprocal of its parameter 𝛼 ),
𝛼
2 1 1
 Variance (𝜎 ) : Var (X) = 2 , Standard Deviation (σ) = .
𝛼 𝛼
Question 1: If a random variable X follows exponential distribution with mean 5, find (i) P{0 < X < 1}
(ii) P{X < 10} (iii) P{X ≤ 0 or X ≥ 1}
Solution:
1 1
 Given Mean = = 5 ⇒ 𝛼 = .
𝛼 5
1 −𝑥
 𝑓 𝑥 = 𝑒 5, 𝑥 ≥ 0, is the probability density function of exponential random variable X.
5
1 1 1 −𝑥 1
 (i) P(0<X<1) = 0
𝑓 𝑥 𝑑𝑥 = 𝑒 5 𝑑𝑥 = 1 − 𝑒 −5 = 0.1813 (OR)
5 0
 P(0<X<1) = F(1)-F(0) = 𝑒 −𝛼(0) − 𝑒 −𝛼 1 (∵𝑃 𝑎 <𝑋 <𝑏 =𝐹 𝑏 −𝐹 𝑎 )
1

=1−𝑒 5 = 0.1813

10 0 10 1 10 −𝑥 1
 (ii) P(X<10) = −∞
𝑓 𝑥 𝑑𝑥 = −∞
𝑓 𝑥 𝑑𝑥 + 0
𝑓 𝑥 𝑑𝑥 = 𝑒 5 𝑑𝑥 = 1 − = 0.8647 (OR)
5 0 𝑒2
 P(X<10) = P(X ≤10)=F(10) = 1 − 𝑒 −𝛼 10 (∵𝐹 𝑎 = 𝑃 𝑋 ≤ 𝑎 = 1 − 𝑒 −𝛼𝑎 , 𝑎 ≥ 0.)
10
−5
=1−𝑒 = 1 − 𝑒 −2 = 0.8647

∞ 1 ∞ −𝑥 1
 (iii) P{X ≤ 0 or X ≥ 1} = P(X ≤ 0) + P(X ≥ 1} = 0 + 1
𝑓 𝑥 𝑑𝑥 = 𝑒 5 𝑑𝑥 = 𝑒 −5 = 0.8187 (OR)
5 1
1
−5
 P(X ≤ 0) + P(X ≥ 1} = 0 + (1-P(X<1))=1-F(1)=1-(1 − 𝑒 −𝛼 1 )= 𝑒 =0.8187
Question 2: Suppose that a system contains a certain type of component whose time to failure(in
years) is given by T. The random variable T follows an exponential distribution with mean time to
failure 5. If 5 of these components are installed in different systems, what is the probability that at
least 2 are still functioning at the end of 8 years?
Solution:
 Let T denote the time to failure. As it follows exponential distribution,
 𝑓 𝑡 = 𝛼𝑒 −𝛼𝑡 , 𝑡 ≥ 0
𝑡
1 1 1
 Given Mean = =5 ⇒𝛼= , ∴ 𝑓 𝑡 = 𝑒 −5 , 𝑡 ≥ 0 is the probability density function of T.
𝛼 5 5

 P(a given component is still functioning after 8 years) = P(time to failure > 8)
∞ 1 ∞ −𝑡 8
 P(T>8) = 8
𝑓 𝑡 𝑑𝑡 = 𝑒 5 𝑑𝑡 = 𝑒 −5 = 0.2019
5 8

 Let Y denotes the number of components functioning after 8 years. Then Y can be considered as a
binomial random variable. Then, n=5, x≥2 (at least two) , p= 0.2019, q=1- 0.2019 =0.7981
 P(Y ≥ 2)=1 − 𝑃 𝑌 = 0 + 𝑃 𝑌 = 1
= 1 − {5C0 0.2019 0 0.7981 5 + 5C1 0.2019 1 0.7981 4 } = 1 − 0.7333 = 0.2667
Question 3: Based on extensive testing, it is determined that the time X, in years, before a major
repair is required for a certain washing machine is characterized by the density function
1 −𝑥
𝑒 4, 𝑥≥0
𝑓 𝑥 = 4 .
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
The machine is considered a good purchase if it is unlikely to require a major repair before the sixth
year. Conclude whether or not the washing machine is a good purchase.

Solution: Let X be the time before a major repair is required.


 Given X is exponentially distributed with mean α = 1/4

 The machine is a good purchase if the probability that it will require major repair after sixth year is
more than the probability that it will require a repair before six years.
𝑥 3
1 ∞ −4 −2
 P(it requires major repair after sixth year) = P(X>6) = 6
𝑒 𝑑𝑥 = 𝑒 = 0.2231.
4

 P(it requires major repair before sixth year)= P(X<6)=1-0.2231=0.7777

 Conclusion: The machine is not really a good purchase.


Question 4: The length of a phone call (in minutes) arriving at a particular center follows an
exponential distribution with an average of 5 minutes. Find the probability that a random call made
to this center (i) ends less than 5 minutes (ii) ends between 5 and 10 minutes.

Solution:
 Let X denote the length of the call. As it follows exponential distribution,
 𝑓 𝑥 = 𝛼𝑒 −𝛼𝑥 , 𝑥 ≥ 0
𝑥
1 1 1
 Given Mean = =5 ⇒𝛼= , ∴ 𝑓 𝑥 = 𝑒 −5 , 𝑥 ≥ 0 is the probability density function of X.
𝛼 5 5

5 1 5 −𝑥 1
 (i) P(x<5) = 0
𝑓 𝑥 𝑑𝑥 = 𝑒 5 𝑑𝑥 = 1 − = 0.6321
5 0 𝑒

10 1 10 −𝑥 1 1
 (ii) P(5<x<10) = 𝑓 𝑥 𝑑𝑥 = 𝑒 5 𝑑𝑥 = − = 0.2325
5 5 5 𝑒 𝑒2
JOINT PROABABILITY DISTRIBUTIONS

Definitions: (Discrete Case)


 If X and Y are two discrete random variables, we define the joint probability function (or joint probability mass
function) of X and Y by P(X=𝑥,Y=𝑦)=𝑝(𝑥, 𝑦) where 𝑝(𝑥, 𝑦) satisfy the conditions
𝑝(𝑥, 𝑦) ≥ 0 and 𝑥 𝑦 𝑝 𝑥, 𝑦 = 1 , 𝑡ℎ𝑒 𝑠𝑢𝑚𝑚𝑎𝑡𝑖𝑜𝑛 𝑖𝑠 𝑡𝑎𝑘𝑒𝑛 𝑜𝑣𝑒𝑟 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒𝑠 𝑜𝑓 𝑥 𝑎𝑛𝑑 𝑦.
(i.e.,) that values 𝑝(𝑥, 𝑦) give the probability that outcomes 𝑥 𝑎𝑛𝑑 𝑦 occur at the same time.
 Suppose X={𝑥1 , 𝑥2 ,…, 𝑥𝑚 } , Y={𝑦1 , 𝑦2 ,…, 𝑦𝑛 } , then 𝑃 𝑋 = 𝑥𝑖 , 𝑌 = 𝑦𝑗 = 𝑝 𝑥𝑖 , 𝑦𝑗 𝑑𝑒𝑛𝑜𝑡𝑒𝑑 𝑏𝑦 𝐽𝑖𝑗 .

 The set of values of the function 𝑝 𝑥𝑖 , 𝑦𝑗 = 𝐽𝑖𝑗 , 𝑖 = 1,2, … 𝑚, 𝑗 = 1,2, … 𝑛 is called the joint probability
distribution of X and Y. These values are presented in the form of a two way table called the joint probability
table.

 Note: The function 𝑝 is defined on the set 𝑋 × 𝑌 = { 𝑥1 , 𝑦1 , 𝑥1 , 𝑦2 , … 𝑥𝑚 , 𝑦𝑛 } ( cartesian product of the sets X
and Y ).
 In the joint probability table, 𝑓 𝑥1 , 𝑓 𝑥2 , … 𝑓(𝑥𝑚 ) respectively represents the sum of all the entries in the
first row, second row,…, 𝑚𝑡ℎ row and 𝑔 𝑦1 , 𝑔 𝑦2 , … 𝑔(𝑦𝑛 ) respectively represents the sum of all the
entries in the first column, second column,…, 𝑛𝑡ℎ column (i.e.,)

 {𝑓 𝑥1 , 𝑓 𝑥2 , … 𝑓(𝑥𝑚 )} and {𝑔 𝑦1 , 𝑔 𝑦2 , … 𝑔(𝑦𝑛 )} are called marginal probability distributions


of X alone and Y alone respectively.
 Note:
 𝑓 𝑥1 + 𝑓 𝑥2 + ⋯ + 𝑓 𝑥𝑚 = 1 𝑎𝑛𝑑𝑔 𝑦1 + 𝑔 𝑦2 + ⋯ + 𝑔 𝑦𝑛 = 1
 In other words, 𝑚 𝑛 𝑚 𝑛
𝑖=1 𝑗=1 𝑝 𝑥𝑖 , 𝑦𝑗 = 𝑖=1 𝑗=1 𝐽𝑖𝑗 = 1 (i.e.,) total of all entries in the joint
probability table is equal to 1.
 The discrete random variables X and Y are said to be independent random variables if
𝑃 𝑋 = 𝑥, 𝑌 = 𝑦 = 𝑃 𝑋 = 𝑥 . 𝑃 𝑌 = 𝑦 𝑎𝑛𝑑 𝑐𝑜𝑛𝑣𝑒𝑟𝑠𝑒𝑙𝑦.

 𝑃 𝑋 = 𝑥𝑖 , 𝑌 = 𝑦𝑗 = 𝑃 𝑋 = 𝑥𝑖 . 𝑃(𝑌 = 𝑦𝑗 ) ⇒ 𝑓 𝑥𝑖 ). 𝑔(𝑦𝑗 = 𝐽𝑖𝑗 in the joint probability table


(i.e.,) X and Y are independent if each entry 𝐽𝑖𝑗 in the table is equal to the product of its marginal
entries. Otherwise, X and Y are said to be dependent.

 If X and Y are two discrete random variables having the joint probability function 𝑝(𝑥, 𝑦) then the
Expectations of X and Y are defined as
 𝜇𝑋 = 𝐸 𝑋 = 𝑥 𝑦 𝑥 𝑝 𝑥, 𝑦 = 𝑖 𝑥𝑖 𝑓(𝑥𝑖 )

 𝜇𝑌 = 𝐸 𝑌 = 𝑥 𝑦𝑦 𝑝 𝑥, 𝑦 = 𝑖 𝑦𝑖 𝑔(𝑦𝑖 )

 𝐸 𝑋𝑌 = 𝑖,𝑗 𝑥𝑖 𝑦𝑗 𝐽𝑖𝑗

 If 𝑍 = ϕ(𝑋, 𝑌) and 𝑝 𝑥, 𝑦 is the joint distribution of X and Y, the Expectation of 𝑍 in the joint
distribution of X,Y is defined as E Z = 𝑖,𝑗 ϕ(𝑥𝑖 , 𝑦𝑗 ) 𝐽𝑖𝑗
 If X and Y are two discrete random variables having mean 𝜇𝑋 and 𝜇𝑌 respectively, then the
covariance of X and Y denoted by 𝑐𝑜𝑣(𝑋, 𝑌) is defined as
 𝑐𝑜𝑣 𝑋, 𝑌 = 𝑖 𝑗(𝑥𝑖 −𝜇𝑋 )(𝑦𝑗 −𝜇𝑌 ) 𝐽𝑖𝑗 = 𝐸[(𝑋 − 𝜇𝑋 )(𝑌 − 𝜇𝑌 )]
 𝑐𝑜𝑣 𝑋, 𝑌 = 𝑖 𝑗 𝑥𝑖 𝑦𝑗 𝐽𝑖𝑗 − 𝜇𝑋 𝜇𝑌 = 𝐸(𝑋𝑌) − 𝜇𝑋 𝜇𝑌

𝑐𝑜𝑣(𝑋,𝑌)
 Correlation of X and Y : 𝜌 𝑥, 𝑦 = , 𝑤ℎ𝑒𝑟𝑒 σ𝑋 𝑎𝑛𝑑 𝜎𝑌 denotes standard deviation of X and
σ𝑋 𝜎𝑌
Y respectively ( 𝜎𝑋2 =E 𝑋 2 − μ2𝑋 and 𝜎𝑌2 =E 𝑌 2 − μ2𝑌 ).
 Note:
1) If X and Y are independent random variables, then
 𝐸 𝑋𝑌 = 𝐸 𝑋 𝐸(𝑌)
 𝑐𝑜𝑣 𝑋, 𝑌 = 0 and 𝜌 𝑥, 𝑦 =0
2
 𝜎𝑋+𝑌 = 𝜎𝑋2 + 𝜎𝑌2

2) 𝑐𝑜𝑣 𝑋, 𝑋 = 𝐸 (𝑋 − 𝜇𝑋 )2 = 𝑉 𝑋 = 𝜎𝑋2
Question 1: The joint distribution of two random variables X and Y is as follows.

Y -4 2 7
X
1 1/8 1/4 1/8
5 1/4 1/8 1/8
Compute the following:
(a) E(X) and E(Y) (b) E(XY) (c) 𝜎𝑋 𝑎𝑛𝑑 𝜎𝑌 (d) 𝑐𝑜𝑣 𝑋, 𝑌 (e) 𝜌(𝑋, 𝑌)
Solution:
1 1 1 1 1 1
 Given: 𝑥1 = 1, 𝑥2 = 5, 𝑦1 = −4, 𝑦2 = 2, 𝑦3 = 7 , 𝐽11 = , 𝐽12 = , 𝐽13 = , 𝐽21 = , 𝐽22 = , 𝐽23 =
8 4 8 4 8 8
 The marginal distributions of X and Y :
 Sum of entries in each row : 𝑓 𝑥1 = 𝐽11 + 𝐽12 + 𝐽13 =1/2 and 𝑓 𝑥2 = 𝐽21 + 𝐽22 + 𝐽23 =1/2
 Sum of entries in each column : 𝑔 𝑥1 = 𝐽11 + 𝐽21 =3/8, 𝑔 𝑥2 = 𝐽12 + 𝐽22 =3/8, 𝑔 𝑥3 = 𝐽13 + 𝐽23 = 1/4

Distribution of X: Distribution of Y:

𝒙𝒊 1 5 𝒚𝒋 -4 2 7

𝒇(𝒙𝒊 ) 1/2 1/2 𝒈(𝒚𝒋 ) 3/8 3/8 1/4


2 1 1 3 3 3 1
 (a) 𝐸 𝑋 = 𝜇𝑋 = 𝑖=1 𝑥𝑖 𝑓 𝑥𝑖 = 1 2
+5 2
= 3 and 𝐸 𝑌 = 𝜇𝑌 = 𝑗=1 𝑦𝑗 𝑔 𝑦𝑗 = −4 8
+2 8
+7 4
=1
2 3
 (b) 𝐸 𝑋𝑌 = 𝑖,𝑗 𝑥𝑖 𝑦𝑗 𝐽𝑖𝑗 = 𝑖=1 𝑗=1 𝑥𝑖 𝑦𝑗 𝐽𝑖𝑗
1 1 1 1 1 1
= 1 −4 + 1 2 + 1 7 + 5 −4 + 5 2 + 5 7 = 3/2
8 4 8 4 8 8
 (c) 𝜎𝑋2 =E 𝑋 2 − μ2𝑋
2 2 1 1
E 𝑋2 = 𝑖=1 𝑥𝑖 𝑓 𝑥𝑖 = 1 + 25 = 13 and μ2𝑋 =32 = 9
2 2
𝜎𝑋2 = 13 − 9 = 4
 𝜎𝑌2 =E 𝑌 2 − μ2𝑌
3 2 3 3 1 79
E 𝑌2 = 𝑗=1 𝑦𝑗 𝑔 𝑦𝑗 = 16 +4 + 49 = and μ2𝑌 =1
8 8 4 4
79 75
𝜎𝑌2 = − 1=
4 4

3 3
 (d) 𝑐𝑜𝑣 𝑋, 𝑌 = 𝐸 𝑋𝑌 − 𝜇𝑋 𝜇𝑌 = − 3 1 =−
2 2
𝑐𝑜𝑣(𝑋,𝑌) −3/2
 (e) 𝜌 𝑥, 𝑦 = = = −0.1732
σ𝑋 𝜎𝑌 ( 4)( 75/4)
Question 2: A fair coin is tossed thrice. The random variables X and Y are defined as follows:
X= 0 (or) 1 according as head or tail occurs on the first toss.
Y= Number of heads.
a) Determine the marginal distribution of X and Y
b) Determine the joint distribution of X and Y
c) Obtain the expectations of X,Y and XY. Also find standard deviations of X and Y
d) Compute the covariance and correlation of X and Y
Solution: The sample space S and the values of random variables X and Y are as follows:
S HHH HHT HTH HTT THH THT TTH TTT
𝑋 0 0 0 0 1 1 1 1
𝑌 3 2 2 1 2 1 1 0
a) The probability distributions of X and Y:
 X={0,1} , Y={0,1,2,3}
4 1 4 1
 𝑃 𝑋 = 0 = 8 = 2 , 𝑃 𝑋 = 1 = 8 = 2,
1 3 3 1
 𝑃 𝑌 = 0 = 8, 𝑃 𝑌 = 1 = 8 , 𝑃 𝑌 = 2 = 8 , 𝑃 𝑌 = 3 = 8
Distribution of X: Distribution of Y:
𝒙𝒊 0 1 𝒚𝒋 0 1 2 3
𝒇(𝒙𝒊 ) 1/2 1/2 𝒈(𝒚𝒋 ) 1/8 3/8 3/8 1/8
Y 𝒚𝟏 = 𝟎 𝒚𝟐 = 𝟏 𝒚𝟑 = 𝟐 𝒚𝟒 = 𝟑 𝒇(𝒙𝒊 )
X
𝒙𝟏 = 0 0 1/8 1/4 1/8 1/2
𝒙𝟐 = 1 1/8 1/4 1/8 0 1/2
𝒈(𝒚𝒋 ) 1/8 3/8 3/8 1/8 1
c) Expectation of X, Y, XY and standard deviations of X and Y:
2 1 1 1
 𝐸 𝑋 = 𝜇𝑋 = 𝑖=1 𝑥𝑖 𝑓 𝑥𝑖 = 0 +1 =
2 2 2
4 1 3 3 1 12 3
 𝐸 𝑌 = 𝜇𝑌 = 𝑗=1 𝑦𝑗 𝑔 𝑦𝑗 = 0 +1 +2 +3 = =
8 8 8 8 8 2
2 4 1 1 1
 𝐸 𝑋𝑌 = 𝑖,𝑗 𝑥𝑖 𝑦𝑗 𝐽𝑖𝑗 = 𝑖=1 𝑗=1 𝑥𝑖 𝑦𝑗 𝐽𝑖𝑗 =0+ +2 =
4 8 2
1 1 2 1
2 2
 𝜎𝑋2 = E 𝑋 2 − μ2𝑋 = 𝑖=1 𝑥𝑖 𝑓 𝑥𝑖 − μ2𝑋 = 0 + 1 − =
2 2 4
4 2 3 3 1 3 2 3
 𝜎𝑌2 =E 𝑌 2 − μ2𝑌 = 𝑗=1 𝑦𝑗 𝑔 𝑦𝑗 − μ2𝑌 = 0 + 1 +4 +9 − =
8 8 8 2 4

1 3 1
(d) 𝑐𝑜𝑣 𝑋, 𝑌 = 𝐸 𝑋𝑌 − 𝜇𝑋 𝜇𝑌 = − = −
2 4 4

𝑐𝑜𝑣(𝑋,𝑌) −1/4 1
𝜌 𝑥, 𝑦 = = =−
σ𝑋 𝜎𝑌 ( 3)/4 3
Question 3: Suppose X and Y are independent random variables with the following respective distribution,
find the joint distribution of X and Y. Also verify that 𝑐𝑜𝑣 𝑋, 𝑌 = 0.

𝒙𝒊 1 2 𝒚𝒋 -2 5 8
𝒇(𝒙𝒊 ) 0.7 0.3 𝒈(𝒚𝒋 ) 0.3 0.5 0.2

Solution:
 Since X and Y are independent, 𝑓 𝑥𝑖 ). 𝑔(𝑦𝑗 = 𝐽𝑖𝑗 (𝑖 = 1,2 𝑎𝑛𝑑 𝑗 = 1,2,3) (i.e.,) 𝐽𝑖𝑗 is obtained
by multiplying the marginal entries.

 𝐽11 = 𝑓 𝑥1 𝑔 𝑦1 = 0.7 0.3 = 0.21, 𝐽12 = 𝑓 𝑥1 𝑔 𝑦2 = 0.7 0.5 = 0.35,

𝐽13 = 𝑓 𝑥1 𝑔 𝑦3 = 0.7 0.2 = 0.14, 𝐽21 = 𝑓 𝑥2 𝑔 𝑦1 = 0.3 0.3 = 0.09,

𝐽22 = 𝑓 𝑥2 𝑔 𝑦2 = 0.3 0.5 = 0.15, 𝐽23 = 𝑓 𝑥2 𝑔 𝑦3 = 0.3 0.2 = 0.06.


Y 𝒚𝟏 = −𝟐 𝒚𝟐 = 𝟓 𝒚𝟑 = 𝟖 𝒇(𝒙𝒊 )

X
𝒙𝟏 =1 0.21 0.35 0.14 0.7

𝒙𝟐 =2 0.09 0.15 0.06 0.3

𝒈(𝒚𝒋 ) 0.3 0.5 0.2 1

 𝑐𝑜𝑣 𝑋, 𝑌 = 𝐸 𝑋𝑌 − 𝜇𝑋 𝜇𝑌
2
 𝐸 𝑋 = 𝜇𝑋 = 𝑖=1 𝑥𝑖 𝑓 𝑥𝑖 = 1 0.7 + 2 0.3 = 1.3
3
 𝐸 𝑌 = 𝜇𝑌 = 𝑗=1 𝑦𝑗 𝑔 𝑦𝑗 = −2 0.3 + 5 0.5 + 8 0.2 = 3.5
2 3
 𝐸 𝑋𝑌 = 𝑖,𝑗 𝑥𝑖 𝑦𝑗 𝐽𝑖𝑗 = 𝑖=1 𝑗=1 𝑥𝑖 𝑦𝑗 𝐽𝑖𝑗
= 1 −2 0.21 + 1 5 0.35 + 1 8 0.14 + 2 −2 0.09 + 2 5 0.15 +
2 8 0.06 = 4.55
 𝑐𝑜𝑣 𝑋, 𝑌 = 𝐸 𝑋𝑌 − 𝜇𝑋 𝜇𝑌 = 4.55 − 1.3 3.5 = 0
Question 4: Let X and Y are independent random variables. X take values 2,5,7 with probability 1/2, 1/4, 1/4 respectively.
Y take the values 3,4,5 with the probability 1/3, 1/3, 1/3.
(i) Find the joint probability distribution of X and Y.
(ii) Show that 𝑐𝑜𝑣 𝑋, 𝑌 = 0.

Solution: (i)
𝒙𝒊 2 5 7 𝒚𝒋 3 4 5
 Given data:
𝒇(𝒙𝒊 ) 1/2 1/4 1/4 𝒈(𝒚𝒋 ) 1/3 1/3 1/3

 Since X and Y are independent, 𝑓 𝑥𝑖 ). 𝑔(𝑦𝑗 = 𝐽𝑖𝑗 (𝑖 = 1,2,3 𝑎𝑛𝑑 𝑗 = 1,2,3) (i.e.,) 𝐽𝑖𝑗 is obtained by multiplying the
marginal entries.
Y 𝒚𝟏 = 𝟑 𝒚𝟐 = 𝟒 𝒚𝟑 = 𝟓 𝒇(𝒙𝒊 )
X
𝒙𝟏 = 2 1/6 1/6 1/6 1/2
𝒙𝟐 = 5 1/12 1/12 1/12 1/4
𝒙𝟑 =7 1/12 1/12 1/12 1/4
𝒈(𝒚𝒋 ) 1/3 1/3 1/3 1
 𝑖𝑖 𝑐𝑜𝑣 𝑋, 𝑌 = 𝐸 𝑋𝑌 − 𝜇𝑋 𝜇𝑌
3 1 1 1
 𝐸 𝑋 = 𝜇𝑋 = 𝑖=1 𝑥𝑖 𝑓 𝑥𝑖 = 2 +5 +7 =4
2 4 4

3 1 1 1
 𝐸 𝑌 = 𝜇𝑌 = 𝑗=1 𝑦𝑗 𝑔 𝑦𝑗 = 3 +4 +5 =4
3 3 3

3 3
 𝐸 𝑋𝑌 = 𝑖,𝑗 𝑥𝑖 𝑦𝑗 𝐽𝑖𝑗 = 𝑖=1 𝑗=1 𝑥𝑖 𝑦𝑗 𝐽𝑖𝑗

1 1 1 1 1 1
= 2 3 + 2 4 + 2 5 + 5 3 + 5 4 + 5 5 +
6 6 6 12 12 12
1 1 1
7 3 + 7 4 + 7 5 = 16
12 12 12

 𝑐𝑜𝑣 𝑋, 𝑌 = 𝐸 𝑋𝑌 − 𝜇𝑋 𝜇𝑌 16 − 4 4 = 0
5 Module 3 - Class notes
Over 50 MB - See link https://tinyurl.com/2phchbzc

134
Unit 4 - Statistical Hypothesis Testing - 4th Sem, B.Tech. 2024
Department of Mathematics, School of Engineering, Dayananda Sagar University

Contents
1 Introduction 1
1.1 Prerequisites from Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Prerequisites from Mathematical Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Key notions in Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 A toy example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Single Sample - Single mean (variance known) 7


2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Single Sample - Single mean (variance unknown) 9


3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 Two Samples - Tests on means (unknown equal variances) 11


4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5 Single Sample - Test for variance 13


5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

6 Exercises 15

References 17

1 Introduction
Hypothesis testing serves as a cornerstone of statistical inference, providing a systematic framework for
making decisions based on sample data and evaluating hypotheses about population parameters. Grounded
in probability theory and statistical principles, hypothesis testing enables researchers to draw meaningful
conclusions and make informed decisions in various fields.

1
In contemporary hypothesis testing, the focus is on controlling the Type I error rate (α) while balancing it
with the Type II error rate (β)1 . This approach allows researchers to specify the desired level of significance
and assess the trade-off between detecting true effects and minimizing the risk of false positives.
Hypothesis testing provides a structured approach to making decisions and drawing conclusions based
on sample data and statistical inference. By understanding the key concepts and modern approaches to
hypothesis testing, researchers can effectively evaluate hypotheses, control error rates, and make informed
decisions in scientific research, business, and beyond.

1.1 Prerequisites from Probability Theory


Before delving into hypothesis testing, we recall briefly the following notions from the previous three
modules, [Nar24a], [Nar24c], [Nar24b], [Ank24], [Pra24] of the course. The primary reference text for the
course is [Wal12]. An overarching, mathematically rigorous and powerful foundation to the subject is the
so-called Measure-theoretic probability theory. The gold standard reference for this is the classic [Chu01].

1.1.1 Sample Space and Events


Probability theory begins with defining the Sample Space Ω, representing all possible outcomes of a random
experiment. Events are subsets of Ω that describe specific outcomes or combinations of outcomes.
Mathematically, a sample space in Probability theory is simply a measure space Ω with a σ-algebra F and
a measure P on it such that P(Ω) = 1. An event is any element of the σ-algebra F.

1.1.2 Random Variables and Distributions


Random Variables are functions of the outcomes of random experiments, and their probability distributions
describe the likelihood of these outcomes. Understanding the behavior of random variables and their
corresponding distributions is crucial for hypothesis testing.
Mathematically, a random variable X is a measurable real/complex function on the sample space.

X : Ω → C such that X −1 U ∈ F ∀ Open (or Borel) U ⊆ C

1.1.3 Expectation and Variance


Expectation (mean) and Variance quantify the central tendency and spread of random variables, providing
insights into their behavior and variability. For a random variable X with probability distribution function
f (x), the expectation of X, denoted by E(X), and the variance of X, denoted by Var(X), are defined as
follows, wherein µ is the mean of X.
Z ∞
E(X) = x · f (x) dx
−∞
1
See 1.3.2 for rough definitions.

2
Z ∞
2
Var(X) = E((X − µ) ) = (x − µ)2 · f (x) dx
−∞

In our modern measure-theoretic language we can equivalently define these notions without (explicitly)
invoking the density function and without regard to whether it’s either/neither a discrete/continuous
random variable. The integrals below are called "Lebesgue Integrals" and they generalize the usual
integration over euclidean spaces to arbitrary (measure) spaces.
Z
E(X) = X

Z
Var(X) = (X − µ)2

1.1.4 Central Limit Theorem (CLT)


The Central Limit Theorem (CLT) states that the distribution of the sample mean approaches a normal
distribution as the sample size increases, regardless of the population distribution.
More precisely, if X1 , X2 , ..., Xn are independent and identically distributed random variables with mean
µ and finite variance σ 2 , then the distribution of the sample mean X̄ approaches a normal distribution
with mean µ and standard deviation √σn as n approaches infinity.

√ d 
→ N 0, σ 2
n(X̄ − µ) −

This theorem underpins many hypothesis testing procedures, particularly those involving sample means.

1.2 Prerequisites from Mathematical Statistics


We also recall briefly the following notions that we shall use from mathematical statistics.

1.2.1 Population
In statistics, a population is the entire pool from which a statistical sample is drawn. It consists of all the
possible elements or outcomes that could be observed in a study. Populations can be finite or infinite and
may follow specific probability distributions.

1.2.2 Random Sample


A random sample from a population is a subset of observations selected in such a way that each member
of the population has an equal chance of being chosen. Formally, a random sample is an independent and
identically distributed (IID) collection of random variables drawn from the population distribution.

3
1.2.3 Population Mean and Variance
The population mean (µ) and the population variance (σ 2 ) are measures corresponding to the entire
population. We typically assume that the population follows a specific probability distribution which
comes with these parameters.

1.2.4 Confidence Interval Estimates for Mean


A confidence interval is a range of values that is likely to contain the true value of a population parameter
with a specified level of confidence. For the population mean, confidence intervals are constructed based
on sample data and are used to estimate the true mean.
• For the population mean with known variance, confidence intervals are typically constructed using
the normal distribution.
• For the population mean with unknown variance, confidence intervals are constructed using the
Student’s t-distribution.
• For the population variance, confidence intervals are constructed using the χ2 -distribution

1.2.5 Sample mean and Sample variance


In statistics, when we collect data from a population, we often take a subset of that population called a
sample. Let {X1 , X2 , ..., Xn } be an (IID) random sample. We then define the random variables X̄ and S 2
as follows.
• The sample mean:
n
1X
X̄ = Xi
n
i=1

• The sample variance:


n
2 1 X
S = (Xi − X̄)2
n−1
i=1

Accordingly, if x1 , x2 , ..., xn denote the observed values of the random sample then the observed sample
mean x̄ and observed sample variance s2 serve as point estimates
√ of these random variables and are defined
as follows, whereas the sample standard deviation s = s . 2

n
1X
x̄ = xi
n
i=1
n
X
1
s2 = (xi − x̄)2
n−1
i=1

4
It’s important to note that we use lowercase letters (e.g., x̄ and s2 ) to represent observed values from the
sample, while uppercase letters (e.g., X̄ and S 2 ) denote the random variables themselves.

1.3 Key notions in Hypothesis Testing


1.3.1 Null and Alternative Hypotheses (H0 and H1 or Ha )
• The null hypothesis (H0 ) represents the default assumption.
• The statement that is being tested against the null hypothesis is the alternative hypothesis. It is
often denoted as H1 or Ha .
As a basic example, if statistical hypothesis testing is thought of as a judgement in a court trial, the null
hypothesis corresponds to the position of the defendant (the defendant is innocent) while the alternative
hypothesis is in the rival position of prosecutor (the defendant is guilty).

1.3.2 Type I and Type II Errors


• Type I error refers to the situation when the null hypothesis is rejected incorrectly.
• Type II error refers to when the null hypothesis is false but is not rejected.

Table 1: Type I and Type II errors in Hypothesis Testing


Decision Null Hypothesis is true Null Hypothesis is False
Do not reject H0 Correct decision Type II Error
Reject H0 Type I Error Correct decision

1.3.3 Significance Level


• The significance level (α) is the threshold for rejecting the null hypothesis. It represents the maximum
allowable probability of Type I error, the risk of incorrectly rejecting a true null hypothesis.
• The Type II error rate (β) is the probability of failing to reject the null hypothesis when it is actually
false. In other words, it represents the likelihood of not detecting a true effect or difference when one
exists. It is complementary to the statistical power of a test, which is the probability of correctly
rejecting the null hypothesis when it is false.

1.4 A toy example


We illustrate the main steps by means of the following example.
A manufacturer claims that a new process improves the mean weight of their product packaging from
the previous mean of 100 grams. A sample of 49 packages is taken, and the mean weight is found to be

5
103 grams. Assume a population standard deviation of 10 grams and test whether the new process has
improved the mean weight of the packaging at a significance level of 0.05.

1.4.1 Formulate Hypotheses


• Null Hypothesis (H0 ): The mean weight of the packaging remains 100 grams (µ = 100).
• Alternative Hypothesis (H1 ): The mean weight of the packaging has increased (µ > 100).
The alternate hypothesis here is chosen without regard to the possibility of µ < 100. It is a matter of the
background information, choice and other uncertainties that dictate the exact alternate hypothesis. In
summary, choose a one-tailed test when you have a specific directional hypothesis and want to maximize
statistical power, and choose a two-tailed test when you want to be more conservative or when there is no
specific directional hypothesis.

1.4.2 Consider the Significance Level (α)


We’re given α = 0.05, representing the maximum acceptable probability of Type I error. A Type 1 error
is when we reject the null hypothesis even when it is true. We wish to commit such a mistake only if the
observed sample mean was unlikely to happen but still did. The significance level α = 0.05 simply means
that we are comfortable with rejecting H0 if we observe a sample mean in a range of values, afar from
µ = 100, wherein the probability of that range of values being observed is less than or equal to 5%.
In other words, assuming that H0 is true, i.e., µ = 100, we wish to find a (critical) value c (of observed
sample mean packaging weight) such that P (X̄ ≥ c) = α = 0.05.

1.4.3 Collect and represent data


Sample mean (x̄) = 103 grams, sample size (n) = 49, population mean (µ0 ) = 100 grams, population
standard deviation (σ) = 10 grams.

1.4.4 Test Statistic and Critical values


We will use the (standard normal) Z-distribution to estimate the population mean, as the sample size is
sufficiently large and the population standard deviation is known.

X̄ − µ0 103 − 100
Z= =⇒ z = = 2.1
√σ √10
n 49

6
At a significance level of 0.05, the (one-tailed) critical Z-value is approximately 1.645, i.e.,

P (Z ≥ 1.645) = 0.05
 
X̄ − 100
Also, P (Z ≥ 1.645) = P √ ≥ 1.645
10/ 49
 
10
= P X̄ − 100 ≥ 1.645 × √
49
= P (X̄ − 100 ≥ 2.35)
= P (X̄ ≥ 102.35)

1.4.5 Draw Conclusions


As the calculated Z-value equals 2.1 which is higher than the critical Z-value 1.645, or equivalently, the
observed sample mean exceeds the critical X̄-value, i.e., 103 > 102.35, we reject the null hypothesis. The
evidence suggests that the new process has indeed improved the mean weight of the packaging.

2 Single Sample - Single mean (variance known)


2.1 Overview
In this section, we consider tests for the population mean assuming that the population variance is known.
It must be emphasized that this is completely unrealistic but is useful to know later when one deals with
power of a test (or Type II error rate).
The underlying situation is that of having assumed/considered a random sample, i.e., a collection of
IID X1 , X2 , X3 , ..., Xn . As they’re identically distributed the random variables Xi all have the same
(probability distribution and thus) mean (µ) and variance (σ 2 ). We assume in this section that we’re
provided with σ 2 and a conjectural mean µ0 . The hypothesis testing procedure outlined below is referred
to as “Single sample test on a single mean - with significance level α and known variance".
1. Null hypothesis H0 : µ = µ0
2. We choose one of the following alternate hypotheses.
(a) H1 : µ ̸= µ0
(b) H1 : µ < µ0
(c) H1 : µ > µ0
x̄−µ
3. Calculate the required sample mean x̄ and the test statistic z = √0 .
σ/ n

4. In this final step we choose the critical region as dictated by the choice of alternate hypothesis in
the second step.

7
(a) Reject H0 if z < −zα/2 or z > zα/2
(b) Reject H0 if z < −zα
(c) Reject H0 if z > zα

2.2 Examples
2.2.1 [Wal12, Example 10.3, p. 338]
A random sample of 100 recorded deaths in the United States during the past year showed an average
life span of 71.8 years. Assuming a population standard deviation of 8.9 years, does this seem to indicate
that the mean life span today is greater than 70 years? Use a 0.05 level of significance. You may assume
z0.05 = 1.645.
Solution:
1. H0 : µ = 70 years.
2. H1 : µ > 70 years.
3. α = 0.05.
x̄−µ
4. One-tailed critical region: z > 1.645, where z = √0 .
σ/ n

5. Computations: x̄ = 71.8 years, σ = 8.9 years, and hence z = 71.8−70



8.9/ 100
= 2.02.

6. Decision: Reject H0 and conclude that the mean life span today is greater than 70 years.

2.2.2 [Wal12, Example 10.4, p. 338]


A manufacturer of sports equipment has developed a new synthetic fishing line that the company claims
has a mean breaking strength of 8 kilograms with a standard deviation of 0.5 kilogram. Test the hypothesis
that µ = 8 kilograms against the alternative that µ ̸= 8 kilograms if a random sample of 50 lines is tested
and found to have a mean breaking strength of 7.8 kilograms. Use a 0.01 level of significance. You may
assume z0.005 = 2.575.
Solution:
1. H0 : µ = 8 kilograms.
2. H1 : µ ̸= 8 kilograms.
3. α = 0.01.
x̄−µ
4. Critical region: z < −2.575 and z > 2.575, where z = √0 .
σ/ n

5. Computations: x̄ = 7.8 kilograms, n = 50, and hence z = 7.8−8



0.5/ 50
= −2.83.

8
6. Decision: Reject H0 and conclude that the average breaking strength is not equal to 8 but is, in fact,
less than 8 kilograms.

3 Single Sample - Single mean (variance unknown)


3.1 Overview
In this section, we consider tests for the population mean assuming that the population variance is un-
known. We shall thus invoke the Student’s t-distribution (with n − 1 degrees of freedom) which is precisely
X̄−µ
the probability distribution followed by the random variable S/ √ .
n

The underlying situation is that of having assumed/considered a random sample that follows normal
distribution, i.e., a collection of IID X1 , X2 , X3 , ..., Xn in N (µ, σ 2 ). As they’re identically distributed
the random variables Xi all have the same (probability distribution and thus) mean (µ) and variance (σ 2 ).
Also in this section we’re not provided with σ 2 but we do have a conjectural mean µ0 . The hypothesis
testing procedure outlined below is referred to as “Single sample test on a single mean - with significance
level α and unknown variance".
1. Null hypothesis H0 : µ = µ0
2. We choose one of the following alternate hypotheses.
(a) H1 : µ ̸= µ0
(b) H1 : µ < µ0
(c) H1 : µ > µ0
x̄−µ
3. Calculate the sample mean x̄, the sample standard deviation s and the test statistic t = √0 .
s/ n

4. In this final step we choose the critical region as dictated by the choice of alternate hypothesis in
the second step. The t-values t−,(n−1) below here are those corresponding to n − 1 degrees
of freedom.
(a) Reject H0 if t < −tα/2,(n−1) or t > tα/2,(n−1)
(b) Reject H0 if t < −tα,(n−1)
(c) Reject H0 if t > tα,(n−1)

3.2 Examples
3.2.1 [Wal12, Example 10.5, p. 340]
The Edison Electric Institute has published figures on the number of kilowatt hours used annually by
various home appliances. It is claimed that a vacuum cleaner uses an average of 46 kilowatt hours per
year. If a random sample of 12 homes included in a planned study indicates that vacuum cleaners use an

9
average of 42 kilowatt hours per year with a standard deviation of 11.9 kilowatt hours, does this suggest at
the 0.05 level of significance that vacuum cleaners use, on average, less than 46 kilowatt hours annually?
Assume the population of kilowatt hours to be normal. You may assume t0.05,11 = 1.796.
Solution:
1. H0 : µ = 46 kilowatt hours.
2. H1 : µ < 46 kilowatt hours.
3. α = 0.05.
x̄−µ
4. Critical region: t < −1.796, where t = √0
s/ n
with 11 degrees of freedom.
5. Computations: x̄ = 42 kilowatt hours, s = 11.9 kilowatt hours, and n = 12. Hence,
42 − 46
t= √ = −1.16,
11.9/ 12

6. Decision: Do not reject H0 and conclude that the average number of kilowatt hours used annually by
home vacuum cleaners is not significantly less than 46. Infact, the P -value equals P (T < −1.16) ≈
0.135.

3.2.2 [Ex. 10.2, https://online.stat.psu.edu/stat415/lesson/10/10.2]


It is assumed that the mean systolic blood pressure is µ = 120 mm Hg. In the Honolulu Heart Study, a
sample of n = 100 people had an average systolic blood pressure of 130.1 mm Hg with a standard deviation
of 21.21 mm Hg. Test whether the group is significantly different from the regular population. You may
assume t0.025,99 = 1.9842.
Solution: Let µ be the population mean systolic blood pressure, assumed to be µ = 120 mm Hg.
The data is of a given sample of n = 100 people with an average systolic blood pressure of x̄ = 130.1 mm
Hg with a standard deviation of s = 21.21 mm Hg.
The null hypothesis is H0 : µ = 120 mm Hg, and the (two-tailed) alternative hypothesis is Ha : µ ̸= 120
mm Hg.
The test statistic for a one-sample t-test is given by:
x̄ − µ
t=
√s
n

Substituting the given values, we get:


130.1 − 120
t= 21.21 ≈ 4.76

100

10
The degrees of freedom for the t-distribution is v = n − 1 = 100 − 1 = 99.
Using a significance level of α = 0.05 (assuming a two-tailed test), the critical t-value, obtained from the
t-table or calculator, t0.025,99 = 1.9842.
We therefore reject the null hypothesis as our t-value 4.76 belongs to the critical region.

4 Two Samples - Tests on means (unknown equal variances)


4.1 Overview
The experimental setting in this section is that of having drawn two random samples, of sizes n1 and n2 ,
respectively, from two normal populations with means µ1 and µ2 , respectively, and unknown but equal
variances σ 2 . The procedure/situation outlined here is referred to as "Two sample pooled t-Test". We
denote the (conjectural) difference of the population means as d0 .
1. Null hypothesis H0 : µ1 − µ2 = d0 .
2. We choose one of the following alternate hypotheses.
(a) H1 : µ1 − µ2 ̸= d0
(b) H1 : µ1 − µ2 < d0
(c) H1 : µ1 − µ2 > d0
3. Calculate the sample means x̄1 , x̄2 , the sample standard deviations s1 , s2 ,qand the test statistic
s21 (n1 −1)+s22 (n2 −1)
t = x̄1q−x̄12 −d10 wherein sp denotes the pooled standard deviation and equals n1 +n2 −2 .
sp n1
+n
2

4. In this final step we choose the critical region as dictated by the choice of alternate hypothesis in the
second step. The t-values t−,(n1 +n2 −2) below here are those corresponding to (n1 + n2 − 2)
degrees of freedom.
(a) Reject H0 if t < −tα/2,(n1 +n2 −2) or t > tα/2,(n1 +n2 −2)
(b) Reject H0 if t < −tα,(n1 +n2 −2)
(c) Reject H0 if t > tα,(n1 +n2 −2)

4.2 Examples
4.2.1 [Wal12, Example 10.6, p. 344]
An experiment was performed to compare the abrasive wear of two different laminated materials. Twelve
pieces of material 1 were tested by exposing each piece to a machine measuring wear. Ten pieces of material
2 were similarly tested. In each case, the depth of wear was observed. The samples of material 1 gave
an average (coded) wear of 85 units with a sample standard deviation of 4, while the samples of material

11
2 gave an average of 81 with a sample standard deviation of 5. Can we conclude at the 0.05 level of
significance that the abrasive wear of material 1 exceeds that of material 2 by more than 2 units? Assume
the populations to be approximately normal with equal variances. You may assume t0.05,20 = 1.725.
Solution: Let µ1 and µ2 represent the population means of the abrasive wear for material 1 and material
2, respectively.
1. H0 : µ1 − µ2 = 2.
2. H1 : µ1 − µ2 > 2.
3. α = 0.05.
(x̄1 −x̄2 )−d0
4. Critical region: t > 1.725, where t = q with v = 20 degrees of freedom.
sp n1 + n1
1 2

5. Computations:

x̄1 = 85, s1 = 4, n1 = 12,


x̄2 = 81, s2 = 5, n2 = 10.

Hence r
(11)(16) + (9)(25)
sp = = 4.478,
12 + 10 − 2
(85 − 81) − 2
t= q = 1.04,
1 1
4.478 12 + 10

6. Decision: Do not reject H0 . We are unable to conclude that the abrasive wear of material 1 exceeds
that of material 2 by more than 2 units.

4.2.2 [Ex. 11.1, https://online.stat.psu.edu/stat415/lesson/11/11.1]


A psychologist wished to check whether the mean fastest speed driven by male college students was different
than the mean fastest speed driven by female college students.
She conducted a survey of n1 = 34 random male college students and n2 = 29 random female college
students. The mean and standard deviation observed for the males was x̄1 = 105.5 mph and s1 = 20.1,
respectively. The mean and standard deviation observed for the females was x̄2 = 90.9 mph and s2 = 12.2,
respectively.
Assuming the populations to be normal with equal variances is there sufficient evidence at the α = 0.05
level to conclude that the mean fastest speed driven by male college students differs from the mean fastest
speed driven by female college students? You may assume t0.025,61 = 1.9996.
Solution:

12
1. H0 : µ1 = µ2 .
2. H1 : µ1 ̸= µ2 .
3. α = 0.05.
(x̄1 −x̄2 )−d0
4. Critical region: t > t0.025,v = 1.9996, where t = q with v = 34 + 29 − 2 = 61 degrees of
sp n1 + n1
1 2
freedom.
5. Computations:

x̄1 = 105.5, s1 = 20.1, n1 = 34,


x̄2 = 90.9, s2 = 12.2, n2 = 29.

Hence r
(33)(20.12 ) + (28)(12.22 )
sp = = 16.9,
61
(105.5 − 90.9) − 0
t= q = 3.42,
1 1
16.9 34 + 29

6. Decision: We reject H0 because the test statistic t = 3.42 falls in the rejection region.

5 Single Sample - Test for variance


5.1 Overview
In this section, as preparation for the next section, we briefly go through a test for variance for a single
sample. The assumption is that we have a normal population with unknown mean µ and a conjectural
variance σ02 . As usual, we’re provided with a random sample of size n from this population. We use the
familiar χ2 -distribution to test for the authenticity of our conjectural variance with significance level α.
1. Null hypothesis H0 : σ 2 = σ02
2. We choose one of the following alternate hypotheses.
(a) H1 : σ 2 ̸= σ02
(b) H1 : σ 2 < σ02
(c) H1 : σ 2 > σ02
(n−1)s2
3. Calculate the required sample variance s2 and the test statistic χ2 = σ02
.

4. In this final step we choose the critical region as dictated by the choice of alternate hypothesis in the
second step. The χ2 values below are those corresponding to (n − 1) degrees of freedom.

13
(a) Reject H0 if χ2 < χ21−(α/2) or χ2 > χ2α/2

(b) Reject H0 if χ2 < χ21−α


(c) Reject H0 if χ2 > χ2α

5.2 Examples
5.2.1 [Wal12, Example 10.12, p. 366]
A manufacturer of car batteries claims that the life of the company’s batteries is approximately normally
distributed with a standard deviation equal to 0.9 year. If a random sample of 10 of these batteries has a
standard deviation of 1.2 years, do you think that σ > 0.9 year? Use a 0.05 level of significance. You may
assume χ20.05,9 = 16.919.
Solution:
1. H0 : σ 2 = 0.81.
2. H1 : σ 2 > 0.81.
3. α = 0.05.
(n−1)s2
4. Critical region: We know that the null hypothesis is rejected when χ2 > 16.919, where χ2 = σ2
,
with v = 9 degrees of freedom.
(9)(1.44)
5. Computations: s2 = 1.44, n = 10, and χ2 = 0.81 = 16.0,
6. Decision: The χ2 -statistic is not significant at the 0.05 level. However, based on the P -value (≈ 0.07)
there is evidence that σ > 0.9.

5.2.2 [Ex. 12.1, https://online.stat.psu.edu/stat415/lesson/12/12.1]


A manufacturer of hard safety hats for construction workers is concerned about the mean and the variation
of the forces its helmets transmits to wearers when subjected to an external force. The manufacturer has
designed the helmets so that the mean force transmitted by the helmets to the workers is 800 pounds (or
less) with a standard deviation to be less than 40 pounds and that the population is normal. Tests were run
on a random sample of n = 40 helmets, and the sample mean and sample standard deviation were found
to be 825 pounds and 48.5 pounds, respectively. You may assume χ20.05,39 = 54.572, χ20.025,39 = 58.120 and
χ20.975,39 = 23.654.
(a) Do the data provide sufficient evidence, at the α = 0.05 level, to conclude that the population
standard deviation exceeds 40 pounds?
(b) Do the data provide sufficient evidence, at the α = 0.05 level, to conclude that the population
standard deviation differs from 40 pounds?
Solution:

14
1. H0 : σ 2 = 402 = 1600
2. The two parts to question implicate the following distinct alternate hypotheses.
(a) Ha : σ 2 > 1600
(b) Hb : σ 2 ̸= 1600
3. The test statistic value remains the same for both cases.
(40 − 1) (48.52 )
χ2 = = 57.336
402

4. The degrees of freedom being 40 − 1 = 39, the one-tailed case has as critical value χ20.05 = 54.572
and the upper cut-off critical value of the two-tailed case is χ20.025 = 58.120.
5. As χ20.975,39 < χ20.05,39 = 54.572 < 57.336 < 58.120 we are led to reject H0 in the one-tailed case but
not in the two-tailed case.
As a side remark, we see that the above example illustrates that the conclusion for the one-sided test
does not always agree with the conclusion for the two-sided test. If we have reason to believe that the
parameter will differ from the null value in a particular direction, then we may conduct the one-sided test.

6 Exercises
6.1 [Wal12, Ex. 10.20, p. 356]
A random sample of 64 bags of white cheddar popcorn weighed, on average, 5.23 ounces with a standard
deviation of 0.24 ounce. Test the hypothesis that µ = 5.5 ounces against the alternative hypothesis,
µ < 5.5 ounces, at the 0.05 level of significance. You may assume t0.05,63 = 1.669.
(t = −9, Reject µ = 5.5.)

6.2 [Wal12, Ex. 10.21, p. 356]


An electrical firm manufactures light bulbs that have a lifetime that is approximately normally distributed
with a mean of 800 hours and a standard deviation of 40 hours. Test the hypothesis, at significance level
α = 0.05, that µ = 800 hours against the alternative, µ ̸= 800 hours, if a random sample of 30 bulbs has
an average life of 788 hours. You may assume z0.025 = 1.96.
(z = −1.64, Not enough evidence to reject µ = 800.)

6.3 [Wal12, Ex. 10.24, p. 356]


The average height of females in the freshman class of a certain college has historically been 162.5 centime-
ters with a standard deviation of 6.9 centimeters. Is there reason to believe, at significance level α = 0.05,

15
that there has been a change in the average height if a random sample of 50 females in the present fresh-
man class has an average height of 165.2 centimeters? Assume the standard deviation remains the same.
You may assume z0.025 = 1.96.
(z = 2.766, Reject µ = 162.5.)

6.4 [Wal12, Ex. 10.26, p. 356]


According to a dietary study, high sodium intake may be related to ulcers, stomach cancer, and migraine
headaches. The human requirement for salt is only 220 milligrams per day, which is surpassed in most
single servings of ready-to-eat cereals. If a random sample of 20 similar servings of a certain cereal has
a mean sodium content of 244 milligrams and a standard deviation of 24.5 milligrams, does this suggest
at the 0.05 level of significance that the average sodium content for a single serving of such cereal is
greater than 220 milligrams? Assume the distribution of sodium contents to be normal. You may assume
t0.05,19 = 1.729.
(t = 4.38, enough available evidence to reject that average sodium for that cereal equals 220 mg.)

6.5 [Wal12, Ex. 10.28, p. 357,]


According to Chemical Engineering, an important property of fiber is its water absorbency. The average
percent absorbency of 25 randomly selected pieces of cotton fiber was found to be 20 with a standard
deviation of 1.5. A random sample of 25 pieces of acetate yielded an average percent of 12 with a standard
deviation of 1.25. Is there strong evidence that the population mean percent absorbency is significantly
higher for cotton fiber than for acetate? Assume that the percent absorbency is approximately normally
distributed and that the population variances in percent absorbency for the two fibers are the same. Use
a significance level of 0.05. You may assume t0.05,48 = 1.677.
(t = 20.485, enough evidence to reject equality of average absorbency of the two materials.)

6.6 [Wal12, Ex. 10.35 , p. 357]


To find out whether a new serum will arrest leukemia, 9 mice, all with an advanced stage of the disease,
are selected. Five mice receive the treatment and 4 do not. Survival times, in years, from the time the
experiment commenced are as follows:
Treatment
2.1, 5.3, 1.4, 4.6, 0.9
No Treatment
1.9, 0.5, 2.8, 3.1
At the 0.05 level of significance, can the serum be said to be effective? Assume the two populations to be
normally distributed with equal variances. You may assume t0.05,7 = 1.895.

16
(t = 0.7, not enough evidence to conclude that the serum is effective.)

6.7 [Wal12, Ex. 10.69, p. 369]


Aflotoxins produced by mold on peanut crops in Virginia must be monitored. A sample of 64 batches of
peanuts reveals levels of 24.17 ppm, on average, with a variance of 4.25 ppm. Test the hypothesis, with
α = 0.05, that σ 2 = 4.2 ppm against the alternative that σ 2 ̸= 4.2 ppm. You may assume χ20.025,63 = 86.828
and χ20.975,63 = 42.949.
(χ2 = 63.75, not enough evidence to reject σ 2 = 4.2.)

6.8 [Wal12, Ex. 10.68, p. 369]


Past experience indicates that the time required for high school seniors to complete a standardized test is
a normal random variable with a standard deviation of 6 minutes. Test the hypothesis that σ = 6 against
the alternative that σ < 6 if a random sample of the test times of 20 high school seniors has a standard
deviation s = 4.51. Use a 0.05 level of significance. You may assume χ20.95,19 = 10.117.
(χ2 = 10.735, Can’t reject σ = 6.)

References
[Ank24] Ankita C., Dept. of Math , DSU, Introduction to Statistics, 2024.
[Chu01] K.L. Chung, A Course in Probability Theory, Elsevier Science, 2001.
[Nar24a] Narayani G., Dept. of Math , DSU, Introduction to Probability theory, 2024.
[Nar24b] , Continuous RVs and Joint Prob. distributions, 2024.
[Nar24c] , Random variables and Probability distributions, 2024.
[Pra24] Pratik Mehta, Dept. of Math , DSU, Introduction to Statistics, 2024.
[Wal12] R.E. Walpole, Probability and Statistics for Engineers and Scientists, Prentice Hall, 2012.

17
Assignment - Unit 4 and Unit 5
1. cf. [Walpole’s text, Ex. 10.40, §10.7, p. 358]
In a study conducted at Virginia Tech, the plasma ascorbic acid levels of pregnant women were compared
for smokers versus nonsmokers. Thirty-two women in the last three months of pregnancy, free of major
health disorders and ranging in age from 15 to 32 years, were selected for the study. Prior to the collection
of 20 ml of blood, the participants were told to avoid breakfast, forgo their vitamin supplements, and
avoid foods high in ascorbic acid content. From the blood samples, the following plasma ascorbic acid
values were determined, in milligrams per 100 milliliters.

Table 2: Plasma Ascorbic Acid Values


Nonsmokers Smokers
0.97 1.16 0.48
0.72 0.86 0.71
1.00 0.85 0.98
0.81 0.58 0.68
0.62 0.57 1.18
1.32 0.64 1.36
1.24 0.98 0.78
0.99 1.09 1.64
0.90 0.92
0.74 0.78
0.88 1.24
0.94 1.18

Is there sufficient evidence, at significance level 0.05, to conclude that there is a difference between plasma
ascorbic acid levels of smokers and nonsmokers? Assume that the two sets of data came from normal
populations with equal variances. Hahaha

2. [Walpole’s text, Ex. 11.1, §11.3, p. 398]


A study was conducted at Virginia Tech to determine if certain static arm-strength measures have an
influence on the “dynamic lift” characteristics of an individual. Twenty-five individuals were subjected to
strength tests and then were asked to perform a weightlifting test in which weight was dynamically lifted
overhead. The data are given in Table 3.
1. Estimate β0 and β1 for the linear regression curve µY |x = β0 + β1 x.
2. Find a point estimate of µY |x=30 .
3. Determine the residual values in the last column of Table 3.

1
Table 3: Dynamic Lift and Arm Strength Data
Individual Arm Strength (x) Dynamic Lift (y) Residuals
1 17.3 71.7
2 19.3 48.3
3 19.5 88.3
4 19.7 75.0
5 22.9 91.7
6 23.1 100.0
7 26.4 73.3
8 26.8 65.0
9 27.6 75.0
10 28.1 88.3
11 28.2 68.3
12 28.7 96.7
13 29.0 76.7
14 29.6 78.3
15 29.9 60.0
16 29.9 71.7
17 30.3 85.0
18 31.3 85.0
19 36.0 88.3
20 39.5 100.0
21 40.4 100.0
22 44.3 100.0
23 44.6 91.7
24 50.4 100.0
25 55.9 71.7

2
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

Course Title: Probability & Statistics


Module 5 - Statistical Hypothesis Testing – Sample tests 2

Goodness of fit test


In this section, we consider a test to determine if a population has a specified theoretical
distribution. The test is based on how good a fit we have between the frequency of
occurrence of observations in an observed sample and the expected frequencies
obtained from the hypothesized distribution.
Consider an experiment in which there are 𝑘 mutually exclusive possible outcomes
𝐴1 , 𝐴2 , … , 𝐴𝑘 . Let 𝑝𝑖 be the probability that event 𝐴𝑖 will occur at a trial of the
experiment and let 𝑛 trials be made. The number of trials producing outcome 𝐴𝑖 will be
denoted by 𝑜𝑖 (observed frequency) and the expected frequency (expected value) of the
outcome 𝐴𝑖 is given by 𝑒𝑖 = 𝑛𝑝𝑖 . In terms of this notation, the problem is to determine
whether the observed frequencies 𝑜1 , 𝑜2 , … 𝑜𝑘 are compatible with the expected
frequencies 𝑒1 , 𝑒2 , … , 𝑒𝑘 . A goodness-of-fit test between observed and expected
frequencies is based on the quantity
(𝒐𝒊 −𝒆𝒊 )𝟐
𝝌𝟐 = ∑𝒌𝒊=𝟏
𝒆𝒊

where 𝜒 2 is a value of a random variable whose sampling distribution is approximated


very closely by the chi-squared distribution with ν = 𝑘 − 1 degrees of freedom. It is
common practice to refer to each possible outcome of an experiment as a cell. The
symbols 𝑜𝑖 and 𝑒𝑖 represent the observed and expected frequencies, respectively, for the
𝑖𝑡ℎ cell. If the observed frequencies are close to the corresponding expected frequencies,
the 𝜒 2 - value will be small, indicating a good fit. If the observed frequencies differ
considerably from the expected frequencies, the 𝜒 2 - value will be large and the fit is
poor. A good fit leads to the acceptance of 𝐻0 , whereas a poor fit leads to its rejection.
The critical region will, therefore, fall in the right tail of the chi-squared distribution.
For a level of significance equal
to α, we find the critical value 𝝌𝟐𝜶
and then 𝝌𝟐 > 𝝌𝟐𝜶 constitutes the
critical region. If the calculated
value 𝝌𝟐 > 𝝌𝟐𝜶 , then Reject the null
hypothesis 𝐻0 .
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

Example:
We consider the tossing of a die experiment. We hypothesize that the die is honest,
which is equivalent to testing the hypothesis that the distribution of outcomes is the
1
discrete uniform distribution with probability 𝑝𝑖 = , 𝑖 = 1,2, … ,6 (i.e.,)
6
1
• Null hypothesis 𝐻0 : 𝑝1 = 𝑝2 = 𝑝3 = 𝑝4 = 𝑝5 = 𝑝6 =
6
• Alternate hypothesis 𝐻1 : at least one 𝑝𝑖 is not as specified in 𝐻0 .
Suppose that the die is tossed 120 times and each outcome is recorded. The results are
given in the table below. Theoretically, if the die is balanced, we would expect each
1
face to occur 20 times since 𝑒𝑖 = 𝑛𝑝𝑖 = 120 ∗ = 20.
6

By comparing the observed frequencies with the corresponding expected frequencies,


we must decide whether these discrepancies are likely to occur as a result of sampling
fluctuations and the die is balanced or whether the die is not honest and the distribution
of outcomes is not uniform.
2
𝑘 (𝑜𝑖 −𝑒𝑖 )
• Calculate the test statistic: 𝜒 =
2 ∑𝑖=1 .
𝑒 𝑖

Here, k=6.

2
At α = 0.05 level of significance, the critical value is 𝜒0.05 = 11.070 for ν = k-1 = 5
degrees of freedom.
• Since 1.7 is less than the critical value, we fail to reject 𝐻0 .
• We conclude that there is insufficient evidence that the die is not
balanced.
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

Question 1: The following table gives the number of aircraft accidents that occurs
during the various days of the week.
Day Sunday Monday Tuesday Wednesday Thursday Friday Saturday
No. of 14 16 8 12 11 9 14
accidents
Use a 0.05 level of significance and test the hypothesis that the accidents are uniformly
2
distributed over the week. Given that 𝜒0.05 = 12.592 for 6 degrees of freedom.
Solution: This is a problem of testing the hypothesis
1
• 𝐻0 : 𝑝1 = 𝑝2 = 𝑝3 = 𝑝4 = 𝑝5 = 𝑝6 = 𝑝7 =
7
for a uniform distribution involving 7 cells (k=7).
• Alternate hypothesis 𝐻1 : at least one 𝑝𝑖 is not as specified in 𝐻0 .
• To calculate the expected frequencies:
Under the null hypothesis 𝐻0 , the expected frequencies 𝑒𝑖 = 𝑛𝑝𝑖 , where
1
𝑛 = 84, 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑐𝑐𝑖𝑑𝑒𝑛𝑡𝑠, are 𝑒𝑖 = 𝑛𝑝𝑖 = 84 ∗ = 12, for
7
i=1,2,…,7 .
Day Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Observed 14 16 8 12 11 9 14
(𝒐𝒊 )
Expected 12 12 12 12 12 12 12
(𝒆𝒊 )
(𝑜𝑖 −𝑒𝑖 )2
• 𝜒 2 = ∑7𝑖=1
𝑒𝑖

• Since 𝜒 2 = 4.17 < 12.592, we fail to reject 𝐻0 at 0.05 level of


significance (i.e.,) we do not have enough evidence to conclude that the
distribution of observed accidents differs significantly from the expected
uniform distribution.
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

Question 2: A sample analysis of examination results of 500 students was made.


It was found that 220 students had failed, 170 had secured third class, 90 had
secured second class and 20 had secured first class. Do these figures support the
general examination result which is in the ratio 4:3:2:1 for the respective
2
categories at α=0.05 level of significance? Given that 𝜒0.05 = 7.81 for 3 degrees
of freedom.
Solution:
This is a problem of testing the hypothesis
4 4 3 2 1
• 𝐻0 : 𝑝1 = = , 𝑝2 = , 𝑝3 = , 𝑝4 =
4+3+2+1 10 10 10 10

for a multinomial distribution involving four cells (k=4) (i.e.,) the general examination
result is in the ratio 4:3:2:1.
• 𝐻1 : at least one 𝑝𝑖 is not as specified in 𝐻0 (i.e.,) the general examination result
is not in the ratio 4:3:2:1.

• To calculate the expected frequencies:


Under the null hypothesis 𝐻0 , the expected frequencies 𝑒𝑖 = 𝑛𝑝𝑖 , where 𝑛 =
500, 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠.
4 3
𝑒1 = 𝑛𝑝1 = 500 ∗ = 200 ; 𝑒2 = 𝑛𝑝2 = 500 ∗ = 150 ;
10 10
2 1
𝑒3 = 𝑛𝑝3 = 500 ∗ = 100; 𝑒4 = 𝑛𝑝4 = 500 ∗ = 50.
10 10

Categories Fail Third class Second class First class


Observed (𝒐𝒊 ) 220 170 90 20
Expected (𝒆𝒊 ) 200 150 100 50

(𝑜𝑖 −𝑒𝑖 )2
• 𝜒 2 = ∑4𝑖=1 = 23.67.
𝑒𝑖
• Since 𝜒 2 = 23.67 > 7.81, we reject 𝐻0 at 0.05 level of significance. We can
conclude that the ratio is not 4:3:2:1.
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

Question 3: A survey of 320 families with 5 children each revealed the following
distribution
No. of 5 4 3 2 1 0
boys
No. of 0 1 2 3 4 5
girls
No. of 14 56 110 88 40 12
families
Assuming that the probability of male child and female child are equal, test the
hypothesis that the number of boys in families with exactly 5 children follows binomial
2
distribution. Use a 0.05 level of significance. Given that 𝜒0.05 = 11.07 for 5 degrees of
freedom.
Solution:
• Let X be the number of boys in a family. Then, by binomial distribution
𝒏 𝟓 𝟏 𝟏 𝟏 𝟓
𝑷(𝒙) = 𝒄𝒙 𝒑𝒙 𝒒𝒏−𝒙 = 𝒄𝒙(𝟐)𝒙 (𝟐)𝟓−𝒙 =(𝟑𝟐) 𝒄 , where n is the total number of
𝒙
children in a family.
1 5 5
• 𝐻0 : 𝑝1 = 𝑃(𝑋 = 5) = ; 𝑝2 = 𝑃(𝑋 = 4) = ; 𝑝3 = 𝑃(𝑋 = 3) = ; 𝑝4 =
32 32 16
5 5 1
𝑃 (𝑋 = 2) = ; 𝑝5 = 𝑃(𝑋 = 1) = ; 𝑝6 = 𝑃(𝑋 = 0) =
16 32 32
• 𝐻1 : at least one 𝑝𝑖 is not as specified in 𝐻0 .
• To calculate the expected frequencies: 𝑒𝑖 = 𝑁 ∗ 𝑝𝑖 , where N is the total
number of families.
1 5 5
𝑒1 = 320 ∗ = 10, 𝑒2 = 320 ∗ = 50; 𝑒3 = 320 ∗ = 100
32 32 16
5 5 1
𝑒4 = 320 ∗ = 100; 𝑒5 = 320 ∗ = 50; 𝑒6 = 320 ∗ = 10.
16 32 32

Categories 5Boys 4 boys 3 boys 2 boys & 3 1 boy & 0 boy


of families &0 & 1 girl &2 girls 4 girls &5
girls girls girls
Observed 14 56 110 88 40 12
(𝒐𝒊 )
Expected 10 50 100 100 50 10
(𝒆𝒊 )
(𝑜𝑖 −𝑒𝑖 )2
• 𝜒 2 = ∑6𝑖=1 = 7.16
𝑒𝑖
• Since 𝜒 2 = 7.16 < 11.07, we fail to reject 𝐻0 at 0.05 level of significance.
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

Question 4: A machine is supposed to mix peanuts, hazelnuts, cashews, and


almonds in the ratio 5:2:2:1. A can containing 500 of these mixed nuts was found
to have 269 peanuts, 112 hazelnuts, 74 cashews, and 45 almonds. At the 0.05
level of significance, test the hypothesis that the machine is mixing the nuts in
2
the ratio 5:2:2:1. Given that 𝜒0.05 = 7.81 for 3 degrees of freedom.
Solution: This is a problem of testing the hypothesis
5 2 2 1
• 𝐻0 : 𝑝1 = , 𝑝2 = , 𝑝3 = , 𝑝4 =
10 10 10 10

for a multinomial distribution involving four cells (k=4) (i.e.,) the machine is mixing
the nuts in the ratio 5:2:2:1.
• 𝐻1 : at least one 𝑝𝑖 is not as specified in 𝐻0 (i.e.,) the machine is not mixing
the nuts in the ratio 5:2:2:1.
Under the null hypothesis 𝐻0 , the expected frequencies 𝑒𝑖 = 𝑛𝑝𝑖 , where 𝑛 =
500, 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑛𝑢𝑡𝑠.
5 2 2
𝑒1 = 𝑛𝑝1 = 500 ∗ = 250 ; 𝑒2 = 𝑛𝑝2 = 500 ∗ = 100 ; 𝑒3 = 𝑛𝑝3 = 500 ∗ =
10 10 10
1
100; 𝑒4 = 𝑛𝑝4 = 500 ∗ = 50.
10

Categories peanuts hazelnuts cashews almonds


Observed (𝒐𝒊 ) 269 112 74 45
Expected (𝒆𝒊 ) 250 100 100 50

(𝑜𝑖 −𝑒𝑖 )2
• 𝜒 2 = ∑4𝑖=1 = 10.144
𝑒𝑖
• Since 𝜒 2 = 10.144 > 7.81, we reject 𝐻0 at 0.05 level of significance. We can
conclude that the ratio is not 5:2:2:1.
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

Practice Problems:
1) The grades in a statistics course for a particular semester were as follows:
Grade A B C D F
frequency 14 18 32 20 16
Test the hypothesis, at the 0.01 level of significance, that the distribution of grades
2
is uniform. Given that 𝜒0.01 = 13.277 for 4 degrees of freedom.
2) A die is tossed 180 times with the following results:
face 1 2 3 4 5 6
frequency 28 36 36 30 27 23
2
Is this a balanced die? Use a 0.01 level of significance. 𝜒0.01 = 15.086 for 5
degrees of freedom.
3) In experiments on the breeding of flowers of a certain species, a researcher
obtained 120 magenta flowers with a green stigma, 48 magenta flowers
with a red stigma, 36 red flowers with a green stigma and 13 red flowers
with a red stigma. Theory predicts that flowers of these types should be
obtained in the ratios 9:3:3:1. Are these experimental results compatible
2
with theory? Given that 𝜒0.05 = 7.815 for 3 degrees of freedom.
4) A car manufacturer expects the order placed for different colours of their
SUV car model to be distributed as follows:
White Black Silver Other
28% 25% 16% 31%
A random sample of 140 orders revealed the following:
White Black Silver Other
39 29 24 48
Test at 𝛼 = 0.05 to determine if the observed colours differ significantly
2
from the manufacturer’s expectation. Given that 𝜒0.05 = 7.815 for 3
degrees of freedom.
Unit 5 - Simple Linear Regression - 4th Sem, B.Tech. 2024
Department of Mathematics, School of Engineering, Dayananda Sagar University

Contents
1 Introduction 1

2 Simple Linear Regression 2


2.1 The "best fitting" line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 The Regression model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 The method of least squares 6

4 Examples 7

5 Practice Questions 12

References 13

1 Introduction
Linear regression is a foundational statistical technique used for modeling and analyzing the relationships
between a dependent variable and one or more independent variables. It provides a method for predicting
the value of a dependent variable based on the values of independent variables and helps in understanding
the strength and nature of these relationships. We aim to develop here a first introduction to simple linear
regression, its assumptions and basic applications in exercises as laid out in [Wal12, §11.1 - §11.3]. The
notes here are intended to be a continuation of the previously explored parts [Nar24a], [Nar24c], [Nar24b],
[Pra24a], [Pra24b] and [Nar24d] of the course.

Simple Linear Regression: Simple linear regression models the relationship between a single indepen-
dent variable X and a dependent variable Y . We discuss more details for this in the next section. Briefly
though, the model is expressed as follows.

Y = β0 + β1 x + ϵ

• Y is the dependent variable (or Response). It is a random variable.

1
• x is a value of the independent variable (often called Regressor/Predictor).
• β0 is called the intercept.
• β1 is called the slope.
• ϵ is the notation for error term (denoted here using the greek letter "Epsilon"). It is a random
variable which depends on x, i.e., given a value of x, say x = 4, we have a random variable ϵ4 .
Example 1.1. Consider predicting the sales revenue (Y ) based on advertising expenditure (X). Here,
the goal is to determine how changes in advertising spend influence sales.

Multiple Linear Regression: Multiple linear regression extends the simple linear regression model by
incorporating multiple independent variables. The model is expressed as below.

Y = β0 + β1 x1 + β2 x2 + · · · + βp xp + ϵ

• Y is the dependent variable.


• x1 , x2 , . . . , xp are values of the independent variables.
• β0 is the intercept.
• β1 , β2 , . . . , βp are the slopes corresponding to each independent variable.
• ϵ is the error term.
Example 1.2. Consider predicting house prices (Y ) based on factors such as square footage (X1 ), number
of bedrooms (X2 ), and age of the house (X3 ). Here, the model aims to capture the combined influence of
these factors on house prices.
In summary, while simple linear regression provides a straightforward method for analyzing the relationship
between two variables, multiple linear regression offers a more comprehensive approach by considering the
simultaneous effect of several predictors. This makes multiple linear regression particularly powerful for
real-world applications where outcomes are influenced by multiple factors.

2 Simple Linear Regression


In this section we discuss a commonly preferred statistical model/scenario that is referred to as a "Simple
linear regression model". The first subsection here motivates this discussion while the next two form the
technical heart of these notes.

2.1 The "best fitting" line


We start with a finite set of data points of two variables {(x1 , y1 ), (x2 , y2 ), ...} that comprises our set of
observed data. The intention in simple linear regression is to find a line that "best" fits the data points.

2
In the context of linear regression, the term "best fit" can have several interpretations depending on the
criteria used to measure the goodness of fit. In what follows are a few such commonly considered ones
wherein yi denotes the ith observed value, yˆi denotes the predicted ith value and the "best fit" line is
ŷ = b0 + b1 x, so that, in particular, yˆi = b0 + b1 xi . We thus only discuss about the values of b0 and b1 in
order to describe the possible interpretations of a "best" fitting line below.
• Least Squares Criterion: The most common interpretation is based on the method of least
squares. Here, the "best fit" line is the one that minimizes the sum of the squared differences
between the observed values yi and the predicted values yˆi .
n
X
b0 , b1 ∈ R such that the sum (yi − (b0 + b1 xi ))2 is minimum
i=1

• Least Absolute Deviations: One may just as well choose to minimize the sum of the absolute
differences between the observed values and the predicted values.
n
X
|yi − (b0 + b1 xi )|
i=1

This method is more robust to outliers compared to the least squares criterion.
• Maximum Likelihood Estimation (MLE): This approach coincides with the least squares cri-
terion for the assumptions that we shall work with.
• R-Squared (Coefficient of Determination): The "best fit" can also be interpreted in terms of
R2 , which measures the proportion of variance in the dependent variable that is predictable from
the independent variable(s). A higher R2 value indicates a better fit.
P
n
(yi − yˆi )2
i=1
R2 = 1 − .
P
n
(yi − ȳ)2
i=1

• Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC): These
criteria are used to compare different models. The "best fit" model is the one with the lowest AIC
or BIC value, which balances model fit with model complexity.

2.2 The Regression model


It is indeed simpler to state the "least sum of squares" computations and simply, shallowly and intimi-
datingly declare the resulting line as the "best fit". However, it is far more illuminating to describe our
desired interpretation of "best fit" in a roundabout manner. We set up a precise meaning of what it means
to say that a set of data points {(x1 , y1 ), (x2 , y2 ), ...} (follows or) is expected to follow a "Simple Linear

3
Regression Model" and later show1 that the "least sum of squares" is indeed an "unbiased" procedure to
uncover2 the true parameters of the regression model if the data were expected to follow one.
We accomplish this in a series of equivalent but progressively precise statements that culminate in our
desired definition of a "Simple linear regression model" as follows.
Mod 2.2.1. The data points {(x1 , y1 ), (x2 , y2 ), ..., (xn , yn )} are, in a statistical relationship study, the
observed y−values for the x−values x1 , x2 , ..., xn that ought to fall on a line.
Mod 2.2.2. We have random variables Y1 , Y2 ,..., Yn for the corresponding x1 , x2 , ..., xn and the data points
{(x1 , y1 ), (x2 , y2 ), ..., (xn , yn )} are obtained from the observed Yi −values for the x−values x1 , x2 , ..., xn and
are expected to all be on a line.
Mod 2.2.3. We have random variables Y1 , Y2 ,..., Yn for the corresponding x1 , x2 , ..., xn such that if
y1 , y2 , ..., yn are their most probable or expected values then the data points {(x1 , y1 ), (x2 , y2 ), ..., (xn , yn )}
should likely be on a line.
Mod 2.2.4. We have random variables Y1 , Y2 ,..., Yn such that if µY1 , µY2 , ..., µYn denote their mean or
expectations then the points {(x1 , µY1 ), (x2 , µY2 ), ..., (xn , µYn )} are all collinear.
We now take a leap of faith, abandon strict mathematical equivalence, assume the Yi to be independent
and, empowered with the central limit theorem, to have a normal distribution with equal variances. In
fact, the method of least sum of squares is an "unbiased" procedure to uncover the line (i.e., β0 and β1 )
only if we add these normality and IID assumptions.
We also denote µYi with µY |xi . Also, we denote by β0 the y−intercept and by β1 the slope of the line
formed by the points {(x1 , µY |x1 ), (x2 , µY |x2 ), ..., (xn , µY |xn )}.
Mod 2.2.5. We have for x1 , x2 , ..., xn the independent collection of random variables Yi ∼ N (β0 +β1 xi , σ 2 ).
Mod 2.2.6. Yi = β0 + β1 xi + ϵi , wherein ϵi are random variables (called error terms), β0 and β1 ∈ R
such that the following hold.
1. The ϵi are independent random variables.
2. ϵi are normally distributed with mean zero and constant variance σ 2 , i.e.,

ϵi ∼ N (0, σ 2 ).

The figures 1, 2 of this section are a visual summary of the model we consider here.
1
i.e., claim and cite literature.
2
We can always only estimate the true parameters and never really uncover them just as, for instance, we never know the
true mean of a statistical population but we assume it exists and estimate it in an "unbiased" manner using sample means.

4
2.3 Notation
As we never know the true mean of the Yi we never know the line intercept and slope β0 and β1 . It therefore
makes sense to have to the following notations/conventions in addition to those already mentioned in the
previous subsection.
1. Estimates for β0 and β1 shall be denoted by b0 and b1 , respectively.
2. The observed Yi −values shall be denoted by yi whereas, using the estimates above, the predicted
Yi −values will be denoted by yˆi (i.e., yˆi = b0 + b1 xi ).
3. The prediction/predicted data, yˆi = b0 + b1 xi , forms a line called the fitted regression line.
4. The difference, yi − yˆi , i.e., observed minus predicted, is called the ith −Residual and denoted ei .
5. The true regression line is the line formed by the points {(x1 , µY |x1 ), (x2 , µY |x2 ), ..., (xn , µY |xn )}
and is denoted µY |x = β0 + β1 x. We never get to know this line as we need at least two points on it
for the equation and for that we need at least two of the means µY |xi and µY |xj .

Figure 1: The never known or hypothetical true regression line

5
Figure 2: The (hypothetical) true errors vs residuals

3 The method of least squares


The method of Least squares involves finding the values of the coefficients b0 and b1 in the linear equation
ŷ = b0 + b1 x that minimize the sum of the squared differences (residuals) between the observed values yi
and the predicted values yˆi .
Now, for the given problem, assume the data set consists of n observations (x1 , y1 ), (x2 , y2 ), ..., (xn , yn ).
The goal is to find the line ŷ = b0 + b1 x that minimizes the sum of squared residuals below.
n
X n
X n
X
SSE = e2i = (yi − yˆi ) = 2
(yi − b0 − b1 xi )2
i=1 i=1 i=1

We think of this above quantity as a function of b0 and b1 so that setting the corresponding partial
derivatives to zero and rearranging/simplifying the equations we obtain the following desired point of
extremum (See, for instance, [Wal12, p. 396, §11.3]).
P
n P
n P
n P
n
n (xi yi ) − ( xi )( yi ) (xi − x̄)(yi − ȳ)
i=1 i=1 i=1 i=1
b1 = =
P
n P
n P
n
(n x2i ) − ( xi )2 (xi − x̄)2
i=1 i=1 i=1

P
n P
n
yi − b1 xi
i=1 i=1
b0 = = ȳ − b1 x̄
n

6
These formulae provide the least squares estimates of the intercept and slope for the (best?) fitted
regression line in simple linear regression. As for unbiasedness, it so happens that the expectations of
the random variables, of which the above bi are point estimates, equal the required βi [Wal12, p. 401,
§11.4].

4 Examples
Example 4.1. The grades of a class of 9 students on a midterm report (x) and on the final examination
(y) are as follows:
x y
77 82
50 66
71 78
72 34
81 47
94 85
96 99
99 99
67 68
1. Estimate the linear regression line.
2. Estimate the final examination grade of a student who received a grade of 85 on the midterm report.
Solution: 1. From the given data: X
xi = 707
i
X
yi = 658
i
X
x2i = 57557
i
X
xi yi = 53258
i

9(53258) − (707)(658)
b1 = = 0.7771
9(57557) − (707)2

658 − (707)(0.7771)
b0 = = 12.0623
9
∴ ŷ = 12.0623 + 0.7771x which is the required linear regression line.

7
2. For x = 85, ŷ = 12.0623 + (0.7771) (85) =78
Example 4.2. A mathematics placement test is given to all entering freshmen at a small college. A
student who receives a grade below 35 is denied admission to the regular mathematics course and placed
in a remedial class. The placement test scores and the final grades for 20 students who took the regular
course were recorded.
Placement Test Course grade
50 53
35 41
35 61
40 56
55 68
65 36
35 11
60 70
90 79
35 59
90 54
80 91
60 48
60 71
60 71
40 47
55 53
50 68
65 57
50 79
1. Plot a scatter diagram.
2. Find the equation of the regression line to predict course grades from placement test scores.
3. Graph the line on the scatter diagram.
4. If 60 is the minimum passing grade, below which placement test score should students in the future
be denied admission to this course?

8
Solution: (1)

Figure 3: Scatterplot of the given data

(2) From the given data: X


xi = 1110
i
X
yi = 1173
i
X
x2i = 67, 100
i
X
xi yi = 67, 690
i

20(67, 690) − (1110)(1173)


b1 = = 0.4711
20(67, 100) − (1110)2

1173 − (1110)(0.4711)
b0 = = 32.5059
20
∴ ŷ = 32.5059 + 0.4711x which is the required linear regression line

9
(3)

Figure 4: The regression line on the scatter plot

(4) When ŷ = 60, we get 60 = 32.5059 + 0.4711x which gives x = 58.466 Therefore, if a student scores
below 59 he(or she) would be denied admission in future.
Example 4.3. In a certain type of metal test specimen, the normal stress on a specimen is known to be
functionally related to the shear resistance. The following is a set of coded experimental data on the two
variables:
Normal Stress, x Shear Resistance, y
26.8 26.5
25.4 27.3
28.9 24.2
23.6 27.1
27.7 23.6
23.9 25.9
24.7 26.3
28.1 22.5
26.9 21.7
27.4 21.4
22.6 25.8
25.6 24.8
1. Estimate the regression line µY |x = β0 + β1 x
2. Estimate the shear resistance for a normal stress of 24.5.

10
Solution: From the given data: X
xi = 311.6
i
X
yi = 297.2
i
X
x2i = 8134.26
i
X
xi yi = 7687.76
i

which gives
b1 = −0.6861

b0 = 42.582

∴ ŷ = 42.582 - 0.6861x which is the required linear regression line.


2. For x = 24.5, ŷ = 42.582 - (0.6861) (24.5) = 25.772
Example 4.4. A professor in the School of Business in a university polled a dozen colleagues about the
number of professional meetings they attended in the past five years (x) and the number of papers they
submitted to refereed journals (y) during the same period. The summary data are given as follows:

n = 12, x̄ = 4, ȳ = 12
X
x2i = 232
i
X
xi yi = 318
i

Fit a simple linear regression model between x and y by finding out the estimates of intercept and slope.
Comment on whether attending more professional meetings would result in publishing more papers.
Solution: From the given data:
n = 12, x̄ = 4, ȳ = 12
which gives X
xi = nx̄ = 12(4)
i
X
yi = nȳ = 12(12)
i

11
Then,
12(318) − (12)(4)(12)(12)
b1 = = −6.45
12(232) − ((12)(4))2

12(12) − (−6.45)12(4)
b0 = = 37.8
12
∴ ŷ = 37.8 - 6.45x which is the required linear regression line. It appears that attending professional
meetings would result in publishing more papers.

5 Practice Questions
1. A study of the amount of rainfall and the quantity of air pollution removed produced the following
data:
Daily frainfall, x(0.01cm) Particulate removed, y(µ g/m3 )
4.3 126
4.5 121
5.9 116
5.6 118
6.1 114
5.2 118
3.8 132
2.1 141
7.5 108
(a) Find the equation of the regression line to predict the particulate removed from the amount of
daily rainfall.
(b) Estimate the amount of particulate removed when the daily rainfall is x = 4.8 units.
[Ans: ŷ = 5.8254 + 0.5676x; For x = 50, ŷ = 34.205 grams]
2. In the regression model ŷ = b0 + b1 x,x̄ = 2.50, ȳ = 5.50 and b0 = 1.50 (where x̄ and ȳ denote mean
of variables x and y), find the value of the slope of the regression line.
[Ans: 1.60]
3. A study was done to study the effect of ambient temperature x on the electric power consumed by a
chemical plant y. Other factors were held constant, and the data were collected from an experimental
pilot plant.

12
y (BTU) x (◦ F)
250 27
285 45
320 72
295 58
265 31
298 60
267 34
321 74
(a) Estimate the slope and intercept in a simple linear regression model.
(b) Predict power consumption for an ambient temperature of 65◦ F.
[Ans: ŷ = 218.26 + 1.3839x; For x = 65, ŷ = 308.21]
4. The following data were collected to determine the relationship between pressure and the correspond-
ing scale reading for the purpose of calibration.
Pressure, x (lb/sq in.) Scale reading, y
10 13
10 18
10 16
10 15
10 20
50 90
50 88
50 88
50 92
50 86
(a) Find the equation of the regression line.
(b) The purpose of calibration in this application is to estimate pressure from an observed scale
reading. Estimate the pressure for a scale reading of 54 using x̂ = 54−b
b1
0

[Ans: ŷ = -1.70 + 1.81x; x̂ = 30.78]

References
[Nar24a] Narayani G., Dept. of Math , DSU, Introduction to Probability theory, 2024.
[Nar24b] , Continuous RVs and Joint Prob. distributions, 2024.
[Nar24c] , Random variables and Probability distributions, 2024.

13
[Nar24d] , Chi-squared Goodness-of-fit test, 2024.
[Pra24a] Pratik Mehta, Dept. of Math , DSU, Introduction to Statistics, 2024.
[Pra24b] , Statistical Hypothesis Testing, 2024.
[Wal12] R.E. Walpole, Probability and Statistics for Engineers and Scientists, Prentice Hall, 2012.

14
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

MODULE 5: SAMPLE TEST 2


Test For Regression Coefficient
Dr. SRIKUMAR

Regression Lines and Regression Coefficients:


Wkt the relationship between covariance 𝐶𝑜𝑣(𝑋, 𝑌) and correlation 𝜌(𝑋, 𝑌) of two random
variables X and Y is given as
𝑪𝒐𝒗 (𝑿,𝒀)
𝝆(𝑿, 𝒀) = 𝝈𝑿 𝝈𝒀
---------------------------------------- (1)

Above expression is also called as Karl Pearson’s coefficient of correlation.


Regression: It is the measure of the average relationship between two or more variables in
terms of original units of the data.
Regression Line: It is a graphical technique to show the functional relationship between two
variables ‘x’ and ‘y’ i.e dependent and independent variables.
It is a line which shows the average relationship between the two variables ‘x’ and ‘y’ .
Thus this is a line of average.

The regression equation of ‘x’ on ‘y’ is 𝒙 = 𝒂 + 𝒃𝒚 … … . . (𝟐) (𝑜𝑟) 𝑥 = 𝑎𝑦 + 𝑏 , where from


eqn.(2) ‘x’ is the dependent variable and ‘y’ is the independent variable. It shows the variation in
the values of ‘x’ for the given changes in ‘y’ .
Similarly , the regression equation of ‘y’ on ‘x’ is 𝒚 = 𝒂 + 𝒃𝒙 … … . . (𝟑) (𝑜𝑟) 𝑦 = 𝑎𝑥 + 𝑏 , where
from eqn.(3) ‘y’ is the dependent variable and ‘x’ is the independent variable. It shows the
variation in the values of ‘y’ for the given changes in ‘x’ .
The point of intersection of the regression lines 𝒙 = 𝒂 + 𝒃𝒚 and 𝒚 = 𝒂 + 𝒃𝒙 gives the mean of
‘x’ and ‘y’ denoted as 𝑥 𝑎𝑛𝑑 𝑦 respectively.

1
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

The objective of these regression lines is to fit the data on the lines . For this we need to
estimate the unknown parameters ‘a’ and ‘b’ ( by the method of least squares by establishing
normal equations for regression lines)
Regression Coefficient (𝒃𝒚𝒙 𝑎𝑛𝑑 𝒃𝒙𝒚 )

It refers to the slope of the regression lines.


Slope of the regression line ‘y’ on ‘x’ i.e 𝒚 = 𝒂 + 𝒃𝒙 is ‘b’ called as the regression coefficient of
𝝈
‘y’ on ‘x’ and is denoted as 𝒃𝒚𝒙 and is given as 𝒃𝒚𝒙 = 𝝆 𝝈𝒚 ……………………………………… (4)
𝒙

Similarly, slope of the regression line ‘x’ on ‘y’ i.e 𝒙 = 𝒂 + 𝒃𝒚 is ‘b’ called as the regression
𝝈
coefficient of ‘x’ on ‘y’ and is denoted as 𝒃𝒙𝒚 and is given as 𝒃𝒙𝒚 = 𝝆 𝒙 …………………… (5)
𝝈𝒚
where , 𝝈𝒙 𝑎𝑛𝑑 𝝈𝒚 are the S.D of ‘x’ and S.D of ‘y’ respectively.

Relationship between 𝝆, 𝒃𝒚𝒙 𝑎𝑛𝑑 𝒃𝒙𝒚 are given as 𝝆 = √𝒃𝒚𝒙 ⋅ 𝒃𝒙𝒚 ………………………..(6)

Note: 1) Value of 𝝆 = √𝒃𝒚𝒙 ⋅ 𝒃𝒙𝒚 ∈ [−1, 1]

2) 𝝆, 𝒃𝒚𝒙 𝑎𝑛𝑑 𝒃𝒙𝒚 all must have same sign.

3) If one of the regression coefficient is greater than 1 then the other must be lesser than 1.
(both can be lesser than 1)

2
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

Worked Example Problems:


1) In partially destroyed laboratory record relating to correlation data, the following
results are eligible 𝜎𝑥2 = 9 and the regression equations 8𝑥 − 10𝑦 + 66 = 0 ;
40𝑥 − 18𝑦 = 214 . Find (i) which line is ‘x’ on ‘y’ and ‘y’ on ‘x’ (ii) mean of ‘x’ and ‘y’
(iii) coefficient correlation between ‘x’ and ‘y’ (iv) S.D of ‘y’

Soltn: By data 𝜎𝑥2 = 9 ∴ 𝜎𝑥 = ±3 ………………………………. (1)


By data,
8𝑥 − 10𝑦 + 66 = 0 ……………………. (2) & 40𝑥 − 18𝑦 = 214 ………………………. (3)
⇒ 8𝑥 = 10𝑦 − 66 ⇒ 18𝑦 = 40𝑥 − 214
⇒ 𝑥 = 1.25𝑦 − 8.25 (𝑥 = 𝑎 + 𝑏𝑦) ⇒ 𝑦 = 2.22𝑥 − 11.88 (𝑦 = 𝑎 + 𝑏𝑥)
Here, 𝒃𝒙𝒚 = 𝟏. 𝟐𝟓 > 𝟏 & 𝒃𝒚𝒙 = 𝟐. 𝟐𝟐 > 𝟏 (against rule) so ruled out.

From eqn.(2) and eqn.(3)


10𝑦 = 8𝑥 + 66 & 40𝑥 = 18𝑦 + 214
⇒ 𝑦 = 0.8𝑥 + 6.6(𝑦 = 𝑎 + 𝑏𝑥) ⇒ 𝑥 = 0.45𝑦 + 4.755 (𝑥 = 𝑎 + 𝑏𝑦)
Here, 𝒃𝒚𝒙 = 𝟎. 𝟖 < 𝟏 ………… (4) & 𝒃𝒙𝒚 = 𝟎. 𝟒𝟓 < 𝟏 …………… (5)

(i) Thus the regression line ‘y’ on ‘x’ is 𝟏𝟎𝒚 = 𝟖𝒙 + 𝟔𝟔 …………………………….. (6)
And ‘x’ on ‘y’ is 𝟒𝟎𝒙 = 𝟏𝟖𝒚 + 𝟐𝟏𝟒 ……………………………….. (7)
(ii) mean of ‘x’ and ‘y’
Solving for eqn.(2) and eqn.(3) (series of two linear equations in two unknowns ‘x’ and
‘y’ by using calculator) we get x=13 and y=17.
Thus , mean of ‘x’ and ‘y’ are : 𝒙 = 𝟏𝟑 𝑎𝑛𝑑 𝒚 = 𝟏𝟕
(iii) coefficient correlation between ‘x’ and ‘y’

Wkt , 𝝆 = √𝒃𝒚𝒙 ⋅ 𝒃𝒙𝒚 ∈ [−1, 1] , using eqns. (5) & (6) we get

𝝆 = 𝟎. 𝟔 ∈ [−1, 1] ………………………………………… (8)


(iv) S.D of ‘y’
𝝈𝒚 𝒃𝒚𝒙 ⋅𝝈𝒙
Wkt, 𝒃𝒚𝒙 = 𝝆 𝝈𝒙
, ⇒ 𝝈𝒚 = 𝝆
∴ 𝝈𝒚 = 𝟒 …………………………. (9)

3
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

2) For 50 students of a class the regression equation of marks in statistics(X) on marks in


mathematics (Y) is 3𝑌 − 5𝑋 + 180 = 0 . The mean mark in mathematics is 44 and
variance of marks in statistics is (9/16)th of the variance of marks in mathematics. Find
the mean marks in statistics and the coefficient of correlation between the marks in two
subjects.
Soltn: By data its given that,
Regression equation for the line X on Y is : 3𝑌 − 5𝑋 + 180 = 0 …………………………… (1)

𝑌 = 44 ………………… (2)
9 3 𝜎𝑋 3
𝜎𝑋2 = 16 𝜎𝑌2 ∴ 𝜎𝑋 = 4 𝜎𝑌 ⇒ 𝜎𝑌
= 4
…………………………………………………….. (2)

𝜎𝑌 4
(or) = …………………………………………………………. (3)
𝜎𝑋 3

To find 𝑋 =? 𝑎𝑛𝑑 𝜌 = ?
From eqn.(1) : 5𝑋 = 3𝑌 + 180 ⇒ 𝑋 = 0.6𝑌 + 36 (𝑋 = 𝑎 + 𝑏𝑌)
𝑏𝑥𝑦 = 0.6 − −−→ 𝑡ℎ𝑒 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑋 𝑜𝑛 𝑌 …………………………………… (4)
𝜎𝑋 𝜎𝑌
Wkt, 𝑏𝑥𝑦 = 𝜌 𝜎𝑌
∴ 𝜌 = 𝑏𝑥𝑦 ⋅ 𝜎𝑋
∴ 𝝆 = 𝟎. 𝟖 ………………………………………….. (5)

As the mean (𝑋 , 𝑌) passes through the line of regression given by eqn.(1) we have

3 𝑌 − 5 𝑋 + 180 = 0 ⇒ 3(44) − 5𝑋 + 180 = 0 ∴ 𝑋 = 62.4

∴ 𝑿 = 𝟔𝟐. 𝟒

3) Consider the following information about the series X and Y. The coefficient of correlation
between X and Y is 0.80. Find out the most probable value of X if Y is 90 and value of Y if
X is 70.

Series X Series Y
Mean 18 100
S.D 14 20

Solt: Given 𝜌(𝑋, 𝑌) = 0.80, 𝑋 = 18, 𝜎𝑋 = 14, 𝑌 = 100, 𝜎𝑌 = 20 …………………………………… (1)


𝜎𝑋
Wkt, 𝑏𝑥𝑦 = 𝜌 𝜎𝑌
∴ 𝑏𝑥𝑦 = 0.56 < 1 ………………………………………………… (2)
𝜎𝑦
𝑏𝑦𝑥 = 𝜌 ∴ 𝑏𝑦𝑥 = 1.142857 > 1 …………………………………………………… (3)
𝜎𝑥

Regression line of Y on X is , 𝑦 = 𝑎 + 𝑏𝑦𝑥 𝑥 (𝑜𝑟) 𝑌 − 𝑌 = 𝑏𝑦𝑥 (𝑋 − 𝑋), 𝑢𝑠𝑖𝑛𝑔 𝑒𝑞𝑛. (1)

4
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

𝑌 − 100 = 1.142857 (𝑋 − 18), 𝑔𝑖𝑣𝑒𝑛 𝑋 = 70 ∴ 𝒀 = 𝟏𝟓𝟗. 𝟒𝟐𝟖𝟓𝟕 …………………………. (4)

Regression line of X on Y is , 𝑥 = 𝑎 + 𝑏𝑥𝑦 𝑦 (𝑜𝑟) (𝑋 − 𝑋) = 𝑏𝑥𝑦 (𝑌 − 𝑌), 𝑢𝑠𝑖𝑛𝑔 𝑒𝑞𝑛. (1)

(𝑋 − 18) = 0.56 (𝑌 − 100), 𝑔𝑖𝑣𝑒𝑛 𝑌 = 90 ∴ 𝑿 = 𝟏𝟐. 𝟒 ………………………… (5)

5
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

Practice Problems:
1) Given that the means of X and Y are 65 & 67 and their S.Ds are 2.5 & 3.5 respectively
and the correlation coefficient between them is 0.80.
(a) write down the regression lines.
(ans: 𝑏𝑥𝑦 = 0.57142 < 1, 𝑏𝑦𝑥 = 1.12 > 1, 𝑌 = 1.12𝑋 − 0.58, 𝑋 = 0.57142𝑌 + 26.714 )

(b) obtain the estimate of X when Y=70. ( ans: X=66.71426)


(c) Using the estimated value of X as the given value of X, estimate the corresponding
value of Y (ans: Y=68.91997)
2) Find out 𝜎𝑦 𝑎𝑛𝑑 𝜌 from the following data 4𝑦 = 3𝑥; 3𝑥 = 𝑦; 𝜎𝑥 = 2 . Also find the
1 3 1
mean of x and y. (ans: 𝑏𝑥𝑦 = 3 < 1, 𝑏𝑦𝑥 = 4 < 1, 𝜌 = 2 , 𝜎𝑦 = 3, 𝑥 = 0 = 𝑦 )

3) The following data sales and advertisement expenditure of a firm is given in the table.
Coefficient of correlation between them is 0.90.

Sales (Crores) Advertisement Expenditure(Crores)


Mean 40 6
S.D 10 1.5

(a) Estimate the likely sales from a proposed advertisement expenditure of ₹ 10 crores
(b) What should be the advertisement expenditure if the firm proposes a sales target of
₹ 60 crores?
(ans: 𝑏𝑥𝑦 = 6 > 1, 𝑏𝑦𝑥 = 0.135 < 1 (𝑎)₹ 64 𝑐𝑟𝑜𝑟𝑒𝑠 (𝑏) ₹ 8.7 𝑐𝑟𝑜𝑟𝑒𝑠. )

(4) Given that the regression equation for ‘y’ on ‘x’ and ‘x’ on ‘y’ respectively 𝑦 = 𝑥 and
4𝑥 − 𝑦 = 3 . Find the correlation coefficient between ‘x’ and ‘y’ .
(ans: 𝑏𝑥𝑦 = 0.25 < 1, 𝑏𝑦𝑥 = 1, 𝜌 = 0.5 ∈ [−1, 1] )

(5) From the following data 𝑥 = 0.845𝑦 , 𝑦 = 0.89𝑥 . Calculate (a) correlation coefficient
between ‘x’ and ‘y’ (b) S.D of ‘y’
(ans: 𝑏𝑥𝑦 = 0.854 < 1, 𝑏𝑦𝑥 = 0.89 < 1, 𝜌 = 0.87 ∈ [−1, 1], 𝜎𝑦 = 3.06 )

6
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

Course Title: Probability and Statistics

MODULE 1: PROBABILITY

Practice Problems:
1) There are 10 students of which 3 are graduates. If a committee of five is to be formed,
what is the probability that there are (i) only 2 graduates (ii) at least 2 graduates.
Ans: (i) 5/12 (ii) ½
2) From 6 positive and 8 negative numbers, 4 numbers are chosen at random and
multiplied. What is the probability that the product is a positive number?
Ans: 0.5043
3) The probability that a person A solves the problem is 1/3, that of B is 1/2 and that of C
is 3/5. If the problem is simultaneously assigned to all of them what is the probability
that the problem is solved?
Ans: 13/15
4) A shooter can hit a target in 3 out of 5 shots and another shooter can hit the target in 2
out of 4 shots. Find the probability that
(a) the target is being hit when both of them try
(b) the target is hit by only one shooter.
Ans: (a) 4/5 (or) 0.8 (b) ½ (or) 0.5
5) Three students A, B, C write an entrance examination. Their chances of passing are 1/2,
1/3, and 1/4 respectively. Find the probability that
(a) at least one of them passes
(b) all of them passes
(c) at least two of them passes.
Ans: (a) 3/4 or 0.75 (b) 1/24 (c) 7/24

6) Two cards are drawn in succession from a pack of 52 cards. Find the probability that the
first is king and the second is queen if the first card is (a) replaced (b) not replaced.
7) Two dice are thrown. Find the probability of (a) getting an odd number on the one and a
multiple of 3 on the other (b) one of the dice showed 3 and the sum on the two dice is 9
(c) sum on the two dice is 9 (d) sum on the two dice is 13.
8) A student takes an examination with four subjects Maths, Physics, Chemistry and English.
She estimates her chances of passing in Maths as 4/5, in Physics as 3/4, in Chemistry as

1
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

5/6 and in English as 2/3. To qualify, she must pass in Maths and at least two other
subjects. What is the probability that she qualifies?
9) 5 balls are drawn at random from a bag of 6 white and 4 black balls. What is the chance
that 3 of them are white and 2 are black?
10) A bag contains 2 white and 4 red balls while another bag contains 5 white and 7 red balls.
A ball is drawn at random from the first bag and put in the second bag. Then a ball is
drawn at random from the second bag. What is the probability that it is a white ball?
11) Two coins are flipped. Assume that all four points in the sample space are equally likely.
Let E1 denote the event that the first lands on heads and E2 denote the event that the
second lands on tails. Determine if E1 and E2 are independent.
12) At Karnataka Middle School, the probability that a student takes Technology and Science
is 0.087. The probability that a student takes Technology is 0.68. What is the probability
that a student takes Science given that the student is taking Technology?
13) In a school, 25% of the students failed in first language, 15% of the students failed in
second language and 10% of the students failed in both. If a student is selected at random
find the probability that
(i) He failed in first language if he had failed in second language
(ii) He failed in second language if he had failed in the first language
(iii) He failed in either of the two languages.
14) The probability that a new car battery functions for more than 10,000 miles is 0.8, the
probability that it functions for more than 20,000 miles is 0.4 and the probability that it
functions for more than 30,000 miles is 0.1. If a new car battery is still working after
10,000, what is the probability that
(i) its total life will exceed 20,000 miles?
(ii) its additional lie will exceed 20,000 miles?
15) Suppose that an urn contain 8 red balls and 4 white balls . We draw 2 balls from the urns
without replacement
(a) If we assume that each draw , each ball in the urn is equally likely to be chosen , what
is the probability that both balls drawn are red?
(b) Now suppose that the balls have different weights, with red ball having weight r and each
white ball having weight w. Suppose that the probability that a given ball in the urn is
the next one selected is its weights divided by the sum of the weights of all balls currently

2
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

in the urn. Now what is the probability that both balls are red?
16) The probability that the regularly scheduled flight departs on time is 0.83, arrives on
time is 0.82 and departs and arrives on time is 0.78. Find the probability that the plane
(a) arrives on time given it has departed on time (Ans: 0.94)
(b) departed on time given it has arrives on time (Ans: 0.95)
17) The probability that an automobile being filled with gasoline also needs an oil change is
0.25, the probability that it need a new oil filter is 0.40 and the probability that it needs
both new oil change and oil filter is 0.14.
(a) If the oil has to be changed then what is the probability that it needs a new oil filter?
(ans: 0.70)
(b) If the oil filter is needed then what is the probability that it needs a new oil change?
(ans: 0.35)
18) The probability that a married man watches a certain television show is 0.40 and that of
a married woman is 0.5. The probability that the man watches the show given that his
wife watches is 0.7.
(a) what is the probability that the woman watches the show given that her husband watches
the show? (ans: 0.9, P(A and B)=0.35)
(b) what is the probability that atleast one member of the married couple will watch the show?
(ans: 0.55)
19) Three major parties A, B, C are contending for power in the elections of a state and the
chance of their winning the election is in the ratio 1: 3: 5. The parties A, B, C respectively
have probabilities of banning the online lottery 2/3, 1/3, 3/5. What is the probability that
there will be a ban on the online lottery in the state? What is the probability that the ban
is from the party C?
20) A laboratory blood test is 95% effective in detecting a certain disease when it is, in fact,
present . however, the test also yields a ”false positive” result for 1% of the healthy person
tested. If 0.5% of the population actually has the disease, what is the probability that a
person has the disease given that the test result is positive?
21) In answering a question on a multiple-choice test, a student either knows the answer or
guesses. Let p be the probability that the student knows the answer and 1-p be the
probability that the student guesses . Assume that a student who guesses at the answer
will be correct with probability 1/m, where m is the number of multiple-choice

3
Dayananda Sagar University
School of Engineering
Devarakaggalahalli, Harohalli, Ramanagara District, Karnataka-562 112

DEPARTMENT OF MATHEMATICS

alternatives. What is the conditional probability that a student knew the answer to a
question given that he or she answered its correctly?
22) Suppose a drug test is 99% sensitive and 99% specific. That is, the test will produce 99%
true positive results for drug users and 99% true negative results for non-drug users.
Suppose that 0.5% of people are users of the drug. If a randomly selected individual tests
positive, what is the probability that he is a user?
23) A insurance company believes that people can be divided into two classes : those who are
accident prone and those who are not. The company’s statistics show that an accident-
prone person will have an accident at some time within a fixed 1-year period with
probability 0.4,whereas this probability decreases to 0.2 for a person who is not accident
prone. If we assume that 30 percent of the population is accident prone, what is the
probability that a new policyholder will have an accident within a year of purchasing a
policy? Also find what is the probability that he or she is accident prone?
24) In a certain day care 30% of the children have grey eyes, 50% have blue and 20% eyes are
in other colours. One day they play a game together. In first run, 65% of the grey eye ones,
82% of the blue eyes and 50% of the children with other eye were selected. Now if a child
is selected randomly from the class and we know that the child was not in the first game,
what is the probability that the child has blue eyes?
25) Suppose there are coloured balls distributed in three boxes in quantities as given by the
table below. A box is selected at random. From that box a ball is selected at random. How
likely is it that a red ball is drawn? What is the probability that the red ball is from box 3?

26) A paint store chain produces and sells latex and semigloss paints. Based on a long range
sales, the probability that a customer will purchase latex paint is 0.75. Of those that
purchase latex paint 60% also purchase rollers. But only 30% of semigloss paint buyers
purchase rollers. A randomly selected buyer purchases a roller and a can of paint. What
is the probability that the paint is latex? (ans: 0.525 and 0.8571)

4
Department of Mathematics
Course Title: PROBABILITY AND STATISTICS
Practice Sheet
Module 2: Random Variables and their Properties and Probability Distributions

Discrete Random Variables

1. Find the value of k such that the following distribution represents a finite/discrete probability distribution.
Hence find its mean and standard deviation. Also find P{X ≤ 1}, P{X > 1} and P{1 < X ≤ 2}.

2. A random variable X take the values -3, -2, -1, 0,1, 2, 3 such that P{X = 0} = P{X < 0} and P{X = −3} = P{X =
−2} = P{X = −1} = P{X = 1} = P{X = 2} = P{X = 3}. Find the probability distribution and cumulative distribution.
3. Determine the value 𝑐 so that following function can serve as a probability distribution of the discrete
random variable X: 𝑝 𝑋 = 𝑐 𝑋 2 + 4 , 𝑋 = 0,1,2,3.
Binomial Distribution
1) The probability that a certain kind of component will survive a shock test is ¾. Find the probability
that (a)exactly 2 of the next 4 components tested survive (b) at least 1 of the 4 components
tested survive.
2)The probability that a patient recovers from a rare blood disease is 0.4. If 15 people are known to
have contracted this disease, what is the probability that (a) at most 3 survive (b) at least 3 survive
(c) from 5 to 8 survive, and (d) exactly 5 survive?
3) A lot contains 1% of defective items. What should be the number (n) of items in a random sample
so that the probability of finding at least one defective in it is at least 0.75?
Ans: n=138
4) The number of telephone lines busy at an instant of time is a binomial variate with probability 0.1
that a line is busy. If 10 lines are chosen at random, what is the probability that (a) all lines are busy
(b) no lines are busy (c) at least one is busy (d) at most two lines are busy (e) exactly two are busy.
Ans: (0.1)10 , 0.3487, 0.6513, 0.9298, 0.1937
5) From a lot of 10 missiles, 4 are selected at random and fired. If the lot contains 3 defective
missiles that will not fire, what is the probability that (a) all 4 will fire? (b) at most 2 will not fire?
6) The probability that a team wins a match is 3/5. If this team play 3 matches in a
tournament, what is the probability that the team (a) win all the matches (b) win at least one
match (c) win at most one match (d) lose all the matches.
Ans: 27/125, 117/125, 44/125, 8/125

7) The probability that a person aged 60 years will live up to 70 is 0.65. What is the probability
that out of 10 persons aged 60 at least 7 of them will live up to 70.
Ans: 0.5138

8) If the mean and variance of the number of correctly answered questions in a test given to
4096 students are 2.5 and 1.875. Find an estimate of the number of candidates answering
correctly (i) 8 or more questions (ii) 2 or less (iii) 5 questions.
Ans: 2,2153,239
Poisson Distribution
1) If X is a Poisson variate such that P{X = 2} = 9P{X = 4} + 90P{X = 6}. Compute mean and variance.
Ans: 1
2) The probability that a news reader commits no mistake in reading the news is 1/𝑒 3 . Find the
probability that on a particular news broadcast he commits (i) only 3 mistakes (ii) more than 3
mistakes (iii) at most 3 mistakes. Ans: 0.22404, 0.3528, 0.6472
3) The probabilities of a Poisson variate taking the values 3 and 4 are equal. Calculate the
1 4
probabilities of the variate taking the values 0 and 1. Ans: 4 , 4
𝑒 𝑒
4) A device can handle only 4 telephone calls per minute. If the incoming calls per minute follow a
Poisson distribution with parameter 3, find the probability that the device is over loaded in any
one minute.
5) The number of accidents in a year attributed to taxi drivers in a city follows Poisson distribution
with mean 2. Out of 500 taxi drivers of that city, find approximately the number of drivers with
(i) no accidents in a year (ii) at least 3 accidents in a year?
Department of Mathematics
Course Title: PROBABILITY AND STATISTICS
Practice Sheet
Module 2: Random Variables and their Properties and Probability Distributions
CONTINUOUS RANDOM VARIABLE:

1) The probability density function of X, the lifetime of a certain type of electronic device (measure in hours) is given by
0, 𝑥 ≤ 10,
𝑓 𝑥 = ቐ10
, 𝑥 > 10
𝑥2
Find (i) P(X>20) (ii) What is the cumulative distribution function of X? (iii) What is the probability that of 6 such types of
devices, at least 3 will function for at least 15 hours?

𝐶 2𝑥 − 𝑥 2 , 0 < 𝑥 < 5/2,


2) Consider the function 𝑓 𝑥 = ቊ
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Could f be a probability density function? If so, determine C.

3) The time t (in years) required to complete a software project has probability density function
𝑘𝑡 1 − 𝑡 , 0 ≤ 𝑡 ≤ 1,
𝑓 𝑡 =ቊ
0, 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Find k and also the probability that the project will be completed in less than 4 months.
4) The shelf life, in days, for bottles of a certain prescribed medicine is a random variable having the density
function
20,000
, 𝑥 > 0,
𝑓 𝑥 = ൞(𝑥 + 100)3
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒.
Find the probability that a bottle of this medicine will have a shelf life of (i) at least 200 days (ii) anywhere from
80 to 120 days.

5) The proportion of people who respond to a certain phone call is a continuous random variable X that has the
2(𝑥+2)
, 0 < 𝑥 < 1,
density function 𝑓 𝑥 = ቐ 5
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
(i) Show that P(0<X<1)=0 (ii) Find the probability that more than ¼ but fewer than ½ of the people contacted
will respond to this type of call.

6) Consider the probability density function


𝑘 𝑥, 0 < 𝑡 < 1,
𝑓 𝑥 =ቊ
0, 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
(i) Find k (ii) Find F(x) and use it evaluate P(0.3<X<0.6).
NORMAL DISTRIBUTION:
1) Given a standard normal distribution, find the area under the curve that lies
(a) To the right of 𝑧 = 1.84 (b) between 𝑧 = −1.97 and 𝑧 = 0.86.
2) Given a random variable X having a normal distribution with μ=50 and σ=10, find the probability that X
assumes a value (a) between 45 and 62 (b) greater than 65.
3) A certain type of storage battery lasts, on average, 3 years with a standard deviation of 0.5 years. Assuming
the battery life is normally distributed, find the probability that a given battery will last (a) less than 2.3 years
(b) more than 2.5 years (c) between 3 and 3.5 years
4) A soft-drink machine is regulated so that it discharges an average of 200 milliliters per cup. If the amount
of drink is normally distributed with a standard deviation equal to 15 milliliters, (a) What fraction of the cups
will contain more than 224 milliliters? (b) What is the probability that a cup contains between 191 and 209
milliliters? (c) How many cups will probably overflow if 230-milliliter cups are used for the next 1000 drinks?
5) In an industrial process, the diameter of a ball bearing is an important measurement. The buyer sets
specifications for the diameter to be 3 ± 0.01cm. The implication is that no part falling outside these
specifications will be accepted. It is known that in the process the diameter of a ball bearing has a normal
distribution with mean μ=3 and standard deviation σ=0.005. On average, how many manufactured ball
bearing will be scraped? (express the answer in percentage).
6)In an exam 7% of students score less than 35% marks and 89% of students score less than 60% marks. Find
mean and standard deviation if the marks are normally distributed. Given P(z > 1.47) = 0.07, P(z < 1.2) = 0.89
EXPONENTIAL DISTRIBUTION:

1) The number of years a radio functions is exponentially distributed with parameter α =


1/8. If a person buys a radio, what is the probability that it will be working after 8 years?
2) The length of time for one individual to be served at a cafeteria is a random variable
having an exponential distribution with a mean of 4 minutes. What is the probability
that a person is served in less than 3 minutes on at least 4 of the next 6 days?
3) The lifetime (in years) of a certain type of electrical switch has an exponential
distribution with an average life 2. If 10 of these switches are installed in different
systems, what is the probability that at most 3 fail during the first year?
4) The time (in hours) required to repair a machine is an exponentially distributed random
variable with parameter α = 1/2. What is the probability that a repair time exceeds 2
hours? What is the probability that a repair takes at least 10 hours?
JOINT PROBABILITY DISTRIBUTION:

1) The joint probability distribution of two discrete random variables X and Y is given by
𝑝 𝑥, 𝑦 = 𝑘 2𝑥 + 3𝑦 , where x and y are integers such that 0≤x ≤2, 0 ≤y ≤ 3.

(a) Find the value of the constant k. (Hint: X={0,1,2}, Y={0,1,2,3} , σ𝑚 𝑛


𝑖=1 σ𝑗=1 𝐽𝑖𝑗 = 1)
(b) Find the marginal probability distributions of X and Y.
(c) Show that X and Y are dependent random variables. (Hint : show 𝑓 𝑥𝑖 𝑔(𝑦𝑗 ) ≠ 𝐽𝑖𝑗 )
(d) Find E(X), E(Y), E(XY), E(X 2 ), E(Y 2 )
(e) Find σX and σY
2) Two pens are selected at random from a box that contains 3 blue pens, 2 red pens, and 3 green pens. If X is
the number of blue pens selected and Y is the number of red pens selected,
(a) Find the joint probability distribution of X and Y.
(b) Find the marginal distributions of X and Y.
(c) E(X), E(Y), E(XY)
(d) Find σX and σY
3) Given the following joint distribution of two random variables X and Y, find the corresponding marginal
distribution. Also compute the covariance and the correlation of the random variables X and Y.

Y 1 3 9
X
2 1/8 1/24 1/12
4 1/4 1/4 0
6 1/8 1/24 1/12
4) The joint probability distribution of two discrete random variables X and Y is given by

Y -2 -1 4 5
X
1 0.1 0.2 0 0.3
2 0.2 0.1 0.1 0
Determine the marginal probability distributions of X and Y. Also, compute (a) E(X) and E(Y) (b) E(XY) (c) 𝜎𝑋 𝑎𝑛𝑑 𝜎𝑌 (d)
𝑐𝑜𝑣 𝑋, 𝑌 (e) 𝜌 𝑋, 𝑌 (f) Verify that X and Y are dependent random variables.
School Of Engineering,
Dayananda Sagar University,
4th Sem
Module 3 : Estimation and testing of
hypothesis

1 Practice Questions
Ques 1. A random sample of n = 50 males showed a mean average daily intake of dairy
productions equal to 756 grams with a standard deviation of 35 grams. Find a 95% and
99% confidence interval for the population average µ ?

(Hint: For 95% confidence interval, the estimate is,


756 − 1.96 √3550 < µ < 756 − 1.96 √3550

For 99% confidence interval, the estimate is,


756 − 2.58 √3550 < µ < 756 − 2.58 √3550 )

Ques 2. An electrical firm manufactures light bulbs that have a length of life with mean
µ and a standard deviation of 40 hours. If a sample of 100 bulbs has an average life of
780 hours. Find a 95% confidence interval for the population mean of all bulbs produced
by this firm.

(Hint: For 95% confidence interval, the estimate is,


780 − 1.96 √40
100
< µ < 780 − 1.96 √40
100
)

Ques 3. A researcher took a sample of 30 students test scores with an average score of
85 and a standard deviation of 5. What is the 95% confidence interval for the test scores ?

(Hint: For 95% confidence interval, the estimate is,


85 − 1.96 √530 < µ < 85 − 1.96 √530 )

Ques 4. A study measures the heights of 50 people, finding an average height of 170
cm with a standard deviation of 10 cm. What is the 99% confidence interval for the
populations height ?

1
( Hint: For 99% confidence interval, the estimate is,
170 − 2.58 √1050 < µ < 170 − 2.58 √1050 )

Ques 5. A random sample of 16 Americans yielded the following data on the num-
ber of pounds of beef consumed per year: 118, 115, 125, 110, 112, 130, 117, 112, 115, 120,
113, 118, 119, 122, 123, 126. What is the average number of pounds of beef consumed each
year per person in the United States ? (Given, t0.025 = 2.1314 for 15 degree of freedom.)

(Ans. (115.42, 121.46))

Ques 6.A random sample of 20 students yielded the following data on the number of
books read in a month: 5, 7, 4, 6, 8, 9, 5, 6, 7, 8, 6, 5, 7, 6, 8, 4, 5, 6, 7, 8. What is the average
number of books read per month per student? (Given, t0.025 = 2.093 for 19 degrees of
freedom.)

(Ans. (5.40, 8.60))

Ques 7. A random sample of 15 employees yielded the following data on the number
of hours worked overtime in a month: 10, 15, 12, 8, 20, 17, 14, 11, 13, 18, 9, 16, 19, 10, 15.
Estimate the average number of hours worked overtime per month per employee. (Given,
t0.025 = 2.1314 for 14 degrees of freedom.)

(Ans. (11.99, 16.67))

Ques 8. A survey was conducted to estimate the average amount of time spent com-
muting to work per day. A random sample of 18 participants yielded the following data on
commute times (in minutes): 25, 30, 35, 20, 40, 45, 30, 35, 30, 25, 20, 35, 40, 45, 30, 35, 40, 25.
Estimate the average commute time per day. (Given, t0.025 = 2.1098 for 17 degrees of
freedom.)

(Ans. (31.96, 36.38))

Ques 9. A coffee shop sells bags of coffee beans with a mean weight of 500 grams and a
standard deviation of 20 grams. If a random sample of 25 bags is selected, what is the
probability that the average weight of the bags is more than 510 grams?

(Ans. 0.0062)

Ques 10. A car manufacturer claims that the average mileage of its vehicles is 30 miles
per gallon (mpg), with a standard deviation of 2 mpg. If a random sample of 36 cars is
tested, what is the probability that the average mileage of the sample is less than 29 mpg?

Ques 11. A factory produces light bulbs with a mean lifespan of 1000 hours and a
standard deviation of 50 hours. If a random sample of 64 bulbs is selected, what is the
probability that the average lifespan of the sample is between 980 and 1020 hours?

2
Ques 12. A bakery produces loaves of bread with a mean weight of 1.2 kg and a standard
deviation of 0.1 kg. If a random sample of 49 loaves is selected, what is the probability
that the average weight of the sample is at least 1.18 kg?

Ques 13. A nutritionist wants to estimate the variance of the sugar content in a certain
type of cereal. A sample of 12 boxes of cereal is selected, and the sugar content (in grams
per serving) is measured as follows: 8.5, 9.0, 8.7, 8.8, 9.2, 8.6, 8.9, 8.3, 8.8, 9.1, 8.7, 8.4.
Assuming a normal population, find a 99% confidence interval for the variance of the
sugar content. (Given χ20.005 = 27.876 and χ20.995 = 2.718)

(Ans. (0.0309, 0.3170))

Ques 14.A botany researcher is studying the variance in the heights of a certain species
of plants. A random sample of 16 plants is selected, and their heights (in centimeters) are
recorded as follows: 30, 32, 31, 29, 33, 31, 30, 34, 32, 31, 30, 33, 32, 33, 31, 30. Assuming
a normal population, find a 95% confidence interval for the variance of the heights of this
species of plants.(Given χ20.025 = 28.307 and χ20.975 = 6.262)

(Ans. (0.7415, 3.3566))

Ques 15. The standard deviation of a random sample of size 20 drawn from a normal
population is 3.8. Calculate the 90% confidence limits for the standard deviation in the
population. (Given χ20.005 = 30.14 and χ20.95 = 9.59for 19 degrees of freedom)

(Ans. (3.01, 5.34))

Ques 16. The standard deviation of a random sample of size 25 drawn from a normal
population is 6.2. Calculate the 99% confidence limits for the standard deviation in the
population. (Given χ20.005 = 42.98 and χ20.995 = 9.31for 24 degrees of freedom)

3
Department of Mathematics, SoE, DSU, Harohalli-562 112

ASSIGNMENT-I
Course: Probability & Statistics
Course Code: 22AM/AS/CS/CT/CY/DS/EC/ME2401
Answer All The Questions:
Module 1: Probability Theory
1) A lot of components contains 0.6% defectives. Each component is subjected to a test that
correctly identifies a defective, but about 2 in every 100 good components is also
indicated defective. Given that a randomly chosen component is declared defective by the
tester, compute the probability that it is actually defective by applying Baye’s theorem in
probability theory.
2) The sum of two non-negative quantities is equal to 2n. Find the chance that their product
is not less than 3/4 times their greatest product.
3) Three groups of children contain respectively 3 girls and 1 boy, 2 girls and 2 boys and 1
girl and 3 boys. One child is selected at random from each group. What is the probability
of selecting three children consisting of 1 girl and 2 boys?
4) In 1989 there were three candidates for the position of Principal – Dr. Chatterji, Dr. Iyer
and Dr. Gupta whose chances of getting appointment are in the ratio of 4: 2: 3 respectively.
The probability that Dr. Chatterji, if selected would introduce co-education in the
institution is 0.30. The probability that Dr. Iyer and Dr. Gupta doing the same are 0.5 and
0.8 respectively. Given that the co-education was introduced in the institution what is the
probability that it was due to Dr. Iyer or Dr. Gupta.
Module 2: Random Variables & Probability Distribution.

5) By investing in a particular stock, a person can make a profit in one year of


Rs.40000 with probability 0.3 or take a loss of Rs.10000 with probability 0.7.
What is this person’s expected gain?
6) An exam consisting of 20 true/false questions is taken in the last five minutes of
the exam period by a student who accidentally slept through most of the exam
period. The student, given the small amount of time available to complete the
exam randomly guesses on all 20 questions. Find the mean and standard
deviation for the number of questions correctly answered by such a student.
Assuming a minimum of 8 correct answers are necessary to pass the exam, what
is this student's probability of passing?
7) A shop sells a particular make of mobile phone. Assuming that the weekly
demand for the mobile phone is a Poisson variable with mean 2,

Page 1 of 2
(a) Find the probability that the shop sells at most 4 in a week
(b) Find the probability that the shop sells more than 9 in a month (4 weeks).
(c) Fresh stocks arrive at the shop only at the beginning of each month. Find the
minimum number that should be in stock at the beginning of a month so that
the shop can be at least 95% sure of being able to meet the demands during
the month.

Page 2 of 2
IV Semester B. Tech Assignment-II
Department of Mathematics
Course Title: Probability & Statistics
Course Code: 22CS/AM/CY/CT/DS/ECE/ME/AS 2401

MODULE-3,4 and 5

Ques 1. An electrical firm manufactures light bulbs that have a length of life with mean µ and a
standard deviation of 40 hours. If a sample of 100 bulbs has an average life of 780 hours. Find a
95% confidence interval for the population mean of all bulbs produced by this firm.

Ques 2. A survey was conducted to estimate the average amount of time spent commuting to
work per day. A random sample of 18 participants yielded the following data on commute times
(in minutes): 25, 30, 35, 20, 40, 45, 30, 35, 30, 25, 20, 35, 40, 45, 30, 35, 40, 25. Estimate the
average commute time per day. (Given, t0.025=2.1098 for 17 degrees of freedom.)

Ques 3. A nutritionist wants to estimate the variance of the sugar content in a certain type of
cereal. A sample of 12 boxes of cereal is selected, and the sugar content (in grams per serving) is
measured as follows: 8.5, 9.0, 8.7, 8.8, 9.2, 8.6, 8.9, 8.3, 8.8, 9.1, 8.7, 8.4. Assuming a normal
2
population, find a $99\%$ confidence interval for the variance of the sugar content. (Given χ0.005
2
=27.876$ and χ0.995=2.718)

Ques 4. A coffee shop sells bags of coffee beans with a mean weight of 500 grams and a
standard deviation of 20 grams. If a random sample of 25 bags is selected, what is the probability
that the average weight of the bags is more than 510 grams?

You might also like