0% found this document useful (0 votes)
13 views

Chapter 6

Chapter 6 of the Introduction to Statistics course covers probability distributions, defining random variables and their types: discrete and continuous. It discusses properties of discrete and continuous probability distributions, expectation, and variance, along with common distributions such as binomial, Poisson, and normal distributions. The chapter includes examples and formulas for calculating probabilities, expected values, and variances for different scenarios.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Chapter 6

Chapter 6 of the Introduction to Statistics course covers probability distributions, defining random variables and their types: discrete and continuous. It discusses properties of discrete and continuous probability distributions, expectation, and variance, along with common distributions such as binomial, Poisson, and normal distributions. The chapter includes examples and formulas for calculating probabilities, expected values, and variances for different scenarios.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Introduction to Statistics STAT 173

Chapter 6
Probability Distribution
6.1 Definition of random variables and probability Distribution.
Random variable: - is numerical valued function defined on the sample space.
It assigns a real number for each element of the sample space. Generally a
random variables are denoted by capital letters and the value of the random
variables are denoted by small letters
Example 6.1: Consider an experiment of tossed a fair of coin three times. Let
the random variable X be the number of heads in three tosses, then find X?
Random variables are of two types:
1. Discrete random variable: are variables which can assume only a specific
number of values. They have values that can be counted
Examples:
• Toss a coin n time and count the number of heads.
• Number of children in a family.
• Number of car accidents per week.
• Number of defective items in a given company.
• Number of bacteria per two cubic centimeter of water.
2. Continuous random variable: are variables that can assume all values
between any two give values.
Examples:
• Height of students at certain college.
• Mark of a student.
• Life time of light bulbs.
• Length of time required to complete a given training.
Probability distribution:- consists of a value a random variable can assume
and the corresponding probabilities of the values or it is a function that assigns
probability for each element of random variable.
Probability distribution can be discrete or continues.
A) Discrete probability distribution:- is a formula, a table, a graph or other
devices used to specify all possible values of the discrete random variable(R.V) X
along with their respective probabilities.

[email protected] 1 Chapter SIX


Introduction to Statistics STAT 173

Example 6.2:1) Consider the experiment of tossing a coin three times. Let X be
the number of heads. Construct the probability distribution of X.
2) A balanced die is tossed two twice, construct a probability distribution if
A) X is the sum the number of spots in the two trials.
B) X is the absolute difference of the number of spots in the trials.
Properties of discrete probability distribution
n
1)  P X  x   1
i 1
i

2) P X  xi   0 or 0  P X  xi   1
3) If X is discrete random variable then
b 1
P a  X  b    P( x)
X  a 1

b 1
P a  X  b    P( x)
X a

b
P a  X  b    P( x)
X  a 1

b
P a  X  b    P( x)
X a

B) Continuous probability distribution


Definition: a non negative function f(x) is called probability distribution of
continuous R.V X if the total area bounded by the curve and the X-axis is 1 and if
the sub area under the curve bounded by the curve & X-axis and perpendicularly
erected at any points a and b give the probability that X is between a and b.
Properties of continuous probability distribution

a) The total area under the curve is one i.e.  f ( x)  1


b) P(a  X  b)  the area under the curve between the point a and b.

c) P X   0
d) P X  a   0
e) Pa  X  b  Pa  X  b  P(a  X  b)  P(a  X  b)

[email protected] 2 Chapter SIX


Introduction to Statistics STAT 173

6.2 Introduction to expectation


Definition:
1. Let a discrete random variable X assume the values X1, X2, ….,Xn with the
probabilities P(X1), P(X2), ….,P(Xn) respectively. Then the expected value of X,
denoted as E(X) is defined as:
E(X) =X1.P(X1) +X2.P(X2) +…. +Xn.P(Xn)
n
=  X i .P  X i 
i 1

2. Let X be a continuous random variable assuming the values in the interval (a,
b
b) such that  f x d (x) =1,then
a

b
E  X    X . f ( x)d ( x)
a

Mean and Variance of a random variable


Let X is given random variable.
1. The expected value of X is its mean
Mean of X=E(X)
2. The variance of X is given by:
Variance of X=Var(x) = E X 2   ( E  X ) 2
Where
n
E ( X 2 )   X i .P X i 
2
If X is discrete
i 1

  X 2 f x d ( x) if X is continuous
x

Rule of Expectation
1) Let X be a R.V and k be a real number, then
a) E (kX) =kE(X)  var kX   k 2 . var  X 
b) E(X+k) =E(X) + k  var( X  k )  var( X )
2) Let X and Y be R.V on the sample space, then
a) E X  Y   E X   EY 
b) var  X  Y   var  X   var Y   2. cov X , Y 
Where, Cov(X, Y) =the covariance between X and Y=E (XY)-E(X).E(Y)

[email protected] 3 Chapter SIX


Introduction to Statistics STAT 173

3) Let X and Y be independent R.V, then


a) E (XY) =E(X).E(Y)
b) var  X  Y   var  X   var(Y )
c) Cov (X, Y) =0
Example 6.3:
1. What is the expected value and Variance of a random variable X obtained by
tossing a coin three times where X is the number of heads?
2. Let X be a continuous R.V with distribution
1
 x 0 x2
f ( x)   2

0, otherwise
Then find a) P (1<x<1.5
b) E(x)
a) Var(x)
b) E (3x 2  2 x)
Common Discrete Probability Distributions
1. Binomial Distribution
A binomial experiment is a probability experiment that satisfies the following
four requirements called assumptions of a binomial distribution.
1. The experiment consists of n identical trials.
2. Each trial has only one of the two possible mutually exclusive outcomes,
success or a failure.
3. The probability of each outcome does not change from trial to trial, and
4. The trials are independent, thus we must sample with replacement.
Examples of binomial experiments
• Tossing a coin 20 times to see how many tails occur.
• Asking 200 people if they watch BBC news.
• Registering a newly produced product as defective or non defective.
• Asking 100 people if they favor the ruling party.
• Rolling a die to see if a 5 appears.
Definition: The outcomes of the binomial experiment and the corresponding
probabilities of these outcomes are called Binomial Distribution.
Let p=probability of success q= 1-p=probability of failure on any given trials
[email protected] 4 Chapter SIX
Introduction to Statistics STAT 173

Then the probability getting x success in n trials becomes


 n  x n  x
 . p q x  0,1,3,....n
P X  x    x 
0 otherwise

And this sometimes written as
X ~ Bin(n, p )
When using the binomial formula to solve problems, we have to identify three
things:
• The number of trials (n)
• The probability of a success on any one trial (P) and
• The number of successes desired (X).
Remark: If X is a binomial random variable with parameters n and p then
E(X)=np and var(X)=npq
Example 6.4:
1. What is the probability of getting three heads by tossing a fair con four times?
2. Suppose that an examination consists of six true and false questions, and assume
that a student has no knowledge of the subject matter. The probability that the
student will guess the correct answer to the first question is 30%. Likewise, the
probability of guessing each of the remaining questions correctly is also 30%.
a) What is the probability of getting more than three correct answers?
b) What is the probability of getting at least two correct answers?
c) What is the probability of getting at most three correct answers?
d) What is the probability of getting less than five correct answers?
3) The probability that a patient contracting IB will recover from the distance under
medical treatment is 0.6 out of 15 patients contracting the diseases
a) What is the probability that exactly 10 is record?
b) What is the expected number of patient who will recover?
c) What is the variance of the number of patient who will recover?
Assume that the patients are subjected under the same medical treatment.
2. Poisson Distribution
- A random variable X is said to have a Poisson distribution if its probability
distribution is given by:

[email protected] 5 Chapter SIX


Introduction to Statistics STAT 173

  x .  
 x  0,1,2.....
P( X  x)   x! Where  is the average number
0
 otherwise
occurrence of an event in the unit length of interval or distance and x is the
number of occurrence in a Poisson process
- The Poisson distribution depends only on the average number of occurrences
per unit time of space.
- The Poisson distribution is used as a distribution of rare events, such as:
• Number of misprints.
• Natural disasters like earth quake.
• Accidents.
• Hereditary.
• Arrivals
. Number of misprints per page
- The process that gives rise to such events is called Poisson process.
- If X is a Poisson random variable with parameters λ then
E(x) = λ, var(x)= λ
Note: The Poisson probability distribution provides a close approximation to the
binomial probability distribution when n is large and p is quite small or quite
large with λ=np.

P X  x 
np .e
x np
where  np  the average number
x!
x  0,1,2,....
Usually we use this approximation if 5≤np. In other words, if n>20 and np<5 or
n(1-p) ≤5 then we may use Poisson distribution as an approximation to binomial
distribution.
Example 6.5: 1. If 1.6 accidents can be expected an intersection on any given
day, what is the probability that there will be 3 accidents on any given day?
2. If there are 200 typographical errors randomly distributed in a 500-page
manuscript, find the probability that a given page contains exactly 3 errors.
3. A sale firm receives, on the average, 3 calls per hour on its toll-free number.
For any given hour, find the probability that it will receive the following.
a) At most 3 calls
b) At least 3 calls

[email protected] 6 Chapter SIX


Introduction to Statistics STAT 173

c) Five or more calls


4. If approximately 2% of the people are left-handed, find the probability that in
a room 200 people, there are exactly 5 people who are left-handed?
Common Continuous Probability Distributions
1. Normal Distribution
A random variable X is said to have a normal distribution if its probability density
function is given by

x
2

f x  
1 1
.e 2
  where    x  ,      ,  0
 2   
  E x  and  2  var iancex  are parameters of the normal distribution.

Properties of Normal Distribution:


1. It is bell shaped and is symmetrical about its mean and it is mesokurtic. The
maximum ordinate is at μ=x and is given by

f x  
1
 2
2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the
mean.
3. It is a continuous distribution i.e. there is no gaps or holes.
4. It is a family of curves, i.e., every unique pair of mean and standard deviation
defines a different normal distribution. Thus, the normal distribution is completely
described by two parameters: mean and standard deviation.
5. Total area under the curve sums to 1, i.e., the area of the distribution on each side of

the mean is 0.5   f ( x)d x   1


6. It is unimodal, i.e., values mound up only in the center of the curve.


7. Median=Mean=mod =μ and located at the center of the distribution.
8. The probability that a random variable will have a value between any two points is
equal to the area under the curve between those points.
Note: To facilitate the use of normal distribution, the following distribution known as
the standard normal distribution was derived by using the transformation

[email protected] 7 Chapter SIX


Introduction to Statistics STAT 173

1
X 
 f z   i.e. if X ~ N  ,  2  then Z ~ 0,1
1 Z2
Z e2
 2
Properties of the Standard Normal Distribution:
Same as a normal distribution, but also...
• Mean is zero
• Variance is one
• Standard Deviation is one
- Areas under the standard normal distribution curve have been tabulated in various
ways. The most common ones are the areas between Z=0 and a positive value of Z.
- Given a normal distributed random variable X with Mean μ and standard
deviation σ
a X  b
Pa  X  b   P   
    
a b
 P Z 
   
Example 6.6:
1. Find the area under the standard normal distribution which lies
a) Between Z=0 and z=96.0
b) Z=-1.45 and Z=0
c) The right of Z=-0.35
d) To the left of Z=0.35
e) Between Z=-0.67 and Z=0.75
f) Between Z=0.25 and Z=1.25
2. Find the value of Z if
a) The normal curve area between 0 and z(positive) is 0.4726
b) The area to the left of z is 0.9868
3. A random variable X has a normal distribution with mean 80 and standard
deviation 4.8. What is the probability that it will take a value?
a) Less than 87.2
b) Greater than 76.4
c) Between 81.2 and 86.0
4. A normal distribution has mean 62.4.Find its standard deviation if 20.05% of
the area under the normal curve lies to the right of 72.9

[email protected] 8 Chapter SIX


Introduction to Statistics STAT 173

5. A random variable has a normal distribution with σ =5.Find its mean if the
probability that the random variable will assume a value less than 52.5 is
0.6915.
6. Of a large group of men, 5% are less than 60 inches in height and 40% are
between 60 & 65 inches. Assuming a normal distribution, find the mean and
standard deviation of heights.
2. Student’s t Distribution
In statistics as long as sample size is large enough, most datasets can be
explained by Standard Normal Distribution. But when the sample size is small,
statisticians rely on the distribution of the t statistic (also known as the t score),
whose value is given by:
[x  μ]
t=
s
n
Where x is the sample mean, μ is the population mean, s is the standard
deviation of the sample, and n is the sample size.
The distribution of the t statistic is called the t distribution or the Student t
distribution. The particular form of the t distribution is determined by its
Degrees of Freedom (df). The degree of freedom refers to the number of
independent observations in a set of data. When estimating a mean score or a
proportion from a single sample, the number of independent observations is
equal to the sample size minus one.. The t distribution can be used with any
statistic having a bell-shaped distribution (i.e., approximately normal).
The t distribution has the following properties:
 The mean of the distribution is equal to 0.
 The variance is equal to v / (v - 2), where v is the degrees of freedom.
 With infinite degrees of freedom, the t distribution is the same as the standard
normal distribution.
 The t distribution is similar to standard normal distribution in the
following ways
 It is bell-shaped.
 It is symmetric about the mean.

[email protected] 9 Chapter SIX


Introduction to Statistics STAT 173

 The mean, median, and mode are equal to zero and located at the
center of the distribution.
 The curve never touches the x axis.
 The t distribution differs from standard normal distribution in the
following ways.
 The variance is greater than one
 The t distribution is actually a family of curves based on the concept of
degrees of freedom, which is related to sample size.
 As the sample size increases, the t distribution approaches the standard
normal distribution.
Chi-Square Distribution

 The chi-square variable is similar to t variable in that its distribution is a


family of curves based on the number of degree of freedom. The symbol
for chi-square is  2 (Greek letter chi, pronounced “ki”). The chi-square

distribution is obtained from the values of


n  1s 2 when random samples
2
are selected from a normally distributed population whose variance is  2 .
A chi-square variable can not be negative, and the distributions are
positively skewed. At about 100 degree of freedom, the chi-square
distribution becomes some what symmetrical. The area under each chi-
square distribution is equal to 1.00 or 100%.
 In order to find the area under the chi-square distribution,
there are three cases to consider:

1) Find the chi-square critical value for a specific  when the hypothesis test is
one tailed right. In this case, find the  value at the top of  2 table and the
corresponding degree of freedom in the left column. Then, the critical value is
located when the two columns meet.

Example: the critical chi-square value for 15 degree of freedom when  =0.05
and the test is one-tailed right (  015.05 ) is 24.996

[email protected] 10 Chapter SIX


Introduction to Statistics STAT 173

2). Find the chi-square critical value for a specific  when the hypothesis test is
one tailed left. In this case, the  value must be subtracted from one. Then, the
left side of the table used, because the  2 table gives the area to the right of the

critical value, the  2 statistics can not be negative.

Example: The critical  2 value for 10 df when  =0.05 and the test is one-tailed
left is 3.940.
3) Find the chi-square critical value for a specific  when the hypothesis test is
two-tailed. When a two-tailed test is conducted, the area must be split. For
example, to find the critical chi-square values for 22 degrees of freedom when 
=0.05, we use the area to the right of the larger value 0.025 (0.05/2), and the
area to right of the smaller value 0.975(1-0.05/2). Hence, one must use  values
in the table of 0.025 and 0.975, with 22 degrees of freedom the critical values are
36.781 and 10.982 respectively.
Note that after the degrees of freedom reach 30, chi-square table only gives
values for multiples of 10(40, 50,60,etc.). When the exact degrees of freedom one
is seeking are not specified in the table, the closer smaller value should be used.
The chi-square distribution has the following properties:

 The mean of the distribution is equal to the number of degrees of freedom


(v).: μ = v.
 The variance is equal to two times the number of degrees of freedom: σ2 =
2*v
 When the degrees of freedom are greater than or equal to 2, the maximum
value for Y occurs when Χ2 = v - 2.
 As the degrees of freedom increase, the chi-square curve approaches a
normal distribution.

[email protected] 11 Chapter SIX

You might also like