0% found this document useful (0 votes)
46 views42 pages

Engineering Data Analysis Lecture 3

This document covers discrete probability distributions in engineering data analysis, focusing on discrete random variables, their probability mass functions (PMFs), and cumulative distribution functions (CDFs). It explains key concepts such as mean and variance, along with specific distributions like binomial and Poisson distributions. Examples are provided to illustrate the application of these concepts in various scenarios.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views42 pages

Engineering Data Analysis Lecture 3

This document covers discrete probability distributions in engineering data analysis, focusing on discrete random variables, their probability mass functions (PMFs), and cumulative distribution functions (CDFs). It explains key concepts such as mean and variance, along with specific distributions like binomial and Poisson distributions. Examples are provided to illustrate the application of these concepts in various scenarios.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

ENGINEERING

DATA ANALYSIS

LECTURE 3: DISCRETE
PROBABILITY DISTRIBUTION
TOPICS TO BE COVERED:

• DISCRETE RANDOM VARIABLE


• PROBABILITY DISTRIBUTIONS AND PROBABILITY MASS FUNCTIONS
• CUMULATIVE DISTRIBUTION FUNCTIONS
• MEAN AND VARIANCE OF DISCRETE RANDOM VARIABLE
• BINOMIAL PROBABILITY DISTRIBUTION
• POISSON PROBABILITY DISTRIBUTION
RANDOM VARIABLES

● In Engineering Data Analysis (EDA), a random variable is a variable whose value is subject to
uncertainty or variability due to inherent randomness in the system or process being studied.
● It's a mathematical representation of a quantity that can take on different values with varying
probabilities.
● A random variable is denoted with a capital letter
● The probability distribution of a random variable X tells what the possible values of X are and
how probabilities are assigned to those values.
● A random variable can be discrete or continuous.
TYPES OF RANDOM VARIABLES

● Discrete Random Variables - Can only take on specific, distinct values (often integers).
○ Examples: number of product defects, number of machine failures, customer arrival rates.
● Continuous Random Variables - Can take on any value within a certain range.
○ Examples: temperature readings, material strength measurements, vibration levels, time to failure.
DISCRETE RANDOM VARIABLES

● is a numerical representation of the outcome of a random event where the possible results are
distinct and can be counted. It serves as the foundation for defining and understanding discrete
probability distributions.
● Each possible value of a discrete random variable has a specific probability associated with it. This
collection of probabilities for all possible values forms the discrete probability distribution
(specifically, the Probability Mass Function, or PMF).

● Finite or Countably Infinite Values:

○ Finite: There's a limited, definite number of possible values (e.g., the number of heads in 3 coin tosses:
0, 1, 2, 3).
○ Countably Infinite: The values can be listed in a sequence, but the list never ends (e.g., the number of
coin tosses until you get the first head: 1, 2, 3, ... – it could theoretically take forever).
EXAMPLES OF DISCRETE RANDOM VARIABLES
EXAMPLES OF DISCRETE RANDOM VARIABLES
PROBABILITY DISTRIBUTIONS
The probability distribution of a random variable X is a description of the
probabilities associated with the possible values of X.
DISCRETE PROBABILITY DISTRIBUTION

• For a discrete random variable, the distribution is often specified by just a list of the possible
values along with the probability of each. In some cases, it is convenient to express the
probability in terms of a formula.
• In essence, it tells you how the total probability of 1 is "distributed" across the distinct values
that a random variable can take.
DISCRETE PROBABILITY DISTRIBUTION

• Key Characteristics:
• Discrete Random Variable (X): The underlying variable must be discrete, meaning it can only take on a
finite or countably infinite number of specific, separate values. These values are often integers, representing
counts or categories (e.g., number of heads in coin tosses, number of defective items, results of a die roll).
• Listable Outcomes: Unlike continuous distributions, the possible outcomes can be listed or enumerated
(e.g., {0, 1, 2, 3} or {1, 2, 3, ...}).
• Probabilities for Each Outcome: For each possible value x that the random variable X can take, there is an
associated probability 𝑃(𝑋 = 𝑥).
• Properties of Probabilities: Non-negative: The probability of any single outcome must be between 0 and 1,
inclusive: 0 ≤ 𝑃(𝑋 = 𝑥) ≤ 1.
• Sum to One: The sum of the probabilities of all possible outcomes must equal 1: σ𝑥 𝑃(𝑋 = 𝑥) = 1. This ensures
that all possible outcomes are accounted for and that some outcome must occur.
EXAMPLE
• Flipping a coin twice. Let X be the number of heads
that are observed. Determine the probability
distribution of X and find the probability that at
least one head is observed.
The possible values that X can take are 0, 1, and 2. Each
of these numbers corresponds to an event in the sample X 0 1 2
space S={hh,ht,th,tt} of equally likely outcomes for this
experiment: 𝑃(𝑥) 0.25 0.5 0.25

X = 0 to {tt}, X = 1 to {ht,th}, and X = 2 to {hh}. The This table is the probability distribution of X.
probability of each of these events, hence of the
corresponding value of X, can be found simply by counting,
to give
EXAMPLE

If 2 dice are rolled. Let X be the sum of the


values of the two dice combined. Construct the
probability distribution of X.
1. Determine the Sample Space: When two dice
are rolled, each die has 6 possible outcomes (1,
2, 3, 4, 5, 6). Since the rolls are independent, the
total number of possible outcomes in the sample
space is 6×6=36.
2. 2. Identify the Possible Values of the Random
Variable X (the Sum): The minimum sum is
1+1=2. The maximum sum is 6+6=12. So, the
possible values for X are {2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12}.
EXAMPLE
:
Probability
Sum (X)
P(X=x)
If 2 dice are rolled. Let X be the sum of the 2 1/36
values of the two dice combined. Construct 3 2/36
4 3/36
the probability distribution of X.
5 4/36
6 5/36
7 6/36
8 5/36
9 4/36
10 3/36
11 2/36
12 1/36
Total 36/36 = 1
BOARDWORK!!!

• Question 1: What is the probability that the sum of the two dice rolls is less than 5?
• Question 2: What is the probability that the sum of the two dice rolls is exactly 7 or exactly
11?
• Question 3: What is the probability that the sum of the two dice rolls is greater than or equal
to 9?
PMF tells you the probability at a specific point.

PROBABILITY MASS FUNCTIONS


• The Probability Mass Function (PMF) is the formal mathematical function that defines a
discrete probability distribution. It directly assigns a probability to each specific, distinct value
that a discrete random variable can take.
PROBABILITY MASS FUNCTIONS
EXAMPLES: PROBABILITY MASS FUNCTION

• Verify that the following functions are probability mass functions, and determine the requested
probabilities
EXAMPLE
• A disk drive manufacturer sells storage • Actual lengths of stay at a
devices with capacities of one hospital’s emergency
terabyte, 500 gigabytes, and 100 department in 2009 are
shown in the following table
gigabytes with probabilities 0.5, 0.3,
(rounded to the nearest
and 0.2, respectively. The revenues
hour). Length of stay is the
associated with the sales in that year
total of wait and service
are estimated to be $50 million, $25 times. Some longer stays are
million, and $10 million, respectively. also approximated as 15
Let X denote the revenue of storage hours in this table. Calculate
devices during that year. Determine the probability mass function
the probability mass function of X. of the wait time for service.
BOARD WORK!!!

• The data from 200 endothermic


reactions involving sodium bicarbonate
are summarized as follows. Calculate
the probability mass function of final
temperature.
CUMULATIVE DISTRIBUTION
FUNCTIONS
The probability that a discrete random variable X takes on a value less than or
equal to a specific value x. It "accumulates" the probabilities
Continuous Probability Distribution: For outcomes within
a range, defined by a Probability Density Function (PDF).

CUMULATIVE DISTRIBUTION FUNCTIONS


• While a discrete distribution uses a Probability Mass Function (PMF) to give probabilities of individual
points, it also has a Cumulative Distribution Function (CDF). The CDF, denoted F(x), gives the
probability that the random variable X takes on a value less than or equal to x.
• The CDF for a discrete variable is a step function. It looks like a series of steps, where the function jumps
up at each possible value of X by an amount equal to the PMF at that value. It is defined for all real
numbers, but its value only changes at the discrete points where the PMF is non-zero.
EXAMPLE:
EXAMPLE

• A disk drive manufacturer sells storage devices


with capacities of one terabyte, 500 gigabytes,
and 100 gigabytes with probabilities 0.5, 0.3,
and 0.2, respectively. The revenues associated
with the sales in that year are estimated to be
$50 million, $25 million, and $10 million,
respectively. Let X denote the revenue of
storage devices during that year. Determine the
cumulative distribution function for the random
variable
EXAMPLE

• Errors in an experimental transmission channel are


found when the transmission is checked by a
certifier that detects missing pulses. The number of
errors found in an eight-bit byte is a random
variable with the following distribution:
BOARDWORK!!!!
The data from 200 endothermic reactions
involving sodium bicarbonate are summarized
as follows. Determine the cumulative
distribution function for the random variable
of the final temperature.
MEAN AND VARIANCE OF
DISCRETE RANDOM VARIABLE
Two numbers are often used to summarize a probability distribution for a random
variable X. The mean is a measure of the center or middle of the probability
distribution, and the variance is a measure of the dispersion, or variability in the
distribution.
MEAN OF A DISCRETE RANDOM VARIABLE (EXPECTED
VALUE)
EXAMPLE

• There is a chance that a bit transmitted through a digital


transmission channel is received in error. Let X equal the number of
bits in error in the next four bits transmitted. The possible values for
X are. Based on the model for the errors presented in the following
section, probabilities for these values will be determined. Suppose
that the probabilities are
BOARD WORK!!!
BOARDWORK

• Determine the mean and variance of the random variable in Exercise 3-16.
BINOMIAL PROBABILITY
DISTRIBUTION
A trial with only two possible outcomes is used so frequently as a building block of a random
experiment that it is called a Bernoulli trial. It is usually assumed that the trials that constitute
the random experiment are independent. This implies that the outcome from one trial has no
effect on the outcome to be obtained from any other trial. Furthermore, it is often reasonable to
assume that the probability of a success in each trial is constant.
BINOMIAL DISTRIBUTION

• A binomial random variable is the number of successes x in n repeated trials of a binomial


experiment. The probability distribution of a binomial random variable is called a binomial
distribution.
CUMULATIVE BINOMIAL PROBABILITY

• A cumulative binomial probability refers to the probability that the binomial random variable
falls within a specified range (e.g., is greater than or equal to a stated lower limit and less than
or equal to a stated upper limit).
EXAMPLE
• Each sample of water has a 10% chance of
containing a particular organic pollutant.
Assume that the samples are independent
with regard to the presence of the pollutant.
Find the probability that in the next 18
samples, exactly 2 contain the pollutant.
EXAMPLE
• Each sample of water has a 10% chance of
containing a particular organic pollutant.
Assume that the samples are independent
with regard to the presence of the pollutant.
Find the probability that in the next 18
samples, at least four samples contain the
pollutant.
BOARDWORK!!!!

• A multiple-choice test contains 25 questions, each with four answers. Assume a student just
guesses on each question. What is the probability the student answers less than five questions
correctly?
• Because not all airline passengers show up for their reserved seat, an airline sells 125 tickets for
a flight that holds only 120 passengers. The probability that a passenger does not show up is
0.10, and the passengers behave independently. What is the probability that every passenger
who shows up can take the flight?
POISSON PROBABILITY
DISTRIBUTION
A widely-used distribution emerges as the number of trials in a binomial
experiment increases to infinity while the mean of the distribution remains
constant.

You might also like