0% found this document useful (0 votes)

34 views

DA Unit-2 Probability and Statistical Methods

data analytics

Uploaded by

keerthi2005srinivas

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views

DA Unit-2 Probability and Statistical Methods

data analytics

Uploaded by

keerthi2005srinivas

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 139

UNIT: 2

 Sample space is the universal set that consists of all possible outcomes of
an experiment. Sample space is usually represented using the letter ‘S’
and individual outcomes are called the elementary events.
 The sample space can be finite or infinite.

 Definition: A sample space, is a set of possible outcomes of a random

experiment.
 Example: Sample space = S = {all people in class}
Few random experiments and their sample spaces are
discussed below:
Experiment 1 : Outcome of a football match
 Sample Space = S = {Win, Draw, Lose}
Experiment 2 : Predicting customer churn at an individual customer
level
 Sample Space = S = {Churn, No Churn}
Experiment 3: Predicting percentage of customer churn
 Sample Space = S = {X | X ∈ R, 0 ≤ X ≤ 100}, that is X is a real
number that can take any value between 0 and 100 percentage.
Experiment 4: Life of a turbine blade used in an aircraft engine
 Sample Space = S = {X | X ∈ R, 0 ≤ X < ∞}, that is X is a real
number that can take any value between 0 and ∞.
 E X A M P L E : When we flip a coin then sample space is
S = { H ,T } , where
 Where H denotes that the coin lands ”Heads
up” and
 T denotes that the coin lands ”Tails up”.

 For a ”fair coin ” we expect H and T to have the same ”chance ” of

occurring, i.e., if we flip the coin many times then about 50 % of the
outcomes will be H .
 We say that the probability of H to occur is 0.5 (or 50 %) . The
probability of T to occur is then also 0.5.
Problem :1
 When we roll a fair die then the sample space is
S = { 1 ,2 , 3 , 4 , 5 , 6 }
 The probability the die lands with k up is 1/6 , k={1,2,3,4,5,6} and when we roll it
1200 times we except a 5 up about 200 times.
 The probability the die lands with an even number up is
 1/6+1/6+1/6 = 1/2
Problem : 2
 EXAMPLE : When we toss a coin 3 times and record the results in the sequence that
they occur, then the sample space is
 S = { HHH , HHT , HTH , HTT , THH , THT , TTH , TTT } .
 Elements of S are ”vectors ”, ”sequences ”, or ”ordered outcomes ”.
 We may expect each of the 8 outcomes to be equally likely. Thus the probability of the
sequence HTT is 1/8 .
 The probability of a sequence to contain precisely two Heads is
 1/ 8 + 1 /8 + 1/ 8 = 3 /8
 Problem 3 : When we toss a coin 3 times and record the results without
paying attention to the order in which they occur, e.g., if we only record the
number of Heads, then the sample space is
S = ‘ {H, H, H } , {H, H, T } , {H, T, T } , {T, T, T } ‘
The outcomes in S are now sets ; i.e., order is not important.
 Recall that the ordered outcomes are
 { HHH , HHT , HTH , HTT , THH , THT , TTH , TTT } .
Note that
 {H, H, H } Corresponds to one of the ordered outcomes,
{H, H, T } Corresponds to three of the ordered outcomes,
{H, T, T } Corresponds to three of the ordered outcomes,
{T, T, T } Corresponds to one of the ordered outcomes ,
Thus {H, H, H } and {T, T, T } each occur with probability 1 /8 ,
while {H, H, T } and {H, T, T } each occur with probability 3 /8 .
 Definition: A probability event can be defined as a set of outcomes of an
experiment. In other words, an event in probability is the subset of the
respective sample space.
 Pick a person in this class at random.
 Sample space: = {all people in class}
 Event A: A = {all males in class}.
 Event B: B = {all females in class}.
 Thus, an event is a subset of the sample space, i.e., E is a subset of S.
 In the example above, event A occurs if the person we pick is male.
 The entire possible set of outcomes of a random experiment is the sample
space.The likelihood of occurrence of an event is known as probability .
 The probability of occurrence of any event lies between 0 and 1.
Example:
 The sample space for the tossing of three coins simultaneously is given by:

 S = {(T , T , T) , (T , T , H) , (T , H , T) , (T , H , H ) , (H , T , T ) , (H , T , H) ,
(H , H, T) ,(H , H , H)}
 Suppose, if we want to find only the outcomes which have at least two heads;
then the set of all such possibilities can be given as:
 E = { (H , T , H) , (H , H ,T) , (H , H ,H) , (T , H , H)}

 Thus, an event is a subset of the sample space, i.e., E is a subset of S.

What is the Probability of Occurrence of an Event?

 The number of favorable outcomes to the total number of outcomes is defined
as the probability of occurrence of any event. So, the probability that an event
will occur is given as:
 P(E) = Number of Favorable Outcomes/ Total Number of Outcomes
1. Simple Events
 Any event consisting of a single point of the sample space is known as a simple
event in probability. For example, if S = {56 , 78 , 96 , 54 , 89} and E = {78}
then E is a simple event.
2. Compound Events
 if any event consists of more than one single point of the sample space then such an
event is called a compound event. Considering the same example again, if S =
{56 ,78 ,96 ,54 ,89}, E1 = {56 ,54 }, E2 = {78 ,56 ,89 } then, E1 and E2 represent
two compound events.
3. Independent Events and Dependent Events
 If the occurrence of any event is completely unaffected by the occurrence of any
other event, such events are known as an independent event in probability and
the events which are affected by other events are known as dependent events.
Examples of Independent Events :
 Tossing a Coin
 Sample Space(S) in a Coin Toss = {H, T}
 Both getting H and T are Independent Events.
4. Mutually Exclusive Events
 If the occurrence of one event excludes the occurrence of another event,
such events are mutually exclusive events i.e. two events don’t have any
common point.
 For example, if S = {1 , 2 , 3 , 4 , 5 , 6} and E1, E2 are two events such
that E1 consists of numbers less than 3 and E2 consists of numbers greater
than 4.
 So, E1 = {1,2} and E2 = {5,6} .
 Then, E1 and E2 are mutually exclusive.
5. Exhaustive Events
 A set of events is called exhaustive if all the events together consume the
entire sample space.
Ex: Let us consider the experiment of throwing a die.
 Sample space S = {1, 2, 3, 4, 5, 6}
 A be the event of getting a number greater than 3
 B be the event of getting a number greater than 2 but less than 5
 C be the event of getting a number less than 3
 We can write these events as:
 A = {4, 5, 6}
 B = {3, 4}
 and C = {1, 2}
 We observe that
 A ⋃ B ⋃ C = {4, 5, 6} ⋃ {3, 4} ⋃ {1, 2} = {1, 2, 3, 4, 5, 6} = S
 Therefore, A, B, and C are called exhaustive events.
5. Complementary Events
 For any event E1 there exists another event E1‘ which represents the remaining
elements of the sample space S.
E1 = S − E 1’
 If a dice is rolled then the sample space S is given as S = {1 , 2 , 3 , 4 , 5 , 6 }.
If event E1 represents all the outcomes which is greater than 4, then
E1 = {5, 6} and E1’ = {1, 2, 3, 4}.
 Thus E1’ is the complement of the event E1.

Events Associated with “OR”

 If two events E1 and E2 are associated with OR then it means that either E1 or
E2 or both. The union symbol (∪) is used to represent OR in probability.
 Thus, the event E1U E2 denotes E1 OR E2.
 Events Associated with “AND”
 If two events E1 and E2 are associated with AND then it means the intersection of
elements which is common to both the events. The intersection symbol (∩) is used
to represent AND in probability.
 Thus, the event E1 ∩ E2 denotes E1 and E2.
Measures of probability :
 A probability measure gives probabilities to a sets of experimental outcomes
(events). It is a function on a collection of events that assigns a probability of 0 and 1
to every event, meeting certain conditions.
Probability Measure Examples
 For a roll of one six-faced die, the

sample space = {1, 2, 3, 4, 5, 6}.

 If A = {1, 3, 5} is the event that the roll is odd, then P(A) = ½.
1. AXIOMS OF PROBABILITY:
According to axiomatic theory of probability, the probability of an event
E satisfies the following axioms:
1. The probability of event E always lies between 0 and 1. That is, 0 ≤
P(E) ≤ 1.
2. The probability of the universal set S is 1. That is, P(S) = 1.
3. P(X ∪Y) = P(X) + P(Y), where X and Y are two mutually exclusive
events.
 The following elementary rules of probability are directly deduced from the
original three axioms of probability, using the set theory relationships:
1. For any event A, the probability of the complementary event, written AC, is
given by
P(AC) = 1 – P(A)
 If P(A) is a probability of observing a fraudulent transaction at an e-commerce
portal, then P(AC) is the probability of observing a genuine transaction.
2. The probability of an empty or impossible event ,f, is zero:
 P(f)=0
 If occurrence of an event A implies that an event B also occurs, so that the event
class A is a subset of event class B, then the probability of A is less than or equal
to the probability of B:
 P(A) < P(B)

 If A is students with more than 3.5 CGPA (cumulative grade point average) out of 4 and
B is students with a CGPA of more than 3.0, then P(A) < P(B)
4. The probability that either events A or B occur or both occur is given by
P (A U B) = P(A) + P(B)- P (A ∩ B )
5 .If A and B are mutually exclusive events, so that P (A ∩ B ) = 0, then
P (A U B) = P(A) + P(B)
6. If A1 , A2 , …, An are n events that form a partition of sample space S,
then their probabilities must add up to 1:

Joint Probability :
Let A and B be two events in a sample space. Then the joint probability of the two events,
written as P(A ∩ B), is given by
13 42
P( Divorced ∩ Default )= -------- = 0.013 P( Single ∩ Default )= -------- = 0.042
1000 1000
50 300
P( Divorced )= ----------- = 0.05 P( Single )= ----------- = 0.3
1000 1000
1. Let there be a bag containing 5 white and 4 red balls .Two balls are
drawn from the bag one after the other without replacement. Consider
the following events.
A= Drawing a white ball in the first draw
B= Drawing a red ball in the Second draw.
Sol: P(B/A)= Probability of drawing a red ball in second draw given
that a white ball has already been drawn in the first draw.
P(B/A)= Probability of drawing a red ball from a bag containing 4
white and 4 red balls.
P(B/A)= 4/8 =1/2
For this Random Experiment P(A/B) is not meaningful because A
cannot occur after the occurrence of event B.
2. A Die is thrown twice and the sum of the numbers appearing is observed
to be 6. what is the conditional probability that the number 4 has appeared
at least once?
B= Number 4 has appears at least once
A=The Sum of the numbers appearing is 6, Required probability P(B/A)
Sol: A=((1,5),(2,4),(3,3),(4,2),(5,1)) P(A ∩ B)= 2 P(A)=5
Required probability = P(B/A)
= P(A ∩ B)/P(A) = 2/5
 A= sum of the numbers appearing on two dice is 6
 =(1,5),(5,1),(2,4),(4,2),(3,3) B= number 4 has appeared at least once
 P(A)=5 =(1,4),(4,1),(2,4),(4,2),(3,4),(4,3),(4,4),(4,5),(5,4)
,(4,6),(6,4)
A∩B=(2,4),(4,2)

P(A∩B)=2
Question 3:
 Ten numbered cards are there from 1 to 15, and two cards a
chosen at random such that the sum of the numbers on both the
cards is even. Find the probability that the chosen cards are
odd-numbered.
 Let, A ≡ event of selecting two odd-numbered cards
 B ≡ event of selecting cards whose sum is even.
Sol: Then,
 P(B) = number of ways of choosing two numbers whose sum is even
= 8C 2 + 7C 2 .
 P(A ∩ B) = number of ways of choosing odd-numbered cards such that
their sum is even.
 = 8 C 2.
 Now, P(A|B) = P(A ∩ B)/P(B)
 = 8C2 / (8C2 + 7C2) = 4/7.
 Bayes’ theorem is one of the most important concepts in analytics
since several problems are solved using Bayesian statistics. Consider
two events A and B. We can write the following two conditional
probabilities:

 Using the two equations, we can show that

 Bayes’ theorem helps the data scientists to update the probability of an

event (B) when any additional information is provided.
 The following terminologies are used to describe various
components:
1. P(B) is called the prior probability (estimate of the probability
without any additional information).
2. P(B|A) is called the posterior probability (that is, given that the
event A has occurred, what is the probability of occurrence of event
B). That is, post the additional information (or additional evidence)
that A has occurred, what is estimated probability of occurrence of B.
3. P(A|B) is called the likelihood of observing evidence A if B is true.
4. P(A) is the prior probability of A.
 A great example for human’s inability to take decisions is the famous
Monty Hall problem in which the contestants of a game show are
shown three doors Behind one of the doors is an expensive item
(such as a car or gold); while there are inexpensive items behind the
remaining two doors (such as a goat).
 The contestant is asked to choose one of the doors. Assume that the
contestant chooses door 1; the game host would then open one of
the remaining two doors. Assume that the game host opens door 2,
which has a goat behind it. Now the contestant is given a chance to
change his initial choice (from door 1 to door 3).
 In this problem, the contestant — the decision maker — has two
choices: he/she can either change his/her initial choice or stick with
his/her initial choice.
 Let C1 , C2 , and C3 be the events that the car is behind door 1, 2, and 3,
respectively. Let D1 , D2 , and D3 be the events that Monty opens door 1, 2,
and 3, respectively.
 Prior probabilities of C1 , C2 , and C3 are P(C1 ) = P(C2 ) = P(C3 ) = 1/3
 Assume that the player has chosen door 1 and Monty opens door 2 to reveal a
goat.
 posterior probability P(C1 |D2 ), Using, Bayes’ theorem
 Generalization of Bayes’ Theorem:
 Three machines A,B,C produce identical items of their irrespective
outputs 5%,4%,and 3% items are defective. On a certain day A has
produced 25% of the total output. B has produced 30% and C the
balance. An item is selected at random and is found defective. What is
the probability that it was produced by the machine with greatest
output?
Sol: let E1 ,E2,E3 denotes the events that an item is selected at random is
manufactured by the machines A,B,and C respectively and Let D be an event of its
being defective then we have P(E1)= 25/100, P(E2)=30/100,P(E3)= 45/100
The probability of drawing a defective item manufactured by machine A is
P( D/E1)=5/100=0.05
Similarly P(D/E2)=4%=0.04 P(D/E3)= 3%=0.03
 A random variable is any function that assigns a numerical value to each possible
outcome.
 The numerical value should be chosen to quantify an important characteristic of the
outcome. Random variables are denoted by capital letters X,Y, and so on, to
distinguish them from their possible values given in lowercase x, y.
Ex:
 Suppose that a coin is tossed twice so that the sample space is S = {HH, HT, TH,
TT}. Let X represent the number of heads that can come up. for example, in the
case of HH (i.e., 2 heads), X = 2 while for TH (1 head), X =1. It follows that X is a
random variable.

Random variable HH HT TH TT
X 2 1 1 0
 Random variables can be classified as discrete or continuous depending on the values that
the random variable can take.
Discrete Random Variables :
 A Random variables which takes finite or at most countable ( may be finite or infinite)
number of values is known as discrete random variable. Or Discrete Random Variable
takes a countable number of possible outcomes.
 Ex: i) Marks obtained by a student in a test
ii) Number of Defective nuts in a lot
iii) The number of cars that pass through a given intersection in an
hour.
iii) Number of errors on a page of a book
iv) Number of accidents taking place on busy road.
 Thus, X = {1, 2, 3, 4, 5, 6}
 Another popular example of a discrete random variable is the number of heads when
tossing of two coins. In this case, the random variable X can take only one of the three
choices i.e., 0, 1, and 2.
Continuous Random variable :
 A random variable which takes all the possible values in an interval is called
Continuous variable.
 Examples i) Waiting time for a bus

ii) Weight, Height of the students

Generally discrete random variables represent Counted data while Continuous random
variable represent measured data.
Probability Mass Function and Cumulative Distribution Function of a
Discrete Random Variable :
 For a discrete random variable, the probability that a random variable X taking a
specific value xi , P(X = xi ), is called the probability mass function P(xi ).
Probability Mass Function :

P(X)=P(x=0)+p(x=1)+p(x=2)
= 1/4+1/2+1/4
=1
 Cumulative distribution function, P(xi ), is the probability that the random
variable X takes values less than or equal xi . That is, P(xi ) = P(X ≤ xi ).
 From the above problem
 P(X < 2), probability that the number of heads are less than are equal
to two.
 F(2) = P(x=0)+P(x=1)
= 1/4 +1/2
= 0.75
 Example 2:
 The Cumulative Distribution Function (CDF) is another important concept in
probability theory and statistics, especially when dealing with random variables, whether
discrete or continuous. The CDF provides the probability that a random variable X takes
on a value less than or equal to a specific point x.
 The cumulative distribution function is denoted by F(x) and its formula is given by:
 F(x)=P(X≤x)
 Probability Mass Function and Cumulative Distribution Function of a
Continuous Random Variable :

See the below figure .

 A probability distribution is a mathematical function that describes
the probability of different possible values and possibilities of
a random variable. Probability distributions are often depicted using
graphs or probability tables.
 Example: Probability distribution
 We can describe the probability distribution of one coin flip using a
probability table:
 Outcome Probability
Heads Tails
.5 .5
Again the probability Distributions are Classified into two types.
1) Discrete probability Distribution
2) Continuous probability Distribution
 A distribution is said to be discrete, if the value taken by the corresponding
random variable are discrete, whereas a distribution is said to be Continuous,
if the random variable takes any value in a specified interval.
 In this Chapter we discuss the following Distributions:

 1) Discrete probability Distribution

a) Binomial Distribution 2) Continuous probability Distribution

b) Poisson Distribution a) Normal Distribution
b) Exponential Distribution
c) Geometric Distribution c) Weibull Distribution
d) Bernoulli Distribution
Binomial Distribution :
 Binomial distribution is one of the most important discrete probability
distribution due to its applications in several contexts. A random variable X is said
to follow a Binomial distribution when
1. The random variable can have only two outcomes success and failure (also
known as Bernoulli trials).
2. The objective is to find the probability of getting k successes out of n trials.
3. The probability of success is p and thus the probability of failure is (1 − p).
4. The probability p is constant and does not change between trials.
5. Success and failure are generic terminologies used in binomial distribution;
based on the context, the interpretation will change (winning a lottery can be
considered as success and not winning as failure).
Probability Mass Function (PMF) of Binomial Distribution : The PMF of the
Binomial distribution (probability that the number of success will be exactly x out of
n trials) is given by

where

 In Microsoft Excel, the function ‘BINOM.DIST(x, n, p, false)’ can be used for

calculating the probability mass function of a binomial distribution.
Cumulative Distribution Function (CDF) of Binomial Distribution : CDF of a
binomial distribution function, F(a), representing the probability that the random
variable X takes value less than or equal to a, is given by
 In Microsoft Excel, the function ‘BINOM.DIST(x, n, p, true)’ can be used for
calculating the cumulative distribution function of a binomial distribution.
Mean and Variance of Binomial Distribution:
 Mean of a binomial distribution is given by

 The variance of a binomial distribution is given by

 Approximation of Binomial Distribution using Normal Distribution If the

number of trials (n) in a binomial distribution is large, then it can be
approximated by normal distribution with mean np and variance npq, where
q = 1 - p.
Binomial Probability:
 Let X be a binomial random variable. Then, its probability mass function is:

 for x = 0, 1, 2, . . . , n. The values of n and p are called the parameters of the

distribution.
Ex:
Consider an exam that contains 10 multiple-choice questions
with 4 possible choices for each question, only one of which
is correct.
 Suppose a student is to select the answer for every question randomly.
Let X be the number of questions the student answers correctly. Then,
X has a binomial distribution with parameters n = 10 and p = 0.25.
(Convince yourself that all assumptions for a binomial distribution are
reasonable in this setting.)
 What is the probability for the student to get no answer correct?
Answer:
What is the probability for the student to get two answers correct?
 Answer:

 What is the probability for the student to fail the test (i.e., to have less
than 6 correct answers)?
Answer:
 Binomial Mean and Variance:
 Mean= np
 Variance=np(1-p)
Binomial Mean E(X) = 10 * 0.25 = 2.5.
Variance V (X) = 10 * (0.25) * (1 − 0.25) = 1.875.
 Poisson Distribution
 Poisson Distribution is a Probability distribution that is used to show how many times
an event occurs over a specific period.
 It is the discrete probability distribution of the number of events occurring in a given
time period, given the average number of times the event occurs over that time
period. It is the distribution related to probabilities of events that are extremely rare
but have a large number of independent opportunities for occurrence.
 Poisson Distribution Definition
 Poisson distribution is used to model the number of events that occur in a fixed
interval of time or space, given the average rate of occurrence, assuming that the
events happen independently and at a constant rate
Poisson distribution formula
Mean and Variance of Poisson distribution:
 The Poisson distribution has only one parameter, called λ.

 The mean of a Poisson distribution is λ or (µ)

 The variance of a Poisson distribution is also λ or (σ²)

 In most distributions, the mean is represented by µ (mu) and the variance
is represented by σ² (sigma squared). Because these two parameters are
the same in a Poisson distribution, we use the λ symbol to represent
both.
1.An average of 0.61 soldiers died by horse kicks per year in each Prussian army corps.You
want to calculate the probability that exactly two soldiers died in the VII Army Corps in
1898, assuming that the number of horse kick deaths per year follows a Poisson
distribution.
Sol:
2. The number of typographical errors in a “big” textbook is Poisson
distributed with a mean of 1.5 per 100 pages.
Suppose 100 pages of the book are randomly selected. What is the
probability that there are no typos?
 Sol:

 Suppose 400 pages of the book are randomly selected. What are the
probabilities for having no typos and for having five or fewer typos?
Sol:
NORMAL DISTRIBUTION (GAUSSIAN DISTRIBUTION) :
 The normal distribution is the most widely known and used of all
distributions. Because the normal distribution approximates many natural
phenomena so well, it has developed into a standard of reference for many
probability problems.
 Let X be a continuous random variable, then it is said to follow normal
distribution if it is given by

 Here u, 𝜎 are the mean & Standard Deviation of X.

Properties Of Normal Distribution :
 It is a two parameter distribution, where the parameter U is the mean
(location parameter) and the parameter 𝜎 is the standard deviation (scale
parameter).
1. Normal curve is always centered at mean
2. Mean, median and mode coincide (i.e., equal)
3. It is unimodal.
4. It is a symmetrical curve and bell shaped curve
5. X-axis is an asymptote to the normal curve .
6. The total area under the normal curve from −∞ 𝑡𝑜∞ is “1”
7. The points of inflection of the normal curve are 𝜇 ± 𝜎, 𝜇 ± 3𝜎
8. The area of the normal curve between
𝜇 − 𝜎 to 𝜇 + 𝜎 = 68.27%
𝜇 − 2𝜎 𝑡𝑜 𝜇 + 2𝜎 = 95.44%
𝜇 − 3𝜎 𝑡𝑜 𝜇 + 3𝜎 = 99.73%
 Standard Normal Variable Let with mean ‘0’ and variance is ‘1’ then the
normal variable is said to be standard normal variable.
Standard Normal Distribution :
 The normal distribution with man ‘0’ and variance ‘1’ is said to be standard normal
distribution of its probability density function is defined by

 By using the following transformation, any normal random variable X can be

converted into a standard normal variable:
 The random variable X can be written in the form of a standard normal
random variable using the relationship.

 Thus, any normal random variable X can be expressed using the standard
normal random variable Z.
Solved Examples
1. Calculate the probability of normal distribution with the population mean
2, standard deviation 3 or random variable 5.
Solution:
x=5
Mean = μ = 2
Standard Deviation = σ = 3
We will solve the questions with the help of the above normal
probability distribution formula:
SAMPLING
 Definition: A portion of the population which is examined with a
view to determining the population characteristics is called a
sample.
 In other words, sample is a subset of population. Size of the sample
is denoted by n. The process of selection of a sample is called
Sampling.
 There are different methods of sampling
Probability Sampling Methods
Non-Probability Sampling Methods
Probability Sampling Methods :
a) Random Sampling (Probability Sampling): It is the process of drawing a sample from a
population in such a way that each member of the population has an equal chance of being included in
the sample.
Example: A hand of cards from a well shuffled pack of cards is a random sample.
Note: If N is the size of the population and n is the size of the sample, then The no. of samples with
replacement = Nn
The no. of samples without replacement = 𝑁Cn
b) Stratified Sampling : In this , the population is first divided into several smaller groups called strata
according to some relevant characteristics .
 From each strata samples are selected at random, all the samples are combined together to form the
stratified sampling.
c) Cluster Sampling :
 In cluster sampling, the population is divided into mutually exclusive clusters.
 For example, assume that a researcher is interested in analyzing life of smart phone batteries from a
specific manufacturer. The manufacturer may have different models (each model in this case will be a
cluster).
d) Systematic Sampling (Quasi Random Sampling): In this method , all the units of the population
are arranged in some order . If the population size is N, and the sample size is n, then we first define
sample interval denoted by = N/n
Non Probability Sampling Methods:
 Sample units are selected based on convenience and/or on voluntary basis.
Ex: Assume that a data scientist is interested in studying attrition and factors
influencing attrition. For this study, he/she may collect data from his friends and
colleagues which may not be true representation of the population. Such
sampling procedures come under the category of non-probability sampling.
Convenience Sampling :
Convenience sampling is a non-probability sampling technique in which the sample
units are not selected according to a probability distribution. For example, a
researcher may collect data from his school or the work place and from his/her
friends since the cost of data collection in such cases is minimal. Convenience
sampling is not recommended since it is likely to result in bias estimates.
Voluntary Sampling : Under voluntary sampling the data is collected from people
who volunteer for such data collection. For example, customer feedbacks in many
contexts fall under this sampling procedure. There could be bias in case of voluntary
sampling. Many organizations such as Amazon, Trip Advisor provide customer
feedback. Many times the feedback is provided by customers who had bad
experience with product/ service; many customers who were happy with
product/service may not give feedback.
Purposive (Judgment ) Sampling : In this method, the members constituting the
sample are chosen not according to some definite scientific procedure , but
according to convenience and personal choice of the individual who selects the
sample . It is the choice of the individual items of a sample entirely depends on the
individual judgment of the investigator.
Sequential Sampling: It consists of a sequence of sample drawn one after another
from the population. Depending on the results of previous samples if the result of
the first sample is not acceptable then second sample is drawn and the process
continues to take proper decision . But if the first sample is acceptable ,then no
new sample is drawn .
Classification of Samples:
 Large Samples : If the size of the sample n ≥ 30 , then it is said to
be large sample.
 Small Samples : If the size of the sample n < 30 ,then it is said to
be small sample or exact sample.
Parameters and Statistics:
 Parameter is a statistical measure based on all the units of a
population.
 Statistic is a statistical measure based on only the units selected in a
sample.
 Note: In this unit, Parameter refers to the population and Statistic
refers to sample.
SAMPLING DISTRIBUTION
 Sampling distribution refers to the probability distribution of a
statistic such as sample mean and sample standard deviation
computed from several random samples of same size.
 Understanding the sampling distribution is important for
hypothesis testing. Test statistic in hypothesis testing is derived
based on the knowledge of sampling distribution.
 In this example, the population is the weight of six pumpkins (in
pounds) displayed in a carnival "guess the weight" game booth.You
are asked to guess the average weight of the six pumpkins by taking
a random sample without replacement from the population.
Since we know the weights from the population, we can find the population
mean.

To demonstrate the sampling distribution, let’s start with obtaining all of the
possible samples of size n=2 from the populations, sampling without
replacement. The table below shows all the possible samples, the weights for the
chosen pumpkins, the sample mean and the probability of obtaining each sample.
 The mean of the sample means is :
 =9.5(1/15)+11.5(1/15)+12(2/15)+12.5(1/15)+13(1/15)+13.5(1
/15)+14(1/15)+14.5(2/15)+15.5(1/15)+16(1/15)+16.5(1/15)+1
7(1/15)+18(1/15)
 = 14
 Now, let's do the same thing as above but with sample size n=5
 Central Limit Theorem: If ̅ be the mean of a random sample of size n
drawn from population having mean 𝜇 and standard deviation 𝜎 , then
the sampling distribution of the sample mean ̅ is approximately a normal
distribution with mean 𝜇 and SD = S.E of ̅ = 𝜎 /√n provided the
sample size n is large.
 Estimate : An estimate is a statement made to find an unknown population
parameter.
 Estimator : The procedure or rule to determine an unknown population
parameter is called estimator.
Example: Sample proportion is an estimate of population proportion , because
with the help of sample proportion value we can estimate the population
proportion value.
Types of Estimation:
 Point Estimation: If the estimate of the population parameter is given by a
single value , then the estimate is called a point estimation of the parameter.
 Interval Estimation: If the estimate of the population parameter is given by
two different values where the parameter is excepted to lie, then the estimate is
called an interval estimation of the parameter.
INTRODUCTION TO HYPOTHESIS TESTING:
 Hypothesis is a claim or belief, hypothesis testing is a statistical process of
either rejecting or retaining a claim or belief or association related to a
business context, product, service, processes, etc.
 Hypothesis testing consists of two complementary statements called null
hypothesis and alternative hypothesis, and only one of them is true.
 Null hypothesis is the claim that is assumed to be true initially. That is at the
beginning we assume that the null hypothesis is true and try to retain it
unless there is strong evidence against null hypothesis.
 Alternative hypothesis, usually denoted as HA (or H1 ), is the complement
of null hypothesis. Alternative hypothesis is what the researcher believes to
be true and would like to reject the null hypothesis.
 Hypothesis testing is an integral part of many predictive analytics
techniques such as multiple linear regression and logistic regression.
 In business, many claims are made by organizations. Few examples of such
claims are listed below:
 1. Children who drink the health drink Complan (a health drink owned by
the company Heinz in India) are likely to grow taller.
 2. If you drink Horlicks, you can grow taller, stronger, and sharper (3 in 1).
 3. Using fair and lovely (fair and handsome) cream can make one fair and
lovely (fair and handsome).
 4. Wearing perfume (such as Axe) will help to attract opposite gender
(known as Axe effect).
 5. Women use camera phone more than men (Freier, 2016).
 There are many such claims and beliefs; many business rules and strategies
are generated based on these hypotheses. The question is how can we check
whether these are actually true. Hypothesis testing is used for checking the
validity of the claim using evidence found in a sample data.
 Take the decision to reject or retain the null hypothesis based on the p-value
and significance value α. The null hypothesis is rejected when p-value is less
than α and the null hypothesis is retained when p-value is greater than or equal
to α.
 Calculate the p-value (probability value), which is the conditional probability
of observing the test statistic value when the null hypothesis is true. In simple
terms, p-value is the evidence in support of the null hypothesis.
 Decide the criteria for rejection and retention of null hypothesis. This is called
significance value traditionally denoted by symbol α . The value of α will
depend on the context and usually 0.1, 0.05, and 0.01 are used.
 if the calculated statistic value is less than the critical value (p-value will be less
than α-value) then we reject the null hypothesis, whereas, if the statistic value
is greater than the critical value(p-value will be greater than then we retain
the null hypothesis.
TYPE I ERROR, TYPE II ERROR
 In hypothesis test we end up with the following two decisions:
1. Reject null hypothesis.
2. Fail to reject (or retain) null hypothesis.
 Type I Error: Conditional probability of rejecting a null hypothesis
when it is true is called Type I Error or False Positive (falsely believing
that the claim made in alternative hypothesis is true).
 A type I error (false-positive) occurs if an investigator rejects a null
hypothesis that is actually true in the population false in the population.
 The significance value α is the value of Type I error.
 Type I Error = α = P(Rejecting null hypothesis | H0 is true)
 Probability value (p-value) is the evidence for the null hypothesis
whereas significance value α is the error based on repetitive sampling.
 Type II Error: Conditional probability of failing to reject a null
hypothesis (or retaining a null hypothesis) when the alternative hypothesis
is true is called Type II Error or False Negative (falsely believing that there
is no relationship).
 A type II error (false-negative) occurs if the investigator fails to reject a
null hypothesis that is actually false in the population.
 Usually Type II error is denoted by the symbol ß.
 Type II Error = ß = P(Retain null hypothesis | H0 is false)
 The value (1 − ß ) is known as the power of hypothesis test.
 Power of the test = 1 − ß = 1 − P(Retain null hypothesis | H0 is false)
 Alternatively the power of test = 1 − ß = P(Reject null hypothesis|H0 is
false.
 False-positive and false-negative results can also occur because of bias.
T-test :
 The t-test is used when the population follows a normal distribution and the population standard

deviation s is unknown and is estimated from the sample. t-test is a robust test for violation of
normality of the data as long as the data is close to symmetry and there are no outliers.

 Let S be the standard deviation estimated from the sample of size n. Then the statistic

will follow a t-distribution with (n − 1) degrees of freedom if the sample is drawn from a

population that follows a normal distribution. Here 1 degree of freedom is lost since the standard

deviation is estimated from the sample. Thus, we use the t-statistic (hence the test is called t-test) to

test the hypothesis when the population standard deviation is unknown. t-statistic =
 The t-test is a statistical test procedure that tests whether there is a
significant difference between the means of two groups.
EX: The two groups could be, for example, patients who received drug
A once and drug B once, and you want to know if there is a difference in
blood pressure between these two groups.
Types of t-test :
 There are three different types of t-tests.
One-sample t-test
 We use the one-sample t-test when we want to compare the mean of a sample with a known
reference mean.
 Example : A manufacturer of chocolate bars claims that its chocolate bars weigh 50 grams on
average. To verify this, a sample of 30 bars is taken and weighed. The mean value of this sample is
48 grams.
Independent-sample t-test
 We use the t-test for independent samples when we want to compare the means of two
independent groups or samples. We want to know if there is a significant difference between these
means.
 Example : We would like to compare the effectiveness of two painkillers, drug A and drug B.
Paired-sample t-test
 The t-test for dependent samples is used to compare the means of two dependent groups.
Example : We want to know how effective a diet is. To do this, we weigh 30 people before the diet
and exactly the same people after the diet.
Chi-Square Goodness of Fit Tests
 Goodness of fit tests are hypothesis tests that are used for comparing the
observed distribution of data with expected distribution of the data to
decide whether there is any statistically significant difference between the
observed distribution and a theoretical distribution based on comparison
of observed frequencies in the data and the expected frequencies if the data
follows a specified theoretical distribution.
 The null and alternative hypotheses in chi-square goodness of fit tests are
H0 : There is no statistically significant difference between the observed
frequencies and the expected frequencies from a hypothesized
distribution.
HA: There is a statistically significant difference between the observed
frequencies and the expected frequencies from a hypothesized
distribution.
 Let Z be a standard normal distribution with 1 degree.
 If we have k random variables, namely, X1 , X2 , …, Xk , then a chi-
square distribution with k-degrees of freedom is given by

 Consider a binomial random variable with parameter p (probability of

success) and number of trials n.
 Consider a binomial random variable with parameter p (probability of
success) and number of trials n.
 Then for a large sample, the standardized random variable in Eq.
follows a standard normal distribution (central limit theorem for
proportions):
 Note that np and n(1 − p) are the expected values of two categories (success
and failure) of the binomial distribution.

 Thus, the chi-square statistic for goodness of fit test is given by

 where Oij is the observed frequency in category (i, j) and Eij is the expected
frequency in the category (i, j). Thus, chi-square test is always a right-tailed
test.
INTRODUCTION TO ANALYSIS OF VARIANCE (ANOVA)
 The objective of ANOVA is to check simultaneously whether population
mean from more than two populations are different.
 ANOVA stands for Analysis of Variance. It is a statistical method used to
analyze the differences between the means of two or more groups or
treatments.
 It is often used to determine whether there are any statistically significant
differences between the means of different groups.
 ANOVA is used to compare treatments, analyze factors impact on a
variable, or compare means across multiple groups.
 Types of ANOVA include one-way (for comparing means of groups) and
two-way (for examining effects of two independent variables on a
dependent variable).
 One-way analysis of variance (ANOVA) : It is a statistical method
for testing for differences in the means of three or more groups.
 In statistics, ANOVA also uses a Null hypothesis and an Alternate
hypothesis.
 The Null hypothesis in ANOVA is valid when all the sample means are
equal, or they don’t have any significant difference.
 On the other hand, the alternate hypothesis is valid when at least one of
the sample means is different from the rest of the sample means. In
mathematical form, they can be represented as:
 where μi is the mean of the i-th level of the factor.
Ex for One –way ANOVA:
 Suppose you are studying the effectiveness of three different drugs (Drug
A, Drug B, and Drug C) in reducing blood pressure.You randomly assign
90 patients to one of the three drug groups and measure their blood
pressure after one month of treatment. The blood pressure measurements
(in mmHg) for each patient are observed and prepared as a dataset.
 In this dataset, each drug group represents a separate treatment or
condition, and the blood pressure measurements for each patient in that
group are recorded.
 To analyze this dataset using ANOVA, you would compare the means of
the blood pressure measurements among the three drug groups to
determine if there is a statistically significant difference.
Two-Way ANOVA : Two way ANOVA technique are used
when the data are classified based on the two factors.
 Ex: the agricultural output may be classified on the basis of different
varieties of Seeds and also on the basis of different varieties of
fertilizers are used.
 A statistical test is used to determine the effect of two nominal
predictor variables on a Continuous outcome variable.
 Two way ANOVA test analyzes the effect of the independent variables
on the expected outcome along with their relationship to the
outcome itself.
Ex for TWO –way ANOVA
 Two-way (or two factor) analysis of variance tests whether there is a
difference between more than two independent samples split between
two variables or factors.
 A factor is, for example, the gender of a person with the characteristics
male and female, the form of therapy used for a disease with therapy A,
B and C or the field of study with, for example, medicine, business
administration, psychology and math.
 In addition to gender, the highest level of education also has an influence
on salary.
 besides therapy, gender also has an influence on blood pressure.
 In addition to the field of study, the university attended also has an
influence on the duration of studies.
Now in all three cases you would not have one factor, but two factors
each. And since you now have two factors, you use the two-way
analysis of variance.
Formulas of ANOVA:
 Sum of Squares of Total Variation (SST):

 Mean Square Total (MST) variation is given by

 Sum of Squares of Between (SSB) Group Variation:

 Mean square between variation (MSB) is given by

 Sum of Squares of Within (SSW) Group Variation:

 The mean square of variation within the group is

Correlation Analysis

1.Simple Correlation Coefficient

Interpretation, Scatter plot.
Correlation:
 Correlation is a statistical measure of an association
relationship between two random variables.
 A correlation coefficient is a statistical measure of the degree
to which changes to the value of one variable predict change
to the value of another.
 ’’Correlation means that between two series or groups of
data ,there exists some casual connection,’’
 EX: For example, mobile service providers collect data on variables
such as call duration, number of calls, numbers to which the calls are
made, number of calls received, the device that was used to make the
call, location (and mobile tower that the phone was attached to), time
between calls, last recharge (in case of pre-paid mobile services),
recharge amount, service plan (in case of post-paid connection),
number of messages sent, number of messages received, apps
downloaded, time spent on surfing internet, and so on. The number of
variables collected and new variables generated may exceed several
thousands. The idea behind collecting all these variables is to find
answer to questions such as
 1. Which customer is likely to churn?
 2. What is the customer lifetime value?
 3. What is the best service plan for a customer?
 4. What recommendations can be made to a customer?
Importance of Correlation:
 The study of Correlation shows the direction and degree of relationship
between the variables .
 It is very helpful in understanding economic behaviour .
 Study of correlation reduces the range of uncertainties in matter of
prediction.
 Helpful in investigation and research.
 It is also helpful in policy formulation
Types of Correlation:
Correlation can be:
 Positive and Negative Correlation
 Linear and Non- Linear Correlation
 Simple ,Multiple and Partial Correlation
Positive Correlation :
 When two variables X and Y move in the same direction,i.e.when one
increases the other also increases and when one decreases the other also
decreases, the correlation between the two is positive .
Negative Correlation:
 If both the variables vary in opposite direction, the correlation is said to be
negative. If means if one variable increases, but the other variable
decreases or if one variable decreases, but the other variable increases,
then the correlation is said to be negative correlation.
Linear Correlation :
 If the ratio of change between two variables is uniform ,it is called Linear
Correlation. If the changes are plotted on a graph paper ,their relationship
will be indicated by a straight line .
Non- Linear Correlation :
 If the ratio of change between two variables is not uniform,It is called
Non-Linear Correlation. If these changes are plotted on a graph paper
,they will not form a straight line but a curve.
Simple Correlation:
 Relationship between two variables is known as Simple Correlation. For
example ,relationship between price and demand of a commodity
 Ex 2 :Yield of paddy and the use of fertilizers is an example of simple
correlation as yield of paddy depends on the use of fertilizers i.e. presence
of one variable affects another variable..
Multiple Correlation:
 When the relationship among three or more than three variables is studied
simultaneously, it is called Multiple Correlation. For example, agricultural
production depends on rainfall, amount of mannures,seeds etc. This will be
called Multiple Correlation
Partial Correlation:
 Relationship between two variables is established keeping other variables
constant. For example, If we study the relationship between degree of
rainfall and agricultural production assuming amount of fertilizers, quality
of seeds as constant ,it will be known as Partial Correlation.
 Degree Of Correlation :
Karl Pearson’s Coefficient of Correlation:
 A mathematical method for measuring the linear relationship
between the variable X and Y was suggested by the great
biologist and statistician Karl Pearson.
 This method is also called Product Moment Method.
 The coefficient of correlation is denoted by the symbol “r”.
 If the two variables under study are X and Y, the following
formula suggested by Karl Pearson can be used for measuring
the degree of relationship of correlation.
•

Here, r=Coefficient of Correlation.

Karl Pearson correlation coefficient lies between -1 and +1,e.i.,-1≤r≤+1.
If r=0 ,there is no correlation between variables.
 If r=+1,The correlation is perfect positive .
If r=-1,The correlation is perfect negative.
MERITS :
 Practical and popular method.
 Meaningful conclusion.
 Measurement of degree and direction simultaneously.
DEMERITS:
 Greater influence of extreme values.
 Calculation process is long and time consuming.
 Possibility of wrong interpretation.
 Assumption of Linear relationship between the variables.
 Example – Correlation of statistics and science tests
1. A study is conducted involving 10 students to investigate the association
between statistics and science tests. The question arises here; is there a
relationship between the degrees gained by the 10 students in statistics and
science tests?
Sol:
As per the above calculation the co-relation co-efficient r = 0.761
so it is a high degree positive co-relation
Spearman’s Rank Coefficient of Correlation:
 When quantification of variables becomes difficult such beauty
of female, leadership ability, knowledge of person etc, then this
method of rank correlation is useful which was developed by
British psychologist Charles Edward Spearman in 1904. In this
method ranks are allotted to each element either in ascending or
descending order.
 To find out correlation under this method, the following
formula is used.

 Here, R=Rank Coefficient of Correlation ,Σ 𝐷 2=The total of

squares of differences of corresponding ranks.
 N= Number of pairs of observation.
 As in case of r, -1≤R≤+1.
MERITS:
 Its calculation is easier as compared to Karl Pearson’s Method.
 This method can be used as a measure of degree of association
between qualitative variables.
DEMERITS:
 This method is not suitable for calculating coefficient of
correlation of grouped frequency distribution.
 If the no. of items are large , this method becomes difficult and
unsuitable.
Kendall's Tau rank correlation:
 Kendall's Tau rank correlation: Kendall rank correlation is a
non-parametric test that measures the strength of dependence
between two variables. If we consider two samples, x and y ,
where each sample size is n, we know that the total number of
pairings with x y is n (n1)/2.
 The following formula is used to calculate the value of Kendall
rank correlation:

Refer Datatab website for problems

Problem:
•Suppose two doctors rank 6 patients by
descending physical health. One of the two
doctors, in this case the female, is now defined
as the reference and the patients are sorted
from 1 to 6.

•Now it is possible to compare the

sorted ranks with the ranks of the
second doctor, e.g. the patient who is
ranked 3 by the female doctor is
ranked 4 by the male doctor.
•We want to know if there is
a correlation between the two
assessments using Kendall's Tau. To
calculate it, we only need the ranks on
the right-hand side, i.e. the ones from
the male doctor.
•We now look at each rank and note
whether the values below it are smaller
or larger than itself.
•As can be seen in the figure above, we
start with the first rank, corresponding
to the value 3. 1 is smaller than 3, so it
gets a minus, 4 is larger, so it gets a plus,
2 is smaller, so it gets a minus, 6 is
larger, so it gets a plus, and 5 is also
larger, so it also gets a plus.
•Same procedure for 1,4,2,6,5,finally
 We get the number of concordant pairs by counting all "+". = 11
 We get the number of discordant pairs by counting through all the
"-“ =4
 C is 11 and D is 4, so the Kendall's Tau is 11 - 4 divided by 11 + 4,
resulting a value of 0.47.
Scatter Diagram Method:
 1.Scatter Diagram Method :The existence of Correlation between
variables can be shown graphically by means of a Scatter diagram.
 It is obtained by plotting value on a graph paper .
 The chart is prepared by measuring X variable on horizontal axis and the
Y-variable on vertical axis and all the observations are plotted on a graph.
 The cluster points ,so obtained on graph paper is called the Scatter
diagram or dot diagram. By observing the points we can know the degree
and direction of Correlation.
 If the trend of the dotted points is Upward, rising from left bottom and
going up towards the right top, Correlation is positive.
 On the other hand ,If the dotted point show a downward trend from the
left top to the right bottom ,correlation is negative.
 If the plotted point do not show any trend ,the two variables are not
correlated.
 Closeness of dots towards each other in a particular direction indicating
higher degree of correlation.
Probable error of Coefficient correlation and interpretation.
Regression
• Regression Model/Analysis:
• Regression analysis is a predictive modelling technique that analyses the relation between the target or
dependent variable and independent variable in a dataset. The regression technique gets used mainly to
determine the predictor strength, forecast trend, time series, and in case of cause & effect relation.
• Example1: Examine the relationship between sales and advertising expenditures for a corporation.
• Purpose of a regression model: Regression analysis is used for one of two purposes: 1. Predicting the value
of the dependent variable when information about the independent variables is known, forecasting. 2. Predicting
the effect of an independent variable on the dependent variable.
• Types of Regression Models: Popularly used Regression Models are, Linear. Regression, Logistic Regression,
Polynomial Regression.
1. Linear Regression:The most extensively used modelling technique is linear regression, which assumes a
linear connection between a dependent variable (Y) and an independent variable (X). It employs a Regression Line,
also known as a best-fit line.
• Y=c+m*X+e, where, C= denotes the intercept (a regression coefficient),
• m= denotes the regression coefficient, slope of the line, and e= is the error term or residual.
• c=Y-m*x and m=((x-x) (y-y'))/(x-x)² where, Y' = mean of Y X' = mean of x
1. Simple Linear Regression: Here we have one dependent variable and one independent variable so
the formula is
Y=c+m*X where c= intercept value and m=slope value.
Simple linear regression can be used:
• To find the intensity of dependency between two variables. Such as the rate of carbon emission and
global warming.
2. Multiple Linear Regressions: Here we have one dependent variable and more than one
independent variables so the formula is
Y=c+m₁*X1+m2*X2+m3*X3+-+mXn.
Multiple linear regression can be used: To estimate how strongly two or more independent
variables influence the single dependent variable. Such as how location, time, condition, and area can
influence the price of a property.
Note that: Linear Regression deals with dependent variables that are continuous numeric data in nature.
• 2.Logistic Regression: This Logistic Regression is useful when dependent variables
are categorical data that is yes/no, true/false, valid/ invalid kind of data, which are
discrete in nature and for this kind of data Linear Regression can't be used for
knowing the output of dependent variable. Hence we say Logistic Regression is used
to predict the categorical dependent variable with the assistance and knowledge of
independent variables.
• The overall aim of Logistic Regression is to classify outputs, which can only be
between 0 and 1.
• In logistic regression, sigmoid curve (S- curve) represents its connection to the
independent variable, and probability has a value between 0 and 1.
• The weighted Sum of inputs is passed through an Activation function called sigmoid
Function which maps values between 0 and 1. the formula for sigmoid function is:
• The change in regression
coefficients (present with
independent variable) has an
impact on the curve direction and
its steepness. Thus, one can infer
that a positive slope results in an
S-shaped curve, and a negative
slope reveals a Z-shaped curve.
• To classify Y-values into two
categories, you need to set a
threshold value (0.5) between 0
and 1. Values of Y above this
threshold will be classified as
category 1, and it will take values
below the threshold as category 0.
• R-Squared: It is important to know how well the relationship between the
values of the x- and y- axis is, if there are no relationship the polynomial
regression cannot be used to predict anything. The relationship is measured
with a value called the r-squared.
• The formula is :

ProbabilityLectureNotes (MSD) 1
No ratings yet
ProbabilityLectureNotes (MSD) 1
9 pages
Probability Modified PDF
No ratings yet
Probability Modified PDF
23 pages
Introduction To Probabilistic and Statistical Methods With Examples in R - 9783030457990 PDF
100% (2)
Introduction To Probabilistic and Statistical Methods With Examples in R - 9783030457990 PDF
163 pages
Probability
100% (2)
Probability
39 pages
Probability and Event Space-2-8
No ratings yet
Probability and Event Space-2-8
7 pages
2.2 probability
No ratings yet
2.2 probability
31 pages
57probability and Statistics
No ratings yet
57probability and Statistics
31 pages
Eda Midterms-Compilation
No ratings yet
Eda Midterms-Compilation
12 pages
Chapter One
No ratings yet
Chapter One
34 pages
Chapter 16 (Philoid-In)
No ratings yet
Chapter 16 (Philoid-In)
19 pages
Lec 17 Probability 2 1
No ratings yet
Lec 17 Probability 2 1
13 pages
UNIT II Notes
No ratings yet
UNIT II Notes
31 pages
CH 6 - Probability
No ratings yet
CH 6 - Probability
7 pages
Some Types of Events in Probability
No ratings yet
Some Types of Events in Probability
6 pages
NCERT Exempler Maths Class 11
100% (1)
NCERT Exempler Maths Class 11
266 pages
College of Engineering Department of Electrical and Computer Engineering (Electronics and Communication Stream)
No ratings yet
College of Engineering Department of Electrical and Computer Engineering (Electronics and Communication Stream)
41 pages
Probability
No ratings yet
Probability
30 pages
CHAPTER 2-Probability PDF
No ratings yet
CHAPTER 2-Probability PDF
51 pages
Probabilitytheory
No ratings yet
Probabilitytheory
10 pages
Probability
No ratings yet
Probability
16 pages
kemh114
No ratings yet
kemh114
25 pages
PROB - 12th (2018C) - E
No ratings yet
PROB - 12th (2018C) - E
73 pages
chapter_1 Background
No ratings yet
chapter_1 Background
75 pages
PCS NOTES M1 (1)
No ratings yet
PCS NOTES M1 (1)
17 pages
Statistics (1)
No ratings yet
Statistics (1)
38 pages
Probability and Statistics
100% (1)
Probability and Statistics
35 pages
Notes 5
No ratings yet
Notes 5
4 pages
Random ExperimentsForBAHon
No ratings yet
Random ExperimentsForBAHon
44 pages
lecture -1
No ratings yet
lecture -1
9 pages
Probability
No ratings yet
Probability
26 pages
Probability SLM
No ratings yet
Probability SLM
8 pages
1.3.1-Sample Space-Event
No ratings yet
1.3.1-Sample Space-Event
3 pages
UNIT3
No ratings yet
UNIT3
5 pages
Probability
No ratings yet
Probability
15 pages
MAT 3103: Computational Statistics and Probability Chapter 3: Probability
No ratings yet
MAT 3103: Computational Statistics and Probability Chapter 3: Probability
23 pages
Unit 1_Probability
No ratings yet
Unit 1_Probability
31 pages
Probability and Prob - Dist.
No ratings yet
Probability and Prob - Dist.
37 pages
Statistical Methods Ecourse (ICAR)
No ratings yet
Statistical Methods Ecourse (ICAR)
76 pages
Probability
No ratings yet
Probability
36 pages
Adobe Scan Oct 30, 2023
No ratings yet
Adobe Scan Oct 30, 2023
16 pages
Probablity
100% (1)
Probablity
312 pages
CUF Feb16 PDF
No ratings yet
CUF Feb16 PDF
3 pages
Probability
No ratings yet
Probability
27 pages
Probability and Statistics: To P, or Not To P?: Module Leader: DR James Abdey
No ratings yet
Probability and Statistics: To P, or Not To P?: Module Leader: DR James Abdey
5 pages
Basic Probability
No ratings yet
Basic Probability
14 pages
Tugas Proses Stokastik
No ratings yet
Tugas Proses Stokastik
32 pages
Probability
No ratings yet
Probability
7 pages
Notes of St. and Pro.
No ratings yet
Notes of St. and Pro.
35 pages
Lec 02
No ratings yet
Lec 02
18 pages
Chapter 1 Probability
No ratings yet
Chapter 1 Probability
13 pages
Probability
No ratings yet
Probability
49 pages
Module 1 (3)
No ratings yet
Module 1 (3)
12 pages
Probability
No ratings yet
Probability
12 pages
Probability
No ratings yet
Probability
30 pages
Important RGPV Question
No ratings yet
Important RGPV Question
23 pages
Probability
No ratings yet
Probability
2 pages
01 Principles of Probablity 1
No ratings yet
01 Principles of Probablity 1
39 pages
Enma104 Lessson2 Probability
No ratings yet
Enma104 Lessson2 Probability
9 pages
M6-Discrete Probability
No ratings yet
M6-Discrete Probability
38 pages
Probability Theory: A Concise Course
From Everand
Probability Theory: A Concise Course
Y. A. Rozanov
4/5 (2)
A System of Legal Logic: Using Aristotle, Ayn Rand, and Analytical Philosophy to Understand the Law, Interpret Cases, and Win in Litigation
From Everand
A System of Legal Logic: Using Aristotle, Ayn Rand, and Analytical Philosophy to Understand the Law, Interpret Cases, and Win in Litigation
Russell Hasan
No ratings yet
BAYES Theorem
From Everand
BAYES Theorem
Jeffery Short
2/5 (5)
Software Programmer - HTML5
100% (1)
Software Programmer - HTML5
276 pages
CN Lab Manual
No ratings yet
CN Lab Manual
68 pages
Data Analytics Lab Keerthi
No ratings yet
Data Analytics Lab Keerthi
71 pages
Backward Chaining
No ratings yet
Backward Chaining
16 pages
Akshay 45
No ratings yet
Akshay 45
6 pages
Big Data
No ratings yet
Big Data
18 pages
Mean, Median, Mode, Variance & Standard Deviation: Subject: Statistics Created By: Marija Stanojcic Revised: 10/9/2018
No ratings yet
Mean, Median, Mode, Variance & Standard Deviation: Subject: Statistics Created By: Marija Stanojcic Revised: 10/9/2018
3 pages
powerBI 1 ST
No ratings yet
powerBI 1 ST
1 page
Probability
No ratings yet
Probability
3 pages
Chapter6_Handout_2024_25 (1)
No ratings yet
Chapter6_Handout_2024_25 (1)
42 pages
RM Assignment
100% (1)
RM Assignment
6 pages
Mann-Whitney U Test Advanced Stat
No ratings yet
Mann-Whitney U Test Advanced Stat
23 pages
Design and Analysis of Experiments
0% (1)
Design and Analysis of Experiments
188 pages
Bland-Altman Plot and Analysis
No ratings yet
Bland-Altman Plot and Analysis
25 pages
Hypothesis Testing HUMSS 6
No ratings yet
Hypothesis Testing HUMSS 6
58 pages
On The Interpretation of Results From The NIST Statistical Test Suite
No ratings yet
On The Interpretation of Results From The NIST Statistical Test Suite
15 pages
Statistics Probability Review Midterm Exam SY 2022 2023
No ratings yet
Statistics Probability Review Midterm Exam SY 2022 2023
4 pages
Probability
No ratings yet
Probability
35 pages
Ppa 696: Sampling
No ratings yet
Ppa 696: Sampling
8 pages
Tutorial 1
No ratings yet
Tutorial 1
2 pages
Week 4 Quiz
No ratings yet
Week 4 Quiz
2 pages
Sampling Homework
No ratings yet
Sampling Homework
7 pages
Probability Distribution
100% (1)
Probability Distribution
22 pages
IBM 2103 Tutorial 6: Testing On Population Mean
No ratings yet
IBM 2103 Tutorial 6: Testing On Population Mean
4 pages
MCQ Statistics
No ratings yet
MCQ Statistics
6 pages
The Parametic Test of Significance Test T - Distribution
No ratings yet
The Parametic Test of Significance Test T - Distribution
43 pages
Assignment I: Statistics For Agricultural Research VAR3013 Second Semester 2017/2018 Lecturer
No ratings yet
Assignment I: Statistics For Agricultural Research VAR3013 Second Semester 2017/2018 Lecturer
3 pages
Post Trial Q
No ratings yet
Post Trial Q
2 pages
CoA - TD15NTU.L5 Turbidity Calibration Standard (Formazin)
No ratings yet
CoA - TD15NTU.L5 Turbidity Calibration Standard (Formazin)
2 pages
Hypothesistesting Notes5
No ratings yet
Hypothesistesting Notes5
6 pages
Bade BOW P.E. Stat Gen - Math
No ratings yet
Bade BOW P.E. Stat Gen - Math
4 pages
Fundamentals of Hypothesis Testing One-Sample Testsnew
100% (2)
Fundamentals of Hypothesis Testing One-Sample Testsnew
18 pages
Module 4 Data Management (Part 1)
No ratings yet
Module 4 Data Management (Part 1)
27 pages
Apprentice Teaching: Lesson Plan Summary Template
No ratings yet
Apprentice Teaching: Lesson Plan Summary Template
2 pages
Case Study 3: Par, Inc
No ratings yet
Case Study 3: Par, Inc
1 page
Stats PT
No ratings yet
Stats PT
15 pages

DA Unit-2 Probability and Statistical Methods

Uploaded by

DA Unit-2 Probability and Statistical Methods

Uploaded by

UNIT: 2

 Definition: A sample space, is a set of possible outcomes of a random

 For a ”fair coin ” we expect H and T to have the same ”chance ” of

 Thus, an event is a subset of the sample space, i.e., E is a subset of S.

What is the Probability of Occurrence of an Event?

Events Associated with “OR”

sample space = {1, 2, 3, 4, 5, 6}.

 Using the two equations, we can show that

 Bayes’ theorem helps the data scientists to update the probability of an

ii) Weight, Height of the students

See the below figure .

 1) Discrete probability Distribution

a) Binomial Distribution 2) Continuous probability Distribution

 In Microsoft Excel, the function ‘BINOM.DIST(x, n, p, false)’ can be used for

 The variance of a binomial distribution is given by

 Approximation of Binomial Distribution using Normal Distribution If the

 for x = 0, 1, 2, . . . , n. The values of n and p are called the parameters of the

 The mean of a Poisson distribution is λ or (µ)

 The variance of a Poisson distribution is also λ or (σ²)

 Here u, 𝜎 are the mean & Standard Deviation of X.

 By using the following transformation, any normal random variable X can be

 Consider a binomial random variable with parameter p (probability of

 Thus, the chi-square statistic for goodness of fit test is given by

 Mean Square Total (MST) variation is given by

 Mean square between variation (MSB) is given by

 The mean square of variation within the group is

1.Simple Correlation Coefficient

Here, r=Coefficient of Correlation.

 Here, R=Rank Coefficient of Correlation ,Σ 𝐷 2=The total of

Refer Datatab website for problems

•Now it is possible to compare the

You might also like