NOTE: Inference is the process of drawing
Statistics and Probability conclusions or making decisions about a
Lesson 1: Introduction to Statistics population based on sample results
Why study statistics? Sampling
1. Data are everywhere a sample should have the same characteristics
2. Statistical techniques are used to make as the population it is representing.
many decisions that affect our lives Sampling can be:
3. No matter what your career, you will ▪ with replacement: a member of the
make professional decisions that involve population may be chosen more than
data. An understanding of statistical once (picking the candy from the
methods will help you make these bowl)
decisions effectively ▪ without replacement: a member of
Applications Of Statistical Concepts In the population may be chosen only
The Business World once (lottery ticket)
❖ Finance – correlation and regression, Sampling methods can be:
index numbers, time series analysis ▪ random - each member of the
❖ Marketing – hypothesis testing, chi- population has an equal chance of
square tests, nonparametric statistics being selected
❖ Personel – hypothesis testing, chi-square ▪ nonrandom
tests, nonparametric tests The actual process of sampling causes
❖ Operating management – hypothesis sampling errors. For example, the sample
testing, estimation, analysis of variance, may not be large enough or representative of
time series analysis the population. Factors not related to the
Statistics sampling process cause nonsampling errors.
❖ The science of collectioning, organizing, A defective counting device can cause a
presenting, analyzing, and interpreting nonsampling error.
data to assist in making more effective Random Sampling Methods
decisions. ❖ Simple random sample (each sample of
❖ Statistical analysis – used to manipulate the same size has an equal chance of
summarize, and investigate data, so that being selected)
useful decision-making information ❖ Stratified sample (divide the population
results. into groups called strata and then take a
Types of statistics sample from each stratum)
❖ Descriptive statistics – Methods of ❖ Cluster sample (divide the population
organizing, summarizing, and presenting into strata and then randomly select some
data in an informative way. of the strata. All the members from these
❖ Inferential statistics – The methods used strata are in the cluster sample.)
to determine something about a ❖ Systematic sample (randomly select a
population on the basis of a sample starting point and take every n-th piece of
▪ Population –The entire set of data from a listing of the population)
individuals or objects of interest or Descriptive Statistics
the measurements obtained from all ❖ Collect data
individuals or objects of interest e.g., Survey
▪ Sample – A portion, or part, of the ❖ Present data
population of interest e.g., Tables and graphs
Inferential Statistics ❖ Summarize data
❖ Estimation e.g., Sample mean = Σ 𝑛𝑖
𝑋
e.g., Estimate the population mean weight
Data
using the sample mean weight
❖ Data is the body of information or
❖ Hypothesis testing
observations being considered by the
e.g., Test the claim that the population mean
researcher. When the data is processed,
weight is 70 kg
information, which is the basis for the
decision making is produced.
❖ Variable is used to define certain Quantitative data (According to
observable values or characteristics. It is Measurement)
called variable since the characteristics Quantitative data are always numbers and are
vary from one another. The values of the the result of counting or measuring
variable are the possible observable attributes of a population. Quantitative data
values or characteristics of the variable. can be separated into two subgroups:
These values are the data to be processed. ▪ discrete (if it is the result of counting
Statistical Data (the number of students of a given
❖ The collection of data that are relevant to ethnic group in a class, the number of
the problem being studied is commonly books on a shelf, ...)
the most difficult, expensive, and time- ▪ continuous (if it is the result of
consuming part of the entire research measuring (distance traveled, weight
project. of luggage, …)
❖ Statistical data are usually obtained by Numerical Scale Of Measurement:
counting or measuring items. ❖ Nominal: consist of categories in each of
▪ Primary data are collected which the number of respective
specifically for the analysis desired observations is recorded. The categories
▪ Secondary data have already been are in no logical order and have no
compiled and are available for particular relationship. The categories are
statistical analysis said to be mutually exclusive since an
❖ A variable is an item of interest that can individual, object, or measurement can be
take on many different numerical values. included in only one of them.
❖ A constant has a fixed numerical value. ❖ Ordinal: contain more information.
Data (According to Nature) Consists of distinct categories in which
Statistical data are usually obtained by order is implied. Values in one category
counting or measuring items. Most data can are larger or smaller than values in other
be put into the following categories: categories (e.g. rating-excelent, good,
▪ Qualitative or Categorical- data are fair, poor)
measurements that each fail into one of ❖ Interval: is a set of numerical
several categories. (hair color, ethnic measurements in which the distance
groups and other attributes of the between numbers is of a known, constant
population) size.
▪ Quantitative or Numerical - data are ❖ Ratio: consists of numerical
observations that are measured on a measurements where the distance
numerical scale (distance traveled to between numbers is of a known, constant
college, number of children in a size, in addition, there is a nonarbitrary
family, etc.) zero point.
Qualitative Data Lesson 2: Random Variables
Qualitative data are generally described by ❖ You might recall that a statistical
words or letters. They are not as widely used experiment is any process by which
as quantitative data because many numerical observations are made and data are
techniques do not apply to the qualitative collected.
data. For example, it does not make sense to ❖ The result of an experiment is known as
find an average hair color or blood type. outcome.
Qualitative data can be separated into two ❖ Statistical experiments can have finite or
subgroups: infinite number of outcomes.
▪ dichotomic (if it takes the form of a ❖ The collection of all possible outcomes is
word with two options (gender - male known as the sample space which is
or female) typically denoted by an S.
▪ polynomic (if it takes the form of a ❖ When one or more outcomes in the
word with more than two options sample space is considered, this is
(education - primary school, referred to as an event.
secondary school and university).
Random Variables 1. A study on the number of customers served
In some experiments such as tossing a coin by a restaurant on a particular day was
three times, rolling a die twice, drawing two conducted. If the random variable 𝑋 denotes
balls from an urn and the like, we are not the number of customers served on that day,
oftentimes concerned with every detail of the then 𝑋 can take one of the values 𝑋 = 0, 1, 2,
outcomes. 3, 4 ….
We are usually interested in some numerals Answer: In this case the number of
associated with outcomes. customers may increase indefinitely, and
For example, if a coin is tossed twice, the set each number represents distinct specific
of all possible outcomes (S) of the values. We call it a discrete random
experiment is. variable.
If we are interested in the number of heads 2. Suppose we are interested in looking at
that came out in the experiment, then we can statistics test scores from a sample of 40
assign numbers 0, 1, and 2 for each of the 4 students. The random variable would be the
possible outcomes. Thus, we can write test scores which would range from 0% to
100%. In this case we will use intervals to
denote the various values of the random
variable.
Answer: When we use intervals for our
random variable, all values in the interval are
possible values of the random variable. We
From the table above, instead of writing
call this kind of variable continuous random
Number of Heads, we can denote it as set X
variable.
whose elements x1=0, x2 = 1 and x3 = 2. In
Probability Distributions of Discrete
symbol, X = {0,1,2}.
Random Variables
Then X is called a Random Variable.
Probability Notation for a Discrete Random
Random Variable [Definition]
Variable
A variable whose assigned value is
❖ The probability of an event (𝐸) denoted
determined by the outcome of a random
by 𝑃(𝐸) is the chance or likelihood of that
experiment or procedure is known as
event occurring.
random variable.
❖ To be more precise and consistent when
▪ It is usually denoted by uppercase
describing the probability of an event, a
letters such as X, whose elements are
numerical measure is applied.
denoted by lower case letters x1, x2, x3
❖ The numerical systems used to describe
and so on.
probability assign values ranging from
▪ Understanding the concepts of sample
zero (0) for impossible events up to one
space and random variables is
(1) for those events that are certain to
important in the study of probability.
occur.
Classification of Random Variables
❖ The probability of all other events lies
❖ Discrete Random Variables are random
between these two extreme values.
variables that can take on a finite or
❖ In any experiment, the probability that an
countable infinite number of distinct
event E occurs is given by
values. Each value can be described by an
If E is certain to occur, P (E) = 1
integer value.
If E is impossible to occur, P(E) = 0
❖ Continuous Random Variables are 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑖𝑚𝑒 𝐸 𝑜𝑐𝑐𝑢𝑟𝑠
random variables that take an P(E) = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠
uncountably infinite number of possible ❖ For a discrete random variable, we may
values, typically measurable quantities. either create a table or use a formula to
The values are obtained by measurement give probabilities for each possible value.
and may assume all values in the interval The correspondence between each
between any two given values along a possible value and probability is known
number line. as the probability distribution function
(pdf) for the variable.
❖ In other words, for a discrete random Lesson 4: Variance, And Standard
variable X, the probability distribution Deviation of Discrete Random
function can be illustrated as a table or
Variable
rule that assigns a probability to each
Variance and Standard Deviation of a
possible value of the variable X.
Probability Distribution [Long Method]
Properties of the Probabilities of Discrete
The mean µ of the random variable X
Random Variables
provides us with a measure of the central
1. The probability of each value is between
location of the distribution of X, but it does
0 and 1.
not give us information on how the various
2. The sum of all the probabilities is equal
values dispersed from µ.
to 1.
Variance– combines all the values in a data
Lesson 3: Mean of the Discrete set to produce a measure of spread. It is used
Random Variable to measure how far the data values are
Mean dispersed from the mean.
Mean is one of the most important Standard Deviation– is the measure of
probabilistic concepts in statistics. The mean spread most commonly used in statistical
of the discrete random variable X, denoted by practice. It is used to calculate the amount of
𝑋̅and 𝜇, it is the weighted average of all dispersion of the given data set values.
possible values of X. It does not have to be a NOTE: Standard Deviation is also useful when
value of discrete random variable can comparing the spread of two separate data sets
assume. that have approximately the same mean.
Generally, the more widely spread the values are,
E(𝑋) = 𝜇 = ∑ 𝑥𝑃(𝑥) the larger the standard deviation is.
E(X) = is the mean of the outcomes x Thus, the variance of a discrete probability
𝝁 = is the mean distribution is given by the formula:
∑ 𝒙𝑷(𝒙) = is the sum of each random 𝜎 2 =∑[(𝑋 − 𝜇)2 ·𝑃(𝑋)]
variable value x multiplied by its own The formula used to compute the standard
probability P(x) deviation of the discrete probability
Example: Let X be the number of cakes sold distribution is shown below.
in a certain store during valentine’s day, along 𝜎 = √∑[(𝑋 − 𝜇)2 ·𝑃(𝑋)]
with its corresponding probabilities is given where:
in the table below. Solve the mean of X. X = value of the random variable
P(X) = probability of the random variable X
µ = mean of the probability distribution
Example: Find the mean
The value obtained is called the mean of the
random variable or the mean of the
probability distribution of X. The mean tells
us that the average number cakes sold by a
certain store during valentine’s day is 23.95.
Since we are referring to a number of cakes,
thus, the mean is approximately 24 cakes.
NOTE: The mean of random variable is also
referred to as the expected value, denoted by 𝜇𝑥 =
E[X]
4. The normal curve approaches, but never
Variance and Standard Deviation of a touches the x-axis as it extends farther
Probability Distribution [Short Method] and farther away from the mean.
Variance 5. Between μ − σ and μ + σ (in the center of
𝜎 =∑[(𝑋 − 𝜇)2 ·𝑃(𝑋)] =0.56
2
the curve), the graph curves downward.
Standard Deviation The graph curves upward to the left of μ
𝜎 = √∑[(𝑋 − 𝜇)2 ·𝑃(𝑋)] =√0.56 = 0.75 − σ and to the right of μ + σ. The points
at which the curve changes from curving
upward to curving downward are called
the inflection points.
6.
Lesson 5: Normal Probability
Distributions
Properties of Normal Distributions
A continuous random variable has an A normal distribution can have any mean
infinite number of possible values that can be and any positive standard deviation.
represented by an interval on the number line.
H
0 3 6 9 12 15 18 21 24
T
The standard deviation describes the spread
0 of the data.
24. Empirical Rule
The probability distribution of a continuous The empirical rule is better known as 68% -
random variable is called a continuous 95% - 99.70% rule.
probability distribution. This rule states that the data in the
The most important probability distribution distribution lies within one (1), two (2), and
in statistics is the normal distribution. three (3) of the standard deviation from the
mean are approximately 68%, 95%, and
N 99.70%, respectively.
Since the area of a normal curve is equal to
1 or 100% as stated on its characteristics,
there are only a few data which is 0.30% falls
outside the 3-standard deviation from the
A normal distribution is a continuous mean.
probability distribution for a random It can summarize the distribution in the
variable, x. The graph of a normal following percentage:
distribution is called the normal curve. ▪ 68% of data lies within 1 standard
Properties of a Normal Distribution deviation from the mean have a grade of
1. The mean, median, and mode is equal. 83 to 91
2. The normal curve is bell-shaped and ▪ 95% of data lies within 2 standard
symmetric about the mean. deviations from the mean have a grade of
3. The total area under the curve is equal to 79 to 95
one.
▪ 99.70% of data lies within 3 standard 2. Find the area by following the directions
deviations from the mean have a grade of for each case shown.
75 to 99 a. To find the area to the left of z, find
The Standard Normal Distribution the area that corresponds to z in the
The standard normal distribution is a Standard Normal Table.
normal distribution with a mean of 0 and a
standard deviation of 1.
b. To find the area to the right of z, use
Any value can be transformed into a z-score the Standard Normal Table to find
by using the formula: the area that corresponds to z. Then
z= Value - Mean
=
x- . subtract the area from 1.
Standard deviation
If each data value of a normally distributed
random variable x is transformed into a z-
score, the result will be the standard normal
distribution.
c. To find the area between two z-
scores, find the area corresponding
to each z-score in the Standard
After the formula is used to transform an x-
Normal Table. Then subtract the
value into a z-score, the Standard Normal
smaller area from the larger area.
Table in Appendix A is used to find the
cumulative area under the curve.
Properties of the Standard Normal
Distribution
1. The cumulative area is close to 0 for z-
scores close to z = −3.49.
2. The cumulative area increases as the z-
scores increase.
3. The cumulative area for z = 0 is 0.5000. Lesson 6: Converting a Normal
4. The cumulative area is close to 1 for z- Random Variable to a Standard
scores close to z = 3.49] Normal Variable and Vice-Versa
In this case, you convert the raw score (x) into
the standard score (z) using the formula:
𝑋− 𝜇
𝑧=
𝜎
You substitute the given value and uses
properties of equality and algebraic rules to
Guidelines for Finding Areas obtain the needed data. This procedure is
Finding Areas Under the Standard Normal known as “standardizing” or
Curve “standardization” of a random variable,
1. Sketch the standard normal curve and where a standardized value is called z-score.
shade the appropriate area under the A z-score is a measure of the number of
curve. standard deviations (𝜎) a particular data
value is away from the mean (𝜇)
Example: Suppose your score on a test in The problems involving probabilities and
Probability and Statistics is 39 and the scores percentiles are solved in the same manner as
are normally distributed with a mean of 33 finding the areas under a normal curve
and standard deviation of 3, then your score In finding probabilities, the following
is exactly 2 standard deviations above the notations will be used:
mean. a. P (Z < z) – probability at the left of z
If you scored 30, then it is exactly 1 standard b. P (Z > z) = P (Z < z) – probability at the
deviation below the mean. All values that are right of z
above the mean have positive z-scores and all c. P (a < z < b) – the probability of z that is in
values that are below the mean have negative between two other z values, say a and
z-scores. d. P (X < x) – probability at the left of a
If you obtained a z-score of -3, this means normal random variable x
that your score is 3 standard deviations below e. P (X > x) – probability at the right of a
the mean, that is, 33 - 3(3) = 33 – 9 = 24 normal random variable x
Z-Score f. P (a < X < b) – the probability of a normal
Given any value of x from a normal random variable x that is in between two
distribution with a mean 𝜇 and standard other normal random variables, say a and b
deviation 𝜎, to convert x to a z-score h. P (X < a) u P (X > b) – the probability of x
(standard normal score), you need to; is in the opposite direction of two values, say
a. Subtract the mean 𝜇 from x a and b
b. Divide this quantity, x - 𝜇, by the Example: To pass in the accreditation and
standard deviation 𝜎 equivalency (A&E) test, ALS students must
The formula used in converting a random score in the top 15% in general ability tests.
variable x to a standard normal variable z is: The test has a mean of 200 and a standard
Where: deviation of 20. Find the lowest possible
z – standard normal score or z-score score to pass the test assuming the test scores
x – any data value in a normal distribution are normally distributed.
𝜇 – mean The lowers possible score is the normal
𝜎 – standard deviation random variable corresponding to a z value
To solve for the normal random variable x, occupying an area of 0.15 from the right of
multiply the z-score (z) by the standard the normal curve. To solve for the required
deviation 𝜎, then add the mean 𝜇 variable, give that P (Z > z) = 0.15
X = z (𝜎) + 𝜇 NOTE: Remember P (Z < z) + P (Z > z) = 1
Lesson 7: Computing Probabilities P (Z > z) = 0.15
1 – P (Z > z) = 1 – 0.15 →P (Z > z) = 1 – 0.15,
and Percentiles Using the Standard
since it is right tailed – 0.85
Normal Table From the table, look for the z value
A normal distribution curve can be used as a corresponding to this area
probability distribution curve for normally Thus, z = 1.03, then convert this to a normal
distributed variables random variable x
The area under the standard normal X = (1.03) (20) + 200 = 220.6 → 221
distribution curve can also be thought of as a
Lesson 8: Locating Percentiles Under
probability
That is, if it’s possible to select any z value at the Normal Curve
random, the probability of choosing one, say, Percentile
below 1.45 would be the as the area under the ❖ For any set of measurements (arranged in
curve at the left of 1.45 ascending or descending order), a
In this case, the area is 0.9265 percentile (or a centile) is a point in the
Therefore, the probability of randomly distribution such that a given number of
selecting a z value below of 1.45 is 0.9265 or cases is below it.
92.65%
❖ A percentile is a measure of relative Types of Random Sampling
standing. It is a descriptive measure of the A. Simple random sampling
relationship of a measurement to the rest It is the most basic sampling technique. In
of the data. this sampling technique, every member of the
Percentile and Z-Scores population has an equal chance of being
❖ A probability value corresponds to an chosen to be a part of the sample. One way to
area under the normal curve. do simple random sampling is by using the
❖ In the Table of Areas Under the Normal Table of Random Numbers or by using the
Curve, the numbers in the extreme left lottery method.
and across the top are z-scores, which are
the distances along the horizontal scale.
The numbers in the body of the table are
areas or probabilities.
❖ The z-scores to the left of the mean are
negative values.
❖ A percentile is a measure of relative
standing. It is a descriptive measure of the
relationship of a measurement to the rest
of the data.
❖ In the Table of Areas Under the Normal
Curve, the numbers in the extreme left
and across the top are z-scores, which are
the distances along the horizontal scale.
The numbers in the body of the table are
areas or probabilities.
❖ The z-scores to the left of the mean are
negative values.
Lesson 9: Sampling Distributions and
the Central Limit Theorem
A sampling distribution is the probability
distribution of a sample statistic that is
formed when samples of size n are repeatedly
taken from a population.
If the sample statistic is the sample mean,
then the distribution is the sampling
distribution of sample means.
The sampling distribution consists of the
values of the sample means, B. Systematic Sampling
3
1
4 2 6
5
The totality of subjects (people, animals or
objects) under consideration is called
population. The portion chosen from a
population is called sample and the process of
taking samples is called sampling.
Random Sampling refers to the technique in
C. Stratified Sampling
which each member of the population is
In stratified sampling, the population is
given equal chance to be chosen as part of the
partitioned into several subgroups called
sample. The lottery method, drawing lots, or
strata which are based on some
the use of random numbers can be used to
characteristics like year level, gender, age,
accomplish random sampling.
ethnicity, etc.
D. Cluster or Area Sampling Step 3. Construct the sampling distribution of
The population is divided into clusters. From the sample mean by preparing another table
these clusters, random sample clusters will be with means on the 1st column and probability
drawn. All the elements from the sampled on the second column
clusters will make up the sample.
Parameters and Statistic
The measurement or quantity that describes
the population is called PARAMETER,
while the measurement or quantity that
describes the sample is called STATISTIC.
Step 4. Add a column for 𝑋̅ ⦁ P (𝑋̅)
Lesson 10: Mean and Variance of the
Sampling Distribution of the Sample
Mean
Constructing a sampling distribution of
the sample mean
Population: {1, 3, 5}
Samples of size 2 done w/replacement
Step 1. Use Fundamental Counting Principle
to get all the possible outcomes [3 x 3 = 9]
Step 5. Get the sum of the entries for
𝑋̅ ⦁ P (𝑋̅)
𝑵
𝑬(𝑿) = ∑ ̅̅̅ ̅̅̅𝒊 )
𝑿𝒊 ∙ 𝑷(𝑿
𝒊=𝟏
Step 2. Solve for the mean of the samples
̅𝟐
Step 6. Add a column for 𝑿 Step 9. Solve for the variance
̅ ) = 𝑬(𝑿
𝑽𝒂𝒓 (𝑿 ̅ 𝟐 ) − [𝑬(𝑿
̅ )] 𝟐
̅ 𝟐 ⦁ 𝑷 (𝑿
Step 7. Add a column for 𝑿 ̅) NOTE: The mean of the sampling
distribution of the sample mean is equal to
the pollution mean
NOTE: The variance of the sampling
distribution of the sample mean is equal to the
population mean divided by the same size
Theorem
If all possible random samples of size 𝒏 are
taken with replacement from a population
Step 8. Get the sum of the entries for with mean 𝝁 and variance 𝝈𝟐 , then the mean
̅ 𝟐 ⦁ 𝑷 (𝑿
𝑿 ̅) 𝝁𝑿̅ , variance 𝝈𝒙̅𝟐 , and standard error 𝝈𝒙̅ of
the sampling distribution of the sample mean
are:
𝜇𝑋̅ = 𝜇 (mean)
𝜎2
𝜎𝑥̅ 2 = (variance)
𝑛
𝜎
𝜎𝑥̅ = (standard error)
√𝑛
❖ Standard error – standard deviation of the
sampling distribution of the sample mean
Multi-Rule Map (standard deviation or standard error)
Title: Finding the mean and variance of the 𝝈 𝑵 −𝒏
𝝈𝒙̅ = √
sampling distribution of the sample mean √𝒏 𝑵 −𝟏
𝑵−𝒏
Note: The factor is called correction
𝑵 −𝟏
factor for the finite population. It will be
close to 1 and can be safely ignored When n
is small compared to N.
Theorem 1
If all possible random samples of size 𝒏 are
taken with replacement(infinite) from a
population with mean 𝝁 and variance 𝝈𝟐 ,
then the mean 𝝁𝑿̅ , variance 𝝈𝒙̅𝟐 , and
standard error 𝝈𝒙̅ of the sampling
distribution of the sample mean are:
(mean)
𝜇𝑋̅ = 𝜇
(variance)
𝜎2
𝜎𝑥̅2 = 𝑛
(standard error)
𝜎
𝜎𝑥̅ = 𝑛
√
❖ Standard error – standard deviation of the
sampling distribution of the sample mean
Theorem 2
If all possible random samples of size 𝒏 are
taken without replacement(dependent) from
a finite population of size N with mean 𝝁 and
variance 𝝈𝟐 , then the mean 𝝁𝑿̅ , variance
𝝈𝟐𝒙̅ , and standard deviation 𝝈𝒙̅ of the
sampling distribution of the sample mean are:
(mean)
𝝁𝑿̅ = 𝝁
(variance)
𝝈𝟐 𝑵−𝒏
𝝈𝟐𝒙̅ = ( )
𝒏 𝑵 −𝟏
Formulas to Remember! b. P (Z > z) = P (Z < z) – probability at the right
1. Sample mean of z
𝑋𝑖 c. P (a < z < b) – the probability of z that is in
Σ between two other z values, say a and
𝑛
2. Probability – P(X) d. P (X < x) – probability at the left of a normal
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑖𝑚𝑒 𝐸 𝑜𝑐𝑐𝑢𝑟𝑠 random variable x
P(E) = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠
e. P (X > x) – probability at the right of a normal
NOTE:
random variable x
The probability of each value is between 0 and 1.
The sum of all the probabilities is equal to 1.
f. P (a < X < b) – the probability of a normal
3. Mean of the probability distribution random variable x that is in between two other
normal random variables, say a and b
E(𝑋) = 𝜇 = ∑ 𝑥𝑃(𝑥) h. P (X < a) u P (X > b) – the probability of x is
E(X) = is the mean of the outcomes x in the opposite direction of two values, say a and
𝝁 = is the mean b
∑ 𝒙𝑷(𝒙) = is the sum of each random variable 17. Slovin’s Formula [Simple Random
value x multiplied by its own probability P(x) Sampling]
4. Formula for Variance [L.M] 𝑁
𝑛=
𝜎 2 =∑[(𝑋 − 𝜇)2 ·𝑃(𝑋)] 1 + 𝑁𝑒 2
X = value of the random variable 18. Systematic Sampling
𝑁 − 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
P(X) = probability of the random variable X 𝑘=
𝑛 − 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
µ = mean of the probability distribution
19. Step 5 in constructing sampling distribution
5. Formula for Standard Deviation [L.M] 𝑵
𝜎 = √∑[(𝑋 − 𝜇)2 ·𝑃(𝑋)] 𝑬(𝑿) = ∑ ̅̅̅ ̅̅̅𝒊 )
𝑿𝒊 ∙ 𝑷(𝑿
X = value of the random variable 𝒊=𝟏
P(X) = probability of the random variable X 20. Step 8 in constructing sampling distribution
µ = mean of the probability distribution ̅ 𝟐 ⦁ 𝑷 (𝑿
𝑿 ̅)
6. Formula for Variance [S.M] 21. Formula for the Variance of Sampling
𝜎 2 =∑[(𝑋 − 𝜇)2 ·𝑃(𝑋)] Distribution
7. Standard Deviation [S.M] ̅ ) = 𝑬(𝑿
(𝑿 ̅ 𝟐 ) − [𝑬(𝑿
̅ )] 𝟐
𝜎 = √∑[(𝑋 − 𝜇)2 ·𝑃(𝑋)]
22. Formula for the mean of sampling
8. Empirical Rule
distribution
This rule states that the data in the distribution lies
within one (1), two (2), and three (3) of the
standard deviation from the mean are
approximately 68%, 95%, and 99.70%, 23. Formula for the variance of the sampling
respectively. distribution
68% of data lies within 1 standard deviation
from the mean have a grade of 83 to 91
95% of data lies within 2 standard deviations
from the mean have a grade of 79 to 95
24. Formulas for Theorem 1
99.70% of data lies within 3 standard
𝜇𝑋̅ = 𝜇 (mean)
deviations from the mean have a grade of 75 to 99 𝜎2
9. The area of a normal curve is equal to 1 or 𝜎𝑥̅ 2 = 𝑛
(variance)
100% 𝜎𝑥̅ =
𝜎
(standard error)
√𝑛
10. Z-Score
25. Formulas for Theorem 2
z= Value - Mean
=
x- .
𝜇𝑋̅ = 𝜇 (mean)
Standard deviation
𝜎 2 𝑁−𝑛
11. Above/Right the curve 𝜎𝑥̅2 = (
𝑛 𝑁 −1
) (variance)
1 – z-scores’ value 𝜎 𝑁 −𝑛
12. Below/Left the curve 𝜎𝑥̅ = √ (standard error)
√𝑛 𝑁 −1
Z-scores’ value
14. Between the curve
Z-Scores’ Value1 – Z-Scores’ Value2
15. Random Variable X
X = z (𝜎) + 𝜇
16. Looking for Percentiles
a. P (Z < z) – probability at the left of z