0% found this document useful (0 votes)
20 views46 pages

Statistics

explanation
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views46 pages

Statistics

explanation
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Statistics for Business

Principles of Statistics for Admin. (15060105)


Statistics: is the science of conducting studies to collect, organize,
summarize, analyze, and draw conclusions from data.

Statistics

Descriptive Inferential Statistics


consists of the collection, consists of generalizing from
organization, summarization, samples to populations, performing
and presentation of data. estimations and hypothesis tests,
determining relationships among
variables, and making predictions
Note: Inferential statistics uses probability
A variable: is a characteristic or attribute that can assume different values.

Data are the values (measurements or observations) that the variables


can assume.

Variables whose values are determined by chance are called random


variables.
A population consists of all subjects (human or otherwise) that are being
studied.

A sample is a group of subjects selected from a population.


Chapter 2 / Section 2.1:
Summarizing Data for a Categorical Variable

Frequency Distribution
A frequency distribution is a tabular summary of data showing the number (frequency)
of observations in each of several nonoverlapping categories or classes.
Relative Frequency and Percent Frequency Distributions

A relative frequency distribution gives a tabular summary of data showing the relative frequency
for each class. A percent frequency distribution summarizes the percent frequency of the data for
each class.
Bar Charts and Pie Charts
A bar chart is a graphical display for depicting
categorical data summarized in a frequency, relative frequency, or percent
frequency distribution. On one axis of the chart (usually the horizontal axis),
we specify the labels that are used for the classes (categories).
The pie chart provides another graphical display for
presenting relative frequency and percent frequency distributions
for categorical data.
Example, because a circle contains 360 degrees and Coca-Cola shows a
relative frequency of .38, the sector of the pie chart labeled Coca-Cola
consists of .38(360) = 136.8 degrees
Chapter 2 / Section 2.2:
Summarizing Data for a Quantitative Variable:

* Frequency Distribution
Note:
This definition holds for quantitative as well as categorical data. However, with quantitative data we
must be more careful in defining the nonoverlapping classes to be used in the frequency distribution.

Steps:
1. determine the number of nonoverlapping classes.
2. determine the width of each class.
3. determine the class limits.
Number of classes Classes are formed by specifying ranges that will be used to group the data. As
a general guideline, we recommend using between 5 and 20 classes.

As a general guideline, we recommend that the width be the same for each class.

Class limits Class limits must be chosen so that each data item belongs to one class.
The approximate class width given by equation (2.2) can be rounded up. Example: 9.28 might be
rounded to 10

Example
class width=(33 − 12)/5 = 4.2  5

The smallest value is 12. we can start


The first class from 10

Note:
class width=15-10=5
class width=20-15=5
class width=25-20=5
class width=30-25=5
class width=35-30=5
Class midpoint: is the value halfway between the lower and upper class limits.

Audit Time (days) Frequency Relative Frequency Percent Frequency Class midpoint

10–14 4 4/20=0.20 20% (10+14)/2=12


15–19 8 8/20=0.40 40% (15+19)/2=17
20–24 5 5/20=0.25 25% (20+24)/2=22
25–29 2 2/20=0.10 10% (25+29)/2=27
30–34 1 1/20=0.05 5% (30+34)/2=32
Total 20 1.00 100%

Class midpoint: is same whether obtained using class limits or class boundaries
Histogram
A common graphical display of quantitative data is a histogram. This graphical display can
be prepared for data previously summarized in either a frequency, relative frequency, or
percent frequency distribution

Class boundaries are used to construct histogram

Audit Time Class


Frequency
(days) boundaries
10–14 4
9.5-14.5
15–19 8
14.5-19.5
20–24 5
19.5-24.5
25–29 2
24.5-29.5
30–34 1
29.5-34.5
Total 20
A histogram contains no natural separation between the rectangles of adjacent classes.
By making the class upper limit = next class lower limit (Class boundaries)

In our Example: the audit time data are stated as 10–14, 15–19, 20–24, 25–29, and 30–34,
one-unit spaces of 14 to 15, 19 to 20, 24 to 25, and 29 to 30
Histogram types: Symmetric, Skewed to the left and Skewed to the right
Frequency polygon: Symmetric, Skewed to the left and Skewed to the right
Frequency Ogives: Symmetric, Skewed to the left and Skewed to the right
Cumulative Distributions
Exercise:
Consider the following frequency distribution (Given by Black Color)
Construct a Relative Frequency, Percent Frequency, Cumulative Frequency,
Cumulative Frequency, Relative Cumulative Frequency, Class midpoint
and Class Widt h

Relative Percent
Relative Percent Cumulative Class Class
Class Limit Frequency Cumulative Cumulative
Frequency Frequency Frequency midpoint Width
Frequency Frequency

10–19 10 0.2 20% 10 0.2 20% 14.5 10


20–29 14 0.28 28% 24 0.48 48% 24.5 10
30–39 17 0.34 34% 41 0.82 82% 34.5 10
40–49 7 0.14 14% 48 0.96 96% 44.5 10
50–59 2 0.04 4% 50 1 100% 54.5 10
Total 50
Chapter 3:
Section: 3.1 Measures of Location page (103-

Measures of Central Tendency


(Mean, Weighted mean, Median, Mode)

Measures of Location
(Percentiles, and Quartiles)‎

We will learn how to compute these measures under:


case of values
and case of grouped data (frequency table)
Statistic Parameter
if the measures are computed for if the measures are computed for
data from a sample, they are called data from a population, they are
sample statistics. called population parameters.

in statistical inference, a sample statistic is referred to as the point estimator of the


corresponding population parameter.

A statistic is a characteristic or measure obtained by using the data values from a sample.
A parameter is a characteristic or measure obtained by using all the data values from a specific population.
Note: The mean is a central tendency measure. Thus, the mean value case of
must be between The lowest and highest values
The mean is sometimes referred to as the arithmetic mean values
𝒙𝒊
𝒙=
𝒏
Compute the mean of: 46, 54, 42, 46, 32 Compute the mean of: 46, 114, 42, 46, 32

Note: Mean is the measure of central tendency


do affect the outliers. (disadvantage)
case of
Weighted Mean values
in the formulas for the sample mean and population mean, each x is given equal importance
or weight. As follows:

The weighted mean is computed as follows


Example: Example:
Compute the mean for a student marks: Suppose that a manager wanted to know
the mean cost per pound of the raw material
Mark Weighted
w*x
(x) (w)
70 0.3 21
60 0.1 6
70 0.1 7
80 0.1 8
90 0.4 36
Total 1 78
The median (Med) case of
is another measure of central Tendency.
The median is the value in the middle when the data are arranged in
values
ascending order (smallest value to largest value).

With an odd number of observations, the median is the middle value.

An even number of observations has no single middle value. in this


case, we follow convention and define the median as the average of
the values for the middle two observations

Example Example
Compute the median of values: 46, 54, 42, 32, 46 Compute the median of values: 13, 8, 44, 32, 34, 10
Start by arranged values in ascending order: Start by arranged values in ascending order:

value 32 42 46 46 54 value 8 10 13 32 34 44
order 1st 2nd 3rd 4th 5th order 1st 2nd 3rd 4th 5th 6th

𝟏𝟑+𝟑𝟐 𝟒𝟓
Median=46 Median= = =22.5
𝟐 𝟐
Mode (M) case of
is another measure of central Tendency.
The mode is the value that occurs with greatest frequency
values
(The value that occurs most often in a data set is called the mode).

Example Example
Compute the mode of values: 46, 54, 42, 32, 46 Compute the mode of values: 46, 54, 42, 32, 46, 54
:
Mode=46 :
Mode=46 and 54

Unimodal Bimodal

Note: Median and mode are the two measure of central tendency
do not affect the outliers. (advantage)
Case of values case of
values
The mean 𝒙𝒊
The mean of values is the average 𝒙=
𝒏
The mode
The mode is the value that occurs with greatest frequency

The median
The median is the value in the middle when the data are arranged in ascending order (smallest value
to largest value).

Weighted Mean
The mean Case of frequency table
𝒏= 𝒇𝒊
𝒙𝒊 𝒇𝒊
𝒙= 𝒙𝒊 : 𝒄𝒂𝒍𝒂𝒔𝒔 𝒊𝒕𝒉 𝒎𝒊𝒅𝒑𝒐𝒊𝒏𝒕
𝒏
𝒇𝒊 : 𝒄𝒂𝒍𝒂𝒔𝒔 𝒊𝒕𝒉 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚

Example:
Compute the mean of student marks
fi xi xi fi
Class
mark Frequency
Midpoint
1-8 4 4.5 18
9-16 6 12.5 75
17-24 2 20.5 41
25-32 7 28.5 199.5 𝒙𝒊 𝒇𝒊 𝟑𝟕𝟎
33-40 1 36.5 36.5
𝒙= = =18.5
Total 20 370 𝒏 𝟐𝟎
Case of frequency table
The mean

Example:
Compute the mean of student marks
fi xi xi fi
Class
mark Frequency
Midpoint 𝒙𝒊 𝒇𝒊 𝟓𝟗𝟓
0-10 3 5 15 𝒙= = =23.8
𝒏 𝟐𝟓
10-20 8 15 120
20-30 6 25 150
30-40 5 35 175
40-50 3 45 135
25 595
Case of frequency table
The Mode (M)
The mode is the midpoint of a class have greatest frequency
Example: Example:
Compute the mode of student marks Compute the mode of student marks

fi xi fi xi
Class Class
mark Frequency mark Frequency
Midpoint Midpoint
1-8 4 4.5 1-8 4 4.5
9-16 6 12.5 9-16 7 12.5
17-24 2 20.5 17-24 2 20.5
25-32 7 28.5 25-32 7 28.5
33-40 1 36.5 33-40 1 36.5

Total 20 Total 21

Mode=28.5 Mode=12.5 and 28.5


Case of frequency table
The median (Med)
is the value in the middle when the data are arranged in ascending
order (smallest value to largest value).

𝒇𝒊
1) Rank of Median=
𝟐

2) Construct cumulative frequency table (using all class boundaries)

3) Apply the proportion or the rule:


𝑴𝒆𝒅𝒊𝒂𝒏 𝑳𝒐𝒘𝒆𝒓 𝑹𝒂𝒏𝒌
𝑳𝒐𝒘𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑼𝒑𝒑𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑳𝒐𝒘𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 − 𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔
𝑹𝒂𝒏𝒌
𝑴𝒆𝒅𝒊𝒂𝒏 = + −
𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔 𝑼𝒑𝒑𝒆𝒓 𝑹𝒂𝒏𝒌 𝑳𝒐𝒘𝒆𝒓 𝑹𝒂𝒏𝒌

𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔
Example1: Case of frequency table
Compute the median of student marks

𝒇𝒊 𝟐𝟎
Rank of Median= = = 𝟏𝟎
𝟐 𝟐

Cumulative
Class Cumulative Class Frequency
Marks Frequency
boundaries (Rank)
1-8 4 0.5-8.5 less than or equals to 0.5 0
9-16 6 8.5-16.5 less than or equals to 8.5 4
17-24 2 16.5-24.5 less than or equals to 16.5 10
25-32 7 24.5-32.5 less than or equals to 24.5 12
33-40 1 32.5-40.5
20 less than or equals to 32.5 19
less than or equals to 40.5 20

Median=16.5
Example 2:
Compute the median of student marks
𝒇𝒊 𝟏𝟖
Rank of Median= = = 𝟗
𝟐 𝟐

Class
Marks Frequency
boundaries

1-8 2 0.5-8.5
9-16 3 8.5-16.5
17-24 8 16.5-24.5
25-32 1 24.5-32.5
33-40 4 32.5-40.5
18
Another solution way for Example2:
Compute the median of student marks
𝒇𝒊 𝟏𝟖
Rank of Median= = = 𝟗
𝟐 𝟐

Class
Marks Frequency
boundaries
1-8 2 0.5-8.5
9-16 3 8.5-16.5
17-24 8 16.5-24.5
25-32 1 24.5-32.5
33-40 4 32.5-40.5
18

𝑴𝒆𝒅𝒊𝒂𝒏 𝑳𝒐𝒘𝒆𝒓 𝑹𝒂𝒏𝒌



𝑳𝒐𝒘𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑼𝒑𝒑𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑳𝒐𝒘𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑹𝒂𝒏𝒌 𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔
𝑴𝒆𝒅𝒊𝒂𝒏 = + −
𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔 𝑼𝒑𝒑𝒆𝒓 𝑹𝒂𝒏𝒌 𝑳𝒐𝒘𝒆𝒓 𝑹𝒂𝒏𝒌

𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑴𝒆𝒅𝒊𝒂𝒏 𝒄𝒍𝒂𝒔𝒔

9−5 4
𝑴𝒆𝒅𝒊𝒂𝒏 = 16.5 + 24.5 − 16.5 =16.5 + 8 =16.5+4 = 20.5
13−5 8
Percentiles Case of frequency table
* the 𝑝𝑡ℎ
percentile is the value that approximately p% of
the observations are less than the 𝑝𝑡ℎ percentile
* and approximately (100 – p)% of the observations are greater than the 𝑝𝑡ℎ percentile.
Note: The 50th percentile is also the median.
To find the 𝑝𝑡ℎ percentile begin by arranging the sample values in ascending order then locate it
using the corresponding value

𝑡ℎ 𝑃
1) 𝑝 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 𝑅𝑎𝑛𝑘 = ∗ 𝑓𝑖
100
2) Construct cumulative frequency table (using all class boundaries)

3) Apply the proportion or the rule:

𝑝𝑡ℎ 𝑳𝒐𝒘𝒆𝒓 𝑹𝒂𝒏𝒌


𝑳𝒐𝒘𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑼𝒑𝒑𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑳𝒐𝒘𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 −
𝑹𝒂𝒏𝒌 𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔
Percentile 𝑝𝑡ℎ 𝒗𝒂𝒍𝒖𝒆 = + −
𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝑼𝒑𝒑𝒆𝒓 𝑹𝒂𝒏𝒌 𝑳𝒐𝒘𝒆𝒓 𝑹𝒂𝒏𝒌

𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔
Example 3:
Compute the percentile 20 of student marks (i.e. p=20)
𝟐𝟎 𝟐𝟎
Rank of 20𝑡ℎ = 𝒇𝒊 = 𝟏𝟖 = 𝟑. 𝟔
𝟏𝟎𝟎 𝟏𝟎𝟎

Class
Marks Frequency
boundaries
1-8 2 0.5-8.5
9-16 3 8.5-16.5
17-24 8 16.5-24.5
25-32 1 24.5-32.5
33-40 4 32.5-40.5
18
Another solution way for Example 3:
Compute the percentile 20 of student marks (i.e. p=20)
𝟐𝟎 𝟐𝟎
Rank of 20𝑡ℎ = 𝒇𝒊 = 𝟏𝟖 = 𝟑. 𝟔
𝟏𝟎𝟎 𝟏𝟎𝟎

Class
Marks Frequency
boundaries

1-8 2 0.5-8.5
9-16 3 8.5-16.5
17-24 8 16.5-24.5
25-32 1 24.5-32.5
33-40 4 32.5-40.5
18

𝑝𝑡ℎ −𝑳𝒐𝒘𝒆𝒓 𝑹𝒂𝒏𝒌


𝑳𝒐𝒘𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑼𝒑𝒑𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑳𝒐𝒘𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑡ℎ
𝑹𝒂𝒏𝒌 𝒐𝒇 𝑝 𝒄𝒍𝒂𝒔𝒔
Percentile 𝑝𝑡ℎ 𝒗𝒂𝒍𝒖𝒆 = + −
𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝑼𝒑𝒑𝒆𝒓 𝑹𝒂𝒏𝒌 𝑳𝒐𝒘𝒆𝒓 𝑹𝒂𝒏𝒌

𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔

3.6−2 1.6
Percentile 20𝑡ℎ 𝒗𝒂𝒍𝒖𝒆 = 8.5 + 16.5 − 8.5 =8.5+ 8 =8.5+4.24=12.76
5−2 3
Example 4:
Compute the percentile 90 of student marks (i.e. p=90)
𝟗𝟎 𝟗𝟎
Rank of 90𝑡ℎ = 𝒇𝒊 = 𝟏𝟖 = 𝟏𝟔. 𝟐
𝟏𝟎𝟎 𝟏𝟎𝟎

Class
Marks Frequency
boundaries

1-8 2 0.5-8.5
9-16 3 8.5-16.5
17-24 8 16.5-24.5
25-32 1 24.5-32.5
33-40 4 32.5-40.5
18

𝑝𝑡ℎ −𝑳𝒐𝒘𝒆𝒓 𝑹𝒂𝒏𝒌


𝑳𝒐𝒘𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑼𝒑𝒑𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑳𝒐𝒘𝒆𝒓 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝑡ℎ
𝑹𝒂𝒏𝒌 𝒐𝒇 𝑝 𝒄𝒍𝒂𝒔𝒔
Percentile 𝑝𝑡ℎ 𝒗𝒂𝒍𝒖𝒆 = + −
𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝑼𝒑𝒑𝒆𝒓 𝑹𝒂𝒏𝒌 𝑳𝒐𝒘𝒆𝒓 𝑹𝒂𝒏𝒌

𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔 𝒐𝒇 𝑝𝑡ℎ 𝒄𝒍𝒂𝒔𝒔

16.2−14 2.2
Percentile 90𝑡ℎ 𝒗𝒂𝒍𝒖𝒆 = 32.5 + 40.5 − 32.5 = 32.5+ 8 = 32.5+4.4= 36.9
18−14 4
Quartiles Case of frequency table
Q1 = first quartile, or 25th percentile

Q2 = second quartile, or 50th percentile (also the median)

Q3 = third quartile, or 75th percentile

Percentile 25𝑡ℎ = 1 𝑠𝑡 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒

Percentile 50𝑡ℎ = 2𝑛𝑑 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 = median

Percentile 75𝑡ℎ = 3𝑟𝑑 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒


Exercise:
Compute the percentile 25, 50, and 75 for the student marks

Class
Marks Frequency Percentile 25𝑡ℎ = 15.17 =P25 = Q1
boundaries

Percentile 50𝑡ℎ = 20.5 =P50 = Q2 = Median


1-8 2 0.5-8.5
9-16 3 8.5-16.5
Percentile 75𝑡ℎ = 28.5 =P75 = Q3
17-24 8 16.5-24.5
25-32 1 24.5-32.5
33-40 4 32.5-40.5
18
Case of Frequency table
The mean
The mean of values is the average
𝒙𝒊 𝒇𝒊 𝒙𝒊 𝒇𝒊
𝒙= =
𝒏 𝒇𝒊
The mode
The mode is the midpoint of a class having greatest frequency
The median
The median is the value in the middle of data (50%) of data less than it or greater than it when the data
are arranged in ascending order (smallest value to largest value).
𝒇
1) Rank of Median= 𝒊
𝟐
2) Construct cumulative frequency table (using all class boundaries)
3) Apply the proportion or the rule

𝑝𝑡ℎ Percentile
* is the value that approximately p% of the observations are less than the 𝑝𝑡ℎ percentile.
• and approximately (100 – p)% of the observations are greater than the 𝑝𝑡ℎ percentile.
𝑷
1) Rank of percentile P = 𝒇𝒊
𝟏𝟎𝟎
2) Construct cumulative frequency table (using all class boundaries)
3) Apply the proportion or the rule

Quartiles
Q1 = first quartile, or 25th percentile = Q1 = P25
Q2 = second quartile, or 50th percentile (also the median) = Q2 = P50 = Median
Q3 = third quartile, or 75th percentile = Q3 = P75
Chapter 3 / Section 3.2 Measures of variability
(Dispersion measure): Range, Interquartile range, Variance,
Standard Deviation, coefficient of Variation
Range
Range= Largest value - Smallest value
Interquartile Range (IQR):
IQR = Q3 – Q1
Variance:
Sample variance (𝑆 2)
Standard Deviation:
Sample standard deviation 𝑆 = Variance

The mean absolute error (MAE)

Coefficient of variation
Sample Variance Formula
For Ungrouped Data 𝐱 − 𝒙
𝟐
𝒙𝟐𝒊 − 𝒏𝒙𝟐
𝑺𝟐 = =
𝐧−𝟏 𝒏−𝟏
For Grouped Data 𝒙𝒊 − 𝒙 𝟐 𝒇𝒊 𝒙𝟐𝒊 𝒇𝒊 − 𝒇𝒊 𝒙𝟐
[case of frequency table] 𝑺𝟐 = =
𝒇𝒊 − 𝟏 𝒇𝒊 − 𝟏

MAE (MD):
mean absolute error = Mean Deviation
For Ungrouped Data 𝐱 − 𝒙
𝑴𝑨𝑬(𝑴𝑫) =
𝐧
For Grouped Data 𝒙𝒊 − 𝒙 𝒇𝒊
𝑴𝑨𝑬(𝑴𝑫) =
𝒇𝒊
Case of values:
Example:
For the sample values: 46, 52, 42, 48, 32, Compute

1) Range =52-32=20
2) IQR=Q3-Q1=50-37=13

𝟐 𝑿−𝑿 𝟐 𝟒𝟔−𝟒𝟒 𝟐+ 𝟓𝟐−𝟒𝟒 𝟐+ 𝟒𝟐−𝟒𝟒 𝟐+ 𝟒𝟖−𝟒𝟒 𝟐+ 𝟑𝟐−𝟒𝟒 𝟐 𝟐𝟑𝟐


3) Variance 𝑺 = = = = 𝟓𝟖
𝒏−𝟏 𝟓−𝟏 𝟒

4) Standard Deviation 𝑺 = 𝟓𝟖 = 𝟕. 𝟔𝟐
𝑺 𝟕.𝟔𝟐
5) Coefficient of variation 𝑪𝑽 = × 𝟏𝟎𝟎% = × 𝟏𝟎𝟎% = 𝟏𝟕. 𝟑%
𝒙 𝟒𝟒

𝐱−𝒙 𝟒𝟔−𝟒𝟒 + 𝟓𝟐−𝟒𝟒 + 𝟒𝟐−𝟒𝟒 + 𝟒𝟖−𝟒𝟒 + 𝟑𝟐−𝟒𝟒 𝟐𝟖


6) Mean Absolute Error 𝑴𝑨𝑬 = 𝐧
= 𝟓
= 𝟓
= 𝟓. 𝟔
Case of frequency table
Range (R):
R = Upper boundary of last class value – Lower boundary of first class value

Interquartile Range (IQR):


IQR = Q3 – Q1

Variance

Standard deviation

MAE
𝒙𝒊 𝒇 𝒊
Example: Assuming the Sample of student marks, Compute 𝒙= = 18.5
𝒏
1) Range = 40.5 - 0.5 = 40
2) IQR = Q3 - Q1 = 27.93 - 9.83 = 18.1
𝒙𝟐𝒊 𝒇𝒊 − 𝒇 𝒊 𝒙𝟐
2
𝟐𝟎𝟑𝟐 2 𝑥𝑖 − 𝑥 𝑓𝑖
3) 𝑽𝒂𝒓𝒊𝒂𝒏𝒄𝒆 𝑺𝟐 = =106.95 𝑆 = = 𝐎𝐑 =
𝟏𝟗 𝑓𝑖 − 1 𝒇𝒊 − 𝟏
4) Standard Deviation 𝑺 = 𝟏𝟎𝟔. 𝟗𝟓 = 𝟏𝟎. 𝟑𝟒
𝟏𝟖𝟒
5) MAE= 𝟐𝟎 =9.2

𝑺 𝟗.𝟐
6) Coefficient of variation 𝑪𝑽 = × 𝟏𝟎𝟎% = × 𝟏𝟎𝟎% = 𝟒𝟗. 𝟕%
𝒙 𝟏𝟖.𝟓
𝟐
𝒇𝒊 𝒙𝒊 𝒙𝒊 𝒇𝒊 𝒙𝟐𝒊 𝒇𝒊 𝒙𝒊 − 𝒙 𝒇𝒊 𝒙𝒊 − 𝒙 𝒇 𝒊

Student # of Class
Marks Students Midpoint
1-8 4 4.5 18 81 784 56
9-16 6 12.5 75 937.5 216 36
17-24 2 20.5 41 840.5 8 4
25-32 7 28.5 199.5 5685.75 700 70
33-40 1 36.5 36.5 1332.25 324 18
Total 20 370 8877 2032 184
𝒙𝒊 − 𝒙
z-Score 𝒛𝒊 =
𝑺
Used to determine the relative location of any observation.

The z-score is often called the standardized value.

The z-score, 𝑧𝑖 , can be interpreted as the number of standard


deviations 𝑥𝑖 is from the mean 𝑥.

Example, 𝑧1 = 1.2 would indicate that 𝑥1 is 1.2 standard deviations


greater than the sample mean.

Similarly, 𝑧2 = −.5 would indicate that 𝑥2 is .5, or (half), standard


deviation less than the sample mean
Example:
compute z-score of the following values:

mean= 44 , Variance=64 , Std. Dev.=8

You might also like