CHAPTER 2: Basic Summary Statistics
 Measures of Central Tendency (or location)
 Mean – mode – median
 Measures of Dispersion (or Variation)
 Variance – standard deviation – coefficient of variation
2.1. Introduction:
For the population of interest, there is a population of values of
the variable of interest.
(i) A parameter is a measure (or number) obtained from
the population values X1,X2, …, XN
(parameters are unknown in general)
A statistic is a measure (or number) obtained from the sample
values x1,x2, …, xn
(statistics are known in general)
Let X1,X2, …, XN be the population values (in general, they are
unknown) of the variable of interest. The population size = N
Let x1,x2, …, xn be the sample values (these values are known)
The sample size = n
2.2. Measures of Central Tendency: (Location)
 The values of a variable often tend to be concentrated around
the center of the data.
Some of these measures are: the mean, mode, median
 These measures are considered as representatives (or typical values)
of data.
Mean:
:
(1) Population mean :

If X1,X2, …, XN are the population values of the variable of
interest , then the population mean is:
N
N
i
i
X
N
N
X
X
X






 1
2
1 
 (unit)
 The population mean is a parameter (it is usually unknown)

(2) Sample mean :
If are the sample values, then the sample mean is
x
n
x
x
x ,
,
, 2
1 
n
x
n
x
x
x
x
n
i
i
n






 1
2
1 
(unit)
 The sample mean is a statistic (it is known)
 The sample mean is used to approximate (estimate) the population
mean .
x
x

Example:
Consider the following population values:
Suppose that the sample values obtained are:
.
41
,
27
,
35
,
22
,
30 5
4
3
2
1 



 X
X
X
X
X
.
27
,
35
,
30 3
2
1 

 x
x
x
Notes:
 The mean is simple to calculate.
 There is only one mean for a given sample data.
 The mean can be distorted by extreme values.
The mean can only be found for quantitative variables
Median:
The median of a finite set of numbers is that value which
divides the ordered set into two equal parts.
Then:
31
5
155
5
41
27
35
22
30








67
.
30
3
92
3
27
35
30





x
(unit)
(unit)
Let x1,x2, …, xn be the sample values . We have two cases:
(1) If the sample size, n, is odd:
 The median is the middle value of the ordered
observations.
* *
1 2
…
…
Middle
value=
MEDIAN
2
1

n …
… *
n
Ordered set
(smallest to
largest)

Rank (or order) 
2
1

n
The middle observation is the ordered observation
The median = The order observation.
2
1

n
Example:
Find the median for the sample values: 10, 54, 21, 38, 53.
Solution:
n = 5 (odd number)
The rank of the middle value (median) = = (5+1)/2 = 3
2
1

n
10 21 38 53 54
2
1

n
1 2 = 3 4 5

Ordered set

Rank (or order)
The median =38 (unit)
(2) If the sample size, n, is even:
 The median is the mean (average) of the two middle values of the
ordered observations.
 The middle two values are the ordered and observations.
2
n
1
2

n
 The median =


Ordered set
Rank (or order)
Example:
Find the median for the sample values: 10, 35, 41, 16, 20, 32
Solution:
.n = 6 (even number)
The rank of the middle values are
= 6 / 2 = 3
2
n
= (6 / 2) + 1 = 4
1
2

n
Ordered set 
Rank (or order) 
The median (unit)
26
2
52
2
32
20




Note:
The median is simple to calculate.
There is only one median for given data.
he median is not affected too much by extreme values.
 The median can only be found for quantitative variables
Mode:
The mode of a set of values is that value which occurs with the
highest frequency.
If all values are different or have the same frequency, there is no
mode.
 A set of data may have more than one mode.
Example:
Note:
 The mode is simple to calculate but it is not “good”.
 The mode is not affected too much by extreme values.
 The mode may be found for both quantitative and qualitative
variables.
2.3. Measures of Dispersion (Variation):
The variation or dispersion in a set of values refers to how
spread out the values are from each other.
•
The variation is small when the values are close
together.
• There is no variation if the values are the same.
Some measures of dispersion:
Range – Variance – Standard deviation
Coefficient of variation
Range:
Range is the difference between the largest (Max) and smallest
(Min) values.
Range = Max  Min
Smaller variation
Larger variation
Example:
Find the range for the sample values: 26, 25, 35, 27, 29, 29.
Solution:
Range = 35  25 = 10 (unit)
Note:
The range is not useful as a measure of the variation since it only
takes into account two of the values. (it is not good)
Variance:
The variance is a measure that uses the mean as a point of
reference.
The variance is small when all values are close to the mean. The
variance is large when all values are spread out from the mean.
deviations from the mean:
(1) Population variance:
Let X1,X2, …, XN be the population values.
The population variance is defined by
Deviations from the mean:
where is the population
mean
N
X
N
i
i


 1

(2) Sample Variance:
Let be the sample values.
The sample variance is defined by:
n
x
x
x ,
,
, 2
1 
where is the sample mean.
n
x
x
n
i
i


 1
Example:
We want to compute the sample variance of the following sample
values: 10, 21, 33, 53, 54.
Solution: n=5
Another method:
Calculating Formula for S2
:
Note:
To calculate S2
we need:
 n = sample size
 The sum of the values
 The sum of the squared values
 
i
x
 
2
i
x
   7
.
376
4
8
.
1506
1
5
2
.
34
5
7355
2
2





S
Standard Deviation:
 The standard deviation is another measure of variation.
 It is the square root of the variance.
(1) Population standard deviation is: (unit)
2

 
(2) Sample standard deviation is: (unit)
2
S
S 
Coefficient of Variation (C.V.):
 The variance and the standard deviation are useful as
measures of variation of the values of a single variable for a
single population (or sample).
 If we want to compare the variation of two variables we
cannot use the variance or the standard deviation because:
1. The variables might have different units.
2. The variables might have different means.
 We need a measure of the relative variation that
will not depend on either the units or on how large the
values are. This measure is the coefficient of variation
(C.V.) which is defined by:
C.V. = (free of unit or unit less)
%
100
*
x
S
1
x 1
S %
100
.
1
1
1
x
S
V
C 
2
x 2
S %
100
.
2
2
2
x
S
V
C 
Mean St.dev. C.V.
1st
data set
2nd
data set
 The relative variability in the 1st
data set is larger than
the relative variability in the 2nd
data set if C.V1
> C.V2
(and vice
versa).
Example:
1st
data set: 66 kg, 4.5 kg
2nd
data set: 36 kg, 4.5 kg
Since , the relative variability in the 2nd
data set is
larger than the relative variability in the 1st
data set.

1
x 
2
S
%
8
.
6
%
100
*
66
5
.
4
. 1 

 V
C

2
x 
2
S
%
5
.
12
%
100
*
36
5
.
4
. 2 

 V
C
2
1 .
. V
C
V
C 
Notes: (Some properties of , S, and S2
:
Sample values are : x1,x2, …, xn
a and b are constants
x
n
x
x
x ,
,
, 2
1 
n
ax
ax
ax ,
,
, 2
1 
b
x
b
x n 
 ,
,
,
1 
b
ax
b
ax n 
 ,
,
1 
x
x
a
b
x 
b
x
a 
S
S
a
S
S
a
2
S
2
2
S
a
2
2
S
a
Sample Data Sample
mean
Sample
st.dev
Sample
Variance
2
S
Absolute value:
 0
0



 a
if
a
a
if
a
a
Example:
Sample Sample
mean
Sample
St..dev.
Sample
Variance
1,3,5 3 2 4
(1)
(2)
(3)
-2, -6, -10
11, 13, 15
8, 4, 0
-6
13
4
4
2
4
16
4
16
Data (1) (a = 2)
(2) (b = 10)
(3) (a = 2, b = 10)
3
2
1 2
,
2
,
2 x
x
x 


10
,
10
,
10 3
2
1 

 x
x
x
10
2
,
10
2
,
10
2 3
2
1 




 x
x
x

More Related Content

PPTX
Lesson3 lpart one - Measures mean [Autosaved].pptx
PPTX
Lesson2 lecture two in Measures mean.pptx
PPT
Descriptive statistics
PPT
STANDARD DEVIATION SLIDESHOW OF LEOPOLDO
PPTX
Stat Chapter 3.pptx, proved detail statistical issues
PDF
Lesson2 - lecture two Measures mean.pdf
PDF
Empirics of standard deviation
DOC
Ch 5 CENTRAL TENDENCY.doc
Lesson3 lpart one - Measures mean [Autosaved].pptx
Lesson2 lecture two in Measures mean.pptx
Descriptive statistics
STANDARD DEVIATION SLIDESHOW OF LEOPOLDO
Stat Chapter 3.pptx, proved detail statistical issues
Lesson2 - lecture two Measures mean.pdf
Empirics of standard deviation
Ch 5 CENTRAL TENDENCY.doc

Similar to INTRODUCTION TO STATISTICAL VARIANCE LECTURE.ppt (20)

PDF
Lect w2 measures_of_location_and_spread
PPT
Measures of dispersion
PPTX
03. Summarizing data biostatic - Copy.pptx
PPT
statistics_1________________________.ppt
PPT
ch-4-measures-of-variability-11 2.ppt for nursing
PPT
measures-of-variability-11.ppt
PPTX
Measures of dispersion
PDF
4)central tendency and dispersion biostatistics
PPTX
Ch3MCT24.pptx measure of central tendency
PPTX
Measures of Dispersion.pptx
PPT
2. Descriptive Numerical Summary Measures-2023(2).ppt
PPTX
Kwoledge of calculation of mean,median and mode
PPTX
Measures of central tendency median mode
ODP
QT1 - 03 - Measures of Central Tendency
ODP
QT1 - 03 - Measures of Central Tendency
PPTX
2- Statistics about statistical to dsp.pptx
PPTX
Statistics
DOCX
PPT
A. measure of central tendency
Lect w2 measures_of_location_and_spread
Measures of dispersion
03. Summarizing data biostatic - Copy.pptx
statistics_1________________________.ppt
ch-4-measures-of-variability-11 2.ppt for nursing
measures-of-variability-11.ppt
Measures of dispersion
4)central tendency and dispersion biostatistics
Ch3MCT24.pptx measure of central tendency
Measures of Dispersion.pptx
2. Descriptive Numerical Summary Measures-2023(2).ppt
Kwoledge of calculation of mean,median and mode
Measures of central tendency median mode
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central Tendency
2- Statistics about statistical to dsp.pptx
Statistics
A. measure of central tendency
Ad

Recently uploaded (20)

PPTX
ifsm.pptx, institutional food service management
PPTX
DATA ANALYTICS COURSE IN PITAMPURA.pptx
PDF
Grey Minimalist Professional Project Presentation (1).pdf
PDF
2025-08 San Francisco FinOps Meetup: Tiering, Intelligently.
PPTX
PPT for Diseases.pptx, there are 3 types of diseases
PDF
Concepts of Database Management, 10th Edition by Lisa Friedrichsen Test Bank.pdf
PPTX
inbound2857676998455010149.pptxmmmmmmmmm
PPTX
GPS sensor used agriculture land for automation
PPTX
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
PDF
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
PPTX
indiraparyavaranbhavan-240418134200-31d840b3.pptx
PPTX
Chapter security of computer_8_v8.1.pptx
PDF
Book Trusted Companions in Delhi – 24/7 Available Delhi Personal Meeting Ser...
PDF
technical specifications solar ear 2025.
PPTX
1 hour to get there before the game is done so you don’t need a car seat for ...
PPT
dsa Lec-1 Introduction FOR THE STUDENTS OF bscs
PPTX
C programming msc chemistry pankaj pandey
PPTX
cp-and-safeguarding-training-2018-2019-mmfv2-230818062456-767bc1a7.pptx
PPTX
machinelearningoverview-250809184828-927201d2.pptx
PPTX
Machine Learning and working of machine Learning
ifsm.pptx, institutional food service management
DATA ANALYTICS COURSE IN PITAMPURA.pptx
Grey Minimalist Professional Project Presentation (1).pdf
2025-08 San Francisco FinOps Meetup: Tiering, Intelligently.
PPT for Diseases.pptx, there are 3 types of diseases
Concepts of Database Management, 10th Edition by Lisa Friedrichsen Test Bank.pdf
inbound2857676998455010149.pptxmmmmmmmmm
GPS sensor used agriculture land for automation
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
indiraparyavaranbhavan-240418134200-31d840b3.pptx
Chapter security of computer_8_v8.1.pptx
Book Trusted Companions in Delhi – 24/7 Available Delhi Personal Meeting Ser...
technical specifications solar ear 2025.
1 hour to get there before the game is done so you don’t need a car seat for ...
dsa Lec-1 Introduction FOR THE STUDENTS OF bscs
C programming msc chemistry pankaj pandey
cp-and-safeguarding-training-2018-2019-mmfv2-230818062456-767bc1a7.pptx
machinelearningoverview-250809184828-927201d2.pptx
Machine Learning and working of machine Learning
Ad

INTRODUCTION TO STATISTICAL VARIANCE LECTURE.ppt

  • 1. CHAPTER 2: Basic Summary Statistics  Measures of Central Tendency (or location)  Mean – mode – median  Measures of Dispersion (or Variation)  Variance – standard deviation – coefficient of variation 2.1. Introduction: For the population of interest, there is a population of values of the variable of interest.
  • 2. (i) A parameter is a measure (or number) obtained from the population values X1,X2, …, XN (parameters are unknown in general) A statistic is a measure (or number) obtained from the sample values x1,x2, …, xn (statistics are known in general) Let X1,X2, …, XN be the population values (in general, they are unknown) of the variable of interest. The population size = N Let x1,x2, …, xn be the sample values (these values are known) The sample size = n
  • 3. 2.2. Measures of Central Tendency: (Location)  The values of a variable often tend to be concentrated around the center of the data. Some of these measures are: the mean, mode, median  These measures are considered as representatives (or typical values) of data. Mean: : (1) Population mean :  If X1,X2, …, XN are the population values of the variable of interest , then the population mean is: N N i i X N N X X X        1 2 1   (unit)
  • 4.  The population mean is a parameter (it is usually unknown)  (2) Sample mean : If are the sample values, then the sample mean is x n x x x , , , 2 1  n x n x x x x n i i n        1 2 1  (unit)
  • 5.  The sample mean is a statistic (it is known)  The sample mean is used to approximate (estimate) the population mean . x x  Example: Consider the following population values: Suppose that the sample values obtained are: . 41 , 27 , 35 , 22 , 30 5 4 3 2 1      X X X X X . 27 , 35 , 30 3 2 1    x x x
  • 6. Notes:  The mean is simple to calculate.  There is only one mean for a given sample data.  The mean can be distorted by extreme values. The mean can only be found for quantitative variables Median: The median of a finite set of numbers is that value which divides the ordered set into two equal parts. Then: 31 5 155 5 41 27 35 22 30         67 . 30 3 92 3 27 35 30      x (unit) (unit) Let x1,x2, …, xn be the sample values . We have two cases:
  • 7. (1) If the sample size, n, is odd:  The median is the middle value of the ordered observations. * * 1 2 … … Middle value= MEDIAN 2 1  n … … * n Ordered set (smallest to largest)  Rank (or order)  2 1  n The middle observation is the ordered observation The median = The order observation. 2 1  n
  • 8. Example: Find the median for the sample values: 10, 54, 21, 38, 53. Solution: n = 5 (odd number) The rank of the middle value (median) = = (5+1)/2 = 3 2 1  n 10 21 38 53 54 2 1  n 1 2 = 3 4 5  Ordered set  Rank (or order) The median =38 (unit)
  • 9. (2) If the sample size, n, is even:  The median is the mean (average) of the two middle values of the ordered observations.  The middle two values are the ordered and observations. 2 n 1 2  n  The median =   Ordered set Rank (or order)
  • 10. Example: Find the median for the sample values: 10, 35, 41, 16, 20, 32 Solution: .n = 6 (even number) The rank of the middle values are = 6 / 2 = 3 2 n = (6 / 2) + 1 = 4 1 2  n
  • 11. Ordered set  Rank (or order)  The median (unit) 26 2 52 2 32 20     Note: The median is simple to calculate. There is only one median for given data. he median is not affected too much by extreme values.  The median can only be found for quantitative variables
  • 12. Mode: The mode of a set of values is that value which occurs with the highest frequency. If all values are different or have the same frequency, there is no mode.  A set of data may have more than one mode. Example: Note:  The mode is simple to calculate but it is not “good”.  The mode is not affected too much by extreme values.  The mode may be found for both quantitative and qualitative variables.
  • 13. 2.3. Measures of Dispersion (Variation): The variation or dispersion in a set of values refers to how spread out the values are from each other. • The variation is small when the values are close together. • There is no variation if the values are the same.
  • 14. Some measures of dispersion: Range – Variance – Standard deviation Coefficient of variation Range: Range is the difference between the largest (Max) and smallest (Min) values. Range = Max  Min Smaller variation Larger variation
  • 15. Example: Find the range for the sample values: 26, 25, 35, 27, 29, 29. Solution: Range = 35  25 = 10 (unit) Note: The range is not useful as a measure of the variation since it only takes into account two of the values. (it is not good) Variance: The variance is a measure that uses the mean as a point of reference. The variance is small when all values are close to the mean. The variance is large when all values are spread out from the mean. deviations from the mean:
  • 16. (1) Population variance: Let X1,X2, …, XN be the population values. The population variance is defined by Deviations from the mean: where is the population mean N X N i i    1 
  • 17. (2) Sample Variance: Let be the sample values. The sample variance is defined by: n x x x , , , 2 1  where is the sample mean. n x x n i i    1
  • 18. Example: We want to compute the sample variance of the following sample values: 10, 21, 33, 53, 54. Solution: n=5
  • 20. Note: To calculate S2 we need:  n = sample size  The sum of the values  The sum of the squared values   i x   2 i x
  • 21.    7 . 376 4 8 . 1506 1 5 2 . 34 5 7355 2 2      S Standard Deviation:  The standard deviation is another measure of variation.  It is the square root of the variance. (1) Population standard deviation is: (unit) 2    (2) Sample standard deviation is: (unit) 2 S S 
  • 22. Coefficient of Variation (C.V.):  The variance and the standard deviation are useful as measures of variation of the values of a single variable for a single population (or sample).  If we want to compare the variation of two variables we cannot use the variance or the standard deviation because: 1. The variables might have different units. 2. The variables might have different means.
  • 23.  We need a measure of the relative variation that will not depend on either the units or on how large the values are. This measure is the coefficient of variation (C.V.) which is defined by: C.V. = (free of unit or unit less) % 100 * x S 1 x 1 S % 100 . 1 1 1 x S V C  2 x 2 S % 100 . 2 2 2 x S V C  Mean St.dev. C.V. 1st data set 2nd data set
  • 24.  The relative variability in the 1st data set is larger than the relative variability in the 2nd data set if C.V1 > C.V2 (and vice versa). Example: 1st data set: 66 kg, 4.5 kg 2nd data set: 36 kg, 4.5 kg Since , the relative variability in the 2nd data set is larger than the relative variability in the 1st data set.  1 x  2 S % 8 . 6 % 100 * 66 5 . 4 . 1    V C  2 x  2 S % 5 . 12 % 100 * 36 5 . 4 . 2    V C 2 1 . . V C V C 
  • 25. Notes: (Some properties of , S, and S2 : Sample values are : x1,x2, …, xn a and b are constants x n x x x , , , 2 1  n ax ax ax , , , 2 1  b x b x n   , , , 1  b ax b ax n   , , 1  x x a b x  b x a  S S a S S a 2 S 2 2 S a 2 2 S a Sample Data Sample mean Sample st.dev Sample Variance 2 S Absolute value:  0 0     a if a a if a a
  • 26. Example: Sample Sample mean Sample St..dev. Sample Variance 1,3,5 3 2 4 (1) (2) (3) -2, -6, -10 11, 13, 15 8, 4, 0 -6 13 4 4 2 4 16 4 16 Data (1) (a = 2) (2) (b = 10) (3) (a = 2, b = 10) 3 2 1 2 , 2 , 2 x x x    10 , 10 , 10 3 2 1    x x x 10 2 , 10 2 , 10 2 3 2 1       x x x