0% found this document useful (0 votes)

9 views

EDA_W3_Obtaining-Data

The document outlines various statistical measures for data analysis, including measures of central tendency (mean, median, mode), measures of variation (range, variance, standard deviation), and measures of shape (skewness, kurtosis). It provides definitions, computational procedures, and examples for each measure, emphasizing their importance in understanding data distribution and variability. Additionally, it includes exercises for calculating these measures using grouped data.

Uploaded by

Grizelle Mae

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

EDA_W3_Obtaining-Data

Uploaded by

Grizelle Mae

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 57

ENGINEERING DATA ANALYSIS

Obtaining Data

Editha A. Macorol
Measures of Describing Data

• Measure of Central Tendency

- Also known as Measure of Central Location
- Measure of finding the mean, median or mode of
the dataset

• Measure of Position
- Measure of finding the kth element of the
distribution
Measures of Describing Data

• Measure of Variation
- Measure of how the data is distributed about the
mean.

• Measure of Shape
- Measure of the degree of symmetry of a
distribution.
The Mean
Weighted Mean
Example:
1. The Carter Construction Company pays its hourly
employees $16.50, $19.00, or $25.00 per hour. There
are 26 hourly employees, 14 of which are paid at the
$16.50 rate, 10 at the $19.00 rate, and 2 at $25.00
rate. What is the mean hourly rate paid of the 26
employees?
The Median
Characteristics
• There is a unique median for each data set.
• It is not affected by extremely large or small values and
is therefore a valuable measure of central tendency
when such values occur.
• It can be computed for ratio-level, interval-level, and
ordinal-level data.
• It can be computed for an open-ended frequency
distribution if the median does not lie in an open-
ended class.
The Median
• The midpoint of the values after they have been
ordered from the smallest to largest
• There are as many values above the median as below it
in the data array.
• For an even set of values, the median will be the
arithmetic average of the two middle numbers.
Median: Computational Procedure
First Procedure
Arrange the observations in an ordered array.
If there is an odd number of terms, the median is the middle term of
the ordered array.
If there is an even number of terms, the median is the average of the
middle two terms.
Second Procedure
The median’s position in an ordered array is given by (n+1)/2.
The Median
Example:

Ordered Array
3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21 22

There are 17 terms in the ordered array.

Position of median = (n+1)/2 = (17+1)/2 = 9
The median is the 9th term, 15.
If the 22 is replaced by 100, the median is 15.
If the 3 is replaced by -103, the median is 15.
The Mode
• The value of the observation that appears most
frequently
The Mode
Class Interval Frequency
25-29 1
30-34 1
35-39 5
40-44 8
45-49 15
50-54 4
55-59 4
60-64 3
65-69 4
70-74 3
75-79 2
– sample size
class mark
frequency
Median of Grouped Data

• - lower boundary of the median class

• - cumulative frequency for class interval preceding the
median class
• - frequency in the median class
• – class width or the interval size
• – sample size
Mode of Grouped Data

• - lower boundary of the modal class

• - difference between the frequency in the modal class and the
frequency in the preceding class interval

• - difference between the frequency in the modal class and the

frequency in the succeeding class interval

• – class width or the interval size

Measures of Location
Quartiles
• Dividing the dataset into 4 groups.

Deciles
• Dividing the dataset into 10 groups.

Percentiles
• Dividing the dataset into 100 groups.
Measures of Location
• Quartile – One fourth
First (1/4), Second (1/2), Third (3/4)
Quartile locator (Lq):
• Decile – One tenth
10%, 20%, …, 90%
Decile locator (Ld):
• Percentile − One hundredth
1%, 2%, …, 99%
Measures of Variation (Dispersion)
Why study dispersion?
• A second reason is to compare the spread in two or
more distributions.
• These are measures of the average distance of each
observation from the center of distribution.
• They measure the homogeneity or heterogeneity of a
particular group.
Measures of Variation (Dispersion)
Why study dispersion?
• A measure of location, such as the mean or the median does not
tell us anything about the spread of the data.
• For example, if your nature guide told you that the river averaged
3 feet in depth, would you want to wade across on foot without
additional information? Probably not. You would want to know
something about the variation in depth.
Measures of Variation
• Range
- The difference between the largest and smallest number in
the set
• Interquartile Range
- Range of values between the first and third quartiles
- Range of the “middle half”
• Mean Deviation
• The average of unsigned deviations from mean
• Variance
- The average of square deviations
• Standard Deviation (SD)
- The population/sample standard deviation is given as the
positive square root of population/sample variance
• Coefficient of Variation (CV)
- The percentage of the ratio of standard deviation to the
mean
Range
R=H─L
Consider the following data.
Grades in Statistics
Jon 100 Ann 84
Ron 65 Ria 86
Dan 75 Let 85
Tom 85 Bel 82
Bob 95 Nel 83
Range 35 Range 4
Range
Conclusion: Grades of males are more scattered while
grades of females are more compressed. Females are
more homogeneous in their math ability.

Disadvantages of the range:

1. Unstable for a very large class
2. Unreliable since only two values are taken into
account
3. Range of two sets of data with unequal number of
scores are not directly comparable
Variance and Standard Deviation
• Sample variance ()

• Sample standard deviation ()

- Positive square root of

The quantity is often called the degree-of-freedom associated with

the variance estimate.
• Mean Deviation
Variance and Standard Deviation
• Population variance ()

• Population standard deviation ()

- Positive square root of
Variance
Determine the variance in the previous example treating
the data as a population and sample.
Grades in Statistics
Jon 100 Ann 84
Ron 65 Ria 86
Dan 75 Let 85
Tom 85 Bel 82
Bob 95 Nel 83
84 84
Variance
Males
Variance
Females
Variance
Conclusion: Males showed more variability. The higher
the variance, the more variable or far apart the values are
from each other.

Remark: Since the variance is in squared units, it does not

reflect the true meaning of data being measured.
Standard Deviation
Males

Females
Measures of Variation
Example:

Consider the following test scores:

Test 1 2 3 4 5 6 7 8 9 10
Student 12 6 13 2 5 0 9 6 10 7
Student 8 10 9 12 5 1 4 7 9 3
a. Who performed better?
b. Who is more consistent?
Measures of Variation
a. Compute the average score of each student.

Student performed better because of the higher

computed average.
Measures of Variation
b. Compute the sample standard deviations.

Student is more consistent because of lower standard

deviation.
Measures of Variation
Remark: Standard deviation and variance are both reliable
but cannot be used in comparing two sets of data of
different units.

Example: Consistency of a player − assist or making points

Measures of Variation

• Interquartile Range (IR)

• Quartile Deviation (QD)

• Coefficient of Variation

s
CV = ( 100 % )
𝑥
Coefficient of Variation
Measures of Shape
• Skewness
- Degree of asymmetry of distribution about a mean. It
is a measure on how the data departs from being
symmetrical
- Can be interpreted as symmetric, positively skewed or
negatively skewed

• Kurtosis
- The degree of peakedness exhibited by the distribution
- Computed as the fourth degree moment from the
mean
Skewness
Pearsonian Coefficient of Skewness (Pearson’s Coefficient
of Skewness)

Interpretation of values:
1. Sk < 0, “negatively skewed” or “skewed to the left”
2. Sk = 0, symmetrical
3. Sk > 0, “positively skewed” or “skewed to the right”
Skewness
• A measure of the asymmetry of the frequency distribution

a. Positive skewness: mode < median < mean

b. Symmetrical: mode = median = mean
c. Negative skewness: mode > median > mean
Skewness
Other formulas

Interpretation of values from formulas above:

1. Sk < 0, “negatively skewed” or “skewed to the left”
2. Sk = 0, symmetrical
3. Sk > 0, “positively skewed” or “skewed to the right”
Kurtosis
• A measure of the degree to which a uni-modal
distribution is peaked
• The state or quality of flatness or peakedness of the
curve describing a frequency distribution about its
mode

Leptokurtic Platykurtic

Mesokurtic
Kurtosis
Moment Based Coefficient of Kurtosis

Interpretation of values from

formulas above:
1. K < 3, “platykurtic”
2. K = 3, “mesokurtic”
3. K > 3, “leptokurtic
SOLVE THE FOLLOWING:
1. Mean
2. Median
3. Mode
4. 1st quartile
5. 3rd Quartile
6. 35th Percentile
7. 67th Percentile
8. IQR
9. Mean Deviation
10. Standard Deviation
11. Skewness
12. Kurtosis
Class
Frequency
Interval

25-29 1

30-34 1

35-39 5

40-44 8

45-49 15

50-54 4

55-59 4

60-64 3

65-69 4

70-74 3

75-79 2
Class
Frequency
Interval

25-29 1 27 1

30-34 1 32 2

35-39 5 37 7

40-44 8 42 15

45-49 15 47 30

50-54 4 52 34

55-59 4 57 38

60-64 3 62 41

65-69 4 67 45

70-74 3 72 48

75-79 2 77 50
Class
Frequency
Interval

27 1 27
25-29 1
32 2 32
30-34 1
37 7 185
35-39 5
42 15 336
40-44 8
47 30 705
45-49 15
52 34 208
50-54 4
57 38 228
55-59 4
62 41 186
60-64 3
67 45 268
65-69 4
72 48 216
70-74 3
77 50 154
75-79 2
2545
50
Class
Frequency
Interval

27 1 27 23.9 571.21 326280.86

25-29 1
32 2 32 18.9 357.21 127598.98
30-34 1
37 7 185 69.5 966.05 186650.52
35-39 5
42 15 336 71.2 633.68 50196.79
40-44 8
47 30 705 58.5 228.15 3470.16
45-49 15
52 34 208 4.4 4.84 5.86
50-54 4
57 38 228 24.4 148.84 5538.34
55-59 4
62 41 186 33.3 369.63 45542.11
60-64 3
67 45 268 64.4 1036.84 268769.3
65-69 4
72 48 216 63.3 1335.63 594635.83
70-74 3
77 50 154 52.2 1362.42 928094.13
75-79 2
2545 484 7014.5 2536769.88
50
– sample size
class mark
frequency

- lower boundary of the median class

- cumulative frequency for class interval preceding the median class
- frequency in the median class
– class width or the interval size
– sample size

𝑚𝑑𝑛= 44.5+ ( 25 −15

15
5 )
𝑚𝑑𝑛=47.83
- lower boundary of the modal class
- difference between the frequency in the modal class and the frequency in the
preceding class interval
- difference between the frequency in the modal class and the frequency in the
succeeding class interval
– class width or the interval size

𝑚𝑜= 44.5 + ( 7
7 + 11 )
5

𝑚𝑜=46.44
- lower boundary of the 1st quartile class
- cumulative frequency for class interval preceding the 1 st quartile class
- frequency in the 1st quartile class
– class width or the interval size

( )5
– sample size
1 5 −7
𝐷 3 =39.5 +
8

( )
35 𝑛
− 𝑐𝑓

( 1 2.5− 7
) 𝐷 3= 44.5
100
𝑄1 =39.5+ 5 𝑃 35 = 𝑥 𝑙𝑏 + 𝑖
𝑓 𝑚
8

𝑄1 =42.94 𝑃 35 = 45.33 𝐷 8=59.5 + ( 4 0 − 38

3 )5
( )
67 𝑛
𝑄 3=54.5+ (
3 7.5 −34
4
5 ) 𝑃 67 =𝑥 𝑙𝑏 +
100
𝑓𝑚
− 𝑐𝑓
𝑖

𝑄 3=58.88 𝑃 67 =53.88 𝐷 8=62.83

42.94

IR=15.94 3 ( 50.9−47.83 )
𝑆𝑘=
11.97
484
𝑚𝑑=
49 𝑆𝑘=0.77

𝑚𝑑=9.9

𝑠=
√
7014.5
49 𝑘=
2536769.88
50 ¿ ¿
𝑠=11.96 1.12
SOLVE THE FOLLOWING:
1. Mean
2. Median
3. Mode
4. 1st quartile
5. 3rd Quartile
6. 35th Percentile
7. 67th Percentile
8. IQR
9. Mean Deviation
10. Standard Deviation
11. Skewness
12. Kurtosis

Numerical Descriptive Measures: A. Measures of Central Tendency
No ratings yet
Numerical Descriptive Measures: A. Measures of Central Tendency
21 pages
Descriptive Measures With Samples-1
No ratings yet
Descriptive Measures With Samples-1
33 pages
Ken Black QA ch03
0% (1)
Ken Black QA ch03
61 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
DSJ BMS Unit2
No ratings yet
DSJ BMS Unit2
18 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
Measures
No ratings yet
Measures
8 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
35 pages
Introduction To Descriptive Statistics
No ratings yet
Introduction To Descriptive Statistics
73 pages
Data Analytics TB
No ratings yet
Data Analytics TB
1,944 pages
Unit 1 - Business Statistics & Analytics
No ratings yet
Unit 1 - Business Statistics & Analytics
25 pages
Measures of Central Tendency
100% (15)
Measures of Central Tendency
15 pages
Group-1 Module-1 PPT
No ratings yet
Group-1 Module-1 PPT
100 pages
Measures of Location and VARIATION For 1 Variable
No ratings yet
Measures of Location and VARIATION For 1 Variable
44 pages
2023 Statistics Fin 4
No ratings yet
2023 Statistics Fin 4
20 pages
Descriptive Stat
No ratings yet
Descriptive Stat
13 pages
Stat Chapter 5-9
No ratings yet
Stat Chapter 5-9
32 pages
Math in The Modern World Stat Lecture
No ratings yet
Math in The Modern World Stat Lecture
3 pages
Ch 2 Lecture Notes
No ratings yet
Ch 2 Lecture Notes
12 pages
Discriptive Statistics
No ratings yet
Discriptive Statistics
50 pages
City_Uni_of_New_York
No ratings yet
City_Uni_of_New_York
33 pages
Jerome Statistics
No ratings yet
Jerome Statistics
12 pages
Unit 3 Summarising Data - Averages and Dispersion
No ratings yet
Unit 3 Summarising Data - Averages and Dispersion
22 pages
Lesson 4: Statistics/Data Management Unit 1 - Measures of Central Tendency
No ratings yet
Lesson 4: Statistics/Data Management Unit 1 - Measures of Central Tendency
26 pages
Biostatistics (Descriptive Statistics)
No ratings yet
Biostatistics (Descriptive Statistics)
30 pages
Click To Add Text Dr. Cemre Erciyes
No ratings yet
Click To Add Text Dr. Cemre Erciyes
69 pages
Lesson-3.2-Measures-of-Central-Tendency-Position-and-Variation
No ratings yet
Lesson-3.2-Measures-of-Central-Tendency-Position-and-Variation
62 pages
RMBS BPT402
No ratings yet
RMBS BPT402
103 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
38 pages
Slides for IT SKill
No ratings yet
Slides for IT SKill
63 pages
EECM3724_Unit_1_Ch3_slides_2022
No ratings yet
EECM3724_Unit_1_Ch3_slides_2022
48 pages
2023 Statistics Fin 5
No ratings yet
2023 Statistics Fin 5
21 pages
Utilization of Assessment Data
No ratings yet
Utilization of Assessment Data
34 pages
Introductory of Statistics - Chapter 3
No ratings yet
Introductory of Statistics - Chapter 3
7 pages
STAE Lecture Notes - LU3
No ratings yet
STAE Lecture Notes - LU3
24 pages
STAE lecture notes_LU3_Annotated
No ratings yet
STAE lecture notes_LU3_Annotated
10 pages
Lecture 3 Numerical Measures of Data
No ratings yet
Lecture 3 Numerical Measures of Data
36 pages
Unit - 2: Measures of Central Tendency
No ratings yet
Unit - 2: Measures of Central Tendency
8 pages
SSC CGL Tier 2 Statistics - Last Minute Study Notes: Measures of Central Tendency
No ratings yet
SSC CGL Tier 2 Statistics - Last Minute Study Notes: Measures of Central Tendency
10 pages
EXP-1- Statistics and Plotting
No ratings yet
EXP-1- Statistics and Plotting
23 pages
Topic 1 Numerical Measure
No ratings yet
Topic 1 Numerical Measure
11 pages
Descriptive Statistics
100% (1)
Descriptive Statistics
37 pages
ISM Session 1-8+webinar1,2 Merged
No ratings yet
ISM Session 1-8+webinar1,2 Merged
718 pages
Session 1 ISM May 2024
No ratings yet
Session 1 ISM May 2024
59 pages
Basic 1
No ratings yet
Basic 1
60 pages
Statistical Analysis_ Descriptive Stat (2)
No ratings yet
Statistical Analysis_ Descriptive Stat (2)
6 pages
Lecture_04
No ratings yet
Lecture_04
88 pages
Topic: Measures of Central Tendency and Measures of Dispersion
No ratings yet
Topic: Measures of Central Tendency and Measures of Dispersion
45 pages
Ch3 Numerically Summarizing Data
No ratings yet
Ch3 Numerically Summarizing Data
35 pages
Measures-of-Centrality-and-Variability
No ratings yet
Measures-of-Centrality-and-Variability
42 pages
Desc. Stat
No ratings yet
Desc. Stat
41 pages
2.data Description
No ratings yet
2.data Description
57 pages
Probability and Statistics Lecture Notes
100% (1)
Probability and Statistics Lecture Notes
9 pages
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
No ratings yet
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
44 pages
2Review on Measurement on Descriptive Statistics
No ratings yet
2Review on Measurement on Descriptive Statistics
76 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
51 pages
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Afrodhite Hanna Sudiro
No ratings yet
Afrodhite Hanna Sudiro
134 pages
Supplement Undergraduate Projects On Descriptive Statistic
No ratings yet
Supplement Undergraduate Projects On Descriptive Statistic
5 pages
Lab 6 Worksheet
No ratings yet
Lab 6 Worksheet
2 pages
R&R Excel Example
No ratings yet
R&R Excel Example
16 pages
MAT 240 Real Estate Data
No ratings yet
MAT 240 Real Estate Data
5 pages
Moment
No ratings yet
Moment
5 pages
Statistics MCQ Practice
No ratings yet
Statistics MCQ Practice
2 pages
Statistics and Probability
No ratings yet
Statistics and Probability
5 pages
Advanced Statistics
No ratings yet
Advanced Statistics
259 pages
upto6-l3-1
No ratings yet
upto6-l3-1
2 pages
Pairwise Granger Causality Tests
No ratings yet
Pairwise Granger Causality Tests
5 pages
DID101R
No ratings yet
DID101R
5 pages
Thesis - Intan Normaya Binti Hairuddin (2009469922)
100% (1)
Thesis - Intan Normaya Binti Hairuddin (2009469922)
65 pages
Data Analysis of Students Marks With Descriptive Statistics
No ratings yet
Data Analysis of Students Marks With Descriptive Statistics
4 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
42 pages
Chapter1 Statistic
No ratings yet
Chapter1 Statistic
33 pages
Frequency Distribution, Cross-Tabulation, and Hypothesis Testing
No ratings yet
Frequency Distribution, Cross-Tabulation, and Hypothesis Testing
4 pages
Mathematics: Exercise 14.3
No ratings yet
Mathematics: Exercise 14.3
18 pages
Statistics 2021-22 Papers
No ratings yet
Statistics 2021-22 Papers
21 pages
AIL-report
No ratings yet
AIL-report
43 pages
Maksud Dan Keputusan Mode, Median Mean
No ratings yet
Maksud Dan Keputusan Mode, Median Mean
2 pages
Activity 3 General
No ratings yet
Activity 3 General
21 pages
comprehensive-examination-in-statistics-educ-602
No ratings yet
comprehensive-examination-in-statistics-educ-602
11 pages
CH5.Operations On Multiple Random Variables
No ratings yet
CH5.Operations On Multiple Random Variables
12 pages
Results: Contingency Tables
No ratings yet
Results: Contingency Tables
7 pages
BBA - 2nd - Sem - 215-Busines Statistics - Final
No ratings yet
BBA - 2nd - Sem - 215-Busines Statistics - Final
175 pages
Edexcel s1 Mixed Question
100% (1)
Edexcel s1 Mixed Question
78 pages
Package Desire': R Topics Documented
No ratings yet
Package Desire': R Topics Documented
22 pages
Group Assignment - AS GRP 5 PDF
No ratings yet
Group Assignment - AS GRP 5 PDF
37 pages
TOPIC 1 - Introduction To Statistics in Relation To
No ratings yet
TOPIC 1 - Introduction To Statistics in Relation To
47 pages