Classification &
Presentation
Introduction
Drawing conclusion from raw data is very difficult technically, which is also called ungrouped
data.
Through statistics the data can be grouped in many meaningful ways. Data presentation in
frequency distribution is known as grouped data.
Once the data is grouped, it becomes easier to pick out the patterns and to draw the logical
conclusion out of it.
In simple words frequency distribution is a tabular summary, grouping the frequencies of
observations in each of the several non – overlapping classes.
Classification
Classification is a process of arranging the data according to some common
characteristic possessed by the facts constituting the data. The facts having a
common characteristic are termed as one class or group. Thus classification is
the grouping of the related facts into different classes.
Purpose of Classification
To condense the mass of data in such a manner that similarities and dissimilarities are readily
apprehended and relationship studied.
To facilitate comparison.
To have a bird’s eye view of the significant features of the data.
To enlighten the important information while giving less prominence to insignificant items.
To utilize the data for tabulation and further statistical analysis.
To eliminate unnecessary details contained in raw data.
To present the complex, scattered data in a concise, logical and understandable form.
Essentials of a good classification
Classification is done in such a way so that entire data is covered and not even a single item is left unclassified.
Each item of the data should belong only to one class by avoiding overlapping.
It should facilitate comparison.
Class interval should be of equal length.
Should confirm to the objects of investigation.
It should be flexible.
Items constituting in a group should be homogeneous
Kinds of Classification
Types of
Classification
Quantitative Chronological Geographical Qualitative
Simple
Manifold
Quantitative Classification
If the data is classified on the basis of some quantitative information the classification is known as
quantitative
Percentage Marks No. Of Students
0-20 10
20-40 20
40-60 35
60-80 32
80-100 3
Chronological Classification
When the data is classified on the basis of time it is known as chronological classification. The
data is also known as time series:
Example:
Year 1931 1941 1951 1961 1971 1981 1991 2001
Population 27.9 31.8 35.7 43.8 54.6 68.4 84.6 102.7
Geographical Classification
When data is classified on the basis of geographical information, it is known as geographical
classification.
S.No. State/ UT Wheat (Th. tonnes)
1 Uttar Pradesh 25220
2 Punjab 15783
3 Madhya Pradesh 14182
4 Haryana 11856
5 Rajasthan 9869
Qualitative Classification
If the data are classified on the basis of some attributes or quality (descriptive
characteristics) such as gender, literacy, beauty, honesty, intelligence, religion,
education, color of hair etc. The classification is called qualitative classification.
“In this type of classification, the attribute under study cannot be measured but its
presence or absence can be found or felt”. This type of classification is called
Simple or Dichotomous or Two-fold classification.
Two fold Classification
Population
Males Females
Employed Unemployed Employed Unemployed
Married Married Married Married
Unmarried Unmarried Unmarried Unmarried
Tabulation
Frequency Distribution
A frequency distribution is any device such as a graph or table that displays the values that
the variable can assume along with the frequency of occurrence of these values either
individually or as they are grouped into a set of mutually exclusive and exhaustive intervals
“Frequency distribution is a method of organizing the raw and unorganized data”
Following are the three steps by which the data can be organized:
Step 1: To find the range of given data
Step 2: To get the number of class – intervals
Step 3: To determine the width of the class
Class Intervals
Class intervals are contiguous non-overlapping intervals selected in such a way that they
are mutually exclusive and exhaustive.
Formation of Frequency Distribution
The number of times a value occurs in a series is called the frequency of that value and the
arrangement obtained by mentioning the frequency against each value in the series is called
frequency distribution.
Value 3 5 6 8 7
Frequency 2 3 2 6 4
Types of frequency Distributions
Discrete Frequency Distribution
Grouped Frequency Distribution
Discrete Frequency Distribution
Number of Cars Sold Frequency
3 1
4 1
6 2
8 3
1 3
Grouped Frequency Distribution
In this, the various items of a series are classified into groups or classes. The
lowest and highest values that can be included in a class or group are called
class limits. The lowest value is known as lower limit and the highest value
is known as the upper limit.
The width of the class is known as class interval, the number of items
falling within the range of the class interval is called the frequency of that
class.
Open and Close Ended Classes
An open end or undetermined class is a class in which either the lower limit or the upper limit
is missing. In general it is applied to more than or less than type classification.
However in practice they are generally avoided because open end classes make it difficult to
calculate certain statistical measure like arithmetic mean.
For example
Returns (INR 0000) No of times
Less than 10 5
10 – 20 12
Above 20 3
Exclusive and Inclusive Classes
In exclusive classes the upper limit of the class is excluded from the
particular class and in Inclusive class distribution the upper limit of
the classes is included in the particular class.
Exclusive Type Interval Arrangement
Marks (percentage) No. of Students
0-10 15
10-20 17
20-30 22
30-40 23
40-50 30
50-60 39
Exclusive Class Distribution
Marks (percentage) No. of Students
0 and > but < 10 15
10 and > but < 20 17
20 and > but < 30 22
30 and > but < 40 23
40 and > but < 50 30
50 and > but < 60
39
Inclusive class distribution
Marks (percentage) No. of Students
0-9 15
10-19 17
20-29 22
30-39 23
40-49 30
50-59 39
As far as possible the inclusive class intervals shall be avoided as it becomes
very difficult for respondents to understand at times. Every Inclusive type can be
converted in to exclusive type as;
Converting Inclusive Interval in Exclusive
Find the difference between the upper limit of any class and the lower limit
of the next class.
Divide the difference found in step 1 by 2.
Subtract the fraction obtained in step 2 from the lower limit of all the classes
and add the same fraction to the upper limit of all the classes.
Converting Inclusive Interval in Exclusive
In the previous example the difference between the upper limit of the first class and the lower limit
of the second class is 10 – 9 =1. Half of this difference is ½ = 0.5. Hence the previous data can be
modified as
Marks (percentage) No. of Students
- 0.5 – 9.5 15
9.5 -19.5 17
19.5 - 29.5 22
29.5 - 39.5 23
39.5 – 49.5 30
49.5 – 59.5 39
Determination of No of Classes
Sturge’s has given a formula to determine the number of classes:
Where; K represents the number of classes and N as number of observations.
Determination of Magnitude of Class Intervals
The magnitude of the class interval is given by;
Where;
A Perfect Table
Table -1
Marks of Students in Statistics (Section A)
Marks of Students Frequency
00 – 10 5
10 – 20 12
20 – 30 22
30 – 40 11
40 – 50 3
Source: Statistics Test
Charts and Graphs
Pie-Chart
Pie chart is used to show the proportion.
A Pie chart shall contain data in
percentage only.
A complete pie chart represents 100% of
the data.
More effective for 2 – 4 category
proportions
Bar and Column Charts
This chart is used for
comparison.
Data may be used in
percentage or frequency.
Column Chart
Frequency Polygon
Figure -1
(IDBI Bank Stock Price Movement on 09.07.2019)
Pareto Chart
Used for cumulative and
independent observations
collectively.
Thank You