Collection of data
• Types of data
• Primary data: are those which are
collected fresh and for the first time
• Secondary data : are those which have
been collected by someone else and
which have already been passed through
the statistical process
Method of Collection :
Primary data
• [Link] Method: observation by
observer
• “Observation may be defined as systematic
viewing ,coupled with consideration of seen
phenomenon - P.V. Young
• 2. Interview method : Oral-verbal responses
• a. Direct Personal interviews
• b. Indirect personal interviews
• c. Telephonic interviews
• d. Group interviews
• g. Postal interviews
Constructing a frequency distribution involves:
Determining the question to be addressed
Constructing a frequency distribution involves:
Collecting raw data
Determining the question to be addressed
Constructing a frequency distribution involves:
Organizing data (frequency distribution)
Collecting raw data
Determining the question to be addressed
Constructing a frequency distribution involves:
Presenting data (graph)
Organizing data (frequency distribution)
Collecting raw data
Determining the question to be addressed
Constructing a frequency distribution involves:
Drawing conclusions
Presenting data (graph)
Organizing data (frequency distribution)
Collecting raw data
Determining the question to be addressed
A Frequency Distribution is a
grouping of data into mutually exclusive
categories showing the number of
observations in each class.
Uses of tabulation:
It facilitates the process of comparison
Statistical tables save space by reducing exploratory
and descriptive statement to a minimum.
Makes it easier to remember the data
Facilitates the summary of items and the detection of
errors and omissions.
Finally it provides a basis for statistical computations.
Class Midpoint: A point that divides a class
into two equal parts. This is the average of the upper
and lower class limits.
Class interval: The
Class Frequency: class interval is
The number of obtained by subtracting
observations in each the lower limit of a
class. class from the lower
limit of the next class.
The class intervals
should be equal.
Definitions
Step One: Decide on the
t
number of classes using the
formula 2k > n
where k=number
t of
classes
n=number of observations
Two Determine the class interval
Step Two:
or width using the formula
i> H–L
k
The Survival time in months for 50 patients
suffering from acute myeloblastic leukemia(AML)
was given below
a. Draw a frequency distribution table with proper
class intervals
b. Create a Histogram to show the distributions of
survival time
18 31 28 36 05 39 20 04 45 23
36 22 08 07 27 05 23 32 29 22
37 07 24 18 08 04 14 43 13 42
10 12 24 13 17 28 08 09 16 18
44 25 15 04 34 28 32 17 20 19
EXAMPLE 1
oThere are 50 observations so n=50.
oTwo raised to the sixth power is 64.
oTherefore, we should have at least
6classes, i.e., k=6.
i > H – L = 45– 3 =7
k 6
where H=highest value, L=lowest value
Round up for an interval of 7
Set the lower limit of the first class at 0, giving a
total of 7 classes.
Example
Step Three:
Three Set the individual class limits and
Steps Four and Five:
Five Tally and count the number of
items in each class.
0- 7 IIII
7 -14 IIII
14 -- 21 IIII
21 - 28 II
28 - 35 III
35 - 42 II
42 - 49 I
Presentation of data
The objectives:
Become concise without losing the details
Arouse interest in the reader
Become simple to form impressions
Define the problem and suggest the solution too
Become helpful in further analysis
The three commonly used graphic forms are
Histograms, Frequency Polygons, and a
Cumulative Frequency distribution.
A Histogram is a graph in which the class
midpoints or limits are marked on the horizontal
axis and the class frequencies on the vertical axis.
The class frequencies are represented by the heights
of the bars and the bars are drawn adjacent to each
other.
Fig 1:Histogram showing the height of the students
20
19
20 18
18
16
14
Number of cases
12 10
10
6
8
6 4
3
4
2 0
0
140 145 150 155 160 165 170 175
Graphic Presentation of a Frequency
Distribution
A Frequency Polygon consists of
line segments connecting the points
formed by the class midpoint and the
class frequency.
Graphic Presentation of a Frequency Distribution
Frequency Polygon for Hours
Spent Studying
14
12
10
Frequency
8
6
4
2
0
10 15 20 25 30 35
Hours spent studying
Frequency Polygon for Hours Spent Studying
Line diagram
Qualitative data:
• Bar charts:A Bar chart can be used to depict any of the level of
measurement(Nominal,ordinal,interval or ratio).
• Bar chart is a popular and easy method adopted for visual
comparison of the magnitude of different frequencies
• in discrete data, such as of morbidity,mortality , vaccinal status
of population in different ages, sexes, professions or places.
Different types of Bar diagrams are-
• 1. Simple bar diagram
• 2. Multiple bar diagram
• 3. Component bar diagram
• 4. Percentage bar diagram
•
•
Type of leprosy Number of patients
Tuberculoid 148
Lepromatous 64
Indeterminate 18
Borderline 10
Multiple bar chart:
INCOME STATUS
Component Bar chart:
Association between the exposure & the disease
Percentage bar chart:
BURDEN SCALE
Pie Chart
6.50%
18.50%
4.50%
Group A
Group B
Group C
24.50% Group D
Group E
46%
Ex. 1
• A Psychologist estimates the IQ of 30 students . The values are as
follows:
• 86 98 78 96 79 81 103 85
• 94 100 103 112 76 95 98 94
• 101 99 83 94 64 78 122
• 105 11568 84 90 100 96
• Form a frequency distribution Table with proper class interval. Draw
the suitable diagram
Ex 2
• Represent the following data regarding religion wise break
down
• Hindu 297
• Muslim 99
• Christian 29
• Others 15