1
Data organization
Data organization
and
and
Presentation
Presentation
Biostatistics lecture note
Uquba G (MPH, MSc EMCC)
2
Learning objectives
Learning objectives
 At the end of this lecture the students will be able to:
 Identify different ways of data organization & presentation
 Familiar with constructing different methods of data organization and
presentation
3
Methods of data organization
Methods of data organization
 The data collected in a survey is called raw data
 Information is not immediately evident from the mass of unsorted raw data
 Needs to be organized in such a way as to condense information to show patterns
and variations
 Techniques of data organization & presentation
 Ordered array
 Tables &
 Graphs
4
Ordered array
Ordered array
 A serial arrangement of numerical data in an ascending or descending
order
 Tells as the ranges of data and their general distributions
 Appropriate only for small data (<20)
 If it is beyond 20 we need to use frequency distributions or Tables
5
Frequency distributions
Frequency distributions
 A frequency distribution is a summary of how often each value
occurs in a dataset.
 Is a table that shows data classified in to a number of classes with a
corresponding number of times falling in each categories (frequency)
 Frequency is the number of times a certain value of the variable is
separated in a given class.
 Two types
 Categorical frequency distribution
 Numerical frequency distribution
6
Categorical frequency distribution
Categorical frequency distribution
 Used for data that can be placed in specific categories
 Used for nominal & Ordinal
 E.g. blood type, marital status etc.
 Example: A health worker collected data on blood type of 30
individuals and recorded as follows (Hypothetical)
 O, A, AB, B, O, O, O, A, B, O, AB, B, B, A, AB, O, O, O, B, AB, O, A,
AB, B, O, O, O, A, B, O
7
Procedures to construct the frequency distribution
Procedures to construct the frequency distribution
There are 4 types of blood group, so we have four classes
Step 1: Make a table
Step 2: Tally the data & place the result in Tally column
Step 3: count the tally and Place the result in frequency
column
Step 4: calculate the % for each class
% = f/n*100
Where f= frequency of the class &
n= total number of values
8
9
Numerical frequency distribution
Numerical frequency distribution
Here the classification criterion is quantitative
It has two forms
 Ungrouped frequency distribution
 For discrete quantitative data
 Grouped frequency distribution
 For continues quantitative data
10
Ungrouped frequency distribution
Ungrouped frequency distribution
 Is a table of all the potential raw score values that could
possibly occur in the data along with the number of times
each actually occurred
 Often used for small set of data on discrete variables
11
Constructing ungrouped freq. distri.
Constructing ungrouped freq. distri.
 1st
find the smallest & the largest values in the data
 Arrange the data in order of magnitude and count the frequency
 To facilitate counting one may include column of tallies.
 Steps in constructing
 Step 1: make the table
 Step 2: Tally the data
 Step 3: Count the frequency
 Step 4: compute the percentage
 E.g. the following hypothetical data represent family size of 50 households.
4, 6, 4, 3, 5, 2 , 8, 10, 4, 4, 5, 3, 5, 8, 4, 4, 6, 2, 6, 4, 3, 5, 2 , 8, 10, 4, 4, 5, 3, 5, 8,
4, 4, 6, 2, 5, 2 , 8, 10, 4, 4, 5, 3, 10, 4, 5, 6, 3, 5, 6
12
13
Grouped frequency distribution (GFD)
Grouped frequency distribution (GFD)
 A frequency distribution when several numbers are grouped in one
class
 Usually used when the range of the data is large
 Two types
 Inclusive
 the upper limit of one class coincides with the lower limit of the next class
 Exclusive
 the upper limit of one class does not coincides with the lower limit of the
next class
14
Grouped freq. distr….
 Example: Consider the following ungrouped marks of 30 students of
AMC Biostatistics course (out of 50%)
24 30 36 35 42 40 26 23
36 36 12 45 29 21 34 40
16 47 28 32 33 44 19 34
30 36 35 47 20 14
• Construct grouped frequency distribution for the above data
15
Guidelines for creating classes
Guidelines for creating classes
1. There should be b/n 6-20 classes
2. The classes must be mutually exclusive.
i.e. no data value fall into two d/t classes
3. The classes must be all inclusive or exhaustive. i.e. all data values must
be included
4. The classes must be continues.
i.e. No gaps in a frequency distribution
5. The classes must be equal in width.
• The exception here is the first or the last classes
• Possible to have ‘Below…’ or ‘…and above’ class.
• Often used in ages.
16
Steps in constructing Grouped freq. distr
Steps in constructing Grouped freq. distr.
.
1. Find the largest & smallest value
2. Compute range (R) = Maximum –Minimum
• From above example R = 47-12 =35
3. Select number of classes (usually 6-20) or use
 Sturge’s rule k = 1+ 3.322 logn
Where k is desired number of classes &
n is total number of observations
K will be round up if there are values after decimal
From the example above (n =30)
K = 1+ 3.322 log30 (log30 = 1.48)
K = 1 + 3.322(1.48) = 5.9, round up to 6
So we need to have 6 classes
17
Steps…
Steps…
4. Find the class width (w) by dividing the range by the number of
classes and roundup not round off.
From ex. Above w = R/k = 35/6 = 5.8, rounded to 6
5. Form a suitable starting point which is equal to the minimum value.
 Starting point is called the lower limit of the 1st
class
 Continue to add the class width to this lower limit to get the rest of lower limits.
18
Steps..
Steps..
 From the above example the lower class limits (LCL) will be:
 The starting point is Small value = 12, so,
 1st
lower limit = 12
 2nd
lower limit = 12 +6 =18
 3rd
lower limit = 18+6 = 24
 4th
lower limit = 24+6 = 30
 5th
lower limit= 30+6 = 36
 6th
lower limit = 36+6 = 42
19
Steps…
Steps…
 1st
UCL = 12 + 5 = 17
 2nd
UCL = 18 + 5 = 23
 3rd
UCL = 24 + 5 = 29
 4th
UCL = 30 + 5 = 35
 5th
UCL = 36 + 5 = 41
 6th
UCL = 42 + 5 = 47
Classes Tally Freq %
12-17
18-23
24-29
30-35
36-41
42-47
Total
6. Find the upper class limit (UCL),
UCL= LCL + (w-1)
From the above ex. W= 6,
so, W-1 = 5
20
Steps …
Steps …
7. Make tally
8. Count the tally & fill frequency
9. Calculate & fill percentages
10. Find relative frequency (rf)
Rf=f/n
11. Find cumulative frequency (cf):
 Lcf : Less than cumulative frequency (<UCB)
 Gcf: Greater than cumulative frequency (>LCB)
21
By combining all the steps
By combining all the steps
Classes Tally Freq % rf Cf
(less than)
Cf (greater than)
12-17 /// 3 10.0 0.10 3 30
18-23 //// 4 13.3 0.13 7 27
24-29 //// 4 13.3 0.13 11 23
30-35 //// /// 8 26.7 0.27 19 19
36-41 //// / 6 20.0 0.20 25 11
42-47 //// 5 16.7 0.17 30 5
Total 30 100.0 1.00
22
Common terms used in grouped freq. distr. (GFD)
Common terms used in grouped freq. distr. (GFD)
 Class interval: range of scores grouped together in a GFD
 Class limits: the first & the last elements in the given class interval
 Units of measurement (U): the distance between two consecutive measures
 U = (n+1)th
LCL – nth
UCL
Example 12-17,
 18-23, U = 18-17 =1
 U is usually taken as; 1, 0.1, 0.01, 0.001….
Example 1: Unit of Measurement U = 0.1
 Intervals:
 12.0 - 12.1
 12.1 - 12.2
 Calculation:
 U=12.1−12.0=0.1U = 12.1 - 12.0 = 0.1U=12.1−12.0=0.1
23
Common terms used in grouped freq. distr. (GFD)
Common terms used in grouped freq. distr. (GFD)
 Example 2: Unit of Measurement U = 0.01
 Intervals:
 12.00 - 12.01
 12.01 - 12.02
 Calculation:
 U=12.01−12.00=0.01U = 12.01 - 12.00 = 0.01U=12.01−12.00=0.01
 Example 3: Unit of Measurement U = 0.001
 Intervals:
 12.000 - 12.001
 12.001 - 12.002
 Calculation:
 U=12.001−12.000=0.001U = 12.001 - 12.000 = 0.001U=12.001−12.000=0.001
24
Terms….
Terms….
12. Class boundaries: separates one class in GFD from another
 The boundaries have one more decimal places than the raw data and therefore do not
appear in the data
 There is no gap b/n the upper boundary of one class and the lower bounder of the next
class
 LCB = LCL-U/2
 UCB = UCL + U/2
 Eg. 12-17, 18-23, U = 18-17 =1
 LCB for 18-23, 18-1/2 = 18-0.5 =17.5
 UCB for 18-23, 23 + ½ = 23 +0.5 =23.5
25
Terms…
Terms…
Classes Class boundaries Freq %
12-17 11.5-17.5
18-23 17.5-23.5
24-29 23.5-29.5
30-35 29.5-35.5
36-41 35.5-41.5
42-47 41.5-47.5
Total
Classes Boundaries
• Class width (w) = UCB-LCB
26
Terms…
Terms…
13. Class mark (Xc)
 The mid point of the class
 The average of LCL & UCL or the average
of LCB + UCB
 Xc = LCL + UCL
2
Eg. Xc1 = 12+17 = 29/2 = 14.5
2
Classes Class marks
(Xc)
12-17 14.5
18-23 20.5
24-29 26.5
30-35 32.5
36-41 38.5
42-47 44.5
Total
------------------------------------------------------------------------------------
Cholesterol level
Mg/100ml freq Relative freq Cum freq Cum.rel. freq
-------------------------------------------------------------------------------------------
80-119 13 1.2 13 1.2
120-159 150 14.1 163 15.3
160-199 442 41.4 605 56.7
200-239 299 28.0 904 84.7
240-279 115 10.8 1019 95.5
280-319 34 3.2 1053 98.7
320-359 9 0.8 1062 99.5
360-399 5 0.5 1067 100
-------------------------------------------------------------------------------------------
Total 1067 100
Table xx. Frequencies of serum cholesterol levels for 1067 US males of ages 25-34 1976-1980
Example: the following data represent the amount of S. urea
(Mg/dl) of some patients
15,18,22,27,30,35,40, 18, 22, 24, 26, 45, 30, 17, 12, 32, 27, 41,
25, 15, 10, 12, 27, 26, 21, 20, 30, 28, 22, 25, 28, 42, 44, 48, 37,
25, 23, 24, 19, 20, 17, 13, 11, 15, 17, 32, 12, 14, 18, 11
i. Represent the above observations by a frequency table
ii. Find number of persons who have S. Urea 25 and above
iii. find percentage of persons who have S. Urea less than 30.
29
Rules in constructing tables
Rules in constructing tables
1. Table should be as simple as possible (6-20 categories)
2. Tables should be self explanatory
• Title should be clear and to the point (answers: What, when, where, how
classified)
e.g. Table 1: Marks of 30 Medical students of AAU, March 2011, AA, Ethiopia.
• Placed above the table
3. Each raw & column should be labelled
4. Numerical entities of zero should be explicitly written rather than indicating by dash,
as dashes are reserved for missing or unobserved data.
5. Totals should be indicated (last raw last column)
6. If the data are not original, their source should be given in foot notes.
30
Types of tables
Types of tables
 We have three d/t types of tables based on the number of variables included
1. Simple or one way table
 Single variable involved
2. Two way table
- Two variables cross tabulated
3. Higher ordered table
- Three or more variables involved
Brief
1. Immunization status of children in xxx woreda
2. Immunization status by sex of children in xxx woreda
3. Immunization status by sex and residence of children in xxx
woreda
32
Eg. One way
Eg. One way
 Table 2: Immunization status of children in xxx woreda, 2010
(hypothetical)
Immunization status Number Percent
Immunized 135 64.3
Not immunized 75 35.7
Total 210 100.0
33
Eg. Two way table
Eg. Two way table
 Table 3: Immunization status by sex of children in xxx woreda, 2010
(hypothetical)
Sex of children Immunization status Total
Immunized Not immunized N %
N % N %
Male 85 65.4 45 34.6 130 100.0
Female 50 62.5 30 37.5 80
Total 135 64.3 75 35.7 210 100.0
34
Eg. Higher ordered table
Eg. Higher ordered table
•Table 4: Immunization status by sex and residence of children in xxx woreda, 2010 (hypothetic
Sex & residence of children
Immunization status Total
Immunized Not immunized N %
N % N %
Male Urban 55 68.7 25 31.3 80 100.0
Rural 30 60.0 20 40.0 50 100.0
Female Urban 40 66.7 20 33.3 60 100.0
rural 10 50.0 10 50.0 20 100.0
Total 135 64.3 75 35.7 210 100.0
Diagrammatic/Graphical
presentation of data
Objectives
 At the end of the class the students will be able to:
 Identify the different types of graphs
 Chose among the graphs based on the data
 Familiar with constructing the different types of graphs
 Identify importance and limitation of using graphs
Graphical presentation of data
 Techniques for presenting data in visual displays using geometric and pictures.
 Importance
 Greater attraction
 Easily understandable
 Facilitate comparison
 May reveal unsuspected patterns in complex set of data
 Greater memorizing value
Limitation
 Used only for purpose of comparison
 Not an alternative to tabulation
 Can give only an approximate idea
 They fail to bring to light to small differences
Graphs
 Graph should be CONISTENT
 With the size of the paper in w/c the diagram is to be drawn
 With value of the variable to be presented
 Sob the scale is chosen that the diagram is not looked too small
or too big
 It should be neat and clean
 Index to explain the symbols, colours, lines
 Explanatory notes to explains the important points at the bottom
or corner of the diagram.
Types of graphs
 For qualitative & quantitative discrete data
 Bar chart
 Pie chart
 For quantitative continues data
 Histograms
 Frequency polygon
 Cumulative frequency polygon ((ogive)
 Stem and leaf plot
 Scatter diagram
 Line graph
Bar chart
 A series of equally spaced bars having equal width (base) where
the height of the bar represents the frequency of (amount)
associated with each category.
 It could be either vertical or horizontal
 Three types based on number of variables involved
 Simple bar chart
 Multiple bar chart
 Component bar chart
Simple bar chart
 From our previous example
Table 2: Immunization status of children in xxx woreda, 2010
(hypothetical)
Immunization
status
Freq %
Immunized 135 64.3
Not immunized 75 35.7
Total 210 100.0
0
20
40
60
80
100
120
140
160
Immunized Not immunized
Immunization Status
num
ber
of
children
Figure 1: Immunization status of children in xxx woreda,
2010 (hypothetical)
Multiple bar chart
 From the previous example
Table 3: Immunization status by sex of children in xxx woreda, 2010
(hypothetical)
Sex of children Immunization status Total
Immunized Not immunized N %
N % N %
Male 85 65.4 45 34.6 130 100.0
Female 50 62.5 30 37.5 80
Total 135 64.3 75 35.7 210 100.0
Multiple bar chart…
0
10
20
30
40
50
60
70
80
90
Immunized not immunized
Immunization
N
o.
of
childern
Male
Female
Figure 2: Immunization status by sex of children in xxx woreda, 2010 (hypothetical)
0
10
20
30
40
50
60
70
Immunized not immunized
Immunization
%
of
children
Male
Female
Component bar chart
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
Male Female
Sex
%
of
children
0
20
40
60
80
100
120
140
Male Female
Sex
NO.
of
children
Figure 3: Immunization status by sex of children in xxx woreda, 2010 (hypothetical)
We can also construct component bar chart for the above table
.
. .
Bar charts showing frequency distribution of the variable ‘BWT’ described in
Table
0
1000
2000
3000
4000
5000
6000
Very low Low Normal Big
BWT
Freq.
0
20
40
60
80
100
Very low Low Normal Big
BWT
Rel.
Freq.
.
. .
Bar charts for comparison
 In order to compare the distribution of a variable for two or more groups,
bars are often drawn along side each other for groups being compared in a
single bar chart
9
88.9
2.1
7.9
89
3.1
0
10
20
30
40
50
60
70
80
90
100
Low Normal Big
BWT
Percen
t
Yes
No
BWT
Big
Normal
Low
Freq.
6000
5000
4000
3000
2000
1000
0
Antenatal Care
No
NNo
Yes
Bar chart indicating categories of birth weight of 9975 newborns grouped by antenatal follow-up of the
mothers
Pie chart
 A circle divided in to sectors so that the areas of the sectors
are proportional to the frequencies.
 Distribution of angles (360o
) is made based on the proportion
of each frequency’s share from the total observation.
 fi/n * 360o
or % of each class * 360o
Pie chart…
Blood Type Freq. %
A 5 16.7
B 7 23.3
AB 5 16.7
O 13 43.3
Total 30 100
A = 5/30*360o
=60o
B = 7/30*360o
=84o
AB = 5/30*360o
=60o
O = 13/30*360o
=156o
Example:
Table 4: Blood type of 30 individuals in xxx woreda, 2010
(hypothetical)
Pie chart
17%
23%
17%
43%
A
B
AB
O
Figure 4: Blood type of 30 individuals in XXXX Woreda, 2010 (hypothetical)
Line graph (diagram)
 The diagram is usually used to represents the time series data, where time is
plotted horizontal axis (x-axis) and the value of the variables against time is
plotted in variable axis (Y- axis) with appropriate scale
 Example: the following data represent the number of data patients of a
hospital in different years
Year 2006 2007 2008 2009 2010 2011
# of dead
patients
725 680 550 650 540 500
Histograms
 Graph consists of series of rectangles whose bases are equal to the class width of
the corresponding class & whose heights are proportional to class frequencies
 Used for quantitative Continues data
1. The horizontal axis is continues scale running from one extreme end
to the other
 Should be labelled with the name of the variable & units of measurement
2. For each class in the distribution, a vertical rectangle is drawn with:
 There will never be gaps b/n the histogram rectangles
 Bases of rectangle will be determined by the class width
Eg. Conseder the data on student marks
Classes Class marks (Xc) Frequency
12-17 14.5 3
18-23 20.5 4
24-29 26.5 4
30-35 32.5 8
36-41 38.5 6
42-47 44.5 5
Total 30
Table 5: Marks of 30 students, AMC Biostatistics course (out of 50%) , Ethiopia, 2021 (hypothetical data)
Histograms
Figure 4: Histograms showing students’ mark, AMC, 2021 (hypothetical data)
Frequency Polygon
 Join the mid points of the tops of the adjacent rectangles of the
histogram with segments
 When it is joined with x-axis the area under the polygon is equal to the
area under the histogram.
 The scales should be marked in the numerical values of the midpoints
(Xc)
 The length of the ordinates represent the class frequency.
2. Data organization and presentaion.ppt
Figure 5: Frequency polygon showing mark of 30 students, AAU, 2010, (Hypothetical data
Cumulative frequency polygon (Ogive)
 Line graph obtained by plotting the cumulative frequency
distribution (Y-axis) against class boundaries (x-axis)
 Two types
 Cumulative frequency Less than the UCB (Lcf)or
 Cumulative frequency More than the LCB (Mcf)
 We can also use the intersection of the two.
Construct Ogive by using the table from the above Example
Classes Class
boundaries
Freq Less
than cF
More than cf
12-17 11.5-17.5 3 3 30
18-23 17.5-23.5 4 7 27
24-29 23.5-29.5 4 11 23
30-35 29.5-35.5 8 19 19
36-41 35.5-41.5 6 25 11
42-47 41.5-47.5 5 30 5
Total 30
Less than Ogive
Figure 6: Less than Ogive showing mark of 30 students, AAU, 2010, (Hypothetical data)
More than Ogive
Figure 7: More than Ogive showing mark of 30 students, AAU, 2010, (Hypothetical data)
Scatter plot
 Used to show the relation ship between two variables.
 The symbol usually a dot is used to show the data pair.
 X-axis – independent variables (factors, cause)
 Y- axis dependent variables (out-come )
Scatter plot
Summary
Data Type Graph Type Description Example
Qualitative Bar Chart Frequency of categories Types of fruits sold
Qualitative Pie Chart Proportions of categories Market share of smartphone
brands
Qualitative Stacked Bar
Chart
Composition of categories Students by major and year
Quantitative Discrete Bar Chart Counts of discrete numeric values Number of students in
grades
Quantitative Discrete Histogram Distribution of discrete values Scores in a test
Quantitative Discrete Dot Plot Individual occurrences of values Books read by each student
Quantitative
Continuous
Histogram Distribution of continuous data Heights of students
Quantitative
Continuous
Line Graph Trends over time Temperature changes over a
week
Quantitative
Continuous
Box Plot Summary of data distribution Test scores across classes
Quantitative
Continuous
Scatter Plot Relationship between two
continuous variables
Hours studied vs. exam
scores
Quiz
What is the most appropriate graphical method to display for
the following data?
1. The distribution of pain level status
2. The height measurement of 20 individuals
3. The age of pregnant women attending ANC
4. Treatment outcome among RTA patients

More Related Content

PPTX
Basic statistics for marketing management
PDF
2 Methods of Data Presentation print.pdf
PPT
data presentation....................ppt
PPTX
data organization and presentation.pptx
PPTX
3 Frequency Distribution biostatistics wildlife
PPTX
Methods of data presention
PDF
Biostatistics methods of data organisation and presentation
PPT
Classidication and Tabulation
Basic statistics for marketing management
2 Methods of Data Presentation print.pdf
data presentation....................ppt
data organization and presentation.pptx
3 Frequency Distribution biostatistics wildlife
Methods of data presention
Biostatistics methods of data organisation and presentation
Classidication and Tabulation

Similar to 2. Data organization and presentaion.ppt (20)

PPT
statistic.ppt
PPTX
Chapter 3: Prsentation of Data
PPTX
Frequency Distribution (Class-interval- Tally).pptx
PPTX
Biostatistics mean median mode unit 1.pptx
PPTX
Tabulation of Data, Frequency Distribution, Contingency table
PPTX
Chapter 2 Descriptive statistics for pedatric.pptx
PDF
Frequency distribution explanation PPT.pdf
PPTX
Frequency Distributions for Organizing and Summarizing
PPTX
Chapter-5-Frequency-Distribution Mathematics in the modern World.pptx
PPTX
Basic statics
PPT
Chapter 2
PPT
Chapter 2
PPT
CLASSIFICATION AND TABULATION in Biostatic
PPTX
Biostatistics Lecture 1 8th Sem B.Pharm AKTU
PPTX
lesson-data-presentation-tools-1.pptx
PPT
Frequency distribution of a continuous variable. in Biostatic
PPT
Frequency distribution and graphs statistics.ppt
PPTX
2.1 frequency distributions for organizing and summarizing data
PPTX
ORGANISATION OF DATA.pptx
PPTX
3_-frequency_distribution.pptx
statistic.ppt
Chapter 3: Prsentation of Data
Frequency Distribution (Class-interval- Tally).pptx
Biostatistics mean median mode unit 1.pptx
Tabulation of Data, Frequency Distribution, Contingency table
Chapter 2 Descriptive statistics for pedatric.pptx
Frequency distribution explanation PPT.pdf
Frequency Distributions for Organizing and Summarizing
Chapter-5-Frequency-Distribution Mathematics in the modern World.pptx
Basic statics
Chapter 2
Chapter 2
CLASSIFICATION AND TABULATION in Biostatic
Biostatistics Lecture 1 8th Sem B.Pharm AKTU
lesson-data-presentation-tools-1.pptx
Frequency distribution of a continuous variable. in Biostatic
Frequency distribution and graphs statistics.ppt
2.1 frequency distributions for organizing and summarizing data
ORGANISATION OF DATA.pptx
3_-frequency_distribution.pptx
Ad

More from oumer5 (11)

PPTX
Viral Disease of reproductive safrica co
PPTX
Chapter 1 General pharmacology afr.pptx
PDF
ART pharmacology africa medical coll.pdf
PDF
Antifungal agents Africa college pre.pdf
PPT
Healing 2024 Africa medical collage 2101
PPTX
hemodynamics 221 African medical collage
PPTX
1. General pathology africa medical coll
PPTX
GENERAL CHEMISTRY OF CHEMICAL REACTION 1
PPTX
Histology of epitlial tissue of skin and
PPT
Common terminology associated with drug, fluid and electrolytes
PPT
Muscular system class at addis ababa cba
Viral Disease of reproductive safrica co
Chapter 1 General pharmacology afr.pptx
ART pharmacology africa medical coll.pdf
Antifungal agents Africa college pre.pdf
Healing 2024 Africa medical collage 2101
hemodynamics 221 African medical collage
1. General pathology africa medical coll
GENERAL CHEMISTRY OF CHEMICAL REACTION 1
Histology of epitlial tissue of skin and
Common terminology associated with drug, fluid and electrolytes
Muscular system class at addis ababa cba
Ad

Recently uploaded (20)

PPTX
migraine heaEDDDDDDDADFAAAAAAFdache (1).pptx
PPTX
Skeletal System presentation for high school
PPTX
Maternal and child health. The normal new born.pptx
PDF
Updates In Managing Cholesterol - Dr Matthew Liew
PDF
mycobacterial infection tuberculosis (TB)
DOCX
Advanced Nursing Procedures.....realted to advance nursing practice M.Sc. 1st...
PPTX
CASE PRESENTATION ON BIRTHAPHYXIA ,PPT PRESENTATION
PPTX
Pharmaco vigilance for BAMS according to NCISM
PPTX
A med nursing, GRP 4-SIKLE CELL DISEASE IN MEDICAL NURSING
PPTX
MEDICAL NURSING. Endocrine Disorder.pptx
PPTX
Management Basics Applied to Nursing.pptx
PDF
WHO Global TUBERCULOSIS Report 2018-2019
PPTX
Direct ELISA - procedure and application.pptx
PPTX
1-back pain presentation presentation .pptx
PPTX
RENAL IMAGING MODALITIES-RENAL NURSING.pptx
PPTX
case study of ischemic stroke for nursing
PPTX
PPTX
osteoporosis in menopause...............
PPT
Doppler - 5.ppt .........................
PPTX
OccupationalhealthPPT1Phealthinindustriesandsafety.pptx
migraine heaEDDDDDDDADFAAAAAAFdache (1).pptx
Skeletal System presentation for high school
Maternal and child health. The normal new born.pptx
Updates In Managing Cholesterol - Dr Matthew Liew
mycobacterial infection tuberculosis (TB)
Advanced Nursing Procedures.....realted to advance nursing practice M.Sc. 1st...
CASE PRESENTATION ON BIRTHAPHYXIA ,PPT PRESENTATION
Pharmaco vigilance for BAMS according to NCISM
A med nursing, GRP 4-SIKLE CELL DISEASE IN MEDICAL NURSING
MEDICAL NURSING. Endocrine Disorder.pptx
Management Basics Applied to Nursing.pptx
WHO Global TUBERCULOSIS Report 2018-2019
Direct ELISA - procedure and application.pptx
1-back pain presentation presentation .pptx
RENAL IMAGING MODALITIES-RENAL NURSING.pptx
case study of ischemic stroke for nursing
osteoporosis in menopause...............
Doppler - 5.ppt .........................
OccupationalhealthPPT1Phealthinindustriesandsafety.pptx

2. Data organization and presentaion.ppt

  • 2. 2 Learning objectives Learning objectives  At the end of this lecture the students will be able to:  Identify different ways of data organization & presentation  Familiar with constructing different methods of data organization and presentation
  • 3. 3 Methods of data organization Methods of data organization  The data collected in a survey is called raw data  Information is not immediately evident from the mass of unsorted raw data  Needs to be organized in such a way as to condense information to show patterns and variations  Techniques of data organization & presentation  Ordered array  Tables &  Graphs
  • 4. 4 Ordered array Ordered array  A serial arrangement of numerical data in an ascending or descending order  Tells as the ranges of data and their general distributions  Appropriate only for small data (<20)  If it is beyond 20 we need to use frequency distributions or Tables
  • 5. 5 Frequency distributions Frequency distributions  A frequency distribution is a summary of how often each value occurs in a dataset.  Is a table that shows data classified in to a number of classes with a corresponding number of times falling in each categories (frequency)  Frequency is the number of times a certain value of the variable is separated in a given class.  Two types  Categorical frequency distribution  Numerical frequency distribution
  • 6. 6 Categorical frequency distribution Categorical frequency distribution  Used for data that can be placed in specific categories  Used for nominal & Ordinal  E.g. blood type, marital status etc.  Example: A health worker collected data on blood type of 30 individuals and recorded as follows (Hypothetical)  O, A, AB, B, O, O, O, A, B, O, AB, B, B, A, AB, O, O, O, B, AB, O, A, AB, B, O, O, O, A, B, O
  • 7. 7 Procedures to construct the frequency distribution Procedures to construct the frequency distribution There are 4 types of blood group, so we have four classes Step 1: Make a table Step 2: Tally the data & place the result in Tally column Step 3: count the tally and Place the result in frequency column Step 4: calculate the % for each class % = f/n*100 Where f= frequency of the class & n= total number of values
  • 8. 8
  • 9. 9 Numerical frequency distribution Numerical frequency distribution Here the classification criterion is quantitative It has two forms  Ungrouped frequency distribution  For discrete quantitative data  Grouped frequency distribution  For continues quantitative data
  • 10. 10 Ungrouped frequency distribution Ungrouped frequency distribution  Is a table of all the potential raw score values that could possibly occur in the data along with the number of times each actually occurred  Often used for small set of data on discrete variables
  • 11. 11 Constructing ungrouped freq. distri. Constructing ungrouped freq. distri.  1st find the smallest & the largest values in the data  Arrange the data in order of magnitude and count the frequency  To facilitate counting one may include column of tallies.  Steps in constructing  Step 1: make the table  Step 2: Tally the data  Step 3: Count the frequency  Step 4: compute the percentage  E.g. the following hypothetical data represent family size of 50 households. 4, 6, 4, 3, 5, 2 , 8, 10, 4, 4, 5, 3, 5, 8, 4, 4, 6, 2, 6, 4, 3, 5, 2 , 8, 10, 4, 4, 5, 3, 5, 8, 4, 4, 6, 2, 5, 2 , 8, 10, 4, 4, 5, 3, 10, 4, 5, 6, 3, 5, 6
  • 12. 12
  • 13. 13 Grouped frequency distribution (GFD) Grouped frequency distribution (GFD)  A frequency distribution when several numbers are grouped in one class  Usually used when the range of the data is large  Two types  Inclusive  the upper limit of one class coincides with the lower limit of the next class  Exclusive  the upper limit of one class does not coincides with the lower limit of the next class
  • 14. 14 Grouped freq. distr….  Example: Consider the following ungrouped marks of 30 students of AMC Biostatistics course (out of 50%) 24 30 36 35 42 40 26 23 36 36 12 45 29 21 34 40 16 47 28 32 33 44 19 34 30 36 35 47 20 14 • Construct grouped frequency distribution for the above data
  • 15. 15 Guidelines for creating classes Guidelines for creating classes 1. There should be b/n 6-20 classes 2. The classes must be mutually exclusive. i.e. no data value fall into two d/t classes 3. The classes must be all inclusive or exhaustive. i.e. all data values must be included 4. The classes must be continues. i.e. No gaps in a frequency distribution 5. The classes must be equal in width. • The exception here is the first or the last classes • Possible to have ‘Below…’ or ‘…and above’ class. • Often used in ages.
  • 16. 16 Steps in constructing Grouped freq. distr Steps in constructing Grouped freq. distr. . 1. Find the largest & smallest value 2. Compute range (R) = Maximum –Minimum • From above example R = 47-12 =35 3. Select number of classes (usually 6-20) or use  Sturge’s rule k = 1+ 3.322 logn Where k is desired number of classes & n is total number of observations K will be round up if there are values after decimal From the example above (n =30) K = 1+ 3.322 log30 (log30 = 1.48) K = 1 + 3.322(1.48) = 5.9, round up to 6 So we need to have 6 classes
  • 17. 17 Steps… Steps… 4. Find the class width (w) by dividing the range by the number of classes and roundup not round off. From ex. Above w = R/k = 35/6 = 5.8, rounded to 6 5. Form a suitable starting point which is equal to the minimum value.  Starting point is called the lower limit of the 1st class  Continue to add the class width to this lower limit to get the rest of lower limits.
  • 18. 18 Steps.. Steps..  From the above example the lower class limits (LCL) will be:  The starting point is Small value = 12, so,  1st lower limit = 12  2nd lower limit = 12 +6 =18  3rd lower limit = 18+6 = 24  4th lower limit = 24+6 = 30  5th lower limit= 30+6 = 36  6th lower limit = 36+6 = 42
  • 19. 19 Steps… Steps…  1st UCL = 12 + 5 = 17  2nd UCL = 18 + 5 = 23  3rd UCL = 24 + 5 = 29  4th UCL = 30 + 5 = 35  5th UCL = 36 + 5 = 41  6th UCL = 42 + 5 = 47 Classes Tally Freq % 12-17 18-23 24-29 30-35 36-41 42-47 Total 6. Find the upper class limit (UCL), UCL= LCL + (w-1) From the above ex. W= 6, so, W-1 = 5
  • 20. 20 Steps … Steps … 7. Make tally 8. Count the tally & fill frequency 9. Calculate & fill percentages 10. Find relative frequency (rf) Rf=f/n 11. Find cumulative frequency (cf):  Lcf : Less than cumulative frequency (<UCB)  Gcf: Greater than cumulative frequency (>LCB)
  • 21. 21 By combining all the steps By combining all the steps Classes Tally Freq % rf Cf (less than) Cf (greater than) 12-17 /// 3 10.0 0.10 3 30 18-23 //// 4 13.3 0.13 7 27 24-29 //// 4 13.3 0.13 11 23 30-35 //// /// 8 26.7 0.27 19 19 36-41 //// / 6 20.0 0.20 25 11 42-47 //// 5 16.7 0.17 30 5 Total 30 100.0 1.00
  • 22. 22 Common terms used in grouped freq. distr. (GFD) Common terms used in grouped freq. distr. (GFD)  Class interval: range of scores grouped together in a GFD  Class limits: the first & the last elements in the given class interval  Units of measurement (U): the distance between two consecutive measures  U = (n+1)th LCL – nth UCL Example 12-17,  18-23, U = 18-17 =1  U is usually taken as; 1, 0.1, 0.01, 0.001…. Example 1: Unit of Measurement U = 0.1  Intervals:  12.0 - 12.1  12.1 - 12.2  Calculation:  U=12.1−12.0=0.1U = 12.1 - 12.0 = 0.1U=12.1−12.0=0.1
  • 23. 23 Common terms used in grouped freq. distr. (GFD) Common terms used in grouped freq. distr. (GFD)  Example 2: Unit of Measurement U = 0.01  Intervals:  12.00 - 12.01  12.01 - 12.02  Calculation:  U=12.01−12.00=0.01U = 12.01 - 12.00 = 0.01U=12.01−12.00=0.01  Example 3: Unit of Measurement U = 0.001  Intervals:  12.000 - 12.001  12.001 - 12.002  Calculation:  U=12.001−12.000=0.001U = 12.001 - 12.000 = 0.001U=12.001−12.000=0.001
  • 24. 24 Terms…. Terms…. 12. Class boundaries: separates one class in GFD from another  The boundaries have one more decimal places than the raw data and therefore do not appear in the data  There is no gap b/n the upper boundary of one class and the lower bounder of the next class  LCB = LCL-U/2  UCB = UCL + U/2  Eg. 12-17, 18-23, U = 18-17 =1  LCB for 18-23, 18-1/2 = 18-0.5 =17.5  UCB for 18-23, 23 + ½ = 23 +0.5 =23.5
  • 25. 25 Terms… Terms… Classes Class boundaries Freq % 12-17 11.5-17.5 18-23 17.5-23.5 24-29 23.5-29.5 30-35 29.5-35.5 36-41 35.5-41.5 42-47 41.5-47.5 Total Classes Boundaries • Class width (w) = UCB-LCB
  • 26. 26 Terms… Terms… 13. Class mark (Xc)  The mid point of the class  The average of LCL & UCL or the average of LCB + UCB  Xc = LCL + UCL 2 Eg. Xc1 = 12+17 = 29/2 = 14.5 2 Classes Class marks (Xc) 12-17 14.5 18-23 20.5 24-29 26.5 30-35 32.5 36-41 38.5 42-47 44.5 Total
  • 27. ------------------------------------------------------------------------------------ Cholesterol level Mg/100ml freq Relative freq Cum freq Cum.rel. freq ------------------------------------------------------------------------------------------- 80-119 13 1.2 13 1.2 120-159 150 14.1 163 15.3 160-199 442 41.4 605 56.7 200-239 299 28.0 904 84.7 240-279 115 10.8 1019 95.5 280-319 34 3.2 1053 98.7 320-359 9 0.8 1062 99.5 360-399 5 0.5 1067 100 ------------------------------------------------------------------------------------------- Total 1067 100 Table xx. Frequencies of serum cholesterol levels for 1067 US males of ages 25-34 1976-1980
  • 28. Example: the following data represent the amount of S. urea (Mg/dl) of some patients 15,18,22,27,30,35,40, 18, 22, 24, 26, 45, 30, 17, 12, 32, 27, 41, 25, 15, 10, 12, 27, 26, 21, 20, 30, 28, 22, 25, 28, 42, 44, 48, 37, 25, 23, 24, 19, 20, 17, 13, 11, 15, 17, 32, 12, 14, 18, 11 i. Represent the above observations by a frequency table ii. Find number of persons who have S. Urea 25 and above iii. find percentage of persons who have S. Urea less than 30.
  • 29. 29 Rules in constructing tables Rules in constructing tables 1. Table should be as simple as possible (6-20 categories) 2. Tables should be self explanatory • Title should be clear and to the point (answers: What, when, where, how classified) e.g. Table 1: Marks of 30 Medical students of AAU, March 2011, AA, Ethiopia. • Placed above the table 3. Each raw & column should be labelled 4. Numerical entities of zero should be explicitly written rather than indicating by dash, as dashes are reserved for missing or unobserved data. 5. Totals should be indicated (last raw last column) 6. If the data are not original, their source should be given in foot notes.
  • 30. 30 Types of tables Types of tables  We have three d/t types of tables based on the number of variables included 1. Simple or one way table  Single variable involved 2. Two way table - Two variables cross tabulated 3. Higher ordered table - Three or more variables involved
  • 31. Brief 1. Immunization status of children in xxx woreda 2. Immunization status by sex of children in xxx woreda 3. Immunization status by sex and residence of children in xxx woreda
  • 32. 32 Eg. One way Eg. One way  Table 2: Immunization status of children in xxx woreda, 2010 (hypothetical) Immunization status Number Percent Immunized 135 64.3 Not immunized 75 35.7 Total 210 100.0
  • 33. 33 Eg. Two way table Eg. Two way table  Table 3: Immunization status by sex of children in xxx woreda, 2010 (hypothetical) Sex of children Immunization status Total Immunized Not immunized N % N % N % Male 85 65.4 45 34.6 130 100.0 Female 50 62.5 30 37.5 80 Total 135 64.3 75 35.7 210 100.0
  • 34. 34 Eg. Higher ordered table Eg. Higher ordered table •Table 4: Immunization status by sex and residence of children in xxx woreda, 2010 (hypothetic Sex & residence of children Immunization status Total Immunized Not immunized N % N % N % Male Urban 55 68.7 25 31.3 80 100.0 Rural 30 60.0 20 40.0 50 100.0 Female Urban 40 66.7 20 33.3 60 100.0 rural 10 50.0 10 50.0 20 100.0 Total 135 64.3 75 35.7 210 100.0
  • 36. Objectives  At the end of the class the students will be able to:  Identify the different types of graphs  Chose among the graphs based on the data  Familiar with constructing the different types of graphs  Identify importance and limitation of using graphs
  • 37. Graphical presentation of data  Techniques for presenting data in visual displays using geometric and pictures.  Importance  Greater attraction  Easily understandable  Facilitate comparison  May reveal unsuspected patterns in complex set of data  Greater memorizing value Limitation  Used only for purpose of comparison  Not an alternative to tabulation  Can give only an approximate idea  They fail to bring to light to small differences
  • 38. Graphs  Graph should be CONISTENT  With the size of the paper in w/c the diagram is to be drawn  With value of the variable to be presented  Sob the scale is chosen that the diagram is not looked too small or too big  It should be neat and clean  Index to explain the symbols, colours, lines  Explanatory notes to explains the important points at the bottom or corner of the diagram.
  • 39. Types of graphs  For qualitative & quantitative discrete data  Bar chart  Pie chart  For quantitative continues data  Histograms  Frequency polygon  Cumulative frequency polygon ((ogive)  Stem and leaf plot  Scatter diagram  Line graph
  • 40. Bar chart  A series of equally spaced bars having equal width (base) where the height of the bar represents the frequency of (amount) associated with each category.  It could be either vertical or horizontal  Three types based on number of variables involved  Simple bar chart  Multiple bar chart  Component bar chart
  • 41. Simple bar chart  From our previous example Table 2: Immunization status of children in xxx woreda, 2010 (hypothetical) Immunization status Freq % Immunized 135 64.3 Not immunized 75 35.7 Total 210 100.0 0 20 40 60 80 100 120 140 160 Immunized Not immunized Immunization Status num ber of children Figure 1: Immunization status of children in xxx woreda, 2010 (hypothetical)
  • 42. Multiple bar chart  From the previous example Table 3: Immunization status by sex of children in xxx woreda, 2010 (hypothetical) Sex of children Immunization status Total Immunized Not immunized N % N % N % Male 85 65.4 45 34.6 130 100.0 Female 50 62.5 30 37.5 80 Total 135 64.3 75 35.7 210 100.0
  • 43. Multiple bar chart… 0 10 20 30 40 50 60 70 80 90 Immunized not immunized Immunization N o. of childern Male Female Figure 2: Immunization status by sex of children in xxx woreda, 2010 (hypothetical) 0 10 20 30 40 50 60 70 Immunized not immunized Immunization % of children Male Female
  • 44. Component bar chart 0.00% 20.00% 40.00% 60.00% 80.00% 100.00% 120.00% Male Female Sex % of children 0 20 40 60 80 100 120 140 Male Female Sex NO. of children Figure 3: Immunization status by sex of children in xxx woreda, 2010 (hypothetical) We can also construct component bar chart for the above table
  • 45. . . . Bar charts showing frequency distribution of the variable ‘BWT’ described in Table 0 1000 2000 3000 4000 5000 6000 Very low Low Normal Big BWT Freq. 0 20 40 60 80 100 Very low Low Normal Big BWT Rel. Freq.
  • 46. . . . Bar charts for comparison  In order to compare the distribution of a variable for two or more groups, bars are often drawn along side each other for groups being compared in a single bar chart 9 88.9 2.1 7.9 89 3.1 0 10 20 30 40 50 60 70 80 90 100 Low Normal Big BWT Percen t Yes No BWT Big Normal Low Freq. 6000 5000 4000 3000 2000 1000 0 Antenatal Care No NNo Yes Bar chart indicating categories of birth weight of 9975 newborns grouped by antenatal follow-up of the mothers
  • 47. Pie chart  A circle divided in to sectors so that the areas of the sectors are proportional to the frequencies.  Distribution of angles (360o ) is made based on the proportion of each frequency’s share from the total observation.  fi/n * 360o or % of each class * 360o
  • 48. Pie chart… Blood Type Freq. % A 5 16.7 B 7 23.3 AB 5 16.7 O 13 43.3 Total 30 100 A = 5/30*360o =60o B = 7/30*360o =84o AB = 5/30*360o =60o O = 13/30*360o =156o Example: Table 4: Blood type of 30 individuals in xxx woreda, 2010 (hypothetical)
  • 49. Pie chart 17% 23% 17% 43% A B AB O Figure 4: Blood type of 30 individuals in XXXX Woreda, 2010 (hypothetical)
  • 50. Line graph (diagram)  The diagram is usually used to represents the time series data, where time is plotted horizontal axis (x-axis) and the value of the variables against time is plotted in variable axis (Y- axis) with appropriate scale  Example: the following data represent the number of data patients of a hospital in different years Year 2006 2007 2008 2009 2010 2011 # of dead patients 725 680 550 650 540 500
  • 51. Histograms  Graph consists of series of rectangles whose bases are equal to the class width of the corresponding class & whose heights are proportional to class frequencies  Used for quantitative Continues data 1. The horizontal axis is continues scale running from one extreme end to the other  Should be labelled with the name of the variable & units of measurement 2. For each class in the distribution, a vertical rectangle is drawn with:  There will never be gaps b/n the histogram rectangles  Bases of rectangle will be determined by the class width
  • 52. Eg. Conseder the data on student marks Classes Class marks (Xc) Frequency 12-17 14.5 3 18-23 20.5 4 24-29 26.5 4 30-35 32.5 8 36-41 38.5 6 42-47 44.5 5 Total 30 Table 5: Marks of 30 students, AMC Biostatistics course (out of 50%) , Ethiopia, 2021 (hypothetical data)
  • 53. Histograms Figure 4: Histograms showing students’ mark, AMC, 2021 (hypothetical data)
  • 54. Frequency Polygon  Join the mid points of the tops of the adjacent rectangles of the histogram with segments  When it is joined with x-axis the area under the polygon is equal to the area under the histogram.  The scales should be marked in the numerical values of the midpoints (Xc)  The length of the ordinates represent the class frequency.
  • 56. Figure 5: Frequency polygon showing mark of 30 students, AAU, 2010, (Hypothetical data
  • 57. Cumulative frequency polygon (Ogive)  Line graph obtained by plotting the cumulative frequency distribution (Y-axis) against class boundaries (x-axis)  Two types  Cumulative frequency Less than the UCB (Lcf)or  Cumulative frequency More than the LCB (Mcf)  We can also use the intersection of the two.
  • 58. Construct Ogive by using the table from the above Example Classes Class boundaries Freq Less than cF More than cf 12-17 11.5-17.5 3 3 30 18-23 17.5-23.5 4 7 27 24-29 23.5-29.5 4 11 23 30-35 29.5-35.5 8 19 19 36-41 35.5-41.5 6 25 11 42-47 41.5-47.5 5 30 5 Total 30
  • 59. Less than Ogive Figure 6: Less than Ogive showing mark of 30 students, AAU, 2010, (Hypothetical data)
  • 60. More than Ogive Figure 7: More than Ogive showing mark of 30 students, AAU, 2010, (Hypothetical data)
  • 61. Scatter plot  Used to show the relation ship between two variables.  The symbol usually a dot is used to show the data pair.  X-axis – independent variables (factors, cause)  Y- axis dependent variables (out-come )
  • 63. Summary Data Type Graph Type Description Example Qualitative Bar Chart Frequency of categories Types of fruits sold Qualitative Pie Chart Proportions of categories Market share of smartphone brands Qualitative Stacked Bar Chart Composition of categories Students by major and year Quantitative Discrete Bar Chart Counts of discrete numeric values Number of students in grades Quantitative Discrete Histogram Distribution of discrete values Scores in a test Quantitative Discrete Dot Plot Individual occurrences of values Books read by each student Quantitative Continuous Histogram Distribution of continuous data Heights of students Quantitative Continuous Line Graph Trends over time Temperature changes over a week Quantitative Continuous Box Plot Summary of data distribution Test scores across classes Quantitative Continuous Scatter Plot Relationship between two continuous variables Hours studied vs. exam scores
  • 64. Quiz What is the most appropriate graphical method to display for the following data? 1. The distribution of pain level status 2. The height measurement of 20 individuals 3. The age of pregnant women attending ANC 4. Treatment outcome among RTA patients