Biostatistics Lecture - 1 - Introduction
Biostatistics Lecture - 1 - Introduction
Biostatistics
Introduction
2019 - 2020
Biostatistics
(a portmanteau word made from biology and statistics)
The application of statistics to a wide range of topics in
biology.
Collection of data.
Presentation of the collected data.
Analysis and interpretation of the results.
Making decisions on the basis of such analysis
Other Definitions for “Statistics”
Comprehensive Sample
Records:
Examples:
- Hospital medical records contain immense amounts of
information on patients.
- Hospital accounting records contain a wealth of data on the
facility’s business activities.
- The data needed to answer a question may already exist in the
form of published reports, commercially available data banks, or
the research literature, i.e. someone else has already asked the
same question.
Surveys:
Constant Variables
Types of Variables
Quantitative Qualitative
continuous nominal
Quantitative Qualitative
descrete ordinal
Quantitative Variables:
For example:
- the heights of adult males.
- the weights of preschool children, the ages of patients seen in
a dental clinic.
Qualitative Variables
Many characteristics are not capable of being measured. Some
of them can be ordered or ranked.
For example:
- classification of people into socio-economic groups,
- social classes based on income, education, etc.
A discrete variable:
Is characterized by gaps or interruptions in the values that it can
assume.
For example:
- The number of daily admissions to a general hospital.
- The number of decayed, missing or filled teeth per child in an
elementary school.
A continuous variable:
can assume any value within a specified relevant interval of
values assumed by the variable.
For example:
Height, weight, skull circumference.
No matter how close together the observed heights of two people, we
can find another person whose height falls somewhere in between.
Nominal data:
It assigns names to each data point without placing it in some
sort of order.
For example:
The results of a test could be each classified nominally as a
"pass" or "fail“.
Ordinal data:
It groups data according to some sort of ranking system: it
orders the data.
For example:
Test results could be grouped in descending order by grade: A,
B, C, D, E and F.
A population:
It is the largest collection of values of a random variable for which
we have an interest at a particular time.
For example:
The weights of all the children enrolled in a certain elementary
school.
Populations may be finite or infinite.
A sample:
It is a part of a population.
For example:
The weights of only a fraction of these children.
Methods of Presentation of Data
Numerical presentation
Graphical presentation
Mathematical presentation
1- Numerical presentation
Tabular presentation (Simple – Complex)
Simple frequency distribution Table (S.F.D.T.)
Title
Name of variable
Frequency %
(Units of variable)
-
- Categories
-
Total
Table (I): Distribution of 50 patients at the surgical
department of (XYZ) hospital in April 2018 according
to their ABO blood groups
Age
Frequency %
(years)
20 - <30 19 38
30 - <40 11 22
40 - <50 3 6
50+ 17 34
Total 50 100
Complex frequency distribution Table
Table (III): Distribution of 20 lung cancer patients at the chest
department of (XYZ) hospital and 40 controls in May 2018 according to
smoking
Lung cancer
Total
Smoking Cases Control
No. % No. % No. %
Smoker 15 75% 8 20% 23 38.33
Non
5 25% 32 80% 37 61.67
smoker
Total 20 100 40 100 60 100
Complex frequency distribution Table
Table (IV): Distribution of 60 patients at the chest department of
(XYZ) hospital in May 2018 according to smoking & lung cancer
Lung cancer
Total
Smoking positive negative
No. % No. % No. %
Smoker 15 65.2 8 34.8 23 100
Non
smoker 5 13.5 32 86.5 37 100
• Line graph
• Frequency polygon
• Frequency curve
• Histogram
• Bar graph
• Scatter plot
Pie chart
Statistical maps
Line Graph
25
20
15
10
5
0
Age
25 35 45 55 65
9
8 Female
7 Male
6
Frequency
5
4
3
2
1
0
20- 30- 40- 50- 60-69
Age in years
Histogram
Distribution of a group of cholera patients by age
30
40
45
60
65
0
Age (years)
30
20
10
0
Single Married Divorced Widowed
Marital status
Pie chart
Deletion
Inversion
3%
18%
Translocation
79%
Doughnut chart
Hospital B
DM
Hospital A IHD
Renal
3-Mathematical presentation
Summery statistics
Measures of location
1- Measures of central tendency
2- Measures of non central locations
(Quartiles, Percentiles )
Measures of dispersion
Summery statistics
1- Measures of central tendency (averages)
Midrange
Smallest observation + Largest observation
2
Mode
the value which occurs with the greatest
frequency i.e. the most common value
Summery statistics
1- Measures of central tendency (cont.)
Median
the observation which lies in the middle of
the ordered observation.
Range
Variance
Standard deviation
Semi-interquartile range
Coefficient of variation
“Standard error”
Dr. Mahmoud Al-Naimi