BIOSTATISTICS
IV
DR. RAIMA ASIF
Lets recap
Read the following on attendance and grades, and answer the questions.
A study conducted at University revealed that students who attended class 95 to 100% of
the time usually received an A in the class. Students who attended class 80 to 90% of the
time usually received a B or C in the class. Students who attended class less than 80% of
the time usually received a D or an F or eventually withdrew from the class.
Based on this information, attendance and grades are related. The more you attend class,
the more likely you will receive a higher grade. If you improve your attendance, your
grades will probably improve. Many factors affect your grade in a course. One factor that
you have considerable control over is attendance. You can increase your opportunities for
learning by attending class more often.
1. What are the variables under study?
2. What are the data in the study?
3. Are descriptive, inferential, or both types of statistics used?
4. What is the population under study?
5. From the information given, comment on the relationship between the variables.
Learning outcomes
Types of distributions
Characteristics of Normal distribution curve
Concept of standardized normal distribution
Concept of skewness and kurtosis
Types of distributions
• Bernoulli Distribution • Normal Distribution
• Binomial Distribution • Log Normal Distribution
• Gamma Distribution
• Negative Binomial • Chi Square Distribution
• Poisson Distribution • F Distribution
• Geometric Distribution • t Distribution
• Multinomial Distribution • Weibull Distribution
• Extreme Value Distribution
(Type I and II)
Types of Distributions
Data can be "distributed" (spread out) in different
ways
Introduction to Normal
Distribution
Introduction to Normal Distributions
the most important and most widely used distribution
in statistics
"bell curve" / "Gaussian curve" after the
mathematician “Karl Friedrich Gauss”
Distributions of many natural phenomena are at least
approximately normally distributed
Characteristics of Normal Distribution curve
1. A normal distribution curve is smooth bell-shaped.
2. It is theoretical, probability distribution representing
infinite number of observations.
3. The mean, median, and mode are equal and are
located at the center of the distribution.
3. A normal distribution curve is unimodal (i.e., it has only
one mode).
4. The curve is symmetric about the mean
Characteristics/Properties…
5. The curve never touches the x axis – asymptotic to x
axis.
6. The shape of curve depends upon mean and standard
deviation.
7. The area between one standard deviation on either
side of mean will include 68% of values in distribution,
within 2 standard deviations 95% and within 3 standard
deviations or 99.7%.
Empirical rule…(normal rule)
Empirical rule..
Scores on a national achievement exam have a mean of 480
and a standard deviation of 90. If these scores are normally
distributed, then
Approximately 68% will fall between _______ and ______
Approximately 95% of the scores will fall between _____ and
Approximately 99.7% will fall between _____ and
K Park 25th Ed Ilyas Ansari 8th Ed
Empirical rule…knowledge check
Scores on a national achievement exam taken by 5000
students have a mean of 480 and a standard deviation of
90. If these scores are normally distributed, then
a. How many students will have marks above mean?
b. What % of students will have marks below mean?
c. How many students will have marks between 390 and
570?
d. What % of students will have marks between 480 and
570?
Summary
Two samples means and S
Two samples means and S
Standard normal curve
Normal curves standardized to
one standardized curve
Smooth bell shaped perfectly
symmetrical curve
Mean is 0, Standard deviation =1
Total area under curve is 1
Standard normal curve…
A normal distribution with mean (μ)=0 and standard
deviation (σ) = 1 is called standard normal
distribution.
• It is also know as S.N(0,1) distribution.
Standard normal curve…
It may be obtained from normal density function by
creating a random variable
Z = (X – μ)/σ.
Z-table (normal table) can be used to calculate the
following area under the curve
Let us suppose, the pulse of a group of normal
healthy
males was 72, with a standard deviation of 2. What
is the
probability that a male chosen at random would be
found to
have a pulse of 80 or more ?
The relative deviate (z) - (X – μ)/σ. 80-72 /2 = 4
The area of the normal curve corresponding to a
deviate
4=0.49997
Importance of standardized scoring
For most measurements, follow a theoretical
distribution, it becomes useful for comparison and
identify position of an individual in the distribution of
that characteristic.
Anthropometric measures
Skewed Data
Values not evenly distributed around mean
Lacks symmetry
Skewness….tells about symmetry
Positively skewed Negatively skewed
Extreme values to the Extreme values to the left
right, Peak on right
peak on left, Tail towards left
tail towards right, Mean less than median
mean greater than median
Types of Skewed Distribution
Mode is most frequently occurring score, the highest point
on the curve,
Median is in between the two
Mean is pulled towards right in positively skewed
and towards left in negatively skewed
The tail of curve shows direction of skewness
Knowledge check
a. Identify the type of skewed distribution
b. Label A, B and C
A
c. Any example of such a distribution B
C
Mean > median > mode
Knowledge check!
Variables such as income happen to be extremely high
for a small proportion of people. In such cases, we
describe distribution of data as
Positively skewed
Negatively skewed
Normally distributed
Binomial distribution
Knowledge check!
What is the most likely relationship between mean,
median and mode of the distribution shown in the figure
below?
•Mean < median < mode
•Mean = median = mode
•Mean > median > mode
•Mode > mean > median
How can you measure skewness?
SD with respect to mean
Histogram
Software uses “tests of normality”
Knowledge check!
A researcher found his data to be skewed on analysis.
Which measure of central tendency and variation
should be use in his data analysis?
Knowledge check!
In cricket, some players made the score lower than the
average, some get out on zero, some players score
runs which are very low, and only one or two players
makes the highest scores.
What is the type of distribution?
Kurtosis…talks about spread
Degree to which distribution/values are clustered
around a small portion of scale or spread out across
entire distribution.
The three types of kurtosis:
Mesokurtosis, Leptokurtosis, Platykurtosis
Types of kurtosis
Leptokurtic
Pilling up of values in the center
The coefficient of kurtosis is more than 3.
Platykurtic
Values are spread throughout distribution
The coefficient of kurtosis is usually less than 3.
Mesokurtic
As Normal distribution.
The coefficient of kurtosis is equal to 3
Types of kurtosis
B
Comment on this picture …
3 curves …
Normally distributed but varied spread
Same measures of central tendency,
different standard deviation
'SYMMETRY' & 'SPREAD'
A distribution may be deviated from “normality” in two
different contexts:
Deviation in Symmetry
Deviation in Spread
Symmetry pertains to:
Mean = Median = Mode
Spread pertains to standard deviation:
µ ± 1 σ ≈ 68%
µ ± 2 σ ≈ 95%
µ ± 3 σ ≈ 99%
KURTOSIS
This is the deviation concerning spread
• Meso-Kurtosis
• Lepto-Kurtosis: (+ve Kurtosis)
• Platy-Kurtosis: (-ve Kurtosis)
KURTOSIS
SKEWNESS
This is the deviation concerning symmetry
Normal Distribution :
Mode = Mean = Median
Positive / Right Skewed :
Mean > Median
Negative / Left Skewed :
Mean < Median
SKEWNESS
RIGHT (+VE) SKEWNESS
Mean > Median
Median
Mode
Mean
LEFT (-VE) SKEWNESS
Mean < Median
Mode
Median
Mean
Summary
The use of normal distribution in statistical inference is
important because
It is a good empirical distribution of many biological
variables
It involves the healthy people
It occupies a central role in statistical analysis
It forms the basis of the methodology in statistics
Thank you