Epidemiological study design
Hiwot M.(MPH)
1
Learning objective
At the end of this session, you will be able to:
1. Describe all types of epidemiological study designs
2. Characterize the difference between descriptive and analytic study
designs
3. Explain the uses of the various descriptive study designs
2
In broad terms the purpose of Epidemiology is to answer questions like:
How big is the problem (magnitude)?
• Prevalence, incidence, mortality
What, who and where of any health problem?
• Person characteristic of affected population
• Place characteristics (locality)
What factors are associated with certain disease
• Specific factors related to causation
To evaluate interventions
• Which drug is best for patients with X disease
• To evaluate any program
In addressing this issues there are many approaches that are called
Epidemiological study design
3
Definition
※ Study design is a detailed plan to enroll subjects, collect data, perform
analysis and interpret findings.
※ The purpose of study design is to transform the conceptual hypothesis in
to an operational hypothesis that can be empirically tested.
4
Categories of Epidemiological studies
According to their focus of investigation:
Descriptive Epidemiology - Defines the amount and distribution of
health problems in relation to person, place and time. It answers the
questions who, where and when.
Analytic Epidemiology – involves explicit comparison of groups of
individuals to identify determinants of health and diseases. It answers
the questions why and how.
5
6
Overview of epidemiologic design strategies
• Descriptive
Populations{Correlational studies}
Individual
Case report
Case series
Cross sectional studies
• Analytic studies
Observational
Case control
Cohort
Retrospective
Prospective
Interventional/Experimental
Randomized controlled trial
Field trial
Clinical trial 7
Descriptive studies
Defines amount and distribution of health problem
Person: describe disease occurrence by personal characteristics
“Who is getting the disease?” Age, race, sex
Place: provides information on geographic distribution of the disease
“Where are the rates of disease highest/ lowest?”
Time: provides information organized by time.
“When does the disease occur commonly/ rarely?”
8
Purpose of descriptive study design
⸙ To evaluate trends in health and disease and allow comparisons among
countries or subgroups within countries
⸙To evaluate a basis for planning and provision of services
⸙To identify problems to be studied by analytic methods and generate a
hypothesis related to those problems
⸙Describes the general characteristics of the distribution of a disease in relation
to person, place and time.
⸙To allocate resources efficiently and plan effective prevention or education
programs. 9
1. Correlational / Ecological study
Uses data from entire population to compare disease frequencies. (i.e.
it doesn’t need data from individuals)
Between different groups during the same period of time or
In the same population at different points in time
Unit of data source and unit of analysis is population.
They use aggregate data and do not measure outcomes and exposure at
individual level.
10
Correlational…..
Eg. Colon cancer rates are higher in U.S. counties that use mostly surface
water and in countries with high per capita meat consumption.
These relationships suggest that something about surface water, e.g.,
chlorination, and something about meat consumption, e.g., saturated fat
intake, might be factors in the development of colon cancer.
However, since exposure is not known at the individual level, it is possible
that the cases of colon cancer are not themselves people who drink
chlorinated water or eat meat, “Ecological fallacy” (The attempt to infer
individual characteristics or relationship from group-level measures)
11
Cont’d…
Strength Limitations
⸙Can be done quickly ⸙Unable to link an exposure to
⸙Inexpensive occurrence of disease in a
single individual. (prone to
⸙Can be done using available Ecological fallacy )
data (routine records, reports)
⸙Lack of ability to control for
⸙May be best design to study effects for potential
health effects of environmental confounding factors.
exposures
⸙Eg. Does soft drink
increase heart disease?
12
2. Case report and case series
These studies are generally used to report
New disease
An unexpected association between disease and symptoms.
An unexpected event in the course of observing or treating a patient.
Unique or rare features of a disease
Unique therapeutic approaches.
“Something unusual”
13
14
Case report
⸎It is the study of health profile of a single individual using a careful
and detailed report by one or more clinicians.
⸎Report is usually documented if there is unusual medical
occurrence, thus it may be first clue for identification of a new
disease.
⸎It is useful in constructing a natural history of individual disease.
15
Case series
⸎Individual case report can be expanded to a case series, which
describes characteristics of a number of patients (usually 5-12) with a
similar disease.
⸎Similar to case report, it is usually made on cases having new and/ or
unusual disease (giving interest to clinicians)
⸎It is often used to detect the emergence of new disease or an
epidemics.
⸎ E.g. The first five AIDS cases in USA.
16
...
Example:
⸙ Five young, previously healthy homosexual men were diagnosed as
having Pneumocystis carinii pneumonia at Los Angeles hospital during a
six-month period from 1980 to 1981.
⸙ This form of pneumonia had been seen almost exclusively among older
men and women whose immune systems were suppressed.
⸙ This unusual circumstance suggested that these individuals were
actually suffering with a previously unknown disease, subsequently it was
called AIDS.
17
Case report and case series
Strength Limitation
⸙Very useful in hypothesis ⸙Are prone to Atomistic fallacy
generation ⸙ reports are biased on single
⸙Useful for studying sign and individual or few patients,
symptoms(new syndromes) which could happen by
and creating case definition for coincidence
epidemiological studies.
⸙Lack of appropriate
comparison group/unable to
test for statistical association.
18
3. Cross sectional
⁜It is called study of prevalence/survey study.
⁜Measure disease and exposure status simultaneously among
individuals at the same point in time.
⁜Could provide information about the frequency of a disease by
furnishing a ‘snapshot’ at a specified time.
19
Characteristics of cross sectional
⸙Assess both exposure and outcome simultaneously.
⸙Sample with out knowledge of exposure or disease- then
classify after result obtained.
⸙Calculate prevalence, but not incidence.
⸙Measure of association is made using odds ratio(OR)
20
21
Odds ratio = cross product ratio
⸙In case control studies, RR can not be calculated directly to determine the
association between exposure and disease.
⸙Difficult to know the risk of disease among exposed and non exposed
since we start recruiting cases and controls.
⸙We use odds ratio to measure association between exposure and disease .
⸙The odds of an event is the ratio of the number of ways the event can
occur to the number of ways the event can not occur.
22
11/26/2025
Strength
⸎Less expensive
⸎As it involves large sample size it is more likely to provide generalizable
findings
⸎Provides much information useful for planning health service and medical
programs
⸎Useful to compare prevalence of disease on different population
⸎Examine trends in disease prevalence or severity over time
⸎Yields prevalence 23
Limitation
Since exposure and disease status is assessed at a single point in time,
temporal relationship between exposure and disease can not be clearly
determined.
Egg and chicken dilemma (it is difficult to know which occurred
first, the exposure or outcome?)
Potential bias in measuring exposure
Not feasible for rare diseases
Doesn’t yield incidences or true relative risk
24
Analytic Epidemiological study designs
Goal: to determine the relationship between exposure and disease:
– Assess determinants of disease
– Focus on risk factors, causes
Used for
– Testing hypotheses
– Looking for / quantifying associations
The hallmark feature that distinguishes an analytic study from a
descriptive study is the presence of Comparison Group.
25
Types of analytical study
1. Observational studies
– Case-control studies
– Cohort studies
2. Experimental studies
– Clinical trials
– Community trials
26
1. 1. Case control
Subjects are selected with respect to presence or absence of the
outcome of interest and then inquiries are made about past exposure
to the factors of interest.
Case: those who have the outcome of interest
Controls: those who do not have the outcome of interest
The exposure histories of cases and controls are then obtained and
compared.
27
cont’d…
28
Features of case control
⸎Identify group of cases and group of controls
⸎Question both groups for possible exposure
⸎Measure the frequency of exposure occurrence in both groups
⸎Compare the frequency of exposure between cases and controls
⸎Calculate the odds ratio and interpret the result
29
cont…
• Generates many insights in to the aetiology of chronic diseases such as
lung, cervical, cancers, congenital defects, etc.
Important for study of rare disease
For diseases with latent period or duration
Use convenient sampling
Short study period
Less expensive than cohort studies
30
Selection of cases
Ideally, cases are a random sample of all cases of interest in the source
population
Sources for cases can be:
hospital or clinic patient rosters
death certificates
special surveys
reporting systems
Care is needed that bias does not arise from the way in which cases are
selected (i.e. selection bias)
31
Cont…
Selections of controls
⸙Controls should have the same exposure distribution as the source
population from which cases are drawn
⸙The controls should be similar to the cases in all respects other
than having the disease in question or
⸙Should be representative of all persons without the disease in the
population from which the cases are selected
32
Sources of cases and controls
CASES CONTROLS
All cases diagnosed in the population Sample of general population
All cases diagnosed in a sample of the Non-cases in a sample of the population
population
All cases diagnosed in all hospitals Sample of patients in all hospitals who do
not have the disease
All cases diagnosed in a single hospital Sample of patients in the same hospital who
do not have the disease
Any of the above methods Spouses, siblings or associates of cases
33
Cont’d…
Epidemiologists use several sources for identifying controls
like:
Individuals from the general population
Individuals attending a hospital or clinic
Friends or relatives identified by the cases
Dead controls
34
Analysis in case control
35
Advantage of case control
Optimal for the evaluation of rare disease
Suitable for diseases with long induction period
Can examine multiple etiologic/risk factors for a single disease
Quick and inexpensive as compared with cohort study
Need small sample size
No ethical problems
36
Disadvantage of case control
It is restricted to single outcome
Usually cannot measure disease risk
Prone to Information bias: e.g. recall bias, non-response bias
Prone to Selection bias: e.g. survivor bias
using different criteria to select cases and controls
the probability of selecting a real case and control
Misclassification of exposures and outcomes
Selection of controls is difficult
Inefficient for rare exposures
37
1.2. Cohort study
Cohort:
The term cohort has military, not medical, roots.
The term “cohort” derived from the Latin word “cohors”.
A cohort was a 300–600-man unit in an ancient Roman army
A group of individuals who have the same ideas or beliefs or who are pursuing
the same activity together
Figure : An early cohort in search of favorable outcomes
38
Cohort study
⸙Involves following up for a specific period of time for the out come
occurrence in two or more groups
⸙Measure and compare the incidence of outcome between exposed and
unexposed groups
⸙Who will develop an outcome, when and why?
⸙Looking for an uncertain outcome
⸙Identifying group of
• Exposed subjects to
• Unexposed subjects
39
Cohort study cont…
Incidence among exposed
(7/15)
Exposed
Incidence among
unexposed
unexposed
(3/15)
Follow up period
40
Purpose of cohort study
Have two primary purpose
• Descriptive(measure of frequency)
• To describe the incidence rates of an outcome over time, or simply
describe the natural history of disease
• Analytic (measure of association)
• To analyze associations between the rates of the outcomes and risk
factors
41
Types of cohort study
Depending on the temporal relationship between the initiation of the
study and the occurrence of the disease, there are two cohort studies
types
1. Prospective (classical) cohort study: at the beginning of the study
the outcome has not yet occurred
2. Retrospective (historical) cohort study: Both exposure and
outcome status have occurred at the beginning of the study
42
Prospective cohort study…
43
Retrospective cohort study…
44
Closed and open cohort
There are two types of cohorts that epidemiologists follow:
1. Closed cohorts
A closed cohort is one with a fixed membership
Once it is defined and follow-up begins, no one can be added to a closed
cohort
[Link] cohorts
An open cohort, which is also referred to as a dynamic cohort, can take on
new members as time passes.
45
Incidence cohort Vs. Prognostic (clinical)
Incidence Cohort Study
• To assess incidence of disease
• To identify risk factors for disease onset
• Incidence greater in exposed than non-exposed?
Prognostic Cohort Study
• Follow diseased cohort to assess factors associated with outcome
(recovery or death)
• Goal is to identify explanatory/prognostic factors of those factors
helped to the dev’t of the disease.
46
Analysis of cohort studies
Compare groups to check similarity at baseline
Calculate incidence of the outcome for exposed and non-exposed
group
Incidence rates are usually calculated by dividing the number of
new events in the follow up period by the appropriate denominator,
based on the size of the population at risk.
Estimate risk (relative risk)
47
Incidence rate in cohort study
48
49
Advantage of cohort study
⸙Helps to understand the natural history of a disease
⸙Provide clear temporal relationship b/n exposure and outcome
⸙Since information is collected longitudinally it is less prone to bias
⸙Optimal for the study of rare exposures
⸙Can examine multiple effects of a single exposure
⸙Minimizes bias in ascertainment of exposure and outcome
50
Disadvantage of cohort study
⸙Inefficient in the evaluation of rare disease
⸙Can’t study the effect of multiple exposures at a time
⸙Inefficient in the evaluation of disease with long latent period (for the
prospective type)
⸙Expensive and time consuming (for the prospective type)
⸙Loss to follow-up (for the prospective type)
⸙The exposure might not be ‘fixed exposure’
51
Comparison of case control and cohort
Case control Cohort
52
2. Experimental/interventional study
⸎ An experimental study (trial) investigates the role of some
agent/intervention in the prevention or treatment of a disease.
⸎ The investigator assigns individuals to two or more groups that either
receive or don’t receive the agent/intervention.
⸎ The group that is allocated the agent under study is called the
treatment/intervention group, and the group that is not allocated the
agent under study is called the comparison group.
53
Cont’d…
The quality of "gold standard" in intervention studies can be achieved
through :
• Randomization
• Use of placebo
• Double Blinding
Randomization: random allocation of both intervention and control groups
by lottery or random number table.
54
Cont’d…
Blinding - is when the observers and/or subjects are kept ignorant as
to the group to which the subjects are assigned
⸙Open/Unblind = all know which intervention a patient is receiving.
⸙Single blind = subjects are ignorant
⸙Double blind = health care giver and subject are ignorant
⸙Triple blind = health care giver, subject and analyst are ignorant
55
Cont’d…
Placebo - an inert agent indistinguishable from the active treatment.
※ Use of placebo minimizes bias in the ascertainment of both subjective
disease outcomes and side effects.
※ Have psychologic benefit of receiving a treatment (can have a powerful
effect)
※ When to use placebo as comparison:
※ If there is no standard treatment for the condition being studied
56
Classification of experimental study
Based on population
Clinical trial - usually performed in clinical setting and the subjects
are patients.
Field trial- used in testing medicine for preventive purpose and the
subjects are healthy people.
Community trial - unit of the study is group of people/community.
E.g. fluoridation of water to prevent dental caries.
57
Based on design:
[Link] trial: no control group and control will be past experience
(history)
[Link]-randomized controlled trial: There is control group but allocation into
either group is not random
3. Randomized controlled trial: There is control group and allocation into
either group is randomized
• Individual randomized controlled trial: Randomization is at individual
level
• Cluster randomized controlled trial: The randomization is at cluster level
58
Cont’d…
Based on level (phase I-IV) for new drug trials only:
Phase I:
• Trial on few subjects (20-80)
• 1st experiment in human for new drug, schedule, or combination
• Primary concern: Safety
• Goal: define the maximum tolerated dose (MTD) in a dose-escalation
study
Phase II:
• Small randomized, controlled, blinded (100-300)
• Tests tolerability and different doses
• E.g., optimal dosage without side effects
• Applied to patients with relevant illness
• Goal - Identify suitable formulation of drug 59
Cont’d…
Phase III
• Referred to as clinical trial
• Trial on large number of subjects (1000+) with therapeutic dose to
evaluate efficacy of drug and side effects
• Usually randomized, blinded, controlled trial
• If successful, licensed and marketed
Phase IV
• Post marketing surveillance for long term effects.
• Large studies after approval of drug
• Often observational, study long-term effects
60
Fig. Schematic display of experimental study 61
Cont…
Uncontrolled trials
• The new intervention is studied without any direct comparison with a
similar group of patients on more standard therapy
• Problems of uncontrolled studies are:
-biased selection of participants
-biased assessment of outcomes
Cont…
Non randomized concurrent controls
• Non randomized methods are used to allocate participants to
intervention or control group
• Systematic allocation - e.g. date of birth, date of
presentation
• Judgement allocation – investigator or participant is allowed
to exercise judgement in allocation
• Major problem is predictability of allocation group,
especially by investigator
Cont…
Randomized concurrent controls
• An experimental design used to evaluate drugs and
health interventions
• Subjects should be allocated to treatment group
according to some chance mechanism
• RCT is a research activity that involves the
administration of a test regimen to humans to evaluate
its safety, efficacy, effectiveness
Main steps in RCT
1. Identify new drug/intervention/prevention
2. Identify comparison –
Example, standard treatment versus placebo (Control v
intervention group)
3. Define eligible patient population/ exclusions (i.e the
sampling frame)
4. Define the outcomes and how to assess them
5. Write the protocol 65
cont…
6. Obtain research ethics committee approval
7. Recruit and consent required sample of patients
8. Randomize to treatment, then treat
9. Follow-up and compare/analyze outcome data
10. Publish/disseminate findings
66
Challenges of experimental study
⸙Ethical issues: Ethical considerations prevent evaluation of many
treatments or procedures using an interventional strategy.
⸙Feasibility/practicability: Getting adequate subjects to be enrolled into
the study is not easy.
⸙Cost: In most of the cases experimental studies are vey expensive.
⸙Non-compliance: When study subjects are expected to take medication
for long period of time non-compliance can be a challenge to the study.
67
Analysis in Interventional studies
⸎Similar to cohort studies
⸎Compare the rates of the outcome of interest in the treated group(s) and
the corresponding rates in the comparison group(s)
⸎The roles of chance, bias, and confounding must be evaluated
⸎Compare the relevant characteristics of the randomized treatment and
comparison groups to assess balance is achieved
⸎This comparison should always be presented as one of the first tables in
the report of the study findings 68
cont…
⸎The exclusion of any randomized patients from the analysis can lead to
biased results
⸎Noncompliance may be related to factors that also affect the risk of the
outcome under study
⸎Thus, analyze by intention to treat-in other words, “once randomized,
always analyzed”
⸎Those who are no longer complying with the study regimen should
continue to provide all follow-up information whenever possible, or at the
very least, their vital status should be ascertained 69
70
Thank you !!!
71