Cohort Studies:
Design, Conduction, Analysis and Interpretation
• Cohort studies are also called “prospective”,
“longitudinal” or “forward looking” studies.
• Cohort study is a major type of observational analytic
study to ascertain the cause for an outcome
• Incidence of disease and relative risk (RR) can be
measured from this study design
• RR is the direct measure of risk and indicates the
strength of association between an exposure and
outcome
• p-value indicates how strong is the evidence for an
association
Three conditions for the calculation of incidence
• Must be free from the disease of interest at the
beginning
• Mush be at risk of developing the disease
• Needs to follow-up for a certain period of time to see
the outcome
Design
• Cohort studies begin with an exposure to a suspected
risk factor to observe the outcome
• Basic difference between a cohort and a case control
study:
• Cohort study starts with exposure, while case-control
study begins with the outcome
Types of Cohort Studies
• Prospective or concurrent cohort studies
• Retrospective or non-concurrent or historical cohort
studies
• Ambidirectional cohort studies
Advantages and limitations
Advantages: Limitations:
• Temporal relationship between the • It is necessary to follow-up a large
exposure and disease can be clearly number of individuals for many
established
years
• This design is most suitable to study • It is generally time consuming and
rare exposure
thus, expensive
• Multiple outcomes for a single • It has potential bias associated with
exposure can be examined.
lost to follow-up
• Tend to minimise the potential for
selection bias
Assembling
• Exposure status is known: Selecting a group on the basis of
exposure status (exposed and unexposed) already known; (e.g.,
sampling from industries). It is suitable when prevalence of
exposure in community is low
• Exposure status is unknown: Selecting a population where some
are exposed and some are not exposed but without knowing their
exposure status before selection of a sample. Suitable when
prevalence of exposure is reasonably high.
• Before the subjects are exposed: Sometimes, a defined population
is selected before they are exposed, but their exposure status is
determined during the initial follow up period (e.g., school,
childcare)
Analysis
• Incidence (both among exposed and unexposed)
• Relative risk (RR) & its 95% CI
• Attributable risk
• Attributable risk
• Proportional or % attributable risk
• Population attributable risk
Cohort studies
• Association between Hepatitis B (HB) infection and liver cancer (Lca)
• Assembled the exposed and unexposed groups & followed for 5 years
• The risk of developing LCa is 1.5 times higher if the person is infected
with HB compared with those who do not have HB infection
• RR=1: No association
• RR>1: Exposure is a risk factor (associated with higher incidence of
disease)
• RR<1: Exposure is a protective factor (exp. is associated with lower
incidence of disease compared to the unexposed)
Ca Liver
+ - Total
HB+ 21 (a) 479 (b) 500 (a+b)
HB- 14 (c) 486 (d) 500 (c+d)
Total 35 (a+c) 965 (b+d) 1,000
We shall calculate
• Incidence of Ca Liver among:
• HB +ve
• HB –ve
• Relative risk (RR)
• 95% CI of RR
• P-value
We can also calculate
• Attributable risk (AR)
• Proportional (or Percent) Attributable Risk (%AR)
• Population Attributable Risk (PAR)
Calculate:
• IC among HB +ve: 4.2% Ca Liver
• IC among HB –ve: 2.8% + - Total
500
• Relative risk (RR): 1.5 HB+ 21 (a) 479 (b) (a+b)
• 95% CI of RR: HB- 14 (c)
500
486 (d) (c+d)
• P-value:
35 965
Total (a+c) (b+d) 1000
• RR in our study is 1.5, Is it by chance?
• To prove, in the population RR is not equal to 1, we need to calculate
the 95% CI for RR
• SE of LnRR (natural log) = ???
• If 95% CI includes 1, it not statistically sig. In the pop. RR may be =1
• If both the values of CI are >1 (e.g., 1.2-2.9), RR is statistically sig and
the exposure is a risk factor
• If both the values of the CI is <1; (e.g., 0.7-0.9), RR is sig. and the
exposure is a protective factor
Calculate:
• 95% CI of RR: 0.77-2.91
• 2 Steps:
• Step 1: Calculation of SE of LnRR
• Step 2: Calculation of 95% CI
Ca Liver
+ - Total
HB+ 21 (a) 479 (b) 500
(a+b)
HB- 14 (c) 486 (d) 500
(c+d)
Total 35 965 1000
(a+c) (b+d)
Ca Liver
+ - Total
HB+ 21 (a) 479 (b) 500
To get the p-value, use the (a+b)
Chi-square test HB- 14 (c) 486 (d) 500
(c+d)
Total 35 965 1000
(a+c) (b+d)
n (ad – bc)2
2 = (a + b) (c + d) (a + c) (b + d)
Calculated values:
• IC among HB +ve: 4.2
• IC among HB –ve: 2.8
• Relative risk (RR): 1.5
• 95% CI of RR: 0.77 to 2.92
• Chi. Sq calculated value is 1.45 [< 3.84 (tabulated value)]
• p-value: >0.05
Conclusion
• Persons with HB infection are at 1.5 times greater/higher
risk of developing liver cancer compared to those who
do not have HB infection, which is not statistically
significant at 95% confidence level (RR: 1.5; 95% CI: 0.77-
2.92; p>0.05).
Conclusion (if RR is <1)
• Suppose RR is 0.76 and 95% CI: 0.67-0.82
• (1-0.76)=0.24 or 24%
• Exposure provides 24% protection from developing the
disease compared to unexposed, which is statistically
significant at 95% confidence level (RR: 0.76; 95% CI:
0.67-0.82; p< 0.05).
Calculation, if the measurement is Person-time
Cancer Person-year
Yes 90 (a) 20,000 (n1)
Smoking
No 35 (c) 22,000 (n2)
Total 125 (D) 42,000 (N)
Potential bias
• Bias in assessment of outcome
• Information bias (especially in non-concurrent cohort
studies)
• Bias for non-response and loss to follow-up:
Attributable risk (AR)
• Attributable risk is another measure of risk in epidemiology
• This measurement is important in public health
• AR provides information about the expected amount of reduction of
disease incidence in the population if the risk factor is eliminated
IC among HB +ve: 4.2%
IC among HB –ve: 2.8%
Relative risk (RR): 1.5
• Attributable risk (AR)
• AR indicates the amount of greater risk attributable to exposure
• (i.e., Exposed group experience X% higher incidence of disease than
the unexposed group).
IC among HB +ve: 4.2%
IC among HB –ve: 2.8%
Relative risk (RR): 1.5
• Proportional or % Attributable Risk (%AR)
• Indicates the proportion of total incidence of the disease among the
exposed group, which is attributable to the risk factor.
• What % of the disease among the exposed group is because of the
exposure
IC among HB +ve: 4.2%
IC among HB –ve: 2.8%
Relative risk (RR): 1.5
• Population AR or PAR
• Indicates what proportion of the disease incidence (e.g., Ca liver) in
the total population could be prevented, if it is possible to eliminate
exposure in the total population
Thank You