67% found this document useful (6 votes)

5K views122 pages

Epidemiology MCQ

Uploaded by

ሌናፍ ኡሉም

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

67% found this document useful (6 votes)

5K views122 pages

Epidemiology MCQ

Uploaded by

ሌናፍ ኡሉም

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Department of Epidemiology

Fundamentals of Epidemiology (EPID 168)

Midterm Examination, Fall 1999

 Instructions:
o Write the last 4 digits of your ID number in space provide on each page (top right).
o Write clearly and legibly; avoid writing on the back of these pages.
o Show all your work and include units where appropriate.
o Write all answers and computations on these pages.

1. Which of the following best describes the retrospective design where subjects are sampled
by disease status and is often used when the investigator is interested in rare diseases? (4
pts)

A. intervention trial
B. case control study
C. retrospective cohort
D. ecologic study
E. none of the above

2. Which of the following best describes the study design that can be either retrospective or
prospective and is often used when the investigators are interested in rare exposures? (4
pts)

A. intervention trials
B. cohort studies
C. prevalence studies
D. case control study
E. none of the above

3. The strength of an association is one of the criteria for evaluating the cause and effect
relationship between an exposure and outcome. Which of the following is a measure of the
strength of association? (Choose one best answer). (4 pts)

A. incidence rate among the exposed

B. cumulative incidence among the exposed
C. the ratio of odds of exposure among cases to the odds of exposure among the non-
cases
D. odds of disease among exposed relative to the prevalence of exposure in the source
population
E. none of the above

4. Incidence rates of a disease are often referred to as direct measures of risk. Can incidence
rates be calculated from case-control studies? Briefly explain in 1-2 sentences why they can
or cannot be calculated. (4 pts)
5. For each of the following epidemiological measures, indicate whether it is a rate, a
proportion or that it is neither a rate nor a proportion. Circle the best answer. (1 pt each)

a. Population attributable risk RATE PROPORTION NEITHER

b. Incidence density (ID) RATE PROPORTION NEITHER

c. Prevalence RATE PROPORTION NEITHER

d. Relative risk RATE PROPORTION NEITHER

6. Indicate true or false next to each of the following. (2 pt each)

____ ____ a. A "J" or "U" shaped relationship of a continuous risk factor and continuous measure of
disease suggests a Pearson product-moment correlation coefficient of near plus one or
minus one.
____ ____ b. A risk ratio measure and a correlation coefficient are both measures of association.
c. A population attributable risk proportion depends on the prevalence of exposure and is
____ ____
not directly related to the strength of an association.
d. The study base for a case-control study consists of those people who if they developed
____ ____
the disease could have been counted as cases.
e. The Bradford Hill criterion "coherence" means that the association has been observed
____ ____
repeatedly in different places, by different observers, and at different times.
f. If an exposure is a cause of a disease, then "temporality" is the Bradford Hill criterion
____ ____
for causal inference that must hold true between exposure and disease.

7. The death rates from various conditions are often compared across geographic areas. These
comparisons are usually based on directly age-standardized mortality rates. Which of the
following best describes what is meant by an age-standardized rate created by the direct
method? (Choose one best answer). (4 pts)

A. The number of events in each age stratum of a standard population is used to create
a weighted average rate.
B. The event rates in each age stratum in the standard population are used to create a
weighted average rate.
C. The event rates in the geographic area of interest are applied to the age-stratum
sizes of a standard population to create a rate that is a weighted average.
D. The event rates in the geographic area of interest are compared to the event rates of
a standard population to create a summary rate that is a weighted average.
8. In order to estimate counts and rates of work-related fatalities, the National Traumatic
Occupational Fatality system has introduced a tick-box on the death certificate to indicate
"injury at work." Kraus et al. (Am J Epidemiol 1995; 141: 973-9) attempted to validate this
"injury at work" classification system against a gold standard [International Classification of
Diseases (ICD) death certificate codes designating deaths that occurred during work-related
activities]. After reviewing a sample of 100,000 death certificates, the authors reported the
following: 1,195 true positives; 788 false positives; 97,672 true negatives; 345 false
negatives. ("Positive" indicates that the tick-box was checked; "negative" indicates that it
was not checked; "true" indicates agreement between the tick-box and the ICD code).

a. Using the counts provided above, complete the 2x2 table below: (2 pts)

ICD Classification

Not work-
Death Certificate Work-related TOTAL
related

Work-related

Not work-related

TOTAL

b. What are the sensitivity and specificity of the "injury at work" classification system?

(4 pts)
c. What is the positive predictive value? In your own words, how would you interpret
this value? (3 pts)
d. Based on these data is the death certificate "injury at work" classification system
likely to underestimate or overestimate the true number of work-related fatal
injuries? (2 pts)
e. The use of data from the "tick-box" on the death certificates to track work-related
mortality trends is an example of which kind of surveillance system? (Choose one
best answer). (4 pts)

A. Active surveillance
B. Passive surveillance
C. Retrospective cohort surveillance
D. Cross-sectional survey surveillance

f. The sensitivity and specificity computed above are quantitative measures of which
of the following aspects of death certificate classification of work-related fatalities?
(Choose one best answer). (4 pts)

A. Reliability of death certificate classification

B. Repeatability of death certificate classification
C. Validity of death certificate classification
D. Attributable risk of work-related classification on death certificates
E. None of the above
9. Age-related maculopathy is a leading cause of blindness among people 65 and older in the
United States, and is estimated to affect between 16 and 26% of people in this age group. In
a recent study by Klein, residents aged 43 to 86 years in the town of Beaver Dam, Wisconsin
were asked to participate in a study to determine whether cigarette smoking was related to
age-related maculopathy. At a baseline examination, participants were asked to report their
lifetime smoking habits. After 5 years, participants had an examination to determine
whether they had developed age-related maculopathy. The following table presents the
number of cases of age-related maculopathy measured at the follow-up examination among
the 1232 male participants ages 43-86 who did not have age related maculopathy (ARM) at
the baseline examination

Smoking status N Cases of ARM

Never smokers 368 26

Ever smokers 864 79

a. Which of the following best describes the research design used by in this study?
(choose one best answer) (3 pts)

A. Population based cross-sectional study

B. Case cohort study
C. Nested case control study
D. Prospective cohort study
E. None of the above

b. Create a 2 x 2 table where one axis is smoking status and the other is age-related
maculopathy status. (4 pts)
c. Calculate the 5-year cumulative incidence of age-related maculopathy in ever
smokers, and in never smokers. Show your work. (4 pts)

d. Calculate the cumulative incidence ratio comparing the incidence of age-related

maculopathy in ever smokers with that in never smokers. Show your work. (4 pts)
e. Assuming causality, what is the proportion of cases of age-related maculopathy that
could have been prevented in the population of males ages 43-86 in Beaver Dam if
the smokers had never smoked? Show your work. (4 pts)

10. The following data come from a national survey of the occurrence of back pain. A case of
low back pain was defined as having at least one episode of severe back pain occurring over
a period of 6 months. The number of cases was obtained from surveys of different
occupation groups as well as a national random sample.
Cell phone manufacturing Textile manufacturing National random sample

Age Persons cases Rate Persons Cases Rate Persons Cases rate

25-39 1000 2 .002 100 2 .02 10,000 30 .003

40-55 700 25 .037 500 30 .06 15,000 900 .06

55+ 50 15 .300 1500 150 .100 15,000 1200 .08

Total 1750 42 .024 2100 182 .087 40,000 2130 .053

a. Compute a standardized event ratio (similar to a standardized mortality ratio (SMR)

except the episodes of back pain aren’t mortal events) of back pain for the cell
phone-manufacturing employees. Briefly state in one sentence the interpretation of
this measure in this case. (3 pts)
b. Compute a standardized event ratio (similar to a standardized mortality ratio (SMR)
except the episodes of back pain aren’t mortal events) of back pain for the textile-
manufacturing employees. Briefly state in one sentence the interpretation of this
measure in this case. (3 pts)
c. Can these two ratios in part (a) and (b) be compared? Briefly explain why or why
not. (3 pts)

11. The evidence supporting obesity as a risk factor for colon cancer remains inconclusive,
especially among women. A recent study (Am J Epidemiol1999; 150:390-398) reported the
association between obesity (measured at baseline) and colon cancer morbidity as
determined from review of medical records and death certificates in a nationally
representative cohort of men and women age 25-74 years who participated in the First
National Health and Nutrition Examination Survey from 1971 to 1975 and were
subsequently followed up through 1992. The following table is from this study for men and
women combined.

Baseline body Number of incident cases Person-years Crude incidence

mass index* of colon cancer of follow up rate/100,000 PY

<22 28 53,475

22 - <24 41 38,919

24 - <26 36 36,610

26 - <28 40 32,635

28 - <30 35 21,122

30+ 42 34,904

* Kg body weight per height in meters squared

a. Which of the following best describes the research design used in this study? (Choose
one best answer). (2 pts)

A. Cross-sectional survey
B. Ecological study
C. Population based case control study
D. Cohort study
E. None of the above

b. Complete the table by calculating the crude body mass index-specific incidence rates. (3
pts)
c. Calculate the relative risk (RR) of colon cancer associated with a BMI of 28-<30. Use the
lowest BMI category as referent. In one sentence interpret your answer. (2 pts)
d. Calculate the attributable risk proportion of those in the 28-<30 BMI category. In one
sentence interpret your answer. (the attributable risk formulas provided in class can be
used even though the data provide is for rates) (2 pts)

12. Analyses of data from cohort studies often have to deal with the reality that participants
have unequal lengths of follow up. Given the data below, calculate the (a) total person time
(month) of follow up, (b) the overall incidence density rate, (c) 13 month cumulative
incidence, and (d) the product limit estimate of failure. Each horizontal line represents a
cohort participant. Each vertical line represents one month. Arrows indicate time of loss to
follow up. Black boxes indicate onset of disease (failure). (2 pts each)

a. ______________
b. ______________
c. ______________
d. ______________
University of North Carolina at Chapel Hill
School of Public Health
Department of Epidemiology
Fundamentals of Epidemiology (EPID 168)

Midterm Examination, Fall 1999

Answer Guide

1. B. Case-control studies are said to use sampling by disease and are suited for studying rare
diseases.
2. B. Cohort studies can be either retrospective or prospective and are often used to study rare
exposures.
3. The ratio of odds of exposure among cases to odds of exposure among noncases is the odds
ratio, which is a measure of association.
4. Incidence rates cannot be estimated from case-control studies without additional
information. In the case-control design selection of subjects is based on disease status, so
the number of cases is under the control of the investigator. If the investigator has access to
all cases and knows the size of the population from which they arise s/he can estimate
incidence, but knowledge of the population size is not available from the case-control
design.
5.

a. Population attributable risk (PARP)

Both "proportion" and "neither" received credit, since this is a subtle distinction.
According to Regina Elandt-Johnson (Am J Epidemiol1975;102:267-271), a proportion is a
type of ratio in which the numerator is included in the denominator [p=a/(a+b)]. Since
PARP can be expressed as ("attributable" cases / all cases), it is indeed a proportion.
However, it can also be expressed as a difference of two proportions (I-I 0) or the product
of a proportion (prevalence) and the difference of two proportions [p(I 1-I0)], so it is easy
to be misled about its mathematical form (indeed, the "official" answer to this question
could not explain why it is a proportion!).
b. Incidence density (ID) is a RATE.
c. Prevalence is a PROPORTION.
d. Relative risk is NEITHER a rate nor a proportion.
6. Indicate true or false next to each of the following. (2 pt each)

a. FALSE – A Pearson product-moment correlation coefficient measures the extent to

which a relationship is linear, so a value of plus one or minus one corresponds to a
straight line.
b. TRUE – A risk ratio measure and a correlation coefficient are both measures of
association.
c. FALSE – A population attributable risk proportion depends on the prevalence of
exposure and is ALSO directly related to the strength of an association.
d. TRUE – The study base for a case-control study consists of those people who if they
developed the disease could have been counted as cases.
e. FALSE – The Bradford Hill criterion "coherence" means that all of the known facts
about the relationship fit into place; the criterion of "consistency" means that the
association has been observed repeatedly in different places, by different observers,
and at different times.
f. TRUE – "Temporality" is the one Bradford Hill criterion for causal inference that must
hold true between exposure and disease.

7. C. "The event rates in the geographic area of interest are applied to the age-stratum sizes of
a standard population to create a rate that is a weighted average" describes a directly-
standardized rate.
8. a.

ICD Classification

Death Certificate Work-related Not work-related TOTAL

Work-related 1195 788 1,983

Not work-related 345 97,672 98,017

TOTAL 1,540 98,460 100,000

b. Sensitivity = 1,195/1,540 = 78% Specificity = 97,672/98,460 = 99%

c. Positive predictive value = 1,195/1,983 = 60%
d. Based on these data the death certificate "injury at work" classification system will
overestimate the true number of work-related fatal injuries, since more non-work-
related injuries will be classified as work-related than vice-versa.
e. B. Passive surveillance – the reports are submitted by health care workers in
conformance with a general obligation rather than in response to a specific request
from the surveillance organization.
f. C. Sensitivity and specificity are measures of validity, since there is a standard for
"truth".

9. D. Prospective cohort, since the investigators monitored people without the condition over
time to detect its development.
Cigarette smoking status

Ever smokers Never smokers Total

Case ARM cases 79 26 105

Status Non-cases 785 342 1127

Total 864 368 1232

a. CI in ever smokers = # new cases / population at risk = 79/864 = 0.091 in 5 years

CI in never smokers = # new cases / population at risk = 26/368 = 0.071 in 5 years
b. (was labeled "e") Cumulative incidence ratio (CIR) = CI in ever smokers / CI in never
smokers
= (79/864) / (26/368) = 1.29
c. (was labeled "f") PARP = (overall incidence – incidence in never smokers) / overall
incidence of ARM
= (0.0852 – 0.0707) / 0.0852 = 17%

10.

a. Standardized event ratio (for cell phones) = SMR (cell phone) = observed/expected
= 42/{(.003)(1000) + (.06)(700) + (.08)(50)} = 42/49 = 0.86
b. Standardized event ratio (textiles) = SMR (textile) = observed/expected
= 182/{(.003)(100) + (.06)(500) + (.08)(1500)} = 182/150 = 1.2
c. These two ratios cannot be compared directly. An SMR is a weighted average where
the weights (e.g., age structure) come from the population for which indirect
standardization is being carried out. So SMRs for two populations use different
weights. Unless the populations have identical age structures, the stratum-specific
rates are the same for all strata, or the stratum-specific rates for one population are a
constant multiple of those for the second population, the comparison is invalid. With
indirect standardization, it is actually the "standard population" rates that are being
"standardized" to the age distribution of the study population.

11.
Baseline body Number of incident Person-years Incidence
mass index* cases of colon cancer of follow up rate/100,000 PY
<22 28 53,475 52.4
22 - <24 41 38,919 105.3
24 - <26 36 36,610 98.3
26 - <28 40 32,635 122.6
28 - <30 35 21,122 165.7
30+ 42 34,904

* kg body weight per height in meters squared

a. D. Cohort study
b. RR of colon cancer for BMI 28-<30 kg/m2 vs. lowest = 165.7/52.4 = 3.16
c. ARP for BMI 28-<30 kg/m2 vs. lowest = (3.16 – 1) / 3.16 = 68%

The ARP of 68% means that 68% of the incidence in the 28-<30 kg/m 2 group is
attributable to elevated BMI.

12.

a. 43 person-months
b. 3 cases/43 person-months = 7.0 cases per 100 person-months
c. 13-month CI = 3/7 = 0.43
d. Product-limit estimate of survival = 1-[(6-1)/6 x (5-1)/5 x (3-1)/3)] = 1-0.444 =
0.555

University of North Carolina at Chapel Hill
School of Public Health
Department of Epidemiology

Fundamentals of Epidemiology (EPID 168)

Final Examination, Fall 1999

The questions on this examination are largely based on Cantor KP, Lynch CF, Hildesheim ME,
Dosemeci M, Lubin J, Alavanja M, Craun G. Drinking water source and chlorination byproducts in
Iowa. III. Risk of brain cancer. Am J Epidemiol 1999;150:552-60. You may refer to an unannotated
copy of this article during the examination.

1. Briefly discuss two reasons why a case-control study is (or is not) well suited to examine risk
factors for brain cancer. (3 pts)

2. The authors describe the study design they used as a "population-based case-control study".
Briefly explain how this is different than a non-population based case-control study. Include in
your answer issues regarding the selection of cases, selection of controls, and validity. (3 pts)
3. Cases were identified by the State Health Registry of Iowa. Which of the following categories of
study design best describes this method of case finding? Choose one best answer. (3 pts)

A. Prospective follow-up
B. Passive surveillance
C. Cross-sectional survey
D. Community-based screening
E. Hospital-based surveillance

4. The authors state that cases had to be newly diagnosed with histologically confirmed glioma
without previous diagnosis of a maligant neoplasm. Which of the following best describes an
advantage of using incident cases instead of prevalent cases? Choose one best answer. (3 pts)

A. Using incident cases allows the investigators to directly compute relative risks.
B. Using incident cases reduces the non-systematic error of case-control studies.
C. Estimates of exposure from incident cases may be less influenced by disease status.
D. Using incident cases allows for the investigation of effects on risk versus those effecting
duration.
E. Incident cases are less likely to be lost to follow up than prevalent cases.

A. The exposure has some influence on the process by which controls are selected.
B. The exposure has some influence on the process of case ascertainment.
C. The disease status has some influence on the recall of exposures.
D. The exposed cases are reported to registries more than unexposed.
E. All of the above will produce selection bias.
6. In this study, exposre information for many of the brain cancer cases was provided by proxy
respondents. The authors did not have information from independent sources that could be
used to directly verify information provided by these surrogates. However, suppose a follow-up
questionnaire was administered to cases, and for 85 of the cases, the investigators were able to
obtained information about whether or not they used a private well directly for the cases (self
report). Assuming that self report is the best available assessment of whether they used a
private well or not, complete the table below so that it reflects a sensitivity, specificity, and
positive predictive value of a proxy response of 77%, 75%, and 57%, respectively. Assume that
26 of cases reported that they used private wells. Show your calculation. (6 pts)

Proxy report Self Report = YES Self report = NO

YES

7. Cases in this study were histologically confirmed. This is an example of which of the following
disease classification criteria? Choose one best answer. (3 pts)

A. Causal criteria
B. Ecologic criteria
C. Manifestational criteria
D. Etiologic criteria
E. None of the above

8. Consider the data presented in Table 1 of this article. Which of the following best represents the
proportion of the risk of brain cancer in the population that is attributable to working on a farm
(farm occupation)? Assume that a farm occupation is causally related to brain cancer risk.
Choose one best answer. (4 pts)

A. 33%
B. 57%
C. 10%
D. 29%
E. Cannot be calculated from case-control studies

9. A case-control study like the one described in this paper is most useful when it helps us
understand what is happening in the study base (underlying population). Which of the
following best describes the study base in this article? Choose one best answer. (3 pts)

A. The study base is those who if they developed brain cancer could have been selected as a
case.
B. The study base is those who have an equal probability to be selected as a case or control.
C. The study base is those who are identified as cases or controls after excluding non-
responders.
D. The study base is those who if exposed would have been identified as exposed.
E. None of the above.
10. In Table 3 the odds ratios for incident brain cancer by duration of chlorinated surface water
exposure are given. The odds ratio (95% confidence interval) in men estimating the risk of
brain cancer with 1-19 years of exposure is 1.3 (0.8, 2.1) and 2.5 (1.2, 5.0) for 40 years or more
of exposure. Which of the following best describes the role of chance in observing these two
estimates? Choose one best answer. (3 pts).

A. The odds ratio for  40 years exposure is more likely due to chance because it is based on
fewer cases and controls.
B. The odds for 1-19 years of exposure are more likely due to chance because the point
estimate is closer to the null value (1.0).
C. The odds ratio for  40 years exposure is more likely due to chance because the
confidence interval is so wide.
D. The odds ratio for 1-19 years of exposure is less likely due to chance because the
confidence interval is narrower.
E. The odds ratio for  40 years exposure is less likely due to chance because the confidence
interval does not include 1.0.

11. Table 3 presents’ odds ratios for the association of incident brain cancer with various levels of
lifetime average THM exposure. The odds ratio (95% confidence interval) for lifetime average
THM concentration of 0.8-2.2  g/liter for men was 0.9 (0.6, 1.6). The odds ratio (95%
confidence interval) for lifetime average THM concentration of  32.6  g/liter for woman was
0.9 (0.4, 1.8). Which of the following best describes the precision of these two estimates of
risk? Choose one best answer. (3 pts)

A. The estimate is equal because the point estimates are the same.
B. The estimate is equal because neither confidence interval excludes 1.0.
C. The estimate in men is slightly more precise because the confidence interval is narrower.
D. The estimate in women is slightly more precise because the exposure level is much higher.
E. The precision of the estimates cannot be compared because they are from different
exposure groups.

12. Using the data in Table 4, which of the following best describes the crude unadjusted odds
ratios estimating the risk of brain associated with  40 years exposure to chlorinated surface
water in men with above median tap water intake? Use the category of 0 years exposure to
chlorinated surface water as the reference group. Choose one best answer. (4 pts)

A. 4.0
B. 1.5
C. 3.6
D. 2.6
E. Cannot be computed from data in Table 4.

13. Table 1 shows the adjusted odds ratio estimating the risk of brain cancer by population size.
Using the  25,000 population sizes as a reference calculate the crude (unadjusted) odds ratio
associated with the > 50,100 population. In 2 sentences or less explain why the two estimates
agree or disagree. (4 pts)
14. The authors state that they "found a dose-response relationship among men between brain
cancer and duration of consuming drinking water from chlorinated surface water…” Using 3
Bradford Hill criteria, in 3-4 sentences, address causality (or the lack of causality) of the
relationship of drinking water to brain cancer. (4 pts)
15. An early study of drinking water and brain cancer was an ecological study conducted by the
lead author of the present article. In this study, brain cancer mortality rates in 923 U.S. counties
were compared with average levels of THM measured in the drinking water supplies of those
counties. For counties in which the sampled water supply served at least 85% of the residents
of that county, the correlation coefficient between county-specific mortality rates from brain
cancer and trihalomethane levels was 0.24 in White men and 0.19 in White women. After
reviewing this paper, your colleague concluded that THM in drinking water are causally related
to brain cancer. However, you are more cautious in your interpretation, citing the "ecological
fallacy." Please define the ecologic fallacy (2 pts) and describe why it limits the causal
inferences that can be made from the ecological study described above (2 pts).
16. The authors used information provided by cases and controls on place of residence, primary
source of drinking water, and tap water and total fluid consumption to create an index of
cumulative lifetime exposure. However, the natural history of cancer (initiation, promotion,
conversion, and progression) may encompass many years. If drinking water is involved at the
earliest stages of brain cancer (initiation), then drinking water exposures in the recent past may
be more important than present exposures or those in the distant past (e.g., in childhood). As
defined in class, which of the following periods would be important in defining the minimal and
maximal length of time expected between drinking water exposure and diagnosis with
histologically confirmed glioma? Choose one best answer. (3 pts)

A. Induction period
B. One year case fatality
C. Latent period
D. Both a and c
E. None of above

17. The authors included all cases of histologically confirmed malignant brain cancers, including
glioblastoma, fibrillary and gemistocytic astrocytoma, and mixed glioma. If authors suspected
that drinking water exposure was associated with only certain subtypes of brain cancer (i.e.,
disease heterogeneity), which of the following strategies could they employ at the analysis
stage? (3 pts)

A. Adjustment for cancer type using mathematical modeling (e.g., logistic regression)
B. Stratification of cases by brain cancer type
C. Direct standardization by brain cancer type
D. Indirect standardization by brain cancer type
E. Matching cases and control by brain cancer type

18. The authors restricted their analysis to those cases and controls with at least 70 percent of
their lifetime years with a known source of drinking water. This approach was used to reduce
which type of bias? Choose one best answer (3 pts)

A. Confounding bias
B. Selection bias
C. Information bias
D. Random error
E. None of the above
19. (question was not asked)

20.
a. Using the data in Table 3, label and complete a 2x2 table for the association between brain
cancer and >=40 years’ residence with a chlorinated surface water source (versus 0
years), collapsing over sex (i.e., combine the data for men and women). (4 pts)
b. Calculate the odds ratio for your 2x2 table in part a. Show your work. (3 pts)
c. Suppose that the sex-adjusted OR for the relationship between brain cancer and >=40
years’ residence with a chlorinated surface water source is 1.1. Is sex a confounder of this
relationship? Justify your answer. (3 pts)
d. Is sex an effect modifier (assuming a multiplicative model for joint effects) of the
relationship between brain cancer and >=40 years’ residence with a chlorine
e. Ted surface water source? Justify your answer. (3 pts)
f. According to Table 1, having a farming occupation (ever vs. never) is a risk factor for
brain cancer (OR=1.5). Assume that among the controls, farming occupation is associated
with duration of residence with a chlorinated surface water source. Could farming
occupation be a confounder of the associations reported under the Total column in Table
3? Explain your answer. (3 pts)

21. Characteristics of cases and controls included in this study are shown in Table 1. Using this
information answer the following questions.

a. Calculate the appropriate crude (unadjusted) measure of association between farm

occupation and brain cancer. Consider those ever working on a farm as sufficient to be
classified as having a farm occupation. In 2 sentences or less interpret what this odds ratio
means. (4 pts)

Farm Occupation CASE CONTROL

YES

b. Assume that 10% of the cases that were labeled as never having worked on a farm truly had
worked in such an environment. Furthermore assume that 15% of the controls that were
labeled as having ever worked on a farm, in fact never really did work on a farm. What
would the true association be between farm occupation and brain cancer? Assume that the
classification of disease status is valid. (4 pts)
c. Which of the following best describes a comparison of the odds ratios you computed in
parts (a) and (b)? Choose one best answer. (3 pts)

A. The odds ratios are different as a result of differential misclassification of exposure.

B. The odds ratios are different as a result of non differential misclassification of exposure.
C. The odds ratios are different as a result of differential misclassification of disease status.
D. The odds ratios are different as a result of non differential misclassification of disease status.
E. The odds ratios are different as a result of random variation in the exposure assessment.
22. Which of the following is a measure of the validity of methods used to classify exposures
such as having worked on a farm? Choose one best answer. (3 pts)

A. interclass correlation coefficient

B. kappa statistic
C. standard error
D. sensitivity
E. none of the above

23.
a. Using data in Table 1, assess whether the crude OR of brain cancer associated with
farm occupation is confounded by age and/or sex. Support your answer with
relevant calculations. Table 1 shows the adjusted odds ratios estimating the risk of
brain cancer due to having farm occupation. (2 pts)

b. What feature of the study design could have contributed to the crude OR’s in Table 1
being confounded by age and/or sex? (2 pts)
University of North Carolina at Chapel Hill
School of Public Health
Department of Epidemiology

Fundamentals of Epidemiology (EPID 168)

Final Examination, fall 1999

The questions on this examination are largely based on Cantor KP, Lynch CF, Hildesheim ME,
Dosemeci M, Lubin J, Alavanja M, Craun G. Drinking water source and chlorination byproducts in
Iowa. III. Risk of brain cancer. Am J Epidemiol 1999; 150:552-60. You may refer to an unannotated
copy of this article during the examination.

1. Briefly discuss two reasons why a case-control study is (or is not) well suited to examine risk
factors for brain cancer. (3 pts)

A. Prospective follow-up
B. Passive surveillance
C. Cross-sectional survey
D. Community-based screening
E. Hospital-based surveillance

4. The authors state that cases had to be newly diagnosed with histologically confirmed glioma
without previous diagnosis of a malignant neoplasm. Which of the following best describes an
advantage of using incident cases instead of prevalent cases? Choose one best answer. (3 pts)

5. Even if the investigators are careful in the selection of cases and controls, selection bias can
make interpretation of results difficult. Which of the following is NOT a situation that can
produce selection bias? Choose one best answer. (3 pts)
A. The exposure has some influence on the process by which controls are selected.
B. The exposure has some influence on the process of case ascertainment.
C. The disease status has some influence on the recall of exposures.
D. The exposed cases are reported to registries more than unexposed.
E. All of the above will produce selection bias.

6. In this study, exposure information for many of the brain cancer cases was provided by proxy
respondents. The authors did not have information from independent sources that could be
used to directly verify information provided by these surrogates. However, suppose a follow-up
questionnaire was administered to cases, and for 85 of the cases, the investigators were able to
obtained information about whether or not they used a private well directly for the cases (self
report). Assuming that self report is the best available assessment of whether they used a
private well or not, complete the table below so that it reflects a sensitivity, specificity, and
positive predictive value of a proxy response of 77%, 75%, and 57%, respectively. Assume that
26 of cases reported that they used private wells. Show your calculation. (6 pts)

Proxy report Self Report = YES Self report = NO

YES

7. Cases in this study were histologically confirmed. This is an example of which of the following
disease classification criteria? Choose one best answer. (3 pts)

A. Causal criteria
B. Ecologic criteria
C. Manifestational criteria
D. Etiologic criteria
E. None of the above

8. Consider the data presented in Table 1 of this article. Which of the following best represents
the proportion of the risk of brain cancer in the population that is attributable to working
on a farm (farm occupation). Assume that a farm occupation is causally related to brain
cancer risk. Choose one best answer. (4 pts)

A. 33%
B. 57%
C. 10%
D. 29%
E. Cannot be calculated from case-control studies

10. In Table 3 the odds ratios for incident brain cancer by duration of chlorinated surface water
exposure are given. The odds ratio (95% confidence interval) in men estimating the risk of
brain cancer with 1-19 years of exposure is 1.3 (0.8, 2.1) and 2.5 (1.2, 5.0) for 40 years or
more of exposure. Which of the following best describes the role of chance in observing
these two estimates? Choose one best answer. (3 pts).

A. The odds ratio for  40 years exposure is more likely due to chance because it is
based on fewer cases and controls.
B. The odds for 1-19 years of exposure is more likely due to chance because the point
estimate is closer to the null value (1.0).
C. The odds ratio for  40 years exposure is more likely due to chance because the
confidence interval is so wide.
D. The odds ratio for 1-19 years of exposure is less likely due to chance because the
confidence interval is narrower.
E. The odds ratio for  40 years exposure is less likely due to chance because the
confidence interval does not include 1.0.

11. Table 3 presents odds ratios for the association of incident brain cancer with various levels
of lifetime average THM exposure. The odds ratio (95% confidence interval) for lifetime
average THM concentration of 0.8-2.2  g/liter for men was 0.9 (0.6, 1.6). The odds ratio
(95% confidence interval) for lifetime average THM concentration of  32.6  g/liter for
woman was 0.9 (0.4, 1.8). Which of the following best describes the precision of these two
estimates of risk? Choose one best answer. (3 pts)

A. The estimate is equal because the point estimates are the same.
B. The estimate is equal because neither confidence interval excludes 1.0.
C. The estimate in men is slightly more precise because the confidence interval is
narrower.
D. The estimate in women is slightly more precise because the exposure level is much
higher.
E. The precision of the estimates cannot be compared because they are from different
exposure groups.

12. Using the data in Table 4, which of the following best describes the crude unadjusted odds
ratios estimating the risk of brain associated with  40 years exposure to chlorinated
surface water in men with above median tap water intake? Use the category of 0 years
exposure to chlorinated surface water as the reference group. Choose one best answer. (4
pts)

A. 4.0
B. 1.5
C. 3.6
D. 2.6
E. Cannot be computed from data in Table 4.

13. Table 1 shows the adjusted odds ratio estimating the risk of brain cancer by population size.
Using the  25,000 population size as a reference calculate the crude (unadjusted) odds
ratio associated with the > 50,100 population. In 2 sentences or less explain why the two
estimate agree or disagree. (4 pts)

14. The authors state that they "found a dose-response relationship among men between brain
cancer and duration of consuming drinking water from chlorinated surface water…". Using
3 Bradford Hill criteria, in 3-4 sentences, address causality (or the lack of causality) of the
relationship of drinking water to brain cancer. (4 pts)

15. An early study of drinking water and brain cancer was an ecological study conducted by the
lead author of the present article. In this study, brain cancer mortality rates in 923 U.S.
counties were compared with average levels of THM measured in the drinking water
supplies of those counties. For counties in which the sampled water supply served at least
85% of the residents of that county, the correlation coefficient between county-specific
mortality rates from brain cancer and trihalomethane levels was 0.24 in White men and
0.19 in White women. After reviewing this paper, your colleague concluded that THM in
drinking water are causally related to brain cancer. However, you are more cautious in your
interpretation, citing the "ecological fallacy." Please define the ecologic fallacy (2 pts) and
describe why it limits the causal inferences that can be made from the ecological study
described above (2 pts).

16. The authors used information provided by cases and controls on place of residence, primary
source of drinking water, and tap water and total fluid consumption to create an index of
cumulative lifetime exposure. However, the natural history of cancer (initiation, promotion,
conversion, and progression) may encompass many years. If drinking water is involved at
the earliest stages of brain cancer (initiation), then drinking water exposures in the recent
past may be more important than present exposures or those in the distant past (e.g., in
childhood). As defined in class, which of the following periods would be important in
defining the minimal and maximal length of time expected between drinking water
exposure and diagnosis with histologically confirmed glioma? Choose one best answer. (3
pts)

A. Induction period
B. One year case fatality
C. Latent period
D. Both a and c
E. None of above
17. The authors included all cases of histologically confirmed malignant brain cancers,
including glioblastoma, fibrillary and gemistocytic astrocytoma, and mixed glioma. If
authors suspected that drinking water exposure was associated with only certain subtypes
of brain cancer (i.e., disease heterogeneity), which of the following strategies could they
employ at the analysis stage? (3 pts)

18. The authors restricted their analysis to those cases and controls with at least 70 percent of
their lifetime years with a known source of drinking water. This approach was used to
reduce which type of bias? Choose one best answer (3 pts)

A. Confounding bias
B. Selection bias
C. Information bias
D. Random error
E. None of the above

19. (question was not asked)

b. Calculate the odds ratio for your 2x2 table in part a. Show your work. (3 pts)

c. Suppose that the sex-adjusted OR for the relationship between brain cancer and
>=40 years’ residence with a chlorinated surface water source is 1.1. Is sex a
confounder of this relationship? Justify your answer. (3 pts)

d. Is sex an effect modifier (assuming a multiplicative model for joint effects) of the
relationship between brain cancer and >=40 years’ residence with a chlorinated
surface water source? Justify your answer. (3 pts)

e. According to Table 1, having a farming occupation (ever vs. never) is a risk factor for
brain cancer (OR=1.5). Assume that among the controls, farming occupation is
associated with duration of residence with a chlorinated surface water source.
Could farming occupation be a confounder of the associations reported under the
Total column in Table 3? Explain your answer. (3 pts)

21. Characteristics of cases and controls included in this study are shown in Table 1. Using this
information answer the following questions.

a. Calculate the appropriate crude (unadjusted) measure of association between farm

occupation and brain cancer. Consider those ever working on a farm as sufficient to be
classified as having a farm occupation. In 2 sentences or less interpret what this odds ratio
means. (4 pts)

Farm Occupation CASE CONTROL

YES

c. Which of the following best describes a comparison of the odds ratios you computed in
parts (a) and (b)? Choose one best answer. (3 pts)

A. The odds ratios are different as a result of differential misclassification of exposure.

B. The odds ratios are different as a result of nondifferential misclassification of
exposure.
C. The odds ratios are different as a result of differential misclassification of disease
status.
D. The odds ratios are different as a result of nondifferential misclassification of
disease status.
E. The odds ratios are different as a result of random variation in the exposure
assessment.

22. Which of the following is a measure of the validity of methods used to classify exposures
such as having worked on a farm? Choose one best answer. (3 pts)
A. interclass correlation coefficient
B. kappa statistic
C. standard error
D. sensitivity
E. none of the above

b. What feature of the study design could have contributed to the crude OR’s in Table 1
being confounded by age and/or sex? (2 pts)

University of North Carolina at Chapel Hill
School of Public Health
Department of Epidemiology

Fundamentals of Epidemiology (EPID 168)

Midterm Examination, Fall 1998

1. a. Briefly summarize two criteria on which disease classifications are based. Discuss
a reason why these two criteria do not always correspond with one another. (3 pts)

1. b. List two examples of each of the two types of criteria you mentioned in 1A. (2 pts)

2. Cohort studies can form the framework for efficient sub studies, using nested case-
control and case-cohort study designs. Which of the following best compares and
contrasts these nested case control studies and case-cohort studies? (3 pts)

A. Both nested case control and case-cohort studies select controls that are
matched on time of case development but only case-cohort studies allow for
multiple comparisons with different case groups.
B. Both nested case control and case-cohort studies select controls from the
entire baseline cohort, but in case-cohort studies the selection is done at
random.
C. In case-cohort studies a single group of controls can be used for comparison
with several case groups.
D. In nested case control studies, cases are selected entirely from the non-
exposed cohort group.
E. both C and D

3. Name the three component parts of any kind of incidence measure. (3 pts)

4. Over a ten-year period the number of bicycle injury events in a population increases
even as the age adjusted bicycle injury rate decreases in the population. Describe
two conditions that could cause this outcome (assume the definition of a bicycle
injury and the quality of the data remain constant over the 10 year period) (3 pts)

5. Which of the following best describes the condition(s) that are required for the odds
ratio (OR) to estimate the risk ratio (RR) in a case-control study? (choose one best
answer) (3pts)

A. Incident cases are identified for a defined population at risk.

B. The controls represent the base population that gave rise to the cases.
C. The disease outcome is rare in the base population at risk.
D. All of the above.

6. The association between induced abortion and breast cancer has been the subject of
previous epidemiological studies. Cohort studies have found no association, while at
least one case-control study has found a positive association. Possible explanations
for the different results in case-control and cohort studies of this topic include
(choose single best answer). (3pts)

A. Case-control studies are prone to selection bias, whereas cohort studies are
not vulnerable to selection bias.
B. Recall bias might explain the association observed in a case-control study,
but this would not be a problem in prospective cohort studies.
C. The method of disease classification is different in case-control and cohort
studies.
D. All of the above

7. Swaen et al (1998) conducted a study of 6,803 males who worked for at least six
months before 1/1/80 at one of nine chemical plants in the Netherlands. The
workers were followed for mortality from 1/1/56 until 1/1/96. Before 1/1/80,
2,842 of the workers were occupationally exposed to acrylonitrile and the other
3,961 workers were not exposed to acrylonitrile. After 1/1/80, there was no
exposure to acrylonitrile. To measure the association between occupational
exposure to acrylonitrile and several outcomes, the investigators calculated
standardized mortality ratios (SMRs) for both the exposed and the unexposed
workers. Age-interval-specific person-years were generated for specific exposure
groups and were multiplied by the mortality rates for the total male population of
the Netherlands to generate expected numbers of cause specific deaths.

a. What study design did the investigators use? (2 pts)

b. What was the (crude) cumulative incidence ratio (CIR) for mortality
comparing the exposed to the unexposed men? What are two reasons why
this measure is problematic with these data?
c. For brain cancer, the SMR for the exposed workers (SMR=173.9) was more
than twice the SMR for the unexposed workers (SMR=85.7). Why are these
two SMRs not strictly comparable? (3 pts)

d. There were 290 deaths due to all causes among the exposed group and 983
deaths due to all causes among the unexposed group. What measure of effect
could be calculated to strictly compare all-cause mortality between the
exposed and the unexposed group. (2 pts)

8. The issue of classification of disease is fundamental to epidemiological

investigations. The degree that we correctly separate cases of disease from non
cases can be quantified in terms of specificity and sensitivity. The issue of correct
classification is important in research involving cerebrovascular disease (stroke).
Generally speaking there are two kinds of strokes, ischemic (blood flow is restricted
to brain tissue because of blocked artery in or leading to the brain) and he
morrhagic (a vessel in the brain ruptures causing bleeding in the brain). These two
pathologic processes are quite different.

Background information:

A panel of experts reviewed the medical records of 525 patients discharged from the
hospital with diagnosis codes indicative of a stroke (ICD 430-438). The panel
classified strokes as either ischemic or not ischemic. Assume the diagnos is reached
by the panel is the most accurate classification possible. Of the 525 cases, 325 had a
discharge diagnosis code for ischemic stroke (ICD code 434). Of these 325 patients,
85 were determined by the panel not to be ischemic strokes. All but 20 o f the
patients with discharge diagnosis codes other than 434 were determined by the
panel to have non-ischemic strokes.

Given the background information, compute the sensitivity, specificity, and positive
predictive value of a hospital discharge code for ischemic stroke (ICD code 434) in
classifying a patient as truly having an ischemic stroke.

a. sensitivity of a 434 code: (2 pts)

b. specificity of a 434 code: (2 pts)

c. positive predictive value of a 434 code: (2 pts)

d. Constructing a receiver/response operating characteristic (ROC) curve may
be useful in understanding the implications of using different case
definitions. Briefly explain what a ROC curve is and what information it
provides. (2 pts)

e. If you were to use a 434 discharge code to identify a group of cases with
ischemic stroke and the sensitivity was 99% but the specificity was 40%,
which of the following would best describe your resulting case group.
(Choose one best answer). (2 pts)

A. The case group would be highly homogenous with respect to

pathophysiology of stroke.
B. The case group would be highly heterogeneous with respect to
patho physiology of stroke.
C. The case group would have many false negative ischemic
strokes.
D. The case group would represent the source population of
cases.

f. What two factors influence the positive predictive value of a screening

test in most situations? (2 pts)

9. Suppose that a study was conducted to compare the rates of automobile

collisions in two cities. The researchers were impressed with studies that
suggest that the use of cell phones and pagers contribute to auto collisions.
They wanted to adjust (standardize) the rates of auto collisions in the two
cities for cell phone and pager use. Data on cell phone use and auto collisions
in the two cities were collected and are presented in the table below.

Cell phone and Corona del Mar, California Boulder, Colorado

pager use

# # accidents Rate* # # accidents Rate*

persons persons

Heavy 4479 293 100 2

Moderate 974 27 300 6

Never 1106 15 8293 145

Total 6559 335 8693 153

* per 1000 persons

a. Calculate the crude total and cell phone/pager use specific rates for
Corona del Mar and Boulder. How do these two cities compare in
crude prevalence of auto accidents. (2 pts)

b. Using the combined number of persons in both areas as a standard,

calculate a standardized rate (standardized for cell phone/pager use)
for each of the states. Use the direct standardization method. Briefly
describe how these standardized rates compare with each other and
with the crude rates. Briefly describe any meaningful differences. (4
pts)

c. In general, which of the following best describes a major weakness of

both crude and adjusted rates? (2 pts)

A. Both measures hide or obscure the heterogeneity in the population.

B. Both measures are only estimates of the true population rate.
C. Neither measure can be used to determine the magnitude of disease
burden in the population.
D. None of the above.

10. In a community intervention study, like the Minnesota Heart Health Program,
the effectiveness of an educational intervention program was evaluated.
Which of the following best describes the unit of assignment, the unit of
observation, and the unit of analysis in these types of studies (in this order)?
(2 pts)

A. group, person, group

B. person, group, group
C. group, group, group
D. none of the above

11. Indicate next to each statement below whether you consider it to be TRUE,
FALSE, or if you are NOT SURE. A correct answer receives 2 points, an
incorrect one zero.

a. An advantage of cohort designs compared to the pure case control

designs is that cohort studies can directly estimate risks.
b. The temporal sequence of exposure and disease can be directly
addressed in a cohort design as well as in a case control study.
c. A disadvantage of the cohort design compared to a case control study
design is that in a cohort study one cannot address multiple outcomes.
d. As described in class, a randomized clinic trial is an example of a
prospective dynamic cohort study.
e. A disadvantage of the cohort design compared to a case control study
is that in a cohort study one needs to follow a large number of
participants if the disease is rare.
f. Ecological studies cannot directly assess causal inference because
they measure disease and exposure in a person at the same point in
time.
g. Correlation studies can be quick, inexpensive, and allow for
multinational comparisons.
h. A case report is a type of descriptive study that is commonly
conducted, partially because an appropriate control group is easily
defined.
i. Cross-sectional studies are limited by their lack of generalizability, but
are powerful in that they directly measure risk.
j. The study of person, place, and time helps to understand the natural
history of a disease.
k. A risk difference is determined by the absolute difference in two
incidence rates, whereas the relative difference is considered an
attributable risk.
l. A correlation coefficient measures the degree of linear or monotonic
relationship between two variables and is therefore suitable for
determining the epidemiologic strength of association between them.
m. As an estimate of a relative risk, an odds ratio is a measure of
association that can be used to determine the magnitude of an
association between exposure and an outcome.
n. An attributable risk proportion is a measure of the impact assessing
how much risk results from exposure levels. Attributable risks that
adjust for the prevalence of the causal factor in a population is called a
population attributable risk.
o. Case control studies have several crucial advantages that relate to
their efficiency for studying rare conditions and those with prolonged
induction and their efficiency in examining many exposure and
outcomes.
p. Incidence density is a proportion where the units of time are
specified.
q. The decision to use an incidence density measure or a cumulative
incidence as a measure of the strength of association may depend on
the objectives of the study. Cumulative incidence is preferred if
estimating individual risk is the main objective.
r. A standardized mortality ratio (SMR) can be determined using
indirect adjustment. Because rates from a standard population are
used, SMR’s from two study populations can be compared as long as
the rates in the standard population are stable.
s. Comparability between cases and controls is a important step in
constructing a case-control study. It should be possible to detect
exposure in controls to the same extent as in cases. It is also critical
that controls have similar motivation and availability as cases. These
two conditions are best met when controls are selected from the
general population.

12. Attributable measures are used by researchers to assess the public health
impact of a detrimental exposure, assuming causality. Given data from a
cohort study on the incidence of stroke (see below), estimate the attributable
risk proportion among the exposed (physically inactive). Explain your
answer in one sentence. Assume that physical activity is causally related to
stroke risk.

Incidence
Physical Did develop a Do notdevelop Person years
per 1,000
activity level stroke a stroke (PY)
PY

ACTIVE 45 5,955 43,200

INACTIVE 135 13,865 100,800

Total 180 19,820 144,000

a. Attributable risk proportion (INACTIVITY) (3 pts)

Explain:

b. Additional data from the National Health and Nutrition Examination Survey
(NHANES) suggest the prevalence of a physically active lifestyle (at least 30
minutes of moderate activity 3 days per week) is 27%. Using this information
and your answer to part (A), estimate what we can hope to accomplish with
programs to get people to be physically active in the total population. In one
sentence explain your answer. (3 pts)

Explain:

13. Suppose that in 1998 researchers hypothesized that communication ability

and skill in young adulthood was related to Alzheimer’s Disease. To test this
they evaluated hand written essays completed by a group of 350 nuns joining
a single religious sect in 1930. By careful review of these writing samples, the
researchers categorized all 350 as either having a high error profile (N=150)
or a low error profile (N=200). Using surveillance of death certificates and
other methods the researchers verified vital status of each nun through 1998.
An accounting of all deaths produced the table below.

Cause of Death and Year by Handwriting Profile Status

High error
Low error profile
profile

# of Year of # of Year of
Cause of Death Cause of Death
Deaths Death Deaths Death

Alzheimer’s Disease 2 1980 Alzheimer’s Disease 1 1985

Alzheimer’s Disease 5 1985 Alzheimer’s Disease 3 1990

Alzheimer’s Disease 6 1990 Alzheimer’s Disease 4 1995

Alzheimer’s Disease 5 1995 Heart Disease 8 1980

Heart Disease 10 1980 Heart Disease 10 1990

Heart Disease 15 1995 Other 20 1960

Other 25 1960 Other 10 1970

Other 30 1970

a. Describe the type of study design used in this example. (2 pts)

b. Compute the incidence density rate of Alzheimer’s disease death for those
with a high error profile and for those with a low error profile. (3 pts) Show
your work.

c. Compute the incidence density ratio for the risk of Alzheimer’s disease death
associated with a high error communication profile. Explain, in two
sentences or less, what this value means. (3 pts)
d. Using data from this study compute an odds ratio for the association of a high
error communication profile with death from Alzheimer’s disease. Show a
clearly labeled 2x2 table. (2 pts)

e. Compare the odds ratio with the incidence density ratio computed in part c
and explain why they are similar or different.

University of North Carolina at Chapel Hill

School of Public Health, Department of Epidemiology
Epidemiology 168, Fall 1998

Midterm Exam Answer Guide

1. a. Manifestation criteria: disease definition and classification based on observable

characteristics, such as symptoms, signs, history, laboratory findings, response to
treatment, prognosis.

Causal criteria: disease definition and classification based on the cause of the
condition,

b. Manifestation criteria: Examples are cancers, arthritis, cholescystitis,

schizophrenia, depression, addiction, insomnia, . . .

Causal criteria: microbial diseases for which the pathogen has been identified
(syphilis, TB, malaria, yellow fever, influenza, etc.), lead poisoning, birth trauma,

2. (C)- Other choices are incorrect because controls in case-cohort studies are not
matched to cases (A), contrrols are selected at random with both designs (B), and
cases must be selected without regard to exposure (D).
3. New cases or events, population at risk or source population, passage of time
4. The size of the population may have grown (number increases even though rate
does not); the age distribution of the population may have changed (e.g., influx of
families with small children, outmigration of families with older children), so that
age-standardized rate may not change but a greater proportion of the population
may be in the higher risk age range (assuming that younger children have higher
injury rates).
5. (D)- All of the above - use of prevalent cases requires that duration is not related to
exposure, controls should provide estimate of exposure in study base, and rare
disease assumption is required for OR to estimate RR (though not for OR to estimate
IDR).
6. (B)- In a prospective cohort study, information on exposure is obtained before the
outcome (breast cancer, in this case) has occurred. Therefore recall bias - different
recall by cases and non-cases - is not an issue. In a case-control study, cases and
non-cases may recall and report exposure with different degrees of accuracy.
7. a. A (retrospective) cohort study.

b. CIR = (290/2,842) / (983/3,961) = 0.411

A cumulative measure ignores possible differences in length of follow-up between
groups being compared. A crude measure ignores possible differences in the age
distributions between men who have been exposed and men who have not.

c. SMRs are an indirect method of standardization, since they are based on weighted
averages for which the weights are taken from the population whose SMR is being
computed rather than from a "standard" population. Unless the age (and in this case,
age-calendar year interval) distributions for the populations whose SMR's are being
computed are the same, then the weighted averages that make up the SMR's are
based on different sets of weights and are not strictly comparable. Since age-interval
distributions of exposed and unexposed workers may differ, their SMR's are not
strictly comparable.

d. Mortality rates computed with person-time denominators can be compared

between exposed and unexposed person-time. These will take into account the
varying amounts of follow-up for workers in different categories. Unless the person-
years at risk for exposed and unexposed workers have the same age distribution,
which we do not know, then adjustment for age is needed. Since there are ample
numbers of deaths from any cause, mortality rates can be directly-standardized
using any reasonable set of weights. Since directly-standardized rates are "strictly
comparable", a ratio or difference of directly standardized rates would be a suitable
measure of association.
8. All but 85 of the 325 code 434's were correct classifications, so there were 240
(=325-85) ischemic stroke patients correctly classified by discharge code. All but 20
of the patients without code 434 were judged to have had an ischemic stroke,
meaning that 20 were judged to have an ischemic stroke. Thus, there were 260
(240+20) ischemic stroke patients, of whom 240 were identified by discharge code
(sensitivity=240/260). The remaining 265 (=525-260) patients did not have an
ischemic stroke, and 180 of them were in fact not given a code 434
(specificity=180/265). Of the 325 code 434's, 240 had had an ischemic stroke
(PPV=240/325). These data are summarized in the following table:

Comparison of discharge code 434 and classification by

expert panel
Expert panel
Discharge code Ischemic Not ischemic Total
Code 434 240 85 325
Other 20 180 200
Total 260 265 525

a. Sensitivity= (325-85) / [(325-85+20) = 240 / 260 = 92.3%

b. Specificity = (200-20) / (525-260) = 180 / 265 = 68%

c. Positive predictive value of a 434 code = (325-85) / 325 = 73.8%

d. An ROC curve plots the value of sensitivity and specificity for each case definition
or cutpoint. Examining the ROC curve shows the trade-off between sensitivity and
specificity that is available for the diagnostic test or measurement method. [The
area between the identity diagonal (slope = 1.0) and the ROC curve serves as a
measure of accuracy that takes into account both sensitivity and specificity, with the
assumption that the costs of false negatives and false positives are the same.]
e. (B) - Due to the low specificity (50%), half of hemmorhagic strokes in the patient
group will be classified as ischemic strokes.

f. Specificity and prevalence of the condition

9. a. Corona del Mar has a 2.9 times higher crude accident rate than Boulder.

Corona del Mar = 51.1/1000 and Boulder = 17.6/1000. Ratio = 2.9

b. Adjusted rates -

Corona del Mar: (4579 x .0654) + (1274 x .0277) + (9399 x .0136)/15,252 =

29.9/1000

Boulder: (4579 x .0200) + (1274 x .0200) + (9399 x .0178)/15,252 = 18.6/1000

The cell phone/pager adjusted auto accident rate for Corona del Mar was 1.6 times
that of Boulder. A portion of the difference seen in the crude rates was due to
differences in the distribution of use of cell phones and pagers between the two
cities.

The standard weights are the sum of the population sizes for the two cities. The
weighted rates are the rates for each city, weighted (multiplied) by the standard
weights. The total of the weighted rates is the directly standardized rate. A problem
in using the directly standardized rates is that there are small numbers of cellular
phone and pager users in Boulder.

The higher crude rate in Corona del Mar reflects the much higher use of cellular
phones and pagers, which is associated with a much higher accident rate. The
difference is reduced for the standardized rates, since these control for the different
distributions of cellular phones and pagers between the two cities. However, this is
a situation where it is essential to examine the specific rates, since Boulder has
lower accident rates among cellular phone and pager users but a higher rate among
never-users.

Since the rates in never users are quite similar, Corona del Mar is likely to make its
greatest impact on accident rates by getting motorists to reduce cellular phone and
pager use while driving or finding some way to such use safer (promote the use of
"designated drivers"!?).

c.(A) Both measures obscure heterogeneity (variation) in rates across subgroups.

10. (A) Community intervention trials of this type assign groups to treatments and
collect measurements from individuals. The unit of analysis must be the same as the
unit of assignment (GROUP) or both (i.e., using mixed models).
11. a. T – a cohort study enrolls people who are free of the outcome and monitors them
for the development of the outcome, so the cohort design can be used to estimate
risk of the event;

b. Not sure – the temporal sequence of exposure and disease can typically not be
addressed in a case-control study, though in some cases (e.g., a genetic
characteristic or other "exposure" that can be definitively assigned to a time prior to
disease onset);

c. F – a cohort design can readily be used to study multiple outcomes; a case-control

design can readily be used to study multiple exposures;

d. T – a randomized clinical trial often enrolls participants over a period of time,

with follow-up time measured from the time of randomization;

e. T – a cohort study begins with disease-free subjects and monitors them for
development of the outcome; if the outcome is rare, many subjects must be followed
to obtain an adequate number of cases;

f. F – ecological studies use group-level variables (e.g., per capita meat consumption)

and relate them to disease rates; direct assessment at the individual level is NOT
made, which is the basis for the ecological fallacy (where the group data are used to
infer a link at the individual level);

g. T – correlational studies (another term for ecological studies) are often used to
compare disease rates across geopolitical entities using available data;

h. F – a case report does not involve a control group;

i. F – cross-sectional studies measure prevalence, not risk (of a future event); they
are the most statistically generalizable type of study when, as is often the case, the
study population is obtained through population-sampling;

j. F – the natural history of a disease is the process by which it develops over time;
descriptive information relating to person, place, and time can at best provide only
indirect information;

k. F – as used in class, the term "attributable risk" refers to the risk difference;

l. F – strength of association as used in epidemiology refers to the degree of change

in the one variable with respect to changes in the other variable; two variables can
be very strongly correlated (vary linearly or motonically) yet a large change in one
may be associated with only a small change in the other (e.g., a straight line with a
modest slope has a high correlation but a small degree of change in the ordinate
variable for a given change in the variable on the abscissa);
m. T – for a rare outcome, the odds ratio (OR) closely approximates the cumulative
incidence ratio (CIR) and incidence density ratio (IDR), so it indicates strength of
association in the epidemiologic sense; when the outcome is not rare, the OR does
not approximate but does vary with the CIR and IDR, so the OR still gives an
indication strength of association

n. T – an attributable risk proportion estimates the proportion of risk that is

associated with an exposure in people who are exposed; attributable risk (as used in
this course) is the risk difference, which indicates the amount of risk associated with
an exposure in people who are exposed; attributable risk must be adjusted for the
prevalence of the exposure in order to estimate the amount of risk associated with
exposure in the population as a whole;

o. F – since case-control studies begin with people who are already cases, they avoid
having to study a large number of people for a long time in order to accumulate
enough cases; they can also compare cases and controls in respect to many
exposures; HOWEVER, they cannot readily study many outcomes, since to do so
requires enrolling cases for each of the outcomes to be studied (i.e., equivalent to
conducting several case-control studies that share the same control group);

p. F – incidence density is a (relative) rate; cumulative incidence is a proportion;

q. F – incidence density and cumulative incidence are measures of frequency of

occurrence, not of strength of associatiion;

r. F – comparability of standardized rates and ratios across study populations

requires that the standardized measures be constructed using the same set of
weights; indirect standardization (e.g., via a SMR) employs the weights (the number
of people in each stratum) from the study population, so measures standardized
using this method are, strictly speaking, useful only for comparing a study
population with the standard population used in the standardization;

s. F – typically, general population controls will be less motivated than cases and
sources of medical information for them will not be comparable to those for cases.

12. a. ARP = (I1 - I0) / I1 = (RR-1) / RR = (1.34-1.04) / 1.34 = 0.30 / 1.34 = 22% (after
rounding)

The "I can't remember formulas" method:

ARP = attributable cases / all exposed cases = attributable cases / 135
Attributable cases = attributable risk * Exposed PY = (1.34-1.04)*100,800 =
30.24
ARP = 30/135 = 22% (after rounding)
Interpretation: Based on these data, 22% (about one in five) strokes in people who
are physically inactive can be attributed to their physical inactivity; in other words,
if physically inactive people became active early enough in their lives, their stroke
incidence would decrease by 22%

b. A key point here is that 27% is the prevalence of physically active people, whereas
the exposure is physical inactivity, whose prevalence is therefore 100% - 27% =
73%

PARP = p1(RR-1) / [1 + p1(RR-1)] = 0.73(1.286-1) / [1 + 0.73(1.286-1)]

= (0.73 x 0.286) / (1 + 0.73 x 0.286) = 0.209 / 1.209 = 17%

(The formula PARP = (I - I0) / I can also be used by first estimating the crude
population incidence, I, as a weighted average of the incidences in exposed and
unexposed, weighting by the prevalence of exposure, e.g.: I = (0.73)(1.34) + (0.27)
(1.04) = 1.26, so PARP = (1.259 - 1.04) / 1.259 = 17%

The "I can't remember formulas" method:

PARP = Attributable cases / All cases

Attributable cases are (1.34-1.04) x number of exposed person-years. Since we do

not know the population size, represent it by n. Based on the NHANES data, 27% of
people are physically active, so there are 0.73n physically inactive people (in one
year, 0.73 person-years). So: Attributable cases = (1.34-1.04)(0.73) = 0.219.

All cases are exposed cases + unexposed cases. Since we do not know the population
size, let it be represented by n. Based on the prevalence of physically active people,
there are 0.73n phyisically inactive and 0.27n physically active people (or person-
years, if we assume a one-year period). So the total number of cases = exposed cases
+ unexposed cases = 0.73(1.34) + 0.27(1.04) = 1.259

Therefore, PARP = 0.219/1.259 = 17%

Note that these measures can be computed more precisely by using the original
number of cases and person-years and not rounding intermediate results, but two
significant figures is adequate for the actual result, and in this case the answer does
not change.

Explanation: Seventeen percent of all strokes in the population are attributable to

physical inactivity; if everyone were physically active, there would be 17% fewer
strokes.

c. Attributable risk measures assume that the relationship is causal (i.e., that
physical inactivity does in fact cause an ncrease stroke risk). Some of the above
interpretations may also require that the process be reversible, so that changing to a
physically active lifestyle brings risk down to the level of someone who was not
inactive. Another assumption is that the rates and rate ratio observed in the cohort
study hold ofr the entire population. Also, we have ignored the effects of other
factors, most notably age.

13. a. This is a retrospective cohort study (researchers developed the hypothesis in

1998).

b. High error profile: (2 + 5 + 6 + 5)/8021 = 2.24 per 1,000 women-years.

Low error profile: (1+3+4) / 12,287 = 0.651 per 1,000 wy

Women-years (WY) are computed as follows:

End Start Years Women WY

1980 1930 50 2 100
1985 1930 55 5 275
1990 1930 60 6 360
1995 1930 65 5 325
1980 1930 50 10 500
1995 1930 65 15 975
1960 1930 30 25 750
1970 1930 40 30 1,200
1998 1930 68 52 3,536
Totals 150 8,021

c. IDR= ID High / ID low = 2.24/0.651 = 3.4. Nuns with a high error communications
profile are 3.4 times more likely to die from Alzheimer's Disease than nuns with a
low error profile.

Alzheimer’s Disease

Handwriting Profile AD Yes AD No

High error 18 132

Low error 8 192

odds ratio = (18) (192)]/[(8) (132)] = 3.27

e. The two are similar because the condition is fairly rare.

University of North Carolina at Chapel Hill

School of Public Health
Department of Epidemiology
Fundamentals of Epidemiology (EPID 168)

Final Examination, Fall 1998

Most of the questions on this examination relate to the article "Individual risk factors for
hip osteoarthritis: obesity, hip injury, and physical activity" (Cyrus Cooper, Hazel Inskip,
Peter Croft, Lesley Campbell, Gillian Smith, Magnus McLaren, and David Coggon. Am J
Epidemiol 1998; 147:516-22). You may refer to this article during the examination.

1. Briefly list two reasons why a case control study is (or is not) appropriate to
examine individual risk factors for hip osteoarthritis. (2 pts)

2. The authors state that their cases come from a defined population.
List four features of the population or the study design that support this statement
or helped the authors to achieve it? (4 pts)

3. Considering the study population, study design, and other information in the
article, which of the following statements is (are) TRUE and which is (are) FALSE. (2
pts each)

a. In these two health districts, the incidence density of symptomatic hip

osteoarthritis of sufficient severity to warrant hip arthroplasty exceeds 40
per 100,000 person-years.

b. If about 12% of the population was age 65 years or older, then about
12,000 people age 65 years or older in the two districts have radiographic
evidence of hip osteoarthritis.

c. The data in Table 1 demonstrate that women are 1.9 times as likely to
develop severe symptomatic hip osteoarthritis as are men.

d. The data in Table 2 indicate that female gender is not a risk factor for hip
osteoarthritis.
e. In this study, matching the control group to the cases on age, as opposed to
a random sample of the general adult population, probably resulted in
greater statistical power and precision.

4. The case identification process was based on a register in each district made up of
persons on a waiting list for a total hip arthoplasty (surgical reformation of the hip
joint). Waiting lists for procedures are common in societies with a nationa l or social
medicine system. In the United States, a region wide waiting list for a hip
arthoplasty is unlikely, as the availability of receiving this procedure would be more
related to insurance status or ability to afford such a procedure. Explain how using
the register system in the Untied Kingdom to select cases either increases or
decreases the possibility of selection bias as compared to a study conducted in the
United States. (4 pts)

5. How was the diagnosis of hip osteoarthritis made in this study? Was this based on
manifestional or causal criteria? Explain your answer. (3 pts)

6. According to the authors: "For each case, a control of the same sex and age was
selected from the list of the same general practice held by the county Family Health
Service Association". State in one sentence the rationale for using a list from ge neral
practioners? (3pts)

7. Eighty-four percent of the patients listed for total hip arthroplasty fulfilled the
criteria for entry into the study as cases. Which of the following best describes the
criteria: (3 pts)

a. age > 45 years, being on the waiting list for hip arthroplasty, and the
presence of Heberden’s nodes.

b. age > 45 years, pain duration at least for 36 months, and presence of
Heberden’s nodes.

c. history of hip fracture within the past year, being on the waiting list for hip
arthroplasty and reside in the study area.

d. presence of Heberden’s nodes, history of hip fracture within the past year,
and reside in the study area.
e. being on the waiting list for hip arthroplasty, reside in the study area, and
age > 45 years

8. The authors report that 89% of the eligible cases agreed to participate and 60% of
the 1060 controls approached agreed to participate. Which of the following best
states a condition regarding the non-responders that could lead to an odds ratio re
ported for the risk of osteoarthritis associated with previous hip injury that is
biased away from the null (>1). Choose one best answer. (3 pts)

a. control non-responders are more likely to have a history of hip injury

compared to case non-responders.

b. control non-responders are less likely to have a history of hip injury

compared to case non-responders.

c. being a non-respondent is not related to previous hip injury.

d. none of the above

9. What was accomplished by replacing controls who refused to participate?

(Choose one best answer) (3 pts)

If controls who refused had not been replaced:

a. selection bias would have been greater;

b. the control group would have been less representative of the study base;

c. probability of a Type I error would have been greater;

d. probabillty of a Type II error would have been greater;

e. nondifferential misclassification bias would have been greater.

f. it would have been necessary to control for age and sex in the analysis.

10. The authors selected controls who were individually matched to cases by age,
gender, and family practitioner. Matching in the design stage is usually considered
only for those variables that are known to be confounders. Under which of the
follow ing circumstances could gender be a confounder of the association between a
risk factor (obesity) and the outcome (hip osteoarthritis)? Circle all that apply. (4
pts)

a. the prevalence of obesity and the prevalence of hip osteoarthritis are both
higher in men that in women

b. the prevalence of obesity is lower in men than women, but the prevalence
of hip osteoarthritis is higher in men than women.

c. the prevalence of obesity is higher in men than women, but the prevalence
of hip osteoarthritis is the same in men and women.

d. the prevalence of obesity is the same in men and women, but the
prevalence of hip osteoarthritis is higher in men than women.

11. The odds ratios in Table 2 are "mutually adjusted for the other two variables" by
logistic regression. The following questions concern the models used to estimate the
odds ratios in the table (ignore the fact that it was "condit ional" logistic regression
and ignore the middle categories for body mass index and presence of Heberden’s
nodes) (2 pts each):

a. How many logistic models were necessary to estimate the odds ratios for
body mass index >28.0, definite Heberden’s nodes, and previous hip injury
among women.

b. The odds ratio estimate for hip injury in women was 2.8. What must the
logistic coefficient have been?

c. From this table, estimate the odds ratio for women who had both definite
Heberden’s nodes and previous hip injury compared to women who had
neither.

12. In this study, information on medical history, life style, and leisure time physical
activities was obtained through a "structured interviewer-administered
questionnaire". (page 517). It is possible that persons on a waiting list for a hip
arthoplasty would be more keenly aware of hip injuries they may have had in the
past than controls. If true, this is an example of which of the following? Choose one
best answer. (3 pts)

a. differential case ascertainment bias

b. differential misclassification bias

c. differential selection bias

d. differential precision bias

e. none of the above

13. Among women, the odds of previous hip injury is higher among cases than
controls (Table 2; OR=2.8). As indicated in the footnotes for Table 2, the odds ratio
for pervious hip injury is adjusted or controlled for the other two variables in the Ta
ble (body mass index and Heberden’s nodes). Using the counts shown in Table 2,
calculate an unadjusted (crude) odds ratio for previous hip injury in women. (3 pts)

Unadjusted (crude) odds ratio = _________

14. Which of the following conclusions can be made from the above results? (choose
one best answer) (3 pts)

a. the unadjusted (crude) association between hip injury and hip

osteoarthritis in women is completely confounded by body mass index and
Heberden’s nodes.

b. since the unadjusted and adjusted odds ratios are similar, the risk factor
(hip injury) must not be associated with the adjustment variables (body mass
index and Heberden’s nodes)

c. since the unadjusted and adjusted odds ratios are similar, there is no
effect-measure modification of the association between hip injury and hip
osteoarthritis.

d. none of the above

15. The odds ratios presented in Table 5 are adjusted for previous hip injury. Why
might they still be confounded by hip injury? (3 pts)

16. In Table 6, is the crude association between previous hip injury and risk of
unilateral hip osteoarthritis biased towards the null or away from the null? (2 pts)

17. Based on the data in Table 3, what is the odds ratio for Heberden's nodes
(definite versus none) for persons in the Upper tertile of body mass index? (3 pts)

18. Rothman has proposed that "public health synergism" is present when an
observed joint effect exceeds that expected under the additive model. Do the odds
ratios in Table 3 indicate the presence of "public health synergism" for effect of
Heberden 's nodes and elevated body mass index on hip osteoarthiritis? If not, do
the odds ratios conform to a multiplicative model? Include in your answer a 1-2
sentence assessment of whether these data indicate "public health synergism". (For
this question, ignore the row for "Possible" Heberden's nodes and the column for
the middle tertile of body mass index, and assume that both Heberden’s nodes and
elevated BMI reflect casual risk factors for hip osteoarthritis. Note: do not
necessarily rely on the autho rs' description of this table.) (6 pts)

19. The authors investigated the association of specific sporting activities with risk
of hip osteoarthritis. Their data are presented in Table 5. Using their data, compute
separately the unadjusted (crude) risk of osteoarthritis associated with pla ying golf
and for swimming in men and women combined. Consider those who do not
participate in any sport as the reference group and assume no missing data. Show
two appropriate 2x2 table and your calculations. (4 pts)

19a. Compare these unadjusted (crude) odds ratios with the ones presented in
Table 3. Briefly describe and explain the comparison. (3 pts)

19b. Consider the possibility that golfers who have hip osteoarthritis are reluctant
to seek medical attention for their condition for fear it will mean the end of their
ability to play golf. Therefore, cases who golf are less likely to be se lected for this
study than cases who do not golf. If the true OR associated with golf is 2.0, then
which of the following best describes the selection bias and its impact on the odds
ratio you computed. (3 pts)

a. non-differential selection bias resulting in an odds ratio biased toward the

null.

b. non-differential selection bias resulting in an odds ratio biased away from

the Null.
c. differential selection bias resulting in an odds ratio biased away from the
null.

d. differential selection bias resulting in an odds ratio biased toward the null.

e. none of the above

19c. The authors state that "...the association with swimming may have arisen
because patients with hip osteoarthritis were advised to swim..." (page 521).
Suppose that 25% of the cases had been incorrectly classified as swimmers and
assume that the misclassified cases had not participated in any other sporting
activity, either. Re-compute the odds ratio for the association of hip osteoarthritis
and swimming, after re-classifying these individuals, using the number from the 2x2
table in question 19 above. Briefly discuss how your conclusion about the role of
swimming does (or does not) change. In what direction did misclassification bias
the study OR? (3 pts)

20. The odds ratio (95% confidence interval) estimating the risk of osteoarthritis
associated with a previous hip injury was 24.8 (3.1-199.3) in men and 2.8 (1.4-5.8)
in women (see Table 2).

a. Which estimate indicates a stronger association? (2 pts)

b. Which estimate is more precise? (2 pts)

c. Which estimate is more compatible with a population odds ratio of 4.0? (2

pts)

21. Which one of the statements best interprets the following passage? (3 pts)

"In a previous case-control study (17) of men aged 60-76 years, we observed
a doubling of risk for hip osteoarthritis among those in the highest third of
body mass index distribution, as compared with those in the lowest third,
although the increased risk was not statistically significant." (p519 bottom of
right column)

a. Hip osteoarthritis is not as significant when it occurs in obese older

patients, because it is expected that overweight that lasts for many years will
lead to damage to the joints.
b. A doubling of risk is not significant from a statistical perspective, because
it represents only a moderate association.

c. The doubling of risk was not statistically significant because a p-value was
not computed, so it is not possible for the authors to know whether the
increased risk was due to chance.

d. If 1,000 independent random samples the same size as that study

population were drawn from a population with no increased risk of hip
osteoarthritis, fewer than 950 would have an OR between 0.5 and 2.0.

e. If 1,000 independent random samples the same size as that study

population were drawn from a population with a doubling of risk of hip
osteoarthritis for the highest third of the body mass distribution, as
compared with the lowest third, more th an 5% of the samples would display
no elevation in risk.

f. If 1,000 independent random samples the same size as that study

population were drawn from a population with a doubling of risk of hip
osteoarthritis for the highest third of the body mass distribution, as
compared with the lowest third, fewer t han 80% would display an
association of that magnitude.

22. A medical journalist, confused by the thrust of this article, comes to you and
says: "I've read this article several times, but I can't figure out what it shows about
the relationship of body mass index, Heberden's nodes, and hip osteoarthri tis. The
authors explain that 'two broad mechanisms are believed to underlie the
pathogenesis of osteoarthritis at any joint site: mechanical stress and a generalized
predisposition to the disorder' as indexed by Heberden’s nodes [p519 right column].
T hat seems straightforward enough, and they later conclude that the analysis
'supports the notion that this condition arises through an interaction between a
generalized predisposition to the disorder and specific mechanical insults to the hip'
[p521]. Y et on page 518 [right column], the authors state that there was 'no
statistically significant interaction' between body mass index and Heberden's nodes,
and on page 519 [left column] they refer to obesity and a tendency to polyarticular
involvement as 'i ndependent risk factors for hip osteoarthritis'. Would you please
assess for me what this article shows about the relationship among body mass
index, Heberden's nodes, and hip osteoarthritis? I have room for 40-60 words.
Thanks!" (6 pts)

23. Write a brief statement for or against a causal relationship between hip injury
and risk of osteoarthritis. Comment specifically on at least two of Bradford Hill’s
criteria for causal inference. Support your conclusion with data or statements f rom
the article. (4 pts)

University of North Carolina at Chapel Hill

School of Public Health, Department of Epidemiology

Fundamentals of Epidemiology (EPID 168)

Final Examination, Fall 1998 - Answer Guide

1. Briefly list two reasons why a case control study is (or is not) appropriate to examine
individual risk factors for hip osteoarthritis. (2 pts)

Condition rare, faster to complete than cohort study, wide range of exposures of
interest.

2. The authors state that their cases come from a defined population. List four features of
the population or the study design that support this statement or helped the authors to
achieve it? (4 pts)

1. The two health districts had a centralized orthopedic facility for assessment and
treatment of hip osteoarthritis;

2. Local orthopedic surgeons were willing to enter all patients into the study;

3. All men and women 45 years and older who were placed on the waiting list for
primary total hip arthoplasty were considered for the study;

4. The authors included patients who consulted orthopedic surgeons privately.

5. The study excluded patients who lived outside the two districts.

The diverse socioeconomic profile was an advantage for generalizability but does not
make this a defined population.

3. Considering the study population, study design, and other information in the article,
which of the following statements is TRUE and which are FALSE . (2 pts each)

a. In these two health districts, the incidence density of symptomatic hip

osteoarthritis of sufficient severity to warrant hip arthroplasty exceeds 40 per
100,000 person-years.
[TRUE - 726 eligible cases / 1 million population over 18 months = 48.4 per
100,000]

b. If about 12% of the population was age 65 years or older, then about 12,000
people age 65 years or older in the two districts have radiographic evidence of hip
osteoarthritis.

[TRUE - 10% population prevalence in age 65 years and older * 12% of one
million]

c. The data in Table 1 demonstrate that women are 1.9 times as likely to develop
severe symptomatic hip osteoarthritis as are men.

[FALSE - the data in Table 1 cannot demonstrate this female excess, since there
is no information about the sex ratio in the older population; this ratio may
well reflect a greater incidence of severe symptomatic hip osteoarthritis in
women, but some of the excess presumably derives from greater mortality
among men.]

d. The data in Table 2 indicate that female gender is not a risk factor for hip
osteoarthritis.

[FALSE - controls were matched to cases on gender (and age), so the sex ratio
in the controls must match that in the cases]

e. In this study, matching the control group to the cases on age, as opposed to a
random sample of the general adult population, probably resulted in greater
statistical power and precision.

[TRUE - the mean age of the cases is 70 years old, with the majority older than
60; thus, the use of general population controls without regard to age would
result in relatively little overlap between the age distributions of cases and
controls on this very important variable.]

4. The case identification process was based on a register in each district made up of
persons on a waiting list for a total hip arthoplasty (surgical reformation of the hip joint).
Waiting lists for procedures are common in societies with a national or social medicine
system. In the United States, a region wide waiting list for a hip arthoplasty is unlikely, as
the availability of receiving this procedure would be more related to insurance status or
ability to afford such a procedure. Explain how using the register system in the Untied
Kingdom to select cases either increases or decreases the possibility of selection bias as
compared to a study conducted in the United States. (4 pts)

Using the registry may reduce selection bias if affluence or ability to pay for a hip
replacement is associated with exposures like BMI, physical activity, Heberden’s nodes.
Cases selected from surgery lists in the United States system may have a differential
association with a risk factor as compared cases not receiving this procedure, so
measures of association may be more biased in a U.S. study.

5. How was the diagnosis of hip osteoarthritis made in this study? Was this based on
manifestional or causal criteria? Explain your answer. (3 pts)

(page 517, left column, 2nd paragraph): Diagnosis of hip osteoarthritis in this study
was based on pelvic radiographs. This is based on manifestional criteria.

6. According to the authors: "For each case, a control of the same sex and age was selected
from the list of the same general practice held by the county Family Health Service
Association". State in one sentence the rationale for using a list from general practioners?
(3pts)

(page 517, left column, 3rd paragraph): In England and Wales, almost everyone is
registered with a general practitioner so that these lists essentially provide an
enumeration of the general population.

7. Eighty-four percent of the patients listed for total hip arthroplasty fulfilled the criteria
for entry into the study as cases. Which of the following best describes the criteria: (3 pts)

a. age > 45 years, being on the waiting list for hip arthroplasty, and the presence of
Heberden’s nodes.

b. age > 45 years, pain duration at least for 36 months, and presence of Heberden’s
nodes.

c. history of hip fracture within the past year, being on the waiting list for hip
arthroplasty and reside in the study area.

d. presence of Heberden’s nodes, history of hip fracture within the past year, and
reside in the study area.

e. being on the waiting list for hip arthroplasty, reside in the study area, and age > 45
years (answer)

8. The authors report that 89% of the eligible cases agreed to participate and 60% of the
1060 controls approached agreed to participate. Which of the following best states a
condition regarding the non-responders that could lead to an odds ratio reported for the
risk of osteoarthritis associated with previous hip injury that is biased away from the null
(>1). Choose one best answer. (3 pts)

a. control non-responders are more likely to have a history of hip injury compared to
case non-responders. (answer)
b. control non-responders are less likely to have a history of hip injury compared to
case non-responders.

c. being a non-respondent is not related to previous hip injury.

d. none of the above

9. What was accomplished by replacing controls who refused to participate? (Choose one

best answer) (3 pts) If controls who refused had not been replaced:

a. selection bias would have been greater;

b. the control group would have been less representative of the study base;

c. probability of a Type I error would have been greater;

d. probabillty of a Type II error would have been greater; (answer)

e. nondifferential misclassification bias would have been greater.

f. it would have been necessary to control for age and sex in the analysis.

Answer: d. Failure to replace controls who refused would have reduced both the
number of controls and of cases (due to the matching), with a loss of statistical power
and increase in the probability of a type II error.

10. The authors selected controls who were individually matched to cases by age, gender,
and family practitioner. Matching in the design stage is usually considered only for those
variables that are known to be confounders. Under which of the following circumstances
could gender be a confounder of the association between a risk factor (obesity) and the
outcome (hip osteoarthritis)? Circle all that apply. (4 pts)

a. the prevalence of obesity and the prevalence of hip osteoarthritis are both higher in
men that in women (true)

b. the prevalence of obesity is lower in men than women, but the prevalence of hip
osteoarthritis is higher in men than women. (true)

c. the prevalence of obesity is higher in men than women, but the prevalence of hip
osteoarthritis is the same in men and women.

d. the prevalence of obesity is the same in men and women, but the prevalence of
hip osteoarthritis is higher in men than women.

11. The odds ratios in Table 2 are "mutually adjusted for the other two variables" by
logistic regression. The following questions concern the models used to estimate the odds
ratios in the table (ignore the fact that it was "conditional" logistic regresion and ignore the
middle categories for body mass index and presence of Heberden’s nodes) (2 pts each):

a. How many logistic models were necessary to estimate the odds ratios for body
mass index >28.0, definite Heberden’s nodes, and previous hip injury among
women.

"Mutually adjusted" means that each odds ratio comes from a model that
includes the other two factors, which therefore means that all three factors are
included in the same model. So one model yields an adjusted odds ratio for each
variable. So one model was used.

b. The odds ratio estimate for hip injury in women was 2.8. What must the logistic
coefficient have been?

<p
The OR for a dichotomous or indicator variable is exp(beta), where beta
is the logistic coefficient. Therefore the coefficient was 1n(2.8) = 1.0296.
</p

c. From this table, estimate the odds ratio for women who had both
definite Heberden’s nodes and previous hip injury compared to
women who had neither.

The logistic model is based on additivity of the logit or

multiplicativity of the odds. Therefore the odds ratio for the
double exposure is the product of the adds ratio for each of the
risk factors: 1.5*2.8=4.2.

12. In this study, information on medical history, life style, and leisure time
physical activities was obtained through a "structured interviewer-
administered questionnaire". (page 517). It is possible that persons on a
waiting list for a hip arthoplasty would be more keenly aware of hip injuries
they may have had in the past than controls. If true, this is an example of
which of the following? Choose one best answer. (3 pts)

a. differential case ascertainment bias

b differential misclassification bias (answer)

c. differential selection bias

d. differential precision bias

e. none of the above

13. Among women, the odds of previous hip injury is higher among cases
than controls (Table 2; OR=2.8). As indicated in the footnotes for Table 2, the
odds ratio for pervious hip injury is adjusted or controlled for the other two
variables in the Table (body mass index and Heberden’s nodes). Using the
counts shown in Table 2, calculate an unadjusted (crude) odds ratio for
previous hip injury in women. (3 pts)

Unadjusted (crude) odds ratio = __________ 2.9

14. Which of the following conclusions can be made from the above results?
(chose one best answer) (3 pts)

a. the unadjusted (crude) association between hip injury and hip

osteoarthritis in women is completely confounded by body mass
index and Heberden’s nodes.

b. since the unadjusted and adjusted odds ratios are similar, the risk
factor (hip injury) must not be associated with the adjustment
variables (body mass index and Heberden’s nodes)

c. since the unadjusted and adjusted odds ratios are similar, there is
no effect-measure modification of the association between hip injury
and hip osteoarthritis.

d. none of the above (answer)

15. The odds ratios presented in Table 5 are adjusted for previous hip injury.
Why might they still be confounded by hip injury? (3 pts)

There may be residual confounding by type of hip injury or by how long

ago the hip injury occurred, or imperfect recall of hip injury (non-
differential misclassification).

16. In Table 6, is the crude association between previous hip injury and risk
of unilateral hip osteoarthritis biased towards the null or away from the null?
(2 pts)

Towards the null (crude OR = 7.6 vs. adjusted OR = 10.6)

17. Based on the data in Table 3, what is the odds ratio for Heberden's nodes
(definite versus none) for persons in the Upper tertile of body mass index? (3
pts)

OR for Definite Heberden's nodes / none = 3.2 / 1.6 = 2.0

18. Rothman has proposed that "public health synergism" is present when an
observed joint effect exceeds that expected under the additive model. Do the
odds ratios in Table 3 indicate the presence of "public health synergism" for
effect of Heberden's nodes and elevated body mass index on hip
osteoarthiritis? If not, do the odds ratios conform to a multiplicative model?
Include in your answer a 1-2 sentence assessment of whether these data
indicate "public health synergism". (For this question, ignore the row for
"Possible" Heberden's nodes and the column for the middle tertile of body
mass index, and assume that both Heberden’s nodes and elevated BMI reflect
casual risk factors for hip osteoarthritis. Note: do not necessarily rely on the
authors' description of this table.) (6 pts)

Odds ratios for hip Body mass

osteoarthiritis index

Heberden's nodes Lowest third Middle third Highest third

None 1.0 1.1 (0.7-1.8)* 1.6 (1.0-2.7)

Possible 1.5 (0.8-2.7) 1.5 (0.8-2.6) 2.0 (1.1-3.6)

Definite 1.4 (0.9-2.3) 2.2 (1.4-3.7) 3.2 (1.9-5.4)

* Numbers in parentheses, 95% confidence interval.

Ignoring the intermediate categories for Heberden's nodes and body

mass index gives the following expression for the additive model:

Expected joint excess risk = excess risk for factor 1 + excess risk for factor 2

= excess risk for Heberden's nodes + excess risk for Body mass index

Since hip osteoarthritis of this severity is rare, the following

approximate expressions are appropriate:

Expected excess risk = (OR for Heberden's nodes - 1) + (OR for Body mass
index - 1)

Expected joint excess risk = (1.4 - 1) + (1.6 - 1) = 1.0

Observed joint excess risk = (3.2 - 1) = 2.2

The substantial difference between 2.2 and 1.0 indicates that the odds
ratios in this table do not conform to an additive model for expected
joint effect.
The odds ratios do not conform to a multiplicative model, either:

Expected joint OR = (OR for Heberden's nodes) * (OR for Body mass index )

= 1.4 * 1.6 = 2.24, vs. 3.2 observed

Thus, the relationship is "supramultiplicative", though not greatly so.

Since these odds ratios indicate a joint effect greater than that
expected under an additive model, "public health synergism" is
present, to a moderate degree (we expect a 100% increase in risk but
observe a 220% increase in risk)

19. The authors investigated the association of specific sporting activities

with risk of hip osteoarthritis. Their data are presented in Table 5. Using
their data, compute separately the unadjusted (crude) risk of osteoarthritis
associated with playing golf and for swimming in men and women combined.
Consider those who do not participate in any sport as the reference group
and assume no missing data. Show two appropriate 2x2 table and your
calculations. (4 pts)

Golfers Cases Controls

YES 51 34

NO 140 162

OR = 1.7

Swimming Cases Controls

YES 156 110

NO 140 162

OR = 1.6

19a. Compare these unadjusted (crude) odds ratios with the ones presented
in Table 3. Briefly describe and explain the comparison. (3 pts)

Table shows 1.4 and 1.5, respectively. This suggests that BMI, nodes, and
hip injury explain very little of the association of these two sports with
hip osteoarthritis.
19b. Consider the possibility that golfers who have hip osteoarthritis are
reluctant to seek medical attention for their condition for fear it will mean
the end of their ability to play golf. Therefore, cases who golf are less likely to
be selected for this study than cases who do not golf. If the true OR associated
with golf is 2.0, then which of the following best describes the selection bias
and its impact on the odds ratio you computed. (3 pts)

a. non-differential selection bias resulting in an odds ratio biased

toward the null.

b. non-differential selection bias resulting in an odds ratio biased

away from the null.

c. differential selection bias resulting in an odds ratio biased away

from the null.

d. differential selection bias resulting in an odds ratio biased toward the

null. (answer)

e. none of the above

19c. The authors state that "...the association with swimming may have
arisen because patients with hip osteoarthritis were advised to swim..." (page
521). Suppose that 25% of the cases had been incorrectly classified as
swimmers and assume that the misclassified cases had not participated in
any other sporting activity, either. Re-compute the odds ratio for the
association of hip osteoarthritis and swimming, after re-classifying these
individuals, using the number from the 2x2 table in question 19 above.
Briefly discuss how your conclusion about the role of swimming does (or
does not) change. In what direction did misclassification bias the study OR?
(3 pts)

Swimming Cases Controls

YES 156-25% = 117 110

NO 140 + 39 = 179 162

OR = 0.96: The misclassification was differential and biased the

odds ratio upward.

20. The odds ratio (95% confidence interval) estimating the risk of
osteoarthritis associated with a previous hip injury was 24.8 (3.1-199.3) in
men and 2.8 (1.4-5.8) in women (see Table 2).
a. Which estimate indicates a stronger association? (2 pts) 24.3

b. Which estimate is more precise? (2 pts) 2.8 (1.4-5.8)

c. Which estimate is more compatible with a population odds ratio of

4.0? (2 pts) 2.8 (1.4-5.8)

21. Which one of the statements best interprets the following passage? (3

pts)

"In a previous case-control study (17) of men aged 60-76 years, we

observed a doubling of risk for hip osteoarthritis among those in the
highest third of body mass index distribution, as compared with those
in the lowest third, although the increased risk was not statistically
significant." (p519 bottom of right column)

a. Hip osteoarthritis is not as significant when it occurs in obese older

patients, because it is expected that overweight that lasts for many
years will lead to damage to the joints.

b. A doubling of risk is not significant from a statistical perspective,

because it represents only a moderate association.

c. The doubling of risk was not statistically significant because a p-

value was not computed, so it is not possible for the authors to know
whether the increased risk was due to chance.

d. If 1,000 independent random samples the same size as that study

population were drawn from a population with no increased risk of hip
osteoarthritis, fewer than 950 would have an OR between 0.5 and 2.0.
(answer)

e. If 1,000 independent random samples the same size as that study

population were drawn from a population with a doubling of risk of
hip osteoarthritis for the highest third of the body mass distribution,
as compared with the lowest third, more than 5% of the samples
would display no elevation in risk.

f. If 1,000 independent random samples the same size as that study

population were drawn from a population with a doubling of risk of
hip osteoarthritis for the highest third of the body mass distribution,
as compared with the lowest third, fewer than 80% would display an
association of that magnitude.

Answer: d. "Statistically significant", as conventionally used, means that

in the absence of any true association a model based on chance would
yield an association as strong or stronger than the observed value less
than 5% of the time.

22. A medical journalist, confused by the thrust of this article, comes to you
and says: "I've read this article several times, but I can't figure out what it
shows about the relationship of body mass index, Heberden's nodes, and hip
osteoarthritis. The authors explain that 'two broad mechanisms are believed
to underlie the pathogenesis of osteoarthritis at any joint site: mechanical
stress and a generalized predisposition to the disorder' as indexed by
Heberden’s nodes [p519 right column]. That seems straightforward enough,
and they later conclude that the analysis 'supports the notion that this
condition arises through an interaction between a generalized predisposition
to the disorder and specific mechanical insults to the hip' [p521]. Yet on page
518 [right column], the authors state that there was 'no statistically
significant interaction' between body mass index and Heberden's nodes, and
on page 519 [left column] they refer to obesity and a tendency to
polyarticular involvement as 'independent risk factors for hip osteoarthritis'.
Would you please assess for me what this article shows about the
relationship among body mass index, Heberden's nodes, and hip
osteoarthritis? I have room for 40-60 words. Thanks!" (6 pts)

Points to include:

1. Both body mass index and presence of Heberden's nodes were

associated with greater risk of hip osteoarthritis, even when the other is
absent.

2. People with both elevated BMI and Heberden's nodes have a greater
risk for hip osteoarthritis than people with only one of these risk factors
and even greater than would be expected from adding or multiplying
their individual effects (i.e., greater than expected by both additive or
multiplicative models).

3. The authors seem to believe and the study does not show otherwise
that most cases of hip osteoarthritis in their study result from a
combination of mechanical stress (which could be something other than
obesity) and biologic predisposition (which might not yet have
manifested in other joints).

4. The paper presents no biological theory or other information

suggesting a mechanistic interaction between obesity and osteoarthritis
at other sites in regard to hip osteoarthritis, but rather discusses a
possible etiologic role for each individually;
Grading: 6 points for 3 of these, 5 points for two of them, 3 points for
one. If none was mentioned then 1-2 points awarded depending upon
the relevance and accuracy of what was written.

23. Write a brief statement for or against a causal relationship between hip
injury and risk of osteoarthritis. Comment specifically on at least two of
Bradford Hill’s criteria for causal inference. Support your conclusion with
data or statements from the article. (4 pts)

(You're on your own here!)

University of North Carolina at Chapel Hill

School of Public Health
Department of Epidemiology

Fundamentals of Epidemiology (EPID 168)

Midterm Examination, Fall 1997

NOTE: Adjust margins and/or pagination before printing.

NOTE: This exam is illustrative only. It proved somewhat on the easy side, and a number
of the questions were problematic.

1. Match the term from column A with the most appropriate topic or
concept from column B (use each term only once and each topic only
once). (1 pt each = 12 pts)

Column A - Terms Column B - Topics

____ cumulative incidence 1. Case-control studies

____ incidence density 2. Causal inference

____ prevalence 3. Confounds cross-sectional data

____ dose response 4. Death certificate

____ induction period 5. Descriptive epidemiology

____ odds ratio 6. Diagnostic tests

____ preventive fraction in the exposed 7. Estimates risk

____ underlying cause of death 8. Measures impact

____ positive predictive value 9. Natural history of disease

____ detectable, pre-clinical phase 10. Population screening

____ migrant studies 11. Proportion

____ cohort effect 12. Relative rate

2. Which of the following best describes the basis of the diagnosis of

myocardial infarction? (Choose one best answer) (4 pts)

____ a. manifestational criteria

____ b. Bradford criteria

____ c. causal criteria

____ d. etiologic criteria

3. In the Minnesota Heart Health Program (as described in class) and many
other community intervention studies, the effectiveness of an
educational intervention program is evaluated. Which of the following
selections best describes the unit of assignment, the unit of
observation, and the unit of analysis (in this order) in studies of
these types? (Choose one best answer) (4 pts)

____ a. community, person, community

____ b. person, community, community

____ c. community, community, community

____ d. none of the above

-2- ID Number - __

4. In a hypothetical clinical trial, a new drug was compared with

"standard therapy" treatment. The endpoint was myocardial infarction.
Which of the following best describes the primary reason to randomize
patients to treatments? (Choose one best answer) (4 pts)

____ a. to create two treatment groups that are similar at baseline on

both known and unknown factors associated with myocardial
infarction.

____ b. prevent bias introduced when the patients know what type of
treatment they are receiving

____ c. prevent bias introduced when the investigators know what type of
treatment the patients are receiving

____ d. b and c

5. Indicate TRUE or FALSE next to each of the following statements.

(2 pts each)

____ a. The indirect method of age standardization applies stratum-

specific rates from an external population to the age distribution
of the study population.

____ b. A standardized mortality ratio is an example of a stratum-specific

crude rate.

____ c. Standardized mortality ratios are perferred for making comparisons

among multiple populations.

____ d. Direct age standardization can be characterized as applying the

same set of weights to the age-specific rates of populations to be
compared.

6. 200 women with a history of chest pain were assessed by an exercise

tolerance test (ETT). Compared with coronary angiography (the "gold
standard"), ETT had a sensitivity of 68% for detecting coronary artery
disease, with specificity 61%. The predictive value of a negative ETT
was higher in younger women (less than 52 years old) and in women with
no more than one risk factor (i.e., family history, hypertension, high
cholesterol, smoking, or diabetes). If sensitivity and specificity do
not vary by age or risk factor status, why is the higher negative
predictive expected? (3 pts)

_______________________________________________________________
_______________________________________________________________

_______________________________________________________________

_______________________________________________________________
-3- ID Number __-__ __ __ __

7. A randomized trial studied 242 HIV-seropositive, 2nd-trimester

pregnant women to assess the efficacy of zidovudine (AZT) in
preventing perinatal HIV transmission. Results were:

Results from a randomized trial of the efficacy of

zidovudine in preventing perinatal HIV transmission

___________________________________________________________________

Zidovudine Placebo All

Births (no.) 121 121 242

Infection status of infant

Non-infected 112 90 202

HIV-infected 9 31 40

Transmission rate (%) 7.4 25.6 16.5

___________________________________________________________________

7A. Which one answer best describes the transmission rate in the table?
(4 pts)

____ a. proportion

____ b. relative rate

____ c. absolute rate

____ d. odds
7B. Using the data in the table, estimate the relative risk of HIV
infection for infants whose mothers took zidovudine relative to
infants of mothers who took placebo. Show formula and calculations.
(4 pts)

_______________________________________________________________

-4- ID Number - __

7C. Based on the data in the above table, estimate the proportion of
potential cases of perinatal HIV transmission that could be prevented
by providing zidovudine to HIV-positive, 2nd trimester pregnant women
who would otherwise not receive the drug. (Assume all women take the
medication and consider only singleton births.) Show formula or
diagram and calculations. (4 pts)

7D. Zidovudine is now routinely offered in association with all

pregnancies to known HIV-seropositive mothers in the United States.
However, growth of resistant strains will reduce the drug's
effectiveness in preventing perinatal HIV transmission. Observational
studies for assessing zidovudine's effectiveness have serious
methodologic problems, but which of the following case-control designs
would be the most nearly valid? (Choose one best answer.) (4 pts)

____ a. Cases are HIV-infected infants; controls are uninfected infants.

____ b. Cases are HIV-infected infants; controls are uninfected infants of

HIV-seropositive mothers.

____ c. Cases are HIV-infected infants; controls are infants whose mothers
should have received zidovudine but did not.

____ d. Cases are HIV-infected infants whose mothers received zidovudine;

controls are uninfected infants whose mothers received zidovudine.

8. The following is background information for questions 8A-8E.

Objective: To determine the prevalence of sexually transmitted

diseases (STD) and high risk sexual behavior for STD among adolescent
males admitted to a juvenile detention facility.

Methods: Data were obtained from interview, exam, and lab tests.

Results:

Table 1. Behavioral variables in 966 subjects

___________________________________________________________________

Variable Mean (SD) Range Median

Age at first coitus 12.3 (2.0) 5-17 13

No. lifetime partners 13.7 (16.8) 1-100 8
No. partners past 4 months 2.9 (3.4) 0-30 2
No. weeks since last sex 5.8 (15.1) 1-260 2
___________________________________________________________________

SD = standard deviation
-5- ID Number __-__ __ __ __

8A. Which of the descriptive statistics in Table 1 (mean, SD, range,

median) is most susceptible to being influenced by a single extreme
value? (Choose one_best answer.) (4 pts)

a. mean
b. SD

c. range

d. median

8B. Of the four variables in Table 1, which has the most symmetrical
(normal-like) distribution? (Choose one best answer.) (4 pts)

a. age at first coitus

b. number of lifetime partners

c. number of partners in the past 4 months

d. number of weeks since last sex

Table 2. Sexually transmitted diseases in adolescent males

admitted to a juvenile detention facility.
______________________________________________________
No. positive
Disease /tested

Syphilis 7/930

Gonorrhea 42/940

Chlamydia 66/957

Any of the above 109/908

_______________________________________________________

8C. Based on the above data and assuming that the the two diseases have
the same average duration, how do their incidence rates compare in
this population? (Choose the one correct answer.) (3 pts)

a. Incidence of gonorrhea is lower than that of chlamydia.

b. Incidence of gonorrhea is the same as that of chlamydia.

c. Incidence of gonnorhea is higher than that of chlamydia.

-6- ID Number - __

8D. Based on the above data but this time assuming that the two diseases
have the same incidence, how do their average durations compare in
this population? (Choose the one correct answer.) (2 pts)

a. Duration of gonorrhea is shorter than that of chlamydia.

b. Duration of gonorrhea is longer than that of chlamydia.

8E. Elaborate on your answer to the preceding question by deriving an

estimate of the relative duration of gonorrhea relative to chlamydia.
Show the basis for your answer. (3 pts)

_______________________________________________________________

9. The following is background information for questions 9A-9D.

In a large urban school district, among 8,000 middle-school school

youth who were well at the beginning of the school year, 400 were
absent for 10 days or longer due to acute asthma ("AA-10") during the
first nine-week quarter. Based on a survey believed accurate for the
period, 15% of middle-school youth in the county middle schools smoke
cigarettes. Interviews with the youth who were absent for 10 days or
longer revealed that 100 of them were cigarette smokers. Assume that
the school enrollment does not change during the quarter.

9A. Show these data in the form of a 2 x 2 table. Include an appropriate

title, labels that identify each row and column, and row and column
totals. (4 pts)
9B. What is the cumulative incidence (CI) of AA-10 (10+ absent days due to
acute asthma), in:

a. the cohort of 8,000 youth? (1 pt)

b. youth who smoke cigarettes? (1 pt)

c. youth who do NOT smoke cigarettes? (1 pt)

-7- ID Number - __

9C. What measure would you use to quantify the strength of association
between cigarette smoking and AA-10? Show the formula for this
measure, substitute the appropriate numbers for that formula, compute
the result, and state its meaning in one sentence. (4 pts)

a. Formula

b. Substitution

c. Result

d. Meaning ____________________________________________________

_______________________________________________________________

_______________________________________________________________
9D. Assuming that cigarette smoking is responsible for the observed excess
in AA-10, how many cases of AA-10 during the quarter are attributable
to cigarette smoking? Show a relevant formula or diagram,
intermediate computation, and result, and give a sentence stating the
meaning of the result. (4 pts)

a. Formula or diagram

b. Substitution

c. Result

d. Meaning ____________________________________________________

_______________________________________________________________

-8- ID Number - __

10. Suppose that 900 of the subjects in question #8 consent to regular STD
screening following release from detention. Subjects are counseled
about preventive measures and screened every three months for two
years. All cases are treated and cured.

Table 3. Numbers of cases of three sexually transmitted diseases

in adolescent males discharged from a juvenile detention facility
____________________________________________________________________

Follow up Time (Months)

3 6 9 12 15 18 21 24

Syphilis 0 1 0 3 1 2 3 4
Gonorrhea 10 8 15 21 11 12 19 24
Chlamydia 15 23 8 18 17 17 14 11

Dropouts (cumulative) 10 30 50 90 120 140 190 270

Number tested 890 870 850 810 780 760 710 630
____________________________________________________________________

(Subjects can become infected with the same organism more than once
and/or become co-infected with more than one organism.)

10A. What is the prevalence of chlamydia at the 12 month follow-up? (3 pts)

10B. What is the average incidence density (per 100 person months or per
100 person years) of chlamydia for the two years of follow up? Assume
that: dropouts contribute no time to follow up after the last time
they are tested; subjects remain at risk even while infected. (3 pts)

10C. Give two reasons for preferring incidence density over cumulative
incidence for assessing frequency of infection in this cohort. (6 pts)

i. ___________________________________________________________

_______________________________________________________________

ii. ___________________________________________________________

_______________________________________________________________
-9- ID Number __-__ __ __ __

11. A study of alcoholism and major depressive disorder recruited 100

consecutive patients in a Veterans Administration hospital in Urbana,
Illinois. All patients had been diagnosed as being alcohol abusers.
An equal number of non-abusers were selected randomly from the same VA
hospital. 76 of the participants identified as being abusers
fulfilled criteria for major depression, as did 20 of the non-abusers.
Evaluate the evidence provided by this study for the inference that
alcohol abuse causes depression in relation to the following aspects:

11A. What is an inherent weakness in this design that makes it susceptible

to obtaining inaccurate data? (3 pts)

_______________________________________________________________

11B. Many of the criteria for causal inference pertain to the evaluation of
evidence from multiple studies, but several can also apply to a single
study. Name two (2) such criteria and use them to evaluate
(quantitatively where possible) the evidence from the above study.
(6 pts)

i. ___________________________________________________________

_______________________________________________________________

ii. ___________________________________________________________

_______________________________________________________________
_______________________________________________________________

_______________________________________________________________

University of North Carolina School of Public Health

Department of Epidemiology

EPID 168 - Fundamentals of Epidemiology

Copyright, 1997, Victor Schoenbach and Wayne Rosamond

<!-- Note: Adjust margins or pagination before printing. !>

Note: The scores on this examination were on the high side, and some of the questions on
this exam were problematic.

MIDTERM EXAMINATION, Fall 1997 -- Answer Guide

1. Matching (1 pt each):
Column A - Terms Column B - Topics
7 cumulative incidence (11 is ok) 1. Case-control studies
12 incidence density 2. Causal inference
11 prevalence (7 is ok) 3. Confounds cross-sectional data
2 dose response 4. Death certificate
9 inductions period 5. Descriptive epidemiology
1 odds ratio 6. Diagnostic tests
8 preventive fraction in the exposed 7. Estimates risk
4 underlying cause of death 8. Measures impact
6 positive predictive value 9. Natural history of disease
10 detectable, pre-clinical phase 10. Population screening
5 migrant studies 11. Proportion
3 cohort effect 12. Relative rate
(Credit was also given for some other pairings.)

2. Diagnosis of myocardial infarction is based on manifestational

criteria. (4 pts)

3. a. community, person, community (units of assignment, observation,

analysis, respectively, in the Minnesota Heart Health Program. (4 pts)

4. a. to create two treatment groups that are similar at baseline on both

known and unknown factors associated with myocardial infarction (4 pt)
5. Age standardization, True or False (2 pts each):

T a. The indirect method of age standardization uses data from the

stratum specific rates from an external population applied to the
age distribution of the study population.

F b. A standardized mortality ratio is an example of a stratum specific

crude rate.

F c. Standardized mortality ratios are useful when the number of events

is small and multiple comparisons among populations are to be
made.

T d. Direct age standardization can be characterized as applying the

same set of weights to the age-specific rates between populations
to be compared.

6. Predictive value depends both on specificity and on prevalence. For a

given specificity, higher prevalence means higher positive predictive
value, lower prevalence means higher negative predictive value.
Prevalence of coronary artery disease is lower in women who are
younger and have few risk factors, so negative predictive value is
higher in this group. (3 pts)

-2-

7A. a. proportion -- The "transmission rate" is the number of HIV-infected

infants divided by the total number of births in that group. The
proportion estimates the prevalence of HIV infection in these infants.
The proportion also estimates cumulative incidence of HIV-infected
babies among 2nd trimester, HIV-infected pregnant women. Cumulative
incidence measures for birth outcomes are a complex matter, because of
the great opportunity for selection bias due to impaired fecundity and
fertility, and unrecognized pregnancy loss. In this case, however,
the exposure occurs after the pregnancy has been recognized. (4 pts)

7B. Relative risk of HIV infection for zidovudine vs. placebo:

Relative risk (RR) = CI1 / CI0 = 7.4% / 25.6% = 0.29

The transmission rates serve as estimates of CI1 and CI0 (the

incidences can be estimated from the transmission rates even if the
former are regarded as prevalences, since there is a restricted risk
period and duration is not a factor). (4 pts)
7C. Proportion of potential cases of perinatal HIV transmission that could
be prevented by zidovudine, i.e., the preventive fraction in the
exposed, PF1 (all women take zidovudine, so all are exposed) (4 pts):

PF1 = 1 - RR = 1 - 0.29 = 0.71 or 71%

By diagram:

H _ _ _ _ _ _ _ _ _ _ _ _ 25.6% transmission rate in women

I | who do not take zidovudine (based on
V | ^ the placebo group)
| |
T | | Amount of the transmission rate that
r | | is prevented by zidovudine
a | v
n |_______________________ 7.4% transmission rate in women
m | who took zidovudine
i |
s |_______________________ 0
.

(25.6% - 7.4%) / 25.6% = 1 - 7.4% / 25.6% = 0.71 (= 1 - RR)

7D. b. Cases are HIV-infected infants; controls are uninfected infants of

HIV-seropositive mothers. Using all uninfected infants as controls
would make zidovudine appear to be a risk factor for HIV transmission,
since most mothers do not have HIV so their infants will be
uninfected. Choices c. and d. choose the control and/or case group
partly on the basis of exposure, which completely undermines a case-
control design. (4 pts)

8A. c. Range -- the range is in fact completely determined by the highest

and lowest values. (4 pts)

8B. a. Age at first coitus -- its mean and mean are both close together
and not very far from the middle of the range. Although the mean and
median are also close together for the number of partners in the past
4 months, but they are no where near the middle of the range. (4 pts)

-3-

8C. a. Incidence of gonorrhea is lower than that of chlamydia -- if

duration is the same for both diseases, the prevalence odds are
proportional to the incidence density, so gonorrhea's smaller
prevalence (42/940 vs. 66/957) implies a lower incidence. (3 pts)
8D. a. Duration of gonorrhea is shorter than that of chlamydia -- if
incidence rates are the same, chlamydia must last longer in order for
its prevalence to be higher. (2 pts)

8E. (3 pts) Prevalence odds = duration x incidence density. Therefore:

prevalence odds (gonorrhea) duration(G) x incidence density

----------------------------- = --------------------------------
prevalence odds (chlaymdia) duration(C) x incidence density

Since both diseases have the same incidence, the ratio of their
durations equals the ratio of their prevalence odds:

prev. odds for gonorrea 42 / 898 0.468

------------------------ = -------- = ------- = 0.63
prev. odds for chlamydia 66 / 891 0.741

(Credit was also given for "prevalence = incidence x duration", though

this true only approximately.)

9A. School absence from acute asthma and cigarette smoking (4 pts):

School absence due to acute asthma in middle school

by cigarette smoking status

Smokers Nonsmokers Total

------- ---------- -----
AA-10* 100 300 400

Absent fewer than 1,100 6,500 7,600

10 days
------ ----- -----
Total 1,200** 6,800 8,000

* AA-10 refers to absence 10+ days due to acute asthma.

** Based on 15% smoking prevalance

9B. Cumulative incidence of AA-10:

a. Crude CI = 400 / 8,000 = 50 per 1,000 or 5%

b. CI in smokers = 100 / 1,200 = 83 per 1,000 or 8.3%

c. CI in nonsmokers = 300 / 6,800 = 44 per 1,000 or 4.4%

9C. Strength of association (4 pts):

CI in smokers 8.3%
Cumulative incidence ratio = ----------------- = ------ = 1.89
CI in nonsmokers 4.4%

d. The cumulative incidence ratio (CIR) of 1.9 indicates a moderate

association between cigarettes and extended school absence.
-4-

9D. Number of cases of excessive absence due to acute asthma (AA-10) that
(assuming causation) are attributable to smoking.

This question asks for the size of the shaded box in the diagram in
the "evolving text". That diagram, with numbers instead of variables
is:
|
8.3% | 8.3% = incidence
| |XXXXXXXXXXXXXXX| in exposed
Incidence | | | persons
| | 3.9% x 1,200 |
| | = 47 | 3.9% = "attributable
4.4% | |XXX XXXX| risk"
| |\\\\\\\\\\\\\\\|
| 300 | 4.4% x 1,200 | 4.4% = incidence
0| |\\ = 53 \\| in unexposed
6,800 1,200 (15%) persons
Nonsmokers Smokers

So the number of cases attributable is 47 (after rounding). This

number can be obtained in various ways:

Number of cases in smokers - "expected" cases in smokers

100 - 1,200 x 4.4%

Attributable risk x Number of smokers

(I1 - I0) x 1,200
(8.3% - 4.4%) x 1,200

Number of cases in smokers x Attributable risk proportion (ARP)

100 x (1.89 - 1 ) / 1.89

Overall number of cases x Pop. attributable risk proportion (PARP)

400 x (I - I0) / I
400 x (5% - 4.4%) / 5%
400 x 12%

All these methods come up with approximately the same answer, the
differences being due to the rounding of intermediate results in
obtaining some of the incidences and the CIR. When the numbers
from the table are used and intermediate results not rounded, the
number of cases attributable to smoking is 47.0588

Assuming causation, cigarette smoking is responsible for heavy absence

(10 days or more during the fall quarter) due to acute asthma in about
47 middle schoolers in the district, or 12% of all students with heavy
absence due to acute asthma.

10A Prevalence of chlamydia at the 12-month follow-up (3 pts):

Cases 18 cases found at 12-month follow-up

Prevalence = ----- = -------------------------------------- = 2.2%
PAR 810 youth tested at 12-month follow-up

-5-

10B Average incidence density of chlamydia (average simply means one

number that applies to the entire two-year interval, rather than one
rate for each three-month interval - if you compute the latter rates,
however, and take the average, you should obtain the same result as
the overall incidence density) (3 pts):

(Total) Cases
Prevalence = ---------------------
(Total) person-time

(15 + 23 + 8 + 18 + 17 + 17 + 14 + 11) cases

= ------------------------------------------------------------
(890 + 870 + 850 + 820 + 780 + 760 + 710 + 630) x 3 months

123 cases
= ------------------ = 0.65/100 person-months = 7.8/100 person-yrs
18,930 person-months
10C Reasons for preferring incidence density in this case (6 pts):

These diseases have an extended risk period (i.e., one longer than the
period of observation)

People can acquire these diseases more than once

Different lengths of follow-up time per subject

11A. Inherent weaknesses in this design that make it susceptible to

obtaining inaccurate data are the potential for problems of recall,
reporting, and recording in medical records; also, there is
considerable opportunity for alcohol abuse status to influence
diagnosis of depression. (3 pts)

11B. Criteria for causal inference (6 pts)

Strength of association -- in this regard the study provides strong

evidence of causation due to its very high odds ratio
([(76)(80)]/[(20)(24)] = 12.7 -- assuming for this discussion that the
OR is not biased by design problems)

Temporality (antecedant-consequent) -- there is no indication here

that alcohol abuse preceded major depression, and the reverse seems
just as possible.

Other criteria (e.g., dose-response, biological plausibility,

experiment, analogy, consistency, coherence) either do not apply to a
single study or cannot be evaluated with the information provided.

School of Public Health

Department of Epidemiology

Fundamentals of Epidemiology (EPID 168)

Final Examination, Fall 1997

The following exam questions relate to the article: Freudenheim J et al.

Exposure to breastmilk in infancy and the risk of breast
cancer. Epidemiology 1994;5:324-331. You may refer to this article du ring the
examination.
NOTE:

o Write all answers on the answer sheets provided.

o You may keep the examination questions.
o Write the last five digits of your student id number in the upper right-
hand corner of each page of your answer sheets.
o This examination is closed book. However, you may use a calculator,
English, foreign language, or medical dictionary.
o When you finish please sign your name on the sign-out sheet under the
pledge:

"I have neither given nor received help from others in completing this examination."

o Good luck and happy holidays.

______________________________________________________________________________________

1. Which of the following best characterizes the present study as presented in

the article (2 pts):

A. Analytic study to investigate the hypothesized relationship in an

available dataset
B. Descriptive study using available data
C. Analytic study of data collected to investigate the hypothesized
relationship
D. A post-hoc analysis of data collected primarily for another study (i.e., of
secondary data)

2. Find an example from the paper for each of the following (give the page
number and quote enough of the words to identify the point or passage; the
same point or phrase cannot be used more than once) (2 pts each)

a. A finding from a migrant study or studies;

b. A finding from descriptive epidemiology;
c. An association from an ecologic study.

3. Several previous studies of exposure to breastmilk and risk of breast cancer in

adulthood reported little association in crude analyses (p. 324). The authors
suggest that the absence of an association could have resulted from a fai lure
to adjust for age. Which of the following best explains why failure to adjust for
age could have obscured an underlying true association. (Choose one best
answer.) (2 pts)

A. Age is causally related to breast cancer risk and an infant’s age is

related to her exposure to breastmilk.
B. Age is causally related to breast cancer risk and infant feeding practices
have changed over time.
C. Age is causally related to breast cancer risk but not associated with
breast feeding purchases.
D. Age is causally related to breast cancer risk but is causally related to
breast feeding practices.

4. The authors describe their study as a case-control study of dietary and

reproductive factors for breast cancer (p. 324). Which of the following best
describes the type of situation for which case-control studies are most
advantageo us compared to other designs. (choose one best answer). ( 2 pts)

A. rare exposure, common endemic disease.

B. rare exposure, rare endemic disease.
C. common exposure, common endemic disease
D. common exposure, rare endemic disease

5. The authors used the term "cohort effects" in regard to results from
previously reported studies. Which of the following best describes what is
meant by cohort effects in this context? (choose one best answer). (2 pts)

A. Breast cancer cases are heterogeneous with respect to known factors.

B. Secular changes in infant feeding practices result in an association
between age and exposure to breastmilk.
C. Breast cancer and control subjects come from nonoverlapping birth
cohorts.
D. Recall accuracy of breastmilk exposure may differ by birth cohort.

6. Cases in this study were incident cases of conformed cancer of the breast (p.
325). Which of the following best describes the advantage of selecting incident
cases over prevalent cases (choose one best answer) (2 pts)

A. selecting from a pool of prevalent cases would make separation of

factors associated with risk and those with survival more difficult.
B. selecting from a pool of prevalent cases would make exposure
assessment more difficult because of pre-existing disease status.
C. selecting from a pool of incident cases creates a more homogenous case
group with regard to unknown confounding factors.
D. selecting from a pool of incident cases reduces misclassification bias.

7. The authors characterize this study as a case-control study of primary and

histologically confirmed cancer of the breast in women. For each of the two
key terms in this phrase, briefly explain its meaning and significance for the s
tudy: (2 pts each)

a. primary
b. histologically-confirmed

8. In this study, controls were selected by a random process from residents of

the two counties and were frequency age matched to cases (p. 325). Which of
the following best describes a reason for preferring community controls over
ho spital-based controls for this study? (choose one best answer). (2 pts)

A. the random selection of controls from the community usually produces

groups of cases and controls that are similar in known and unknown
confounding variables.
B. the random selection of controls from the community provides a better
estimate of breastmilk exposure among the source population.
C. the random selection of controls from the community ensures that the
subsequent odds ratio is not an overestimation of the association of
breast feeding and adult breast cancer.
D. The random selection of controls from the community reduces the
likelihood of differential misclassification of exposure in cases and
controls.

9. Information on breastmilk exposure was based on subject's self-report (p.

325). If exposure information could also be obtained from an independent
source (such as physician records, or reports from parents), then the
agreement betw een these two methods could be compared. Which of the
following measures would be most appropriate to quantify the reliability
between the two methods? (choose one best answer). (2 pts)

A. kappa coefficient
B. correlation coefficient of reproducibility
C. intraclass correlation coefficient
D. product-moment correlation
E. A or B
F. A, B, or C

[Link] a hypothetical validation study of self-report of being breastfed as an infant,

the presence of a newly discovered antibody that could serve as a "gold
standard" indicator of being breast fed as an infant was compared to self-re
port. Testing for the presence of this new antibody is very expensive and was
done only on the 204 cases age 40-50 (see table 1). The following data from
the validation study were compiled. Calculate the (a) sensitivity, (b)
specificity, (c) positive pre dictive value, and (d) predictive value of a negative
test. Construct an appropriate 2x2 table and show your work (6 pts)

Data from validation study:

1. the breastfed antibody was found in 73.5% cases.

2. 80 self-reports were false negative

[Link] the data presented in Table 1 answer the following:

a. For premenopausal women with greater than a high school education,

compute and interpret the odds ratio for having breastfed as an infant
and breast cancer as an adult. (2 pts)
b. Referring to your analysis in part (a), assume now that 20% of controls
who gave a positive history of having been breastfed had not in fact
been breastfed, but that all other data were correct. Compute and
interpret the odds ratio for having breastfed as an infant and breast
cancer as an adult under this assumption. (2 pts)
c. Which of the following best describes the type of misclassification
illustrated in part (b) above. (2 pts)

A. differential misclassification of disease and exposure status

B. differential misclassification of exposure
C. nondifferential misclassification of exposure
D. nondifferential misclassification of disease and exposure status
E. none of the above

[Link] each of the following statements, indicate if it is TRUE OR FALSE: (1 pt

each)
a. By matching the controls to the cases on age, the authors have ensured
that age will not be a confounder .
b. The procedure for identifying cases is essentially one of active
surveillance.
c. The difference between the proportion of cases interviewed and the
proportion of controls interviewed will cause selection bias.
d. The fact that premenopausal controls who had been breastfed were
somewhat older than controls who had not (page 325, bottom of col. 2)
indicates frequency matching by age did not "work.
e. The absence of an association between age and breast cancer in tables 1
and 2 is likely to be a reflection of selection bias from the low response
rates for cases and controls.
f. In postmenopausal women there appears to be a "dose response"
relationship between body mass index and the association between
having been breastfed.
g. A case-control study design is often the design of choice in outbreak
investigations.
h. For a factor under study to be considered an effect modifier it must be
an independent risk factor for the outcome of interest

13.A list of control variables for use in the logistic regression models appears on
page 325, middle of column 2. These variables have been chosen because they
(choose one best answer): (2 pts)

A. are likely to be associated with breast cancer risk in the bottle-fed

women.
B. are known or suspected risk factors for breast cancer, or at least
proxies for such factors
C. are likely to be associated with infant feeding history in the controls
D. are likely to be associated with infant feeding history in the cases

[Link] presentation of data in Table 2 can be used to examine a number of

relationships. Using these data give a numerical example of each of the
following (show your work and in one sentence explain what the number
means): (2 pts eac h)

a. An association between breast cancer risk and having zero pregnancies.

Use > 3 pregnancies as a reference.
b. An association between having been breastfed and being over 165 cm
in height. Use <160 cm as a reference.
c. An association between breast cancer and having been breastfed,
overall.

[Link] page 326, 2nd column, the authors state "As shown in Table 3, the risk of
breast cancer associated with having been breastfed, was about 0.7 for both
pre- and postmenopausal women." In this context, to which of the following
epi demiologic measures does the term "risk" refer? Choose one best answer.
(2 pts)

A. Cumulative incidence
B. Incidence density
C. Attributable risk
D. Odds ratio

[Link] the data in Table 3, estimate AND state the meaning of the following

measures (for this question you may ignore the possibility of selection bias in
cases and controls):

a. Attributable Risk Proportion (ARP) for NOT having been breastfed for

all breast cancer (both premenopausal and postmenopausal breast
cancer, combined). Note that an ARP is also known as the etiologic
fraction in the e xposed. (3 pts)
b. Population Attributable Risk Proportion (PARP) for NOT having been
breastfed for premenopausal and for postmenopausal breast cancer,
separately (i.e., 2 PARP's). Note that the PARP is also known as the
etiologic fract ion. (4 pts)
c. Why would you or would you not expect the PARP to be different for
premenopausal breast cancer compared to the PARP for
postmenopausal breast cancer case in this investigation (part b)? (2
pts)

[Link] the multiple logistic model referred to as Model 2 in Table 3, what was the
coefficient for the variable not-having-been-breastfed among all breast cancer
cases? (2 pts)

Which of the following assumptions is involved in that model? Indicate True

or False for each assumption. (1 pt each)

a. The odds of breast cancer vary as the product of the odds for age and
the odds for education.
b. The odds of breast cancer vary as the sum of the odds for age and the
odds for education.
c. Age, education, and not having been breastfed were independent of
(i.e., uncorrelated with) each other.
d. Breast cancer is a rare disease.

[Link] that cases who refused to participate in this study were less likely to
have been breastfed as infants than those who participated in the study.
Which of the following best describes what this fact would imply for the obser
ved relative risk associated with being breastfed compared with what would
have been observed had all persons participated I the study? (choose one best
answer). (2 pts)

A. the observed relative risk would be biased away from the null.
B. the observed relative risk would be subject to selection bias and the
direction of the bias can not be estimated.
C. the observed relative risk would be biased toward the null.
D. the observed relative risk would be subject to misclassification bias and
the direction of the bias can not be estimated.

[Link] table 3, the confidence intervals for the OR's for all women do not include
the value 1.0, whereas all but one of the OR's for premenopausal breast cancer
and postmenopausal breast cancer do. Mathematically, what does this patte rn
reflect? (2 pts)

[Link] page 324, 2nd column, the authors offer a possible explanation of why two
previous studies of breastfeeding and breast cancer found little crude
association, observing that the result may have been "confounded by a fa ilure
to adjust for age, because of cohort effects with regard to breastfeeding
frequency". The following stratified analysis has been constructed to illustrate
a situation where cohort effects with regard to breastfeeding completely
obscure a true prote ctive association seen when age is controlled.

Age < 60 Age > 60 Total

Breastfe Bottlefe Breastfe Bottlefe Breastfe Bottlefed

d d d d d

Cases 24 40 256 100 280 140

Controls 79 86 204 54 280 140

OR 0.653 0.678 1.0

Based on these hypothetical data:

a. demonstrate that there is a cohort effect for breastfeeding, (2 pts)

b. briefly explain (1-2 sentences referring to specific numbers or calculations

for these tables) how failure to adjust for age interferes with finding a
protective effect of breastfeeding. (2 pts)

[Link] epidemiology graduate student finds evidence in the literature that

childhood sunlight exposure may affect adult breast cancer risk. To explore
this hypothesis, she obtains from the authors the palace of birth for all of the
sub jects in the present study and constructs a sunlight exposure variable
('high" or "low") based on geologic and meteorologic data for the years of the
subject's childhood. Her data show that 56.2% of the 219 premenopausal
women who were not breastfed as i nfants grew up with "high" sunlight
exposure. Based on this fact and the partially-completed tables below, (a)
calculate the odds ratio of breast cancer with respect to breastmilk exposure
within each of the two sunlight exposure strata, and (b) briefly describe the
relationship of the sunlight exposure variable to the association between
breast cancer and breastmilk exposure (i.e. in relation to confounding and
effect modification. (4 pts)

High sunlight Low sunlight

Cases Controls Total Cases Controls Total

Breastfed 24 67

Bottlefed 81 36

Total 191 284

[Link] the data from Table 2 (Distribution of Characteristics of Postmenopausal

Cases and Controls) to draw separate 2 x 2 tables for women who have had : 0
pregnancies, 1-2 pregnancies, and >=3 pregnancies. (5 pts)

a. calculate odds ratios for each of these three categories.

b. Assuming no effects of confounding, interpret your findngs in part (a).

23.A hypothetical cross-sectional ancillary study to this report was conducted. In

that study a survey of breast cancer annual incidence rates in geographically
distinct areas was completed. Region A in the upper Midwest were breast c
ancer mortality is high, and Region B the Southeast where mortality from
breast cancer is low. The following data were obtained.

Region A Region B

Age No. Populatio Rate/1,00 No. Populatio Rate/1,00

of n 0 of n 0
case case
s s

< High 40- 10 7,000 1.4 10 15,000 0.7

School 50
Educatio
n 51- 15 10,000 1.5 20 5,000 4.0
60

61- 30 3,000 10 600 55,000 10.9

Tota 55 20,000 630 75,000

>= High 40- 5 1,000 5.0 6 2,000 3.0

School 50
Educatio
n 51- 5 2,000 2.5 10 15,000 0.7
60

61- 4 500 8.0 4 1,000 4.0

Tota 14 3,500 20 18,000

Grand total 69 23,500 650 93,000

Crude 2.9
Compute the following (for adjusted rates use the direct method and the total
population as a standard):

a. the overall region B crude event rate. (1 pt)

b. Age and educational achievement adjusted rate for Region B: (2 pts)
c. Age and educational achievement adjusted rate for Region B: (2 pts)
d. Compare the overall crude rates with the age and educational
achievement adjusted rates. Briefly explain your findings. (2 pts)

[Link] a brief statement for or against a causal relationship between

breastfeeding in infancy and risk of breast cancer as an adult. Comment
specifically on at least two of Bradford Hill's criteri for causal inference.
Include in y our comments data or statements from the article. (5 pts)

[Link] that this relationship is causal, why might a similar study, 50 years
from now, fail to find as strong a relationship? (2 pts)

University of North Carolina at Chapel Hill

School of Public Health
Department of Epidemiology

Fundamentals of Epidemiology (EPID 168)

Final Examination, Fall 1997

Answer Guide

1. C. Analytic study of data collected to investigate the

hypothesized relationship

2. a. A finding from a migrant study or studies: "Studies of

migrants provide some evidence; for example, migrants to the
United States from Japan experienced a rate of breast cancer
intermediate between the lower rate in Japan and the higher
rate in the U.S."

b. A finding from descriptive epidemiology: *Many possibilities,

including either of these sentences:
"This finding implies a possible connection between the
trend toward increasing bottlefeeding in the postwar
period and current trends toward increasing incidence of
breast cancer. Furthermore, it offers a partial
explanation of the international variation in breast
cancer rates, with rates considerably lower in less
developed than in developed nations."

c. An association from an ecologic study: *"Micozzi found mean

adult height and breast cancer incidence in 30 countries to be
highly correlated (r=0.8)."

3. B. Age is causally related to breast cancer risk and infant feeding

practices have changed over time.

4. D. Common exposure, rare endemic disease.

5. B. Secular changes in infant feeding practices result in an association

between age and exposure to breastmilk.

6. A. selecting from a pool of prevalent cases would make separation of

factors associated with risk and those with survival more difficult.

7. a. Primary -- Primary breast cancer is a tumor that originates in the

breast, rather than a tumor in the breast that is the result of
metastasis from a tumor that originated in another location or
tissue. In general, tumors originating in the same organ and
tissue are more likely to have similar etiologies than are
tumors that originate in different organs.

b. Histologically-confirmed -- histological confirmation

refers to the verification of the diagnosis (of breast cancer)
through laboratory examination of tumor tissue. Microscopic
examination of tumor cells establishes the existence and type
of tumor with a greater degree of certainty than does a
clinical diagnosis alone. Counting only histological-confirmed
cases reduces the potential for false positive breast cancer
diagnoses and the misclassification bias will cause.

8. B. The random selection of controls from the community provides a

better estimate of breastmilk exposure among the source population.

9. A. Kappa coefficient

10. Table:
Biomarker validation of women's self-report of having been breastfed

Breastfeeding biomarker found

Yes No Total
S r --------------------------------------------
e e Breastfed 70 26 96
l p
f o Not breastfed 80 28 108
r --------------------------------------------
t Total 150 54 204

Derivation: 204 cases tested (overall total), 73.5% (=150) have

the marker (so 54=204-150 do not), 80 are false negatives by
self-report (so 80 = "yes" biomarker, "no" self-report), and the
remaining cells and marginals are obtained from these numbers.

a. Sensitivity = 70 / 150 = 47% (Answers the question, "Of women

who truly were breastfred, as demonstrated by the presence of the
biomarker for having been breastfed, what % were correctly
classified by self-report?"))

b. Specificity = 28 / 54 = 52% (Answers the question, "Of women

who were not breastfed, as demonstrated by the absence of the
biomarker, what % were correctly classified by self-report?")

c. Positive predictive value (PPV) = 70 / 96 = 73% (Answers the

question, "Of women classified, on the basis of their self-report,
as 'having been breastfed', what % were correctly classified?")

d. Negative predictive value (NPV) = 28 / 108 = 26% (Answers the

question, "Of women classified, on the basis of their self-report,
as 'not having been breastfed', what % were correctly classified?")

11. a. Table:
Adult breast cancer by having been breastfed as an infant,
among premenopausal women with education beyond high school

Case Control Total

------------------------
Breastfed 61 93 154

Not breastfed 69 61 130

-------------------------
Total 130 154 284

OR = (61 x 61) / (93 x 69) = 0.58.

Interpretation: having been breastfed appears to be protective

against female adult breast cancer, with a reduction in risk of
approximately 40%.
b. Table:

Adult breast cancer by having been breastfed as an infant,

among premenopausal women with education beyond high school,
assuming that 20% of controls who reported having been
breastfed had in fact not been

Cases Controls Total

-------------------------
Breastfed 61 74 135

Not breastfed 69 80 149

-------------------------
Total 130 154 284

Derivation: 20% of the 93 controls who reported having been

breastfed had not been, so 20% of 93 (=18.6->19) are switched from
"Breastfed" to "Not breastfed", being added to the 61 who reported
not having been breastfed. The remaining 80% of 93 (=74.4->74)
remain in the upper row.

OR = (61 x 80) / (74 x 69) = 1.0, i.e. no association.

c. B. differential misclassification of exposure

12. TRUE or FALSE

a. False - matching controls to cases does not prevent the

matching variable (age) from being associated with the exposure
(having been breastfed), so the matching cannot prevent
confounding. (See also d. and e.)

b. True - The nurse telephoned hospitals on a frequent, regular

basis, to identify all breast cancer cases.

c. False - The difference in the proportions interviewed among

cases and among controls provides a great deal of potential for
selection bias, but if nonparticipation was not related to having
been breastfed then selection bias will not occur.

d. False - The matching caused cases and controls to have the same
age distribution, so it did "work"; matching would not be expected
to eliminate an association between age and the exposure, since
exposure status was not known when controls were being selected and
in any case would not have been used in the matching procedure.
e. False - The matching procedure prevented an association.

f. False - The association between body mass index and breast

cancer can be assessed by estimating odds ratios from Table 2. To
avoid confounding infant feeding history we should preferably
assess the association separately in breastfed women and in women
who have not been breastfed (omitting the complexities from
considering body mass to be an intervening variable in the effect
of infant feeding history). To avoid being misled by a possible
"synergism" involving infant feeding and body mass, ideally we
would look in the "unexposed" group. However, although this study
focuses on breastfeeding, one can also consider "formula feeding"
as an exposure that might be "synergistic" with body mass. So we
can choose either exposure group (or both).

Here are the computations:

From Table 2:
Cases Controls
------------------------- -------------------------
Breastfed Not breastfed Breastfed Not breastfed
Body mass ---------- -------------- --------- -------------
index (kg/mz)

16-22 48 15 89 19
23-27 103 26 125 16
>27 90 17 91 16

To show the details, here is a table for estimating OR's for body mass index and breast
cancer:

Breastfed Not breastfed Total

Body mass --------------- --------------- ---------------
index (kg/m sq) Cases Controls Cases Controls Cases Controls
16-22 48 89 15 19 63 108
23-27 103 125 26 16 129 141
>27 90 91 17 16 107 107

and the resulting OR's are [e.g., (90 * 89) / (48 * 91) = 1.83]:

Breastfed Not breastfed Total

Body mass --------- ------------- ---------
index (kg/m sq)
16-22 (ref. level) 1.0 1.0 1.0
23-27 1.83 2.06 1.57
>27 1.83 1.34 1.71

The OR's in the total column are shown to illustrate that in this
case there is some confounding by breastfeeding history, at body
mass index level 23-27 kg/m sq. Within either breastfed or not
breastfed group there is no "dose-response" relationship.

g. True - Generally, generally an outbreak investigation begins

after the outbreak has begun and the investigation seeks to
determine what characteristics of cases might have been responsible
for their disease. If the cases happened to be part of an existing
cohort for which the requisite exposure information was already
available in some form, then a retrospective cohort study would be
another possibility. If cases are still occurring a prospective
cohort study might be initiated, but the better an idea the
investigators have about which exposures to assess, the more they
should intervene to minimize the occurrence of additional cases.

h. False - for a factor to be considered a confounder, it must be

an independent risk factor for the outcome, but this requirement
does not pertain to effect modification. For example, genital
ulcers cannot cause HIV by themselves, but in conjunction with a
sex partner who is HIV infected, genital ulcers can increase
(modify) the risk of HIV infection.

13. Potential confounders are factors that are known or suspected risk
factors for breast cancer or its detection, or at least proxies for
such factors.

14. a. Breast cancer risk and no previous pregnancies

Cases Controls Total

-------------------------------
No pregnancies 50 38 88

>= 3 pregnancies 167 216 383

-------------------------------
Total 217 254 471

OR = (50 x 216) / (38 x 167) = 1.7 (for zero vs. >= 3 pregnancies)

Interpretation: having never been pregnant was associated with an

increased breast cancer rate, with an apparent 70% greater rate
among nulligravidae (women who have never been pregnant).

Other choices of a reference level produce the same result, e.g.,

1-2 pregnancies as the reference level:

OR = (50 x 102) / (38 x 82) = 1.6.

If both groups, 1-2 pregnancies and 3+ pregnancies are combined

and used as the reference group, then:

OR = (50 x 318) / (38 x 249) = 1.7

b. Height above 165 centimeters and having been breastfed

Height > 165 cm < 160 cm Total

-----------------------------------
Breastfed 148 183 331

Not breastfed 41 25 66
----------------------------------
Total 189 208 397

OR = (148 x 25) / (183 x 41) = 0.49.

Interpretation: Women who were breastfed were less likely

to be over 165 cm. tall.

Other possible OR's --

> 165 vs. 160-165: OR = (148 x 43) / (213 x 41) = 0.73

> 165 vs. all others: OR = (148 x 68) / (396 x 41) = 0.62

c. Breast cancer and having been breastfed (crude)

Cases Controls Total

----------------------------------
Breastfed 241 305 546

Not breastfed 58 51 109

----------------------------------
Total 299 356 655

OR = (241 x 51) / (305 x 58) = 0.69

Interpretation: having been breastfed was associated with lower

risk of breast cancer
15. D. The statement refers to the (relative) risk of breast cancer
between women who were and were not breastfed, estimated using
the odds ratio.

16. a. Estimate RR for Not breastfed as 1/OR for Breastfed: 1 / 0.69 = 1.45

ARP = (RR - 1) / RR = (1.45 - 1) / 1.45 = 0.45/1.45 = 0.31

Interpretation: Some 31% of breast cancer in women who were not

breastfed was attributable to their having not been breastfed.

b. If know the formula (or can derive it from the diagram and the
"grand synthesis"):

P(E|D) (RR-1)
PARP = --------------- and since breast cancer is rare, use OR.
RR

(117)
----------- (1.47-1)
(117+112) (0.51) (0.47)
Premenopausal: ----------------------- = --------------- = 0.16
1.47 1.47

AND
(58)
-------------- (1.45-1)
(58+241) (0.19) (0.45)
Postmenopausal: ------------------------- = --------------- = 0.06
1.45 1.45

Meaning: In women who wre not breastfed, some 16% of premenopausal

breast cancer and some 6% of postmenopausal breast cancer were
attributable to their having not been breastfed.

OR, reason as follows:

Proportion of exposed (Not breastfed) cases that are atttributable to not having been
breastfed is:
ARP = (RR-1)/RR
Since breast cancer is rare, we can estimate with
(OR-1)/OR = (1.47-1) / 1.47 = 0.3197 for postmenopausal.

However, this proportion applies only to cases who are exposed

(because ARP is "proportion of exposed cases . . ."). So estimate
proportion of all cases who are exposed:
= Pr(Exposed|Case) = 117 / (117+112) = 0.51 for postmenopausal

Muliplying 1. by 2., 0.51 x 0.3197 = 16% for postmenopausal

c. The PARP for premenopausal breast cancer is expected to be

greater due to the secular decrease in breastfeeding during the
decades when these women were infants. Thus, the proportion
exposed to not having been breastfed is substantially greater for
the premenopausal breast cancer cases. Hence, their PARP is
greater.

17. Logistic model coefficients for risk factor variables are natural
logarithms of odds ratios per one unit change in the variable.
So the coefficient was ln(0.70) = -0.3567

Assumptions:
a. True - The odds of breast cancer vary as the product of the odds
for age and the odds for education.

b. False - Only in a few special cases will the product of two odds
equal their sum (e.g., both odds equal zero or both odds equal two).
The logistic model is additive in the logit (logarithm of odds),
multiplicative in the odds.

c. False - One of the reasons for using mathematical modeling is

that the risk factors (exposures and potential confounders) ARE
associated (i.e., not independently distributed)

d. True - Breast cancer is a rare disease.

18. C. The observed relative risk would be biased toward the null.

19. Smaller sample sizes produce wider confidence intervals, so if the

point estimates for the crude and stratum-specific measures are about
the same, then the confidence intervals for the latter will be wider.

20.
AGE < 60 AGE > 60 TOTAL
----------------------------------------------------
Breast Bottle Breast Bottle Breast Bottle
------ ------ ------ ------ ------ ------
Cases 24 40 256 100 280 140

Controls 79 86 204 54 280 140

----------------------------------------------------
OR 0.653 0.678 1.0

a. Control women in older stratum are more likely to have been

breastfed than control women in the younger stratum, e.g., odds of
having been breastfed are 0.9 (79/86) among younger women and 3.8
for AGE > 60.

b. Age is a strong risk factor for breast cancer, so if breastfed

women were older than bottle-fed women, than a possible protective
effect of breastfeeding could have been offset by the greater risk
associated with older age.

21. An epidemiology graduate student finds evidence in the literature

that childhood sunlight exposure may affect adult breast cancer risk.
To explore this hypothesis, she obtains from the authors the place of
birth for all of the subjects in the present study and constructs a
sunlight exposure variable ("high" or "low") based on geologic and
meteorologic data for the years of the subject=B9s childhood. Her data
show that 56.2% of the 219 premenopausal women who were NOT breastfed
as infants grew up with "high" sunlight exposure. Based on this fact
and the partially-completed tables below, (a) calculate the odds ratio
of breast cancer with respect to breastmilk exposure within each of the
two sunlight exposure strata, and (b) briefly describe the relationship
of the sunlight exposure variable to the association between breast
cancer and breastmilk exposure (i.e. in relation to confounding and
effect modification. (4 pts)

High Sunlight Cases Controls Total

Breastfed Yes 44 24 68
Breastfed No 81 *42 123
Total 125 66 191

Low Sunlight Cases Controls Total

Breastfed Yes 67 *120 187
Breastfed No 36 *61 97
Total 103 181 284

* crude from Table 1 or Table 3 = 0.68

High sunlight OR = (44x42)/(24x81) = 0.95
Low sunlight OR = (67x61)/(120x36) = 0.95.
Sunlight is a confounder of the protective effect of breastfeeding
as an infant. It is not an effect modifier.

22. Use the data from Table 2 (Distribution of Characteristics of

Postmenopausal Cases and Controls) to draw separate 2 x 2 tables for
women who have had: a. 0 pregnancies, b. 1-2 pregnancies, c. >=3
pregnancies. Be sure to include appropriate labels. (5 pts)

0 pregnancies 1-2 pregnancies 3 pregnancies

Cases Controls Cases Controls Cases Controls
Breast 34 35 71 90 136 180
Bottle 16 3 11 12 31 36
Total 50 38 82 102 167 216

a) Calculate odds ratios for each of these three categories.

0 pregnancies: OR = (34 x 3) / (16 x 35) = 0.18
1-2 pregnancies: OR = (71 x 12) / (11 x 90) = 0.86
>=3 pregnancies: OR = (136 x 36) / (31 x 180) = 0.88

b) Assuming no effects of confounding, interpret your findings in

part (a).
There is effect modification. The magnitude of the protective
effect of having been breast-fed on development of breast cancer
is dependent on pregnancy history. Having been breast-fed is a
stronger protective factor for those women who never had a pregnancy.

23. A hypothetical cross-sectional ancillary study to this report was

conducted. In that study a survey of breast cancer annual incidence
rates in geographically distinct areas was completed, Region A in the
upper midwest where breast cancer mortality is high, and Region B the
Southeast where mortality from breast cancer is low. The following
data were obtained.

Region A Region B
Cases Population Rate/1000 Cases Population Rate/1000
< High School Education
Age
40-50 10 7,000 1.4 10 15,000 0.7
51-60 15 10,000 1.5 20 5,000 4.0
61-65 30 3,000 10 600 55,000 10.9

Total 55 20,000 630 75,000

High School Education

Age
40-50 5 1,000 5.0 6 2,000 3.0
51-60 5 2,000 2.5 10 15,000 0.7
61-65 4 500 8.0 4 1,000 4.0

Total 14 3,500 20 18,000

Grand Total 69 23,500 650 93,000

Crude 2.9

a. Compute the overall Region B crude event rate: (1 pt) = 7.0/1000

Using the total population as a standard compute the following by the

direct method of adjustment:

b. Age and educational achievement adjusted rate for Region A (2 pts)

= 6.0/1000
c. Age and educational achievement adjusted rate for Region B (2 pts)
= 6.3/1000
d. Comparison of the overall crude rates with the age and educational
achievement adjusted rates.

Briefly explain your findings. (2 pts): Much of the difference

between the crude rates of the two regions is due to the different
distributions of age and educational achievement.

24. Causal relationship - Comment specifically on at least two of Bradford

Hill's criteria for causal inference. Include in your comments data or
statements from the article. (5 pts)

25. Assuming that this relationship is causal, why might a similar study,
50 years from now, fail to find as strong a relationship? (2 pts)

Formula changes (less fat), overfeeding reduced reflecting recent trends.

University of North Carolina at Chapel Hill
School of Public Health
Department of Epidemiology
Fundamentals of Epidemiology (EPID 168)

Midterm Exam, Fall 1996

EPID 168

Most of the questions in this examination are based on the article:

Garry VM, Schreinemachers D, Harkins ME, Griffith J. Pesticide appliers,

biocides, and birth defects in rural Minnesota. Environ Health Perspect
1996;104:394-399.

A copy of this article was provided to you before this examination and can be
used in answering the following questions.
1. Briefly state the primary study question of this report. Identify the
main exposure and outcome of interest. (3 pts)

2. Briefly explain the difference between disease classification based on

manifestational criteria and disease classification based on causal
criteria. What is the logic for analyzing the data in relation to
categories of anomalies grouped by organ system? (4 pts)

___________________________________________________________

3. As discussed in class, epidemiologic studies often have both

descriptive and analytic characteristics. State one way in which this
study is descriptive and one way in which it is analytic? (4 pts)

4. The reporting of birth defects was provided in accord with state

statutes, and grouping of birth defects categories followed the
National Centers for Health Statistics guidelines (page 394 second
paragraph - methods). This reporting of birth defects is an example
of which of the following types of data collection methods. Choose
one best answer. (4 pts)

A. Active surveillance
B. Ongoing crossectional survey
C. Passive surveillance
D. Follow up study of dynamic population

5. This study determined exposure and outcomes using data from "a list of
all members of the agricultural community who were certified to apply
restricted-use pesticides in 1991" (p. 394-methods) and from "all in-
wedlock live births recorded in the state for the years 1989 through
1992" (p. 394-methods). Briefly assess the strength of these data
sources in establishing the temporal sequence of pesticide exposure
and birth defects and provide support for your assessment. (4 pts)

6. For each of the following epidemiologic measures, indicate whether it

is a rate, a proportion, or a ratio that is neither a rate nor a
proportion, or none of these. Circle the best answer (4 pts)

A. Population attributable risk (PAR) rate proportion ratio neither

B. Incidence density (ID) rate proportion ratio neither
C. Prevalence rate proportion ratio neither
D. Relative risk rate proportion ratio neither

7. The use of the term "rate" is not an infallible guide to the specific
epidemiologic measure being presented. Which one of the following
epidemiologic measures best characterizes the measure that the authors
refer to as the "rate of anomalies per 1000 live births" (Table 2 -
footnote)? Choose one best answer. (4 pts)

A. cumulative incidence (CI)

B. incidence density (ID)
C. prevalence
D. attributable risk proportion

8. The authors indicate that table 1 supports their statement...

"pesticide appliers had significantly more children with an anomaly
than did nonappliers" (p.395 results first paragraph). This
statement is readily understood but not literally correct. Which one
of the following state the finding more precisely? Choose one best
answer. (4 pts)

A. pesticide appliers had 1.37 times more births with anomalies than
did the general population.

B. pesticide appliers had more children with birth anomalies than did
the general population.

C. pesticide appliers had a greater proportion of births with

anomalies as compared to the general population.

D. Pesticide appliers accounted for more births with anomalies than

did the general population.

9. Table 1 presents both crude and age-adjusted odds ratios. In the

table, the age adjusted odds ratio for gastrointestinal anomalies is
slightly larger than the crude estimate, as is the case for most of
the odds ratios presented. If the difference between the crude and
age-adjusted odds ratios had been large, explain in general terms what
this would mean regarding the respective ages of the pesticide
appliers and the general population. Assume the maternal age
structure of the combined population was used as the standard. (3 pts)

___________________________________________________________________
10. Using data in Table 1:

a. Compute an estimate of the potential impact of pesticides on birth

anomalies (in wedlock, all types together) to fathers who are
certified pesticide appliers. State the assumption required to
interpret this estimate. (4 pts)

b. Compute an estimate of the potential impact of pesticides on birth

anomalies (in wedlock, all types together) in the Minnesota
population as a whole. (3 pts).

11. Using the data presented in Table 1, recalculate the crude odds ratio
for all births with anomalies assuming that all musculoskeletal birth
anomalies occurring among those with maternal age greater than 30 and
the "other" anomalies among maternal age > 35 were later found to
actually have occurred among persons incorrectly classified as
appliers. Explain what implications this new calculation would have
on the conclusions of the study. (3 pts)

___________________________________________________________________

12. It is possible that the pesticides examined in this study might have
reduced fecundity or increased the proportion of conceptions not
resulting in live births. Assume that both of these effects (lower
fecundity, more spontaneous abortions, and more still births) have in
fact occurred in the pesticide applier population studied here, so
that the number of live births to pesticide applier fathers is smaller
than it would have been in the absence of pesticide exposure. Which of
the following statements is (are) TRUE and which is (are) FALSE? (2
pts each)

TRUE FALSE
____ ____ A. Since all births would be affected equally, effects on
fecundity and spontaneous abortion WOULD NOT have influenced
the size of the odds ratio presented in this study. [This
question is problematic.]

____ ____ B. If pesticides were equally likely to cause fetal loss and birth
anomalies, then the odds ratios would strongly understate the
harmful effects of pesticides.

13. Table 4 shows the frequency per 1000 births of major anomalies for the
general population by region. Which of the following best describes
the study design from which these data were obtained. (4 pts)

A. ecologic study
B. prospective cohort study
C. retrospective cohort study
D. region-specific case control study

14. The authors begin their discussion section by stating that this report
"is an initial step in the evaluation of the possible relationships
between the frequency of birth anomalies and pesticide use". They
conclude, however by saying that these data "signify a clear-cut need
for comprehensive examination of the health issues involved". This
latter statement seems to indicate that the authors suspect a causal
relationship. Identify and describe three criteria for causal
inference for which at least some information is present in the
article. Give specific examples from the article to support your
selection. (9 pts)

___________________________________________________________________

15. Suppose that after this publication came out, another study was
conducted in Illinois to investigate the hypothesis that birth defects
occurred more often in Illinois as compared to Minnesota. However,
in this new study the authors thought that the type of water consumed
could be related to birth defects. They wanted to adjust
(standardize) the rates of defects in the two states for water type.
Data from the two studies are compared as below.

Births by state and water type

Minnesota Pesticide Appliers Illinois Pesticide Appliers

Normal With anomalies Normal With anomalies
Water Type (#) (#) rate* (#) (#) rate*

Well water only 3379 93 26.8 100 2 ____

City water only 874 27 30.0 200 6 ____
Bottled water only 206 5 23.7 7293 145 ____
Total 4456 125 28.0 7593 153 ____

* per 1000 live births

a. calculate the crude rate and the water-type specific rates for
Illinois. Briefly describe how these two states compare in crude
rates of birth anomalies. (4 pts)

b. Using the combined number of live births as a standard, calculate

a standardized rate (standardized for water type) for each of the
states. Briefly describe how these standardized rates compare
with each other and reasons why they may or may not agree with the
crude rates. (6 pts)

16. Would an inference of causality based on the data in Table 4 be

subject to criticism based on the ecologic fallacy concept. Briefly
explain your answer. (2 pts)

17. Which of the following statements about the present study are (is)
TRUE and which are (is) FALSE. Indicate TRUE or FALSE for each
statement. (2 pts each)

TRUE FALSE
____ ____ A. Subjects used in the analyses for Table 1 of this study were
selected on the basis of their exposure status.

B. Table 4 in this study supplied dose response evidence to

support an inference of a causal relationship between
pesticides and birth defects.

____ ____ C. The age-adjusted odds ratio for all birth anomalies of 1.41 is
considered a modest association.

____ ____ D. Since birth defects of these types are rare in the general
population, a cohort study could be designed to efficiently
examine further the relationship of pesticides and birth
anomalies.

E. Exposure status in this study was randomized resulting in an

equal distribution of known and unknown confounding variables
between pesticide appliers and the general population.

F. a correlation coefficient is a measure of association but is

not useful in assessing the dichotomous outcomes measured in
this study.

G. Table 1 used stratified analyses to adjust for a confounding

effect of maternal age on the association between
musculoskeletal/integumental anomalies and pesticide exposure.
[question #18 has been removed, 10/7/97]

19. Succinctly evaluate whether or not, on the basis of the information in

the article (including information that the authors cite to other
work), further measures are warranted now to prevent birth defects
caused by chlorophenoxy herbicides. (5 pts)

University of North Carolina at Chapel Hill

School of Public Health
Department of Epidemiology
Fundamentals of Epidemiology (EPID 168)

Midterm Exam, Fall 1996

Answer Key - REVISED

Note: this answer guide is especially detailed in order to provide thorough

explanations of the many concepts that exam touched on (including a few it
touched on unintentionally!).

1. The primary study question for this investigation concerns the

relationship, suggested by previous studies, between exposure to
pesticides and risk of birth anomalies in offspring. The main exposure
is pesticides (assessed by the surrogate measure of being licensed to
apply certain pesticides). The main outcome is birth anomalies in
offspring, as recorded in birth records.

2. Classification of disease using manifestional criteria means grouping

disorders on the basis of their having similar observable
characteristics, e.g., symptoms, signs, behavior, laboratory findings,
onset, course, prognosis, response to treatment. Classification using
causal criteria means grouping disorders on the basis of their having the
same primary etiologic agent, which, of course, must have been previously
identified. The logic for analyzing the data in terms of organ systems
(a manifestational criterion) is that anomalies occurring in the same
organ system may be more likely to have the same (or closely related)
etiology and therefore should exhibit stronger associations with the
relevant exposure than would the more general category of all birth
anomalies.

3. The presentation of data concerning the occurrence of birth defects with

regard to place (crop region) and time (seasons) is basic descriptive
epidemiology. The fact that the study was designed with a view to
examining specific relationships of interest, which were then assessed
with measures of association and statistical tests, derives from an
analytic perspective.

4. C. Passive surveillance

5. This study cannot really establish the temporal sequence of pesticide

exposure and birth defects because a) half of births occurred before the
data used for the pesticide certification (1991); and b) the time of
actual exposure cannot be determined, since exposure is measured so
indirectly and without the ability to establish when it occurred.

6. A. Any answer can be defended - the population attributable risk (PAR) is

equal to the attributable risk multiplied by exposure prevalence or,
equivalently, the crude incidence minus the incidence in unexposed
persons. When incidence is measured as a rate (i.e., ID), then the PAR
is the difference of two rates. When incidence is measured as a
proportion (i.e., CI), then PAR is the difference of two proportions and
therefore cannot exceed 1.0. The resulting value is typically expressed
as a rate or a proportion. So this question is ambiguous -- apologies!

B. Rate - by the definition of ID

C. Proportion - by the definition of prevalence
D. Ratio - relative risk is a ratio of independently-derived risks (or
rates, if "relative risk" is interpreted as applying to the concept,
rather than specifically to the risk ratio).

7. C. prevalence - Although a birth with an anomaly is an "event", there is

no way to establish the population at risk (denominator) for these
events. For example, would the denominator population be couples, fecund
couples, fecund couples trying to conceive, embryos, recognized
pregnancies? Birth anomalies do not arise out of "live births", since
the anomalies already exist in the fetus. Therefore the "rate of
anomalies per 1000 live births" is simply the proportion of live births
in which a birth defect is present.

8. C. Pesticide appliers had a greater proportion of births with anomalies

as compared to the general population.

9. Assuming that prevalence of birth anomalies increases with increasing

maternal age, an increase in the odds ratio due to age-adjustment
indicates that the maternal age distribution in the general population is
shifted toward older ages relative to that distribution in pesticide
applier spouses. The basis for this conclusion is the following. Birth
defect prevalence was greater for pesticide applier couples. If some of
that excess were due to greater age among pesticide applier mothers, then
age-adjustment would diminish the excess, thereby decreasing the odds
ratios. Since instead, age-adjustment increased the odds ratios, then
the older ages of general population mothers must have offset some of the
excess risk due associated with pesticide exposure.

10A. Since the question does not specify absolute or relative impact, either
attributable risk (AR) or attributable risk proportion (ARP) is correct
(actually, attributable prevalence, but the term attributable risk is
typically applied to rates and prevalences as well as risks).

AR = P1 - P0 = [125 / (125 + 4456)] - [3666 / (3666 + 179,265)]

= 0.02728 - 0.02004 = 0.0072466 = 0.0072, or
7.2 per 1000 total live births

Meaning: 7.2 births with anomalies per 1000 live births fathered by
pesticide appliers are attributable to pesticide exposure.

Attributable Risk proportion (ARP) = (RR-1) / RR (using OR for RR)

= (OR - 1) / OR = (1.37 - 1) / 1.37 = 0.270 = 27%
or
ARP = AR / P1 = (0.027283 - 0.02004)/0.027283 = 0.26548 = 27%

Meaning: 27% of the prevalence of births with anomalies among all

live births fathered by pesticide appliers are attributable to
pesticide exposure.

To attribute cases to exposure requires the assumption of a causal

relationship between pesticides exposure and birth defects.

10B. Again, either population attributable risk (PAR) or population

attributable risk proportion (PARP) provide an answer.

Prevalence of paternal exposure among all live births is:

Pe = 4456 / (4456 + 179,265) = 0.02425 = 2.4% of live births

So PAR = AR x Pe = 0.0072466 x 0.02425 = 0.0655 = 0.000176

= 1.8 per 10,000 live births.

or PCrude - P0 = 0.020217 - 0.02004 = 0.000177 = 1.8 / 10,000

Meaning: 1.8 births with anomalies per 10,000 live births to the general
(married) population are attributable to pesticide exposure in pesticide
appliers.

PARP = [Pe (RR-1) ] / [1 + Pe (RR-1)] (using OR for RR)

= [(0.02425) (1.37-1)] / [1+0.02425(1.37)] = 0.0089
= 1% (approximately)

Or, using the case-control formulation,

Pe|d = 125 / ( 125 + 3666 ) = .032973

PARP = Pe|d (OR-1) / OR = (.032973) (1.37-1) / 1.37 = 0.008905

= 1% (approximately)

Or, PARP = Pe x ARP = 0.02425 x 0.26548 = 0.00644, using the ARP

from part a.

Meaning: Approximately 1% of all Minnesota live births with anomalies

are attributable to pesticide exposure in pesticide appliers.

(Note: small differences among the results from the various methods are
primarily due to the fact that the OR of 1.37 has been rounded to fewer
significant digits than are the prevalences computed above.

11. OR = 1.04 (Derivation:

"Corrected" cases in exposed = 127 - (19 + 12) = 96
Proportion in exposed = 96 / (4456 + 96) = 0.0211
"Corrected" cases in control = 3666 + 31 = 3697;
Proportion in control = 3697 / (3697 + 179,265) = 0.0202
0.0211 / 0.0202 = 1.04 = new odds ratio)

Thus, incorrectly classifying those anomalies into the exposed group

overestimates the strength of association.

12. A. False - there is no basis for assuming that all births would be
affected equally.

B. True - The total proportion of harm, including fetal loss, is:

(lost fetuses + birth anomalies)

-----------------------------------------------------
(lost fetuses + birth anomalies + normal live births)

This proportion exceeds the prevalence of birth anomalies among live

births, potentially by a substantial amount.

13. A. ecologic study - exposure is assessed at the community (region) level,

and exposure of persons is inferred based on residence in a geographic
region where pesticides are heavily used.

14. 1) Strength of association, estimated using odds ratios, is modest, and

therefore does not provide strong evidence on which to infer causal
relationships.

2) Biological plausibility - various laboratory studies and a clinical

epidemiologic study show that active ingredients and contaminants in
pesticides can be teratogenic and/or spermatotoxic. Also, several
compounds in the pesticides are endocrine disrupters.

3) Consistency (the authors cite epidemiologic studies [in Iowa,

Nebraska, Colorado] that have found similar relationships).

15. This question underwent a revision to simplify it, but unfortunately some
parts of the previous version remained. The columns labelled
"# live births" should have included the qualifier "Normal", and the
rates for Minnesota needed to be re-computed accordingly. Due to this
problem, two alternate solutions are completely acceptable, one in which
the denominators are the numbers in the "# live births" column and one in
which the denominators equal the sum of these numbers plus the numbers of
births with anomalies. In addition, full credit is given if the rates
for Minnesota were recomputed. Here is the version in which the stated
rates were used and the # of live births column was treated as if it
meant "Total live births":

Birth anomaly prevalences for Illinois, by water type:

Well water: 2/100 = 20.0 per 1000 live births
City water: 6/200 = 30.0 per 1000 live births
Bottled water: 145/7293 = 19.9 per 1000 live births
Overall (crude): 153/7593 = 20.2 per 1000 live births
Thus, the crude prevalence is higher in Minnesota than in Illinois.

Number of live births (both states combined)

--------------------------------------------
Well water 3479
City water 1074
Bottled water 7499
Total 12,052

Standardized prevalence for MN:

3479 x 26.8 + 1074 x 30.0 + 7499 x 23.7

---------------------------------------- = 25.2 per 1,000
12,052 x 1000

Standardized prevalence for IL:

3479 x 20.0 + 1074 x 30.0 + 7499 x 19.9

---------------------------------------- = 20.8 per 1,000
12,052 x 1000

The standardized prevalence for Minnesota also exceeds that for

Illinois, though by a smaller amount than the difference in the crude
prevalences. The difference has been slightly reduced because the
standardized prevalence for Minnesota gives somewhat greater weight to
the prevalence for bottled water (23.7/1000) and less to the
prevalence for well water (26.8/1000) than did the crude prevalence.

16. Yes - it is not clear from these data whether birth anomalies occurred
in people with or without exposure because exposure information was
based on group data.

17. A. False - subjects were selected from birth records for live births
B. False
C. True
D. False
E False
F. True - (however, a correlation coefficient indicates the extent of
association in the sense of two variables moving in tandem; it does
not indicate the strength of association in the epidemiologic sense
of how great a change occurs in the response variable for a change
of a given size in the exposure variable)
G. True

18. [Question removed, 10/7/97]

19. Points in favor of action at this time are the evidence that the
relationship is causal (biological plausibility, consistency between
results of ecologic [by crop-region] and individual-based [pesticide
applier] analyses, pattern of findings (season of conception),
consistency across several epidemiologic studies, and the high
attributable risk percent (27%) among babies with birth anomalies born
to pesticide applier couples. In addition, the substantially
increased prevalences of birth anomalies among all live births in
county clusters with high use of chlorophenoxy herbicides/fungicides
(Table 4), consistent across the four regions, suggest that anomalies
due to pesticides (assuming that the relationship is causal) occur
throughout areas where these pesticides are used. Even though the
population attributable risk proportion is very small (about 1%) for
exposure due to being a pesticide applier, the proportion of all
Minnesota birth anomalies potentially attributable to residence in a
county cluster with high pesticide use is 27% [overall prevalence of
birth anomalies for all Minnesota in-wedlock births was 3791 / 183,721
= 20.63 per 1000 live births (Table 1), prevalence of birth anomalies
in low-pesticide county clusters ("unexposed") was 15 per 1000 (Table
4), so PARP = (PCrude - P0) / PCrude = (20.63 - 15) / 20.63 = .27).
The effects seem to be strongest for chlorophenoxy pesticides,
suggesting that at least this category should be restricted.
Moreover, there are powerful arguments for reducing pesticide use for
environmental reasons as well.

Against taking action other than continuing research are that the
evidence is still not very strong (biological mechanisms not yet
elucidated, relationship is not highly specific, epidemiologic studies
limited and not entirely consistent, experimental evidence not
available), the potential impact on agriculture and therefore food
prices is considerable, and the costs to industry and commerce from
restrictions on a major product are substantial. Moreover, the
relative weakness of the odds ratios (below 2.0) indicates a
significant possibility that other factors could be responsible for
the increase in birth anomaly prevalence seen in association with
pesticide exposure, a possibility whose investigation requires better
data on exposure and other factors that may lead to birth anomalies.

Grading of this question is based on the clarity and support for your
evaluation and recommendation.
University of North Carolina
School of Public Health
Department of Epidemiology
Fundamentals of Epidemiology (EPID 168)
Victor J. Schoenbach and Wayne D. Rosamond

Fall 1996 Final Exam (Tuesday 10 Dec 1996)

This examination is based on Per-Gunnar Persson, Anders Ahlbom, Goran

Hellers. Diet and inflamatory bowel disease: a case-control study.
Epidemiology 1992;3:47-52.

NOTE: For simplicity, ignore the requirement that this study was
restricted to those persons with a telephone number.

1. Which of the following best describes the primary objective of this

study? (Choose one best answer) (3 pts)

A. To test the hypothesis that persons with inflammatory bowel disease

are more likely to have been exposed to certain dietary factors than
those without inflammatory bowel disease.

B. To test the hypothesis that the risk of having inflammatory bowel

disease given that you have certain dietary exposures is greater than
the risk of not having inflammatory bowel disease.

C. To test the hypothesis that the increase in inflammatory bowel

disease in the population is attributed to certain dietary exposures.

D. To test the hypothesis that the average consumption of certain

dietary factors increases as the proportion of a group of people with
inflammatory bowel disease increases.

2. Designation as a case of ulcerative colitis was based on which of the

following classification models. (Choose one best answer) (3 pts)

A. Manifestational criteria

B. Causal criteria

C. Both manifestational and causal criteria

D. Neither

3. Medical records were used to validate the hospital diagnoses of Crohn's

disease and ulcerative colitis. By using this validation process
instead of relying on hospital discharges coding alone, the authors are
reducing which of the following sources of error? (Choose one best
answer) (3 pts)

A. Selection bias

B. Prevalence-incidence bias

C. Information bias

D. Surveillance bias
_ -2-

4. Controls were selected as a random sample using the population register

of Stockholm County Council. Which of the following best describes the
primary purpose of using a random sample in this study? (Choose one
best answer) (3 pts)

A. Maximize generalizability by obtaining a statistically representative

sample.
B. Select a control group that was as similar as possible to the case
group except for dietary exposures.

C. Provide an estimate of the dietary exposure in the source population

from which the cases arose.

D. Select a control group with dietary habits similar to those in the

population of cases.

5. Dietary exposures were assessed using a questionnaire with

Ò retrospective questions aimed at a period of time 5 years in the pastÓ
(page 48). Which of the following situations of misclassification would
make sucrose appear more harmful than it really was? (Choose one best
answer) (3 pts)

A. Controls underreported sucrose intake but cases did not.

B. Cases underreported sucrose intake but controls did not.

C. Both cases and controls underreported sucrose intake.

D. Both cases and controls overreport sucrose intake.

6. Suppose that cases excluded due to administrative delay problems were

more likely to have daily soft drink exposure than less than daily.
Which of the following best describes the impact this would have on the
odds ratio presented in Table 3? (Choose one best answer) (3 pts)

A. Without the exclusion the odds ratio would be closer to the null.

B. Without the exclusion the odds ratio would be larger.

C. The exclusion did not affect the odds ratio.

D. Cannot determine on the basis of this information.

7. Diagnoses of disease were verified in this study. Define validity and

compare and contrast this concept with reliability. (4 pts)

8. This study uses a case control design with a population based control
group. Which of the following, in general, is a strength of this
design. (Choose one best answer) (3 pts)

A. Allows examination of rare diseases.

B. Allows examination of rare exposures.

C. Good for establishing temporality.

D. Good for equalizing on known and unknown confounders.

_ -3-

9. Items on the food frequency questionnaire were mostly in a format with

six response options that ranged from twice per day or more often to
less frequently than once every 2 weeks (pg 48). In deriving values for
daily energy intake, the authors treated the food frequency responses as
which level of measurement? (Choose one best answer) (3 pts)

A. Nominal

B. Ordinal

C. Interval

D. Ratio

10. Control for age in the analyses presented in Table 2 was accomplished
through which of the following methods? (Choose one best answer)
(3 pts)

A. Stratified analysis plus matching.

B. Matching plus mathematical modeling.

C. Restriction without stratification

D. Mathematical modeling and stratification.

11. Based on the data presented in Table 2, is ulcerative colitis associated

with fat intake among men? Give a brief statement to support your
answer. (4 pts)

12. The authors state on page 49 that after controlling for smoking, the
relative risk for CrohnÕ s disease among men was 1.9 for a high
consumption of sucrose and 0.7 for a high consumption of fiber. Briefly
explain why based on these data the authors state that smoking did not
confound these associations. (3 pts)
13. The data presented in Table 3 indicate that Crohn's disease is
associated with the consumption of fast foods. Suppose that when
stratified by educational attainment, the resulting data were as
follows:
Educational attainment

High Low

Controls Cases Controls Cases

Fast foods

1+ times/wk 12 10 8 14

None 150 100 135 28

a. Calculate the crude and stratum-specific odds ratios. (3 pts)

b. Is this association between fast food and CrohnÕ s disease confounded

by education level? Quantify and briefly explain your answer. (3
pts)

c. Briefly explain in 2 sentences or a diagram how education might fit

into a conceptual model consisting of fast food, education, and risk
of Crohn's disease. (3 pts)
_ -4-

14. In the discussion (page 50), the authors state that Ò if the change in
diet is the same in cases as in controls, then the relative risk
estimates would be biased toward unityÓ . This is an example of which of
the following? (Choose one best answer) (3 pts)

A. Non differential misclassification bias

B. Non differential selection bias
C. Differential information bias
D. Differential misclassification bias

15. This articles does not present p-values yet reports 95% confidence
intervals for all odds ratios. Which of the following best describes
what information a confidence interval conveys that a p-value does not.
(Choose one best answer) (3 pts)
A. A confidence interval puts the observed point estimate in the context
of randomness.

B. A confidence interval provides information on the precision of the

point estimate.

C. A confidence interval includes an estimate of the statistical power

of the study.

D. A confidence interval reflects the clinical significance of the point

estimate.

16. The study describes the association of consumption of Muesli-type

breakfast cereal and Crohn's disease (Table 3). Briefly state and
evaluate the strength of the numerical evidence for the association
between Muesli-type breakfast cereals and Crohn's disease. (3 pts)

17. Briefly present the evidence for or against the role of fiber as a
confounder of the association of sucrose intake and CrohnÕ s disease. (3
pts)

18. Suppose a follow-up to this study was done to estimate the rate (per
10,000 person years) of ulcerative colitis among a large sample in the
Swedish population. The table below summarizes the results.

Fast food intake

Soft drink intake 2/week None

Daily 18.0 9.1

Less frequently 6.8 3.7

a. Which model for the joint effect of these two food items, the
additive model or the multiplicative model, better fits the data?
Your answer should give the formula for each model and show how to
evaluate it with the above data. (5 pts)

b. Do these data, assuming that they accurately reflect causal effects,

indicate a synergistic effect from a public health perspective?
Justify your answer and state an appropriate public health
implication if any. (2 pts)
_ -5-

19. This study did not differentiate between caffeinated and decaffeinated
coffee. Using the data presented in Table 4 and applying the
assumptions below, calculate the odds ratio (heavy versus no use)
associated with caffeinated coffee consumption and determine if it is
protective against ulcerative colitis. Describe in 2 sentences or less
the interpretation of this new odds ratio, ignoring issues of random
error. (4 pts)

Assumptions:

1. 20% of the heavy coffee drinkers ( 3 cups per day) among cases drink
only decaffeinated coffee.

2. 90% of heavy coffee drinkers among controls drink only decaffeinated

coffee.

20. Which of the following variables was NOT in the multiple logistic model
that was used to estimate the relative risk for sucrose intake in
relation to ulcerative colitis in women? (Choose best answer) (3 pts)

A. Age

B. Gender

C. Total energy intake

D. Ulcerative colitis

21. In the multiple logistic model that yielded the relative risk estimate
of 0.7 for Ulcerative colitis in relation to daily vegetable consumption
(Table 4), what was the value of the coefficient for the vegetable
consumption variable assuming that it was coded as 1=daily, 0=less
frequently? Write the conversion equation of coefficient to relative
risk estimate. (3 pts)

22. Assume that the population of Stockholm County in the age range covered
by this study was 1,000,000 in 1980 and remained constant throughout the
decade. What was the average annual incidence of hospital-diagnosed
Crohn's disease during that period regardless of when their medical
record became available? (3 pts)

23. Using the data in Table 2, for which of the following two associations
is there more of an indication of confounding by age and total energy
intake in WOMEN? Support your answer with relevant data and/or
computations. (3 pts)

a. Crohn's disease and sucrose intake (highest versus lowest level)

b. Crohn's disease and disaccharide intake (highest versus lowest level)

24. Briefly state one major strength and one major limitation of this study
(2 pts)
_ -6-

25. List two Bradford Hill criteria for evaluating whether dietary sucrose
intake is causally related to inflammatory bowel disease. Evaluate each
using specific facts from the article. (4 pts)

26. Which of the following statements about the data in Tables 1 and 2 are
TRUE and which are FALSE (answer TRUE or FALSE for each statement). (2
pts each)

a. In women, the rate of (hospitalized) ulcerative colitis was higher

than that of (hospitalized) Crohn's disease.

b. The similarity in age distribution between the case groups and

controls indicates that the rates of these disease are fairly uniform
between the ages of 15 and 79 years.

c. Reporting of dietary intake by the Crohn's disease cases involved

recall over longer periods of time, on the average, than was the case
for the ulcerative colitis cases.

d. The proportion of controls with high dietary fat intake was higher
for men than for women.

27. A Swedish friend of yours who lives in Stockhom has an indentical twin
sister who is anything but identical in terms of her diet. Your friend,
as other health conscious Swedes, avoids fast foods and soft drinks, and
eats whole grain bread and muesli-type cereals daily. Her twin sister,
and many Swedes, often consumes fast foods and soft drinks, but never
touches whole grain bread or muesli.

Your friend comes to visit with you over the holidays, and while you are
sleeping late one morning she comes across your class notes from EPID
168. At breakfast, where she has been busily scribbling on her napkin,
she asks you this question.

"Suppose that fast foods, soft drinks, whole grain bread, and muesli-
type cereal affect Crohn's disease risk independently, and that I can
ignore other risk factors. Suppose also that the excess risks are
additive. Is my twin sister's risk of Crohn's disease 10 times my own?"

She shows you how she used the information in Table 3 to obtain that
estimate:

(3.4 - 1) + (2.8 - 1) + ((1/0.4) - 1) + ((1/0.2) - 1) + 1 = 10.7

She goes on to explain "(3.4 -1) is the excess risk from fast foods, and
((1/0.4) - 1) is the excess risk from eating bread that is not whole
grain."

Even though you're not quite fully awake, you feel justifiable pride in
your command of epidemiologic concepts and explain to her the one big
mistake she has made. You say, " . . . ". Write a brief statement of
what you would say. (4 pts)

University of North Carolina at Chapel Hill

School of Public Health
Department of Epidemiology
Fundamentals of Epidemiology (EPID 168)

Final Examination, Fall 1996

Answer Guide

1. A. To test the hypothesis that persons with inflammatory bowel disease

are more likely to have been exposed to certain dietary factors than
those without inflammatory bowel disease.

2. A. Manifestational criteria

3. C. Information bias
4. C. Provide an estimate of the dietary exposure in source population from
which the cases arose.

5. A. Controls underreported sucrose intake but cases did not.

6. B. This differential selection bias would underestimate the odds ratio.

7. Validity refers to accuracy or how well an instrument or method measures

what it purports to measure. Reliability refers to repeatability, does
an instrument or method get the same result or answer consistently,
regardless of whether the reading is correct.

8. A. allows examination of rare diseases.

9. D. Ratio (The response scale for each item was ordinal, but in order to
create the total energy variable the authors had to convert each
response into calories.)

10. B. Matching plus mathematical modeling.

11. The odds ratios for 80 to 104 grams per day was 1.4 and for intakes of
greater than 105 grams per day the odds ratio was 1.3. This suggests a
tendency for cases to have a greater proportion of high fat eaters than
controls. However, the confidence intervals are broad, extending as low
as 0.4 and 0.6. Furthermore there is no suggestion of a dose response.
This is at most weak evidence of a relationship between fat intake and
ulcerative colitis.

12. a. The crude (with respect to smoking) and adjusted odds ratios are the
same. If smoking had been a confounder in the relationship between
sucrose and Crohn's disease or between fiber and Crohn's disease the
adjusted odds ratio would have been meaningfully different from the
values in Table 2.

13. a. Odds ratios: Crude = (24 x 285) / (20 x 128) = 6840 / 2560 = 2.7
among High education = (10 x 150) / (12 x 100) = 1.3
among Low education = (14 x 135) / (28 x 8) = 8.4

b. The stratum-specific odds ratios are quite different from each other,
suggesting some degree of effect modification. The crude odds ratio
is within the range of the two stratum-specific odds ratio, which
suggests that education is not so much a confounder as an effect
modifier.

c. Three conceptual models of the relationship among fast food, education

and Crohn's disease could be:

education-- lower fast food-- lower Crohn's disease (i.e., higher

educational status could lead to lower fast food consumption which
could then lead to reduced association with Crohn's disease)

education-- (lower fast food + education) -- lower Crohn's disease

(i.e., education also has an interactive effect with fast food
consumption to lead to an association with Crohn's disease)

[education -- lower Crohn's disease] AND [lower fast foods-- lower

Crohn's disease risk] (i.e., lower fast food intake and education act
as independent main effects to influence Crohn's disease risk).

14. A. Nondifferential misclassification bias

15. B. A confidence interval provides information on the precision of the

point estimate.

16. There appears to be a strong protective effect of daily consumption of

Muesli-type breakfast cereals and Crohn's disease (odds ratio = 0.2 [0.1-
0.7]). The association is considerably weaker for weekly consumption of
these cereals (odds ratio = 0.8). There is evidence of a dose response
relationship, even though the OR for weekly consumption was not
statistically significant. One should also consider that the absolute
number of cases with daily consumption of Muesli-type cereals is small
(n=4).

17. The authors state that sucrose and fiber intake could be associated with
one another as well as with Crohn's disease and thus each factor might be
a confounder of the associations between Crohn's disease and the other
("mutual confounding"). The odds ratio was 2.6 for a high sucrose intake
(bottom page 48). When adjusted for fiber the sucrose odds ratio changed
only slightly to 2.5. Therefore, fiber was a only a slight modifier of
the sucrose and Crohn's disease relationship.

18. a. Under the additive model, we expect the joint excess rate of the two
factors will be equal to the sum of the excess rate from each factor
separately. The additive model can also be written in terms of rates:
expected rate of ulcerative colitis with both daily soft drink and =2
fast foods per week = rate (daily soft drinks, without fast food) +
rate (less freq. soft drink, =2 fast food per week) - rate (neither).

If Ri,j is the rate for exposures i and j = 1 (present) or 0 (absent),

then the additive model is: R1,1 = R1,0 + R0,1 - R0,0. This equation
expressed with numbers from the tables is: Expected joint rate = 9.1
+ 6.8 - 3.7 = 12.2. The observed rate with both factors was 18.0.
Therefore, the additive model does not explain the full amount of the
observed joint risk.

Under the multiplicative model, we expect the joint rate ratio of the
two factors to be equal to the product of the rate ratios for each
factor separately. In the above notation, the model can be expressed
as: R1,1 = (R1,0 x R0,1)/R0,0. This equation expressed with numbers
from the tables is: (9.1 x 6.8) / 3.7 = 16.7. The observed rate is
18.0. The close agreement for the observed joint rate and that
expected under the multiplicative model suggests that the relationship
among daily soft drink consumption, frequent fast food exposure, and
Crohn's disease is closer to multiplicative than to additive.

b. Generally, synergism from a public health perspective is equated with

a joint effect that is greater than expected with an additive model.
Therefore, the relationship between fast foods and soft drink is
synergistic, implying that the exposure group to target for maximum
reduction in ulcerative colitis rates per person year is people who
consume both fast foods and soft drinks. One could propose posting
warnings signs in fast food establishments, soft drink vending
machines, and beverage containers, etc.

19. odds ratio for =3 caffeinated coffee = (56 x 36) / (18 x 36) = 3.1
Heavy caffeinated coffee drinking now appears to be a risk factor for
Ulcerative colitis where before coffee drinking appeared to be
protective. An alternative approach would be to include the
decaffeinated coffee drinkers in the "No" (caffeinated) coffee group.
Under this model the odds ratio for =3 cups caffeinated coffee, relative
to none or only decaffeinated = (56 x 201) / (50 x 18) = 12.5

20. B. Gender -- all subjects in this analysis are women.

21. Regression coefficient = log (OR) = log (0.7) = -0.36

22. 236 cases / 5,000,000 person years = 4.72 cases/100,000 person years.
Full credit was given for 236 cases / 4,000,000 person years = 5.9 cases
/ 100,000 per year. Note that the incidence is obtained from all cases
(or at least all confirmed cases), rather than from only consenting
cases.

23. There is more confounding for sucrose:

For sucrose:
Crude OR = (34 x 67) / (27 x 38) = 2.22, versus adjusted OR of 3.6

For disaccharides:
Crude OR = (30 x 66) / (35 x 45) = 1.26, versus adjusted OR of 1.2

24. Strengths could include attempts to evaluate dose response, population-

based case and control selection, validation of case status, large study
population. Weaknesses include potential for recall bias, information
bias in diet assessment.

25. Strength of association (This study assessed the strength of association

by calculating odds ratios. These measures of strength were also put in
context by providing confidence intervals. Some stratum-specific odds
ratios were strong while others were very weak.), dose response,
consistency across studies (limited).

26. a. F
b. F
c. T
d. F

27. Models of joint effects combine effects of "pure" exposures, i.e., in the
absence of other exposures. But the excess risk for each food item in
Table 2 is estimated without controlling for the effects of others. For
example, since people who eat fast foods are also likely to take soft
drinks and not to eat whole grain bread, the relative risk estimates for
fast food 2+ times/week probably already reflect frequent soft drink
consumption and low whole grain bread consumption. In order to add up
the excess risk for each food item, we need to know the excess risks for
exposure to that item in the absence of the others.

Environmental Health Mock Exam Paper
100% (7)
Environmental Health Mock Exam Paper
9 pages
Epidemiology Exam Sample Questions
75% (4)
Epidemiology Exam Sample Questions
8 pages
MD3150E Epidemiology Exam Questions
75% (4)
MD3150E Epidemiology Exam Questions
11 pages
Epidemiology Final Exam Questions
100% (1)
Epidemiology Final Exam Questions
10 pages
Advanced Epidemiology Exam 2 Results
100% (1)
Advanced Epidemiology Exam 2 Results
13 pages
Epidemiology-Biostatistics Exam 2000
86% (7)
Epidemiology-Biostatistics Exam 2000
7 pages
Epidemiology MCQs for Self-Assessment
50% (6)
Epidemiology MCQs for Self-Assessment
4 pages
Assignment 1 - Young
100% (2)
Assignment 1 - Young
15 pages
Epidemiology Exam Questions and Answers
100% (2)
Epidemiology Exam Questions and Answers
17 pages
Key Epidemiology MCQs and Answers
100% (1)
Key Epidemiology MCQs and Answers
10 pages
Epidemiology MCQs for Self-Assessment
60% (5)
Epidemiology MCQs for Self-Assessment
5 pages
Epidemiology MCQs and Answers Guide
100% (2)
Epidemiology MCQs and Answers Guide
9 pages
Epidemiology Final Exam Overview
100% (1)
Epidemiology Final Exam Overview
13 pages
Epidemiology Self-Assessment Quiz
100% (9)
Epidemiology Self-Assessment Quiz
75 pages
Analytical Epidemiology MCQs Guide
86% (7)
Analytical Epidemiology MCQs Guide
12 pages
Clinical Trial Analysis and Design
100% (2)
Clinical Trial Analysis and Design
17 pages
Epidemiology Quiz for Participants
100% (12)
Epidemiology Quiz for Participants
9 pages
Understanding HIV/AIDS and Related Concepts
100% (2)
Understanding HIV/AIDS and Related Concepts
11 pages
Epidemiology MCQs and Answers Guide
100% (9)
Epidemiology MCQs and Answers Guide
9 pages
IUD Insertion RVU Analysis
100% (2)
IUD Insertion RVU Analysis
86 pages
Epidemiological Concepts MCQs
100% (1)
Epidemiological Concepts MCQs
45 pages
Epidemiology & Biostatistics MCQs Exam
76% (33)
Epidemiology & Biostatistics MCQs Exam
4 pages
Descriptive Epidemiology MCQs Guide
100% (8)
Descriptive Epidemiology MCQs Guide
6 pages
Understanding Epidemiologic Measures
100% (3)
Understanding Epidemiologic Measures
46 pages
Epidemiology & Biostatistics MCQ Exam
100% (3)
Epidemiology & Biostatistics MCQ Exam
4 pages
STD and Cancer Risk Assessment Analysis
83% (12)
STD and Cancer Risk Assessment Analysis
4 pages
Epidemiology Final Exam 2004/05
100% (2)
Epidemiology Final Exam 2004/05
9 pages
Understanding Incidence and Prevalence
91% (45)
Understanding Incidence and Prevalence
58 pages
Biostatistics MCQs for Medical Students
100% (3)
Biostatistics MCQs for Medical Students
13 pages
Epidemiology Final Exam Questions
100% (2)
Epidemiology Final Exam Questions
9 pages
Epidemiology Final Exam Questions
100% (2)
Epidemiology Final Exam Questions
3 pages
Epidemiological Study Design MCQs
100% (4)
Epidemiological Study Design MCQs
7 pages
Veterinary Epidemiology Exam Questions
90% (10)
Veterinary Epidemiology Exam Questions
32 pages
EEP 312 Exam: Environmental Epidemiology
100% (4)
EEP 312 Exam: Environmental Epidemiology
13 pages
Epidemiology MCQs for Public Health Exams
50% (2)
Epidemiology MCQs for Public Health Exams
2 pages
Epidemiology MCQs and Study Designs
100% (1)
Epidemiology MCQs and Study Designs
18 pages
Epidemiology Exam Questions Overview
No ratings yet
Epidemiology Exam Questions Overview
68 pages
Adolescent Cancer MRD Assay Study Proposal
100% (2)
Adolescent Cancer MRD Assay Study Proposal
14 pages
Biostatistics MCQs for Medical Students
75% (4)
Biostatistics MCQs for Medical Students
9 pages
Acute Asthma Management Insights
100% (5)
Acute Asthma Management Insights
52 pages
Epidemiology MCQs for Health Professionals
100% (1)
Epidemiology MCQs for Health Professionals
19 pages
Measures of Disease Frequency Exercises
88% (8)
Measures of Disease Frequency Exercises
7 pages
Understanding Epidemic Occurrences
100% (3)
Understanding Epidemic Occurrences
12 pages
Epidemiology MCQs: Incidence & Prevalence
No ratings yet
Epidemiology MCQs: Incidence & Prevalence
33 pages
Community Health Entrance Exam Quiz
100% (2)
Community Health Entrance Exam Quiz
2 pages
Community Medicine MCQs on Epidemiology
100% (1)
Community Medicine MCQs on Epidemiology
355 pages
Statistical Power in Clinical Trials
100% (6)
Statistical Power in Clinical Trials
12 pages
Epidemiology Midterm Exam Questions
100% (1)
Epidemiology Midterm Exam Questions
119 pages
Epidemiology and Public Health Insights
No ratings yet
Epidemiology and Public Health Insights
14 pages
Epidemiologic Principles Midterm Exam
No ratings yet
Epidemiologic Principles Midterm Exam
7 pages
Epidemiology Midterm Exam Questions
No ratings yet
Epidemiology Midterm Exam Questions
7 pages
Epidemiologic Methods Midterm Exam
No ratings yet
Epidemiologic Methods Midterm Exam
6 pages
Key Epidemiology Concepts and Measurements
No ratings yet
Key Epidemiology Concepts and Measurements
19 pages
Epidemiology Exam Questions and Answers
No ratings yet
Epidemiology Exam Questions and Answers
8 pages
Epidemiology MCQ Exam Questions
No ratings yet
Epidemiology MCQ Exam Questions
10 pages
Epidemiology Study Design Questions
No ratings yet
Epidemiology Study Design Questions
10 pages
Community Dentistry MCQs and Answers
100% (3)
Community Dentistry MCQs and Answers
11 pages
Epidemiology Practice Questions and Solutions
100% (1)
Epidemiology Practice Questions and Solutions
12 pages
Understanding Cohort Study Validity
90% (10)
Understanding Cohort Study Validity
49 pages
Understanding Epidemiology Basics
No ratings yet
Understanding Epidemiology Basics
17 pages
Ethiopia Obstetrics Management Protocol
No ratings yet
Ethiopia Obstetrics Management Protocol
216 pages
Patient Attitudes Toward Anesthesia in Gondar
No ratings yet
Patient Attitudes Toward Anesthesia in Gondar
26 pages
Understanding Neoplasia and Metastasis
No ratings yet
Understanding Neoplasia and Metastasis
76 pages
Review of Elective Gynecologic Surgeries
100% (1)
Review of Elective Gynecologic Surgeries
33 pages
Anatomy of Orofacial Structures
No ratings yet
Anatomy of Orofacial Structures
96 pages
Antenatal Care and Pregnancy Diagnosis
100% (1)
Antenatal Care and Pregnancy Diagnosis
56 pages
Case Report: 44-Year-Old with Menorrhagia
No ratings yet
Case Report: 44-Year-Old with Menorrhagia
16 pages
Type 1 Diabetes in Pregnancy Case Study
No ratings yet
Type 1 Diabetes in Pregnancy Case Study
34 pages
ICU Monitoring of Critically Ill Patients
No ratings yet
ICU Monitoring of Critically Ill Patients
20 pages
Integrated Care Handbook, 2nd Edition
100% (7)
Integrated Care Handbook, 2nd Edition
17 pages
N A S S: Modified Finnegan Neonatal Abstinence Score Sheet
100% (1)
N A S S: Modified Finnegan Neonatal Abstinence Score Sheet
4 pages
Global Return On Investment and Cost Effectiveness
No ratings yet
Global Return On Investment and Cost Effectiveness
11 pages
Community Health Extension Worker Plan
No ratings yet
Community Health Extension Worker Plan
66 pages
COVID-19 Infection Control in Nursing Homes
No ratings yet
COVID-19 Infection Control in Nursing Homes
6 pages
Medical Certificate
No ratings yet
Medical Certificate
3 pages
Chemicals in Medicine: A Consumer Guide
No ratings yet
Chemicals in Medicine: A Consumer Guide
7 pages
MAPEH 7: 4th Quarter Review Guide
No ratings yet
MAPEH 7: 4th Quarter Review Guide
4 pages
Non-Adherence Factors in Diabetic Patients
No ratings yet
Non-Adherence Factors in Diabetic Patients
7 pages
HCM Treatment Strategies and Innovations
No ratings yet
HCM Treatment Strategies and Innovations
9 pages
Now: Food-Preservation-Technology-1st-Edition-Olga-Martin-Belloso
100% (1)
Now: Food-Preservation-Technology-1st-Edition-Olga-Martin-Belloso
69 pages
Accident Investigation and Safety Evolution
No ratings yet
Accident Investigation and Safety Evolution
129 pages
Rickettsia and Chlamydia Overview
No ratings yet
Rickettsia and Chlamydia Overview
34 pages
Understanding Hypotension Types
No ratings yet
Understanding Hypotension Types
12 pages
Homoeopathic Remedies from Volcanoes
86% (14)
Homoeopathic Remedies from Volcanoes
17 pages
Eosinophilic Esophagitis Overview
No ratings yet
Eosinophilic Esophagitis Overview
2 pages
Educational Topic 54: Endometrial Hyperplasia and Carcinoma: NIT Eoplasia
No ratings yet
Educational Topic 54: Endometrial Hyperplasia and Carcinoma: NIT Eoplasia
3 pages
Family Medicine Course Syllabus 2024
No ratings yet
Family Medicine Course Syllabus 2024
9 pages
Complete Repertory by Roger van Zandvoort
No ratings yet
Complete Repertory by Roger van Zandvoort
5 pages
Final MBBS Part-II Surgery MCQs
No ratings yet
Final MBBS Part-II Surgery MCQs
3 pages
Class XII Immunity Project Report
No ratings yet
Class XII Immunity Project Report
16 pages
All India Services Pension Commutation Rules
No ratings yet
All India Services Pension Commutation Rules
29 pages
Blood Sample Collection and Grouping SOP
No ratings yet
Blood Sample Collection and Grouping SOP
11 pages
Malnutrition in HIV Positive Children
No ratings yet
Malnutrition in HIV Positive Children
43 pages
Carvedilol: Uses and Nursing Guidelines
No ratings yet
Carvedilol: Uses and Nursing Guidelines
2 pages
Ebook & Testbank Cohens Pathways of The Pulp Expert Consult 12th Edition Kenneth M Hargreaves
100% (2)
Ebook & Testbank Cohens Pathways of The Pulp Expert Consult 12th Edition Kenneth M Hargreaves
221 pages
Unified Medical Declaration Form
No ratings yet
Unified Medical Declaration Form
1 page
Nursing Case Analysis Guidelines
No ratings yet
Nursing Case Analysis Guidelines
4 pages
Adult DCD/Dyspraxia Checklist Guide
No ratings yet
Adult DCD/Dyspraxia Checklist Guide
4 pages
Vitamin B12 and D Test Results Report
No ratings yet
Vitamin B12 and D Test Results Report
3 pages