0% found this document useful (0 votes)
19 views6 pages

Beyond Traditional Surveillance: Harnessing Expert Knowledge For Public Health Forecasting

The document discusses the importance of expert judgment in public health forecasting, particularly in light of workforce reductions in the US public health sector. A case study involving 114 public health officials demonstrated that expert predictions on influenza dominance and hospitalizations were more accurate than computational models. The authors recommend developing a national toolkit to systematically collect and analyze expert predictions and rationales, integrating human judgment with traditional data sources to enhance crisis response capabilities.

Uploaded by

Việt Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views6 pages

Beyond Traditional Surveillance: Harnessing Expert Knowledge For Public Health Forecasting

The document discusses the importance of expert judgment in public health forecasting, particularly in light of workforce reductions in the US public health sector. A case study involving 114 public health officials demonstrated that expert predictions on influenza dominance and hospitalizations were more accurate than computational models. The authors recommend developing a national toolkit to systematically collect and analyze expert predictions and rationales, integrating human judgment with traditional data sources to enhance crisis response capabilities.

Uploaded by

Việt Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Beyond Traditional Surveillance: Harnessing Expert Knowledge for

Public Health Forecasting


Garrik Hoyt1 , Eleanor Bergren2 , Gabrielle String3,4 , and Thomas McAndrew5

Abstract— Downsizing the US public health workforce have a potentially immense impact on US public health [6],
throughout 2025 amplifies potential risks during public health [7]. The reductions significantly diminished the public health
crises. Expert judgment from public health officials represents a workforce of practitioners, scientists, and epidemiologists,
vital information source, distinct from traditional surveillance
infrastructure, that should be valued—not discarded. Under- redirecting resources from long-term priorities, such as data
arXiv:2508.15623v1 [[Link]] 21 Aug 2025

standing how expert knowledge functions under constraints collection, to immediate needs [2], [8].
is essential for understanding the potential impact of reduced Effective decisions combine data, models, and, perhaps
capacity. most importantly, the experience of epidemiologists, infec-
To explore expert forecasting capabilities, 114 public health tious disease modelers, and public health officials (who we
officials at the 2024 CSTE workshop generated 103 predictions
plus 102 rationales of peak hospitalizations and 114 predictions call experts in this work) [9]. The importance of this collab-
of influenza H3 versus H1 dominance in Pennsylvania for the orative expertise becomes clear when examining successful
2024/25 season. We compared expert predictions to computa- international responses to health emergencies. Following the
tional models and used rationales to analyze reasoning patterns 2010 earthquake in Haiti, the Centers for Disease Control
using Latent Dirichlet Allocation. Experts better predicted and Prevention (CDC) supported the Haitian Health Min-
H3 dominance and assigned lower probability to implausible
scenarios than models. Expert rationales drew on historical istry (MSPP) to strengthen disease surveillance systems and
patterns, pathogen interactions, vaccine data, and cumulative laboratory testing at health facilities already supported by the
experience. US President’s Emergency Plan for AIDS Relief [10]. When
Expert public health knowledge constitutes a critical data cholera was detected nine months later, the US Office of
source that should be valued equally with traditional datasets. Foreign Disaster Assistance, MSPP, CDC, and local public
We recommend developing a national toolkit to systematically
collect and analyze expert predictions and rationales, treating health officials quickly coordinated the distribution of cholera
human judgment as quantifiable data alongside surveillance treatment supplies to hospitals, promoted point-of-use water
systems to enhance crisis response capabilities. treatment and sanitation in displaced person camps and
communities, and established a national cholera surveillance
I. INTRODUCTION
system still in use today [11], [12].
Effective public health decision-making requires rigorous International responses like the above illustrate how expert
training in population-level epidemiology and biostatistics, judgment enables rapid decision-making under uncertainty—
adherence to evidence-based decision making rather than officials quickly assessed risk patterns, anticipated resource
anecdotal reasoning, and collaborative leadership among needs, and coordinated interventions. However, given the
networks of health institutions and officials [1]. Starting constraints imposed by reduced workforce capacity, main-
February 14, 2025, the United States Government has— taining the collaborative expertise demonstrated in past crisis
via the canceling of COVID-era grants, policy shifts, and responses will require innovative approaches to optimize
rebudgeting—reduced staff positions related to public health what remains, and a deeper understanding of how expert
services [2], [3]. The US health secretary terminated 10,000 judgment functions under pressure. To maximize the effec-
positions within the Department of Health and Human tiveness of smaller teams, one must examine the complemen-
Services. All seventeen experts who were members of the tary strengths of expert judgment and computational model-
Advisory Council On Immunization Practices were removed ing, identifying how these approaches can be systematically
from their position [4]. The Director of the National Institute combined to enhance forecasting accuracy and decision-
of Allergy and Infectious Diseases was put on leave [5]. A making quality.
consistent decline in practitioners, scientists, and epidemiol- Past work has investigated the strengths and weaknesses of
ogists associated with public health has been suggested to the ability of experts to produce well-performing predictions
1 Department of Computer Science and Engineering, PC Rossin College about the future [13]. Past studies have shown that experts
of Engineering and Applied Sciences, Lehigh University, Bethlehem, PA, excel in assessing factors and mechanisms that are linked
USA to worse health outcomes for a population [14], [15]. In
2 Council of State and Territorial Epidemiologists, Atlanta, Georgia, USA
3 Department of Population Health, College of Health, Lehigh University, addition to assessing the link between factors and health
Bethlehem, PA, USA conditions, when faced with a challenge (e.g. outbreak),
4 Department of Civil and Environmental Engineering P.C. Rossin College
experts typically make better decisions compared to a novice.
of Engineering and Applied Sciences, Lehigh University, Bethlehem, PA, This decision making ability is termed recognition-primed
USA
5 Department of Biostatistics and Health Data Science, College of Health, decision making. Given a challenge, the expert can quickly
Lehigh University, Bethlehem, PA, USA recognize past, similar experiences and select from a set of
Fig. 1. Forecasts generated by a model (orange) and by a group of public health officials (blue) for two questions related to the 2024/25 influenza season
in PA: (Top panel) would the majority of confirmed influenza cases be classified as H3 (vs H1) and (Bottom) what would be the peak number of incident
hospitalizations. The height of the bars for both panels denote the probability assigned to each potential, future observation. The bars that are enclosed
by dashed lines identify the ground truth which was computed at the close of the season. Public health officials generated forecasts of these two seasonal
targets which were competitive with a traditional computational model.

decisions that performed well [16], [17]. That said, experts the Supplement). The short timeframe to answer questions
do not need to be the primary decision makers and can was meant to emphasize the role of public health experience
supplement statistical models. Experts have been asked to: over a more laborious study of background data. In addition
design probability densities for models with sparse data, to quantitative predictions, we analyzed expert rationales
make predictions about specific aspects of an infectious using Latent Dirichlet Allocation to identify themes in their
outbreak (e.g. peak intensity, weekly cases, etc), and make reasoning.
long-range predictions about vaccine efficacy [18], [19],
[20]. However, expert judgment, like all human judgment, II. RESULTS
is susceptible to biases like group-think and anchoring [21]. On May 31, 2025, at the close of the 2024/25 influenza
Expert predictions are not always as accurate as we would season, the reported percent of H3 influenza cases in Penn-
expect and may be overly confident [22]. Importantly, expert sylvania was 40.0% and in the US was 47.3% (See Fig-
judgment is crucial in the communication of information to ure 1A). The percent of H3 influenza cases in Pennsylvania
the public. was on average 49% over the past three seasons (84% in
In what follows, we contribute a case study comparing season 2021/22, 53% in 2022/23, and 20% in 2023/24).
expert forecasts to two computational models, the ability Experts (vs the model) assigned a probability of 0.25 (vs
of experts to generate reasonable mechanisms for their 0.06) to the true percent of H3 influenza cases this season.
predictions, and recommend the necessary properties of a In addition, experts (vs model) assigned a probability of
toolkit that could be used to collect expert predictions and 0.01 (vs 0.38) to an unlikely percent of H3 below 20%. The
rationales similar to how surveillance data is collected. model, though, did report a larger variance when compared
Our case study on expert forecasting of seasonal influenza to experts (model variance 11% vs 2% for experts).
examines how expert judgment functions in practice under a In Pennsylvania, the peak number of incident hospitaliza-
time constraint. On November 20th, 2024, at the council of tions for the 2024/25 season was 4,318 (See Figure 1B).
State and Territorial Epidemiologists workshop on infectious This number of incident hospitalizations was the highest
disease forecasting, we posed two questions to public health value compared to the past three seasons (200 in season
officials, epidemiologists, and infectious disease modelers 2021/22; 1,299 in 2022/23; and 933 in 2023/24). The model
who attended the conference: in the state of Pennsylvania assigned a higher probability to the observed peak number
(PA), during the 2024/25 season, (1) will the majority of of hospitalizations compared to the aggregate prediction of
lab-confirmed cases of influenza be classified as H1 or H3 experts (model prob. = 0.07 vs experts prob = 0.05). That
(H3 often results in more severe symptoms)? (2) what will said, the model also assigned high probability to implausibly
be the peak number of confirmed hospitalizations due to small peaks of less than 200 peak hospitalizations (model
influenza and why (e.g. a rationale for their prediction)? prob. = 0.21 vs Experts = 0.03).
We received 217 responses from 114 experts (raw data When asked to provide a rationale for their prediction of
provided in Supplement). Experts at the conference were peak incident hospitalizations, expert rationales focused on:
provided brief background information to help them form historical flu patterns; how other pathogens like COVID-
a prediction and were given just five minutes to answer 19 may modulate influenza intensity; scenarios such as
each question (Format for the questions can be found in if the season is (or is not) an H3-dominant season; and
Fig. 2. Latent Dirichlet Allocation identified five topics from 102 expert responses, with prevalence shown as percentage of corpus. Topics demonstrate
experts’ consideration of historical flu patterns, COVID-19 interactions, recent trends, comparative assessments, and vaccination data—illustrating the
diverse reasoning approaches that complement computational models in infectious disease forecasting.

vaccine efficacy and uptake data. These topics were identified commented “When I saw the historical data from the link, I
through Latent Dirichlet Allocation analysis of the rationales instinctively leaned [sic. towards this data] because my data
provided by experts (See Figure 2). scientist eye loves patterns, but : what if I’m wrong?” as did
another who wrote “I am heavily biased to the most recent
Several experts noted the interplay between COVID-19 season which happened to fall in between the range of the
and influenza, adding rationales like “with only two seasons previous ones.”
without extreme COVID-19 interference, I inferred it may
be time in a similar range to those two for the upcoming III. DISCUSSION
season” and “Prior seasons with less covid activity have co- Our case study suggests complementary strengths between
incided with stronger flu activity. [...] I’d guess that we won’t expert judgment and computational modeling in infectious
encounter a very high burden covid variant that displaces flu disease forecasting. Public health experts’ probability es-
virus.” Similar to the model, experts also considered previous timates for H3 influenza dominance aligned more closely
peak data for influenza: “Looking at recent flu seasons, peaks with the observed outcome. While the forecast generated
have gone from about 1200 to about 1000. I am assuming by the model and experts performed similarly for peak
there is a trend towards returning to lower peak incidence hospitalization predictions, the model assigned substantial
after reporting changes” and “Prior flu seasons have peaks probability to implausible scenarios that experts correctly
around 1K in January”. Unlike the model, though, experts identified as unlikely, such as peak hospitalizations below
were able to tap a large breadth of knowledge on causal 200. A major advantage to collecting predictions from ex-
mechanisms that impact influenza: “Poor vaccine strain perts is their ability to describe their reasoning, while compu-
match will lead to a more severe influenza season than tational models only offer numerical output. These findings
observed last year.”, “Decreased flu vaccine uptake year by suggest that optimal forecasting approaches should collect
year since the onset of pandemic; similar or slightly higher both computational and expert forecasts (plus rationales),
combined respiratory burden this year.”, and “last years potentially leveraging natural language processing tools like
average + bird flu”. Some experts even tapped, in real-time, small language models to systematically incorporate expert
a math model, writing “Math model run in real time on the rationales into computational frameworks. Given just five
fly now”. Notably, some experts were able to qualify their minutes to produce a prediction and rationale, this work
own potential biases when making predictions. One expert highlights the value of experience in public health practice.
This study has several important limitations. We lacked quantifiable data source on par with traditional data sources
a direct comparison with non-expert predictions, limiting collected via surveillance systems.
our ability to quantify the specific value of public health
expertise versus general forecasting ability. Though the ma- IV. RECOMMENDATIONS
jority of attendees participated, not all conference attendees We recommend the development of a toolkit that addresses
participated which could bias predictions toward individuals the challenge of systematically capturing and leveraging
more confident or skilled in forecasting tasks. Our analysis expert judgment in public health forecasting. Although com-
focused on only two forecasting targets for a single influenza putational models provide quantitative predictions, they often
season in one geographic location which limited the scope lack the contextual reasoning and domain expertise that hu-
for assessing expert performance across diverse scenarios man forecasters bring to complex epidemiological scenarios.
or pathogens. Additionally, our simple computational model The proposed approach should combine expert forecasts with
does not represent the full spectrum of sophisticated mod- their underlying reasoning to create both improved predictive
eling approaches available, such as ensemble methods or models and insights into decision-making processes.
machine learning techniques that might perform differently Such a toolkit would require three components. First, the
against expert judgment. system should systematically collect forecasts from public
The strengths and susceptibilities of human judgment health experts alongside structured documentation of their
in forecasting aligns with established literature on expert reasoning, assumptions, and the ground truth. Second, this
decision-making under uncertainty [13]. As demonstrated in system should employ a language model to identify char-
our results and supported by prior research, experts excel acteristics of effective versus ineffective reasoning based on
at recognition-primed decision making, rapidly drawing on collected data from the first component. Third, a dashboard
accumulated experience to assess plausible scenarios and would present ensemble predictions, historical data, and a
causal mechanisms that purely data-driven models may visual of common reasoning patterns across experts.
miss [16], [17]. However, expert forecasts remain vulnerable A toolkit such as the one we propose should be revised
to cognitive biases, overconfidence, and anchoring effects iteratively, with feedback collected from public health of-
that can compromise accuracy. This was exemplified in our ficials. This user-centered approach will ensure that data
work as smaller predictive variance for experts compared collection procedures, survey instruments, and dashboard
to a computational model. These biases may be particularly interfaces align with existing workflows and decision-making
salient when experts are asked to make decisions in novel needs in public health practice.
or rapidly evolving situations where past experience may be
less applicable [21], [22]. ACKNOWLEDGMENTS
Past forecasting systems that aggregate predictions and The authors thank Dr. Tomás Martín León (Modeling
rationales from humans have found success that public health Section Chief, California Department of Public Health) and
practice should explore. Modern forecasting increasingly Justin Crow, MPA (Foresight & Analytics Coordinator, Vir-
relies on prediction platforms and specialized tools that lever- ginia Department of Health) for their valuable feedback.
age collective intelligence to enhance decision-making across
diverse domains. Metaculus, an online forecasting plat- R EFERENCES
form, directly solicits probabilistic forecasts from its global [1] Hank Aaron, Robert F. Kennedy Jr, and the Public’s
community and aggregates these individual probabilities Health. AJPH, Vol. 115 Issue 2. Accessed July 23, 2025.
using sophisticated machine-learning-weighted predictions [Link]
that outperform traditional prediction markets. Specialized [2] S. H. Woolf, S. Galea, and D. R. Williams, “The potential impact of
the Trump administration policies on health research in the USA,” The
tools like IDEAcology streamline rigorous expert elicitation Lancet, vol. 405, no. 10495, pp. 2114-2116, 2025.
through protocols designed for quantitative and probabilistic [3] Y. Takakazu, “The Trump Administration’s Domestic Health Policy
estimates in fields such as ecology and biosecurity [23]. and Global Health,” Asia-Pacific Review, vol. 32, no. 1, pp. 35-53,
2025.
The success of existing forecasting platforms underscores [4] C. J. R. Daval and A. S. Kesselheim, “The Advisory Committee on
the value of establishing formal systems for expert knowl- Immunization Practices—Legal Roles, Challenges, and Guardrails,”
edge collection. For public health specifically, developing a JAMA, June 26, 2025.
[5] “Trump administration purges U.S. health agency leaders,” Ac-
dedicated web-based toolkit to coordinate national collection cessed July 8, 2025. [Link]
and analysis of expert rationales would treat human judgment administration-purges-u-s-health-agency-leaders
as a critical data source alongside traditional surveillance [6] J. Liu and K. Eggleston, “The Association between Health Workforce
and Health Outcomes: A Cross-Country Econometric Study,” Soc Indic
systems. We posit a dedicated database of expert predictions, Res, vol. 163, no. 2, pp. 609-632, 2022.
rationales, and the context in which they were made could [7] T. McAndrew, A. A. Lover, G. Hoyt, and M. S. Majumder, “When
improve crisis response. data disappear: public health pays as US policy strays,” The Lancet
Digital Health, vol. 0, no. 0, 2025.
To support public health response, especially during times [8] C. P. Duggan and Z. A. Bhutta, “’Putting America First’ — Under-
of resource constraints, we recommend that future work mining Health for Populations at Home and Abroad,” New England
address the construction of tools to amplify expert judgment, Journal of Medicine, vol. 392, no. 18, pp. 1769-1771, 2025.
[9] R. C. Brownson, J. G. Gurney, and G. H. Land, “Evidence-based
expertise, the decisions that were made and the results of decision making in public health,” J Public Health Manag Pract, vol.
those decisions. Expert reasoning should be considered a 5, no. 5, pp. 86-97, 1999.
[10] S. Juin, N. Schaad, D. Lafontant, et al., “Strengthening National H3 season in Pennsylvania”. Experts were able to provide the
Disease Surveillance and Response—Haiti, 2010–2015,” Am J Trop following answers: 10%, 20%,...,100% probability that the
Med Hyg, vol. 97, no. 4 Suppl, pp. 12-20, 2017.
[11] E. J. Barzilay, N. Schaad, R. Magloire, et al., “Cholera Surveillance season would be H3 dominant. Question two asked “What
during the Haiti Epidemic — The First 2 Years,” New England Journal will be the peak number of influenza hospitalizations in
of Medicine, vol. 368, no. 7, pp. 599-609, 2013. PA for the 2024/25 season?”. Experts were able to choose
[12] J. W. Tappero and R. V. Tauxe, “Lessons Learned during Public Health
Response to Cholera Epidemic in Haiti and the Dominican Republic,” from a set of ranges: [0-200], [201-400], [401-600], [601-
Emerg Infect Dis, vol. 17, no. 11, pp. 2087-2093, 2011. 800], [801-1000], [1001-1200], [1201-1400], [1401-]. Brief
[13] M. Zellner, A. E. Abbas, D. V. Budescu, and A. Galstyan, “A survey of background information was given to experts to aid them in
human judgement and quantitative forecasting methods,” Royal Society
Open Science, vol. 8, no. 2, 201187, 2021. forming predictions (see Supplemental).
[14] A. Verwiel and W. Rish, “Multidisciplinary perspectives on cumula- Ground truth for both questions was determined on May
tive impact assessment for vulnerable communities: expert elicitation 31, 2025, after the conclusion of the typical influenza season
using a Delphi method,” Integrated Environmental Assessment and
Management, vol. 21, no. 2, pp. 301-313, 2025. in the northern hemisphere (after MMWR week 2025W22).
[15] C. C. Hammer, J. Brainard, and P. R. Hunter, “Risk factors for Ground truth on a H1 vs H3 dominant season was collected
communicable diseases in humanitarian emergencies and disasters: from the Pennsylvania Department of Health’s Respiratory
Results from a three-stage expert elicitation,” Global Biosecurity, vol.
1, 2019. dashboard [24], and ground truth about the peak incident
[16] P. R. Falzer, “Naturalistic Decision Making and the Practice of Health hospitalizations due to influenza was collected from the
Care,” Journal of Cognitive Engineering and Decision Making, vol. Weekly Hospital Respiratory Dataset which is hosted by the
12, no. 3, pp. 178-193, 2018.
[17] “Collaborative Activities During an Outbreak Early Warning Assisted National Healthcare Safety Network as part of CDC [25].
by a Decision-Supported System (ASTER),” International Journal of
Human–Computer Interaction, vol. 26, no. 2-3. Accessed July 9, 2025. B. Evaluation
[Link]
[18] C. J. Cadham, M. Knoll, L. M. Sánchez-Romero, et al., “The Use of Aggregate predictions from experts were compared to
Expert Elicitation among Computational Modeling Studies in Health corresponding computational models that were trained on
Research: A Systematic Review,” Med Decis Making, vol. 42, no. 5, historical, observed data. The premise for this comparison
pp. 684-703, 2022.
[19] T. McAndrew, J. Cambeiro, and T. Besiroglu, “Aggregating human is that if aggregate predictions from experts outperform
judgment probabilistic predictions of the safety, efficacy, and timing reasonable computational models then these models, and the
of a COVID-19 vaccine,” Vaccine, vol. 40, no. 15, pp. 2331-2341, data that they are trained on, is not capturing information
2022.
[20] T. McAndrew, A. Codi, J. Cambeiro, et al., “Chimeric forecasting: used by experts. This “unknown” information may include
combining probabilistic predictions from computational models and expertise, developed over years in the field.
human judgment,” BMC Infect Dis, vol. 22, no. 1, 833, 2022.
[21] A. Tversky and D. Kahneman, “The Framing of Decisions and the C. H1 vs H3 dominant influenza season
Psychology of Choice,” Science, vol. 211, no. 4481, pp. 453-458, 1981.
[22] G. Recchia, A. L. J. Freeman, and D. Spiegelhalter, “How well did Because the question asked to assign a probability to an
experts and laypeople forecast the size of the COVID-19 pandemic?” H3 (vs H1) season, we can construct an aggregate density
PLOS ONE, vol. 16, no. 5, e0250935, 2021.
[23] S. K. Courtney Jones, S. R. Geange, A. Hanea, et al., “IDEAcology:
that assigns probability values to the future proportion of
An interface to streamline and facilitate efficient, rigorous expert influenza cases that are typed as H3. Given N predictions,
elicitation in ecology,” Methods in Ecology and Evolution, vol. 14, the expert aggregate forecast assigned a probability to x%
no. 8, pp. 2019-2028, 2023.
[24] “Respiratory Virus Dashboard,” Accessed July 25, 2025.
of cases typed as H3 equal to the number of experts who
[Link] answered x% divided by N which we denote px . We can
disease/respiratory-viruses/[Link] build a probability density by assigning px to the interval
[25] “Weekly Hospital Respiratory Data (HRD) Metrics by Jurisdiction,
National Healthcare Safety Network (NHSN),” Data, Centers
[x, x + 10%). Our proposed computational model was a
for Disease Control and Prevention. Accessed July 25, 2025. kernel density estimate that was trained on the past propor-
[Link] tions of cases that were typed as H3 from the Pennsylvania
Respiratory-Data-HRD-Metrics-by-Ju/ua7e-t2fy/about_data
Department of Health Respiratory Dashboard for seasons
V. METHODS 2021/22 up until 2022/23 (See Supplemental for data set).

A. Data Collection and determination of ground truth D. Peak intensity of confirmed influenza hospitalizations
Participants were conference attendees at the Infectious The aggregate forecast from experts was a probability
Disease Forecasting Workshop hosted by the council of assignment to the above eight intervals outlined in the section
State and Territorial Epidemiologists (CSTE) and Centers titled Data Collection and determination of ground truth.
for Disease Control and Prevention (CDC). The conference The probability assigned to interval I was computed as the
period was November 19-21, 2024. Conference attendees number of experts who selected that interval divided by all
were experts in public health, epidemiology, and infectious experts who participated. Our proposed computational model
disease modeling. Most, if not all, states were represented at was a kernel density estimate that was trained on the past
the conference by active public health officials. We denote number of peak hospitalizations in Pennsylvania for seasons
this group as experts. 2021/22 up until 2022/23 (See Supplemental for data set).
On November 20, 2024 we posed two questions to experts The rationales provided for predictions (N =102) were an-
during an invited talk. Question one asked “Please assign a alyzed using Latent Dirichlet Allocation to identify common
probability to this upcoming season being characterized as a themes. Text preprocessing included removing stopwords,
lemmatization, and filtering words appearing in <3 or >80%
of responses. After preprocessing, there were 61 unique
words in any of the 102 rationales. Model selection involved
training the LDA model for 2-8 topics and choosing the num-
ber of topics that resulted in the highest coherence scores.
The highest coherence score (0.38) was when there were 5
topics selected. Final model used α=0.083, β=0.01, trained
for 100 iterations using Gensim 4.3.3. One author analyzed
the topics (GH) and drafted interpretations independently,
based on the words included and example responses. These
interpretations were then reviewed and confirmed with a co-
author (TM).
VI. E THICS A PPROVAL S TATEMENT
In consultation with the Lehigh University Internal Review
Board (IRB), this work was deemed not to involve human
subjects and not need formal evaluation.
VII. DATA AVAILABILITY AND ANALYSIS
REPRODUCIBILITY
All data and code used to con-
duct the above analysis is available at
[Link]
In particular, a Makefile is provided that formats the data
and then runs the analysis for this work.
VIII. S UPPLEMENTAL M ATERIALS
1. Line list of predictions and rationals from experts
2. Forms used for data collection (i.e. expert elicitation)

You might also like