0% found this document useful (0 votes)
55 views13 pages

The Generative Era of Medical AI Cell

The document discusses the transformative impact of generative AI and large language models (LLMs) on the medical field, enhancing diagnostics, patient interactions, and personalized healthcare. It highlights the integration of multimodal AI, which combines diverse data types for improved medical screening and decision-making, while also addressing challenges such as bias and privacy. The review emphasizes the potential for AI to revolutionize healthcare delivery by enabling proactive management and personalized patient care through advanced screening and continuous monitoring technologies.

Uploaded by

小强木君
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views13 pages

The Generative Era of Medical AI Cell

The document discusses the transformative impact of generative AI and large language models (LLMs) on the medical field, enhancing diagnostics, patient interactions, and personalized healthcare. It highlights the integration of multimodal AI, which combines diverse data types for improved medical screening and decision-making, while also addressing challenges such as bias and privacy. The review emphasizes the potential for AI to revolutionize healthcare delivery by enabling proactive management and personalized patient care through advanced screening and continuous monitoring technologies.

Uploaded by

小强木君
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

ll

Leading Edge

Review
The generative era of medical AI
L. John Fahrner,1,3 Emma Chen,1,3 Eric Topol,2,4,5 and Pranav Rajpurkar1,4,5,*
1Department of Biomedical Informatics, Harvard Medical School, Cambridge, MA, USA
2Scripps Research, La Jolla, CA, USA
3These authors contributed equally
4These authors contributed equally
5Senior author

*Correspondence: [email protected]
https://doi.org/10.1016/j.cell.2025.05.018

SUMMARY

Rapid advancements in artificial intelligence (AI), particularly large language models (LLMs) and multimodal
AI, are transforming medicine through enhancements in diagnostics, patient interaction, and medical fore­
casting. LLMs enable conversational interfaces, simplify medical reports, and assist clinicians with decision
making. Multimodal AI integrates diverse data like images and genetic data for superior performance in pa­
thology and medical screening. AI-driven tools promise proactive, personalized healthcare through contin­
uous monitoring and multiscale forecasting. However, challenges like bias, privacy, regulatory hurdles,
and integration into healthcare systems must be addressed for widespread clinical adoption.

INTRODUCTION AI systems, which were not reliable enough for real-world clinical
applications.
Technological innovation in biomedicine has directly contributed Fast forward to today, LLMs like ChatGPT, Gemini, Claude,
to improved quality of life and extended healthspan. Historically, and Llama have captured the attention of the world. These
advances in drug development, surgical techniques, under­ models exemplify two important paradigms in modern AI: foun­
standing of biological pathways, imaging techniques, and other dation models—large-scale, general-purpose AI systems
areas have propelled this progress. Now we are on the verge of a trained on vast datasets that can be adapted to numerous down­
new phase of growth with the recent progress in artificial intelli­ stream tasks—and generative AI, which enables the creation of
gence (AI), which we will attempt to summarize here. The weekly novel content such as text, images, or molecular designs by
Doctor Penguin newsletter has continued to track novel devel­ modeling complex data distributions. Unlike traditional AI, which
opments in medical and health AI since 2019 and serves as predominantly focused on discriminative classification tasks and
a source of material for this review (https://doctorpenguin. relied on specialized architectures for specific domains, genera­
substack.com). From a technical perspective, modern AI ad­ tive AI learns to generate outputs that statistically resemble the
vancements have been enabled by several key architectural in­ training data. This capability stems primarily from the Trans­
novations, including the Transformer architecture, generative former architecture, introduced in 2017, which has redefined
adversarial networks, and diffusion models, which together scalability and performance in AI.2
have powered the development of increasingly sophisticated The Transformer’s core innovation is self-attention, a mecha­
generative AI systems. Research has shown the potential for nism that dynamically weighs the relevance of different input ele­
transformative change because of large language models ments, allowing the model to capture long-range dependencies in
(LLMs) and multimodal AI, the changing medical practice, and data, such as relationships across sentences or protein se­
multiscale medical forecasting; this review aims to summarize quences. In generative tasks, the decoder component of the
this seemingly exponential progress over the last 3 years. We Transformer is critical; it generates outputs sequentially (e.g.,
will discuss the background, implementation, implications, and one word or token at a time) by attending to both the input context
some of the persistent challenges associated with these new and previously generated elements. This architecture powers
technologies. LLMs, such as those driving chatbots or text synthesis tools,
which are trained on vast datasets to learn intricate patterns in lan­
LLMs AND THE PATH TO MULTIMODAL MEDICINE guage or other domains. Earlier approaches like recurrent neural
networks (RNNs) and convolutional neural networks (CNNs) faced
The promise of AI in healthcare dates to the 1960s, when Joseph fundamental scaling bottlenecks—RNNs struggled with paralleli­
Weizenbaum developed ELIZA, one of the first chatbots.1 ELIZA zation and long-range dependencies, while CNNs had locality
simulated a Rogerian psychotherapist, engaging in simple dia­ biases limiting their effectiveness for sequential data.
logue with users. Subsequent efforts to create conversational The key discovery that Transformer models consistently improve
AI for medicine were hindered by the limited capabilities of early as they grow larger accelerated the recent AI development.

3648 Cell 188, July 10, 2025 © 2025 Elsevier Inc.


All rights are reserved, including those for text and data mining, AI training, and similar technologies.
ll
Review

Researchers found that simply increasing the model size, training teracting with existing systems, humans, or other agents.
data, and computing power leads to predictable gains in perfor­ Agentic systems promise to automate workflows, validate AI
mance—a property not seen with previous approaches where safety and reduce errors, aid in managing disparate AI tools,
improvements would eventually plateau.3 This mathematical pre­ and to provide outcomes predictions, among other skills.26
dictability, coupled with advances in specialized computing hard­ Polaris AI exemplifies this agentic approach through its ‘‘constel­
ware and the availability of petabyte-scale datasets, established lation architecture,’’ where a primary conversational agent works
the precise conditions necessary for the current AI revolution. in concert with specialized LLM agents—including medication
The convergence of these factors has positioned generative AI as specialists that verify dosages, labs specialists that analyze
a transformative tool for applications like drug discovery, clinical test results, and nutrition specialists that provide tailored dietary
decision support, and automated analysis of medical literature, of­ guidance—enabling the system to maintain both engaging con­
fering unprecedented opportunities to accelerate biomedical versation and medical accuracy while ensuring built-in safety re­
research. dundancies for healthcare interactions.27
When applied to direct patient interaction, LLMs are charting a Recent advancements in chain-of-thought prompting and
path toward meaningful conversational AI in medicine. In this reasoning techniques have addressed the challenge of ensuring
application, LLMs can provide patients with accessible conversa­ accurate and clinically relevant outputs from LLMs.28,29 These
tional interfaces to interact directly with their own individual health approaches have facilitated the development of datasets opti­
data in the electronic health record (EHR) and also with general mized for reasoning and, more recently, specialized reasoning
medical information.4–6 For example, LLM agents have been models.30–32 As these techniques evolve, large reasoning
used to reduce the complexity of pathology reports and for trans­ models are poised to become increasingly prevalent in clinical
forming hospital discharge summaries into a patient-friendly applications.
format.7,8 When mental health chatbots are made available to pa­
tients, they have shown potential in reducing stigma about mental Multimodal AI and foundation models
health care and have demonstrated increased referral rates, most Early medical AI systems were dedicated single-task models pre­
significantly for traditionally underrepresented groups.9 These re­ dominantly trained on specific medical datasets, which required
sults are notable, as the first steps of seeking care and receiving tedious manual labeling. This burden was slightly reduced by tech­
an appropriate referral are common barriers in the mental health niques like self-supervised learning (automatic interpretation of
pathway. These conversational agents can assist patients in navi­ training data without explicit human labeling) and few-shot learning
gating their healthcare course, providing personalized informa­ (more efficient learning using fewer curated examples).33,34
tion and support. While many LLM tools await medical approval, Medicine is inherently a multimodal domain, where clinical in­
early reports suggest patients are already testing their benefits; in sights arise from combining radiology scans, patient records,
one example, a mother was able to diagnose her young son’s genomic sequences, and spoken consultations.35 Traditional
tethered cord after multiple fruitless physician visits.10 AI models, often limited to single modalities, struggled to capture
In addition to conversational and summarization agents, LLMs this complexity. Multimodal generative AI overcomes these lim­
can be tools for clinicians.11 Models in the research setting have itations by learning unified representations across modalities,
demonstrated performance at least comparable to clinicians in enabling a deeper understanding of medical data. A pivotal
history-taking, following diagnostic pathways, communication, advancement in this field is the Contrastive Language-Image
and empathy.11–14 LLMs can also serve as medical knowledge Pretraining (CLIP) model (introduced in 2021 by OpenAI), which
resources for clinicians. Dedicated LLMs have been developed uses contrastive learning to align vector representations of im­
for specialized fields, enabling clinicians to access expert knowl­ ages, text, and potentially other modalities (e.g., audio spectro­
edge and to assist with decision-making and guideline adher­ grams) into a shared latent space.36 CLIP’s architecture trains on
ence, and have already gained certification.15–17 Models now paired data (e.g., images and captions) by maximizing the simi­
routinely achieve passing scores on medical licensing exams, larity between matched pairs while minimizing similarity between
showcasing their potential to provide up-to-date and compre­ unmatched pairs, creating a unified space where related con­
hensive medical information.18–21 By leveraging the vast knowl­ cepts across modalities (like a medical image of a tumor and
edge encapsulated within LLMs, these diagnostic tools promise its textual description) are positioned close together. This align­
to aid clinicians in making accurate and timely diagnoses and ment enables generative models to process and generate multi­
guiding management decisions. modal outputs, such as synthesizing medical reports from imag­
Beyond these applications, LLMs could be integrated into ing data or answering clinical queries by combining visual and
healthcare delivery to automate documentation tasks and textual inputs. Theoretically, this integration mimics human
improve clinician efficiency. AI-powered ‘‘scribes’’ are capable reasoning, where physicians synthesize diverse information to
of recording patient histories; creating medical notes; handling form diagnoses, making multimodal AI a fundamental break­
pre-authorization requests for medications or tests; scheduling through for medicine. By modeling statistical relationships
follow-up appointments; and managing lab test results, scans, across modalities, multimodal generative AI transforms medical
procedures, billing, and more and are already being used clini­ applications; it enhances diagnostic accuracy by correlating im­
cally.11,22–25 Notably, LLMs have shown the ability to summarize aging and clinical notes, accelerates drug discovery by inte­
medical information as effectively as human experts. Much of the grating molecular structures with textual annotations, and per­
emerging research on LLMs focuses on ‘‘agentic’’ environments, sonalizes treatment plans by combining patient histories with
in which AI systems dynamically complete complex tasks by in­ real-time sensor data. Unlike unimodal models, which risk

Cell 188, July 10, 2025 3649


ll
Review

fragmented insights, multimodal AI captures the holistic nature newer AI-enabled smartwatches have demonstrated the ability
of medical data, driving progress in precision medicine and clin­ to identify individuals at risk of atrial fibrillation, screen for left ven­
ical decision support. The scalability of models like CLIP, which tricular systolic dysfunction, and to monitor cardiac function post-
improve with diverse and large-scale datasets, further amplifies COVID-19 vaccination.48–51 Devices in research settings include
their impact, positioning multimodal generative AI as a corner­ implantable temperature sensors for early detection of acute kid­
stone of the generative AI era in medical research.37–44 Current ney rejection in transplant patients, sensors for continuous cortisol
multimodality AI research is focused on incorporating more mo­ level detection in sweat, and wearable ultrasound devices for
dalities into a single model and incorporating volumetric data­ physiologic monitoring that use machine learning to maintain
sets like magnetic resonance imaging (MRI) and computed to­ high-quality images during movement.52–54 Some AI-powered
mography (CT) and video. sensors do not provide continuous monitoring but do allow for
Multimodal AI is rapidly progressing in the field of pathology. more accessible diagnosis. For example, smartphone-based
Large pathology training sets can incorporate standardized im­ diagnostic tools, like dermoscopy lenses coupled with AI models,
ages of slides and specimens with text reports, genomic data, have demonstrated high accuracy in diagnosing suspicious skin
and EHR data, providing a prime target for AI research.45 By lesions, reducing the need for in-person consultations.55 Similarly,
applying the Transformer architecture, researchers have created smartphones with endoscope or otoscope attachments, com­
‘‘vision-language’’ models, which incorporate these text and im­ bined with AI models, have been shown to assist with accurate
age-analysis components into a unified model. These models remote diagnosis of diseases such as acute otitis media.56 By
are capable of advanced tasks like image captioning and enabling individuals to take control of their health and well-being,
answering questions about an image. An early model used pub­ these novel AI-enabled sensors have the potential to regularly
licly available social media pathology images and captions to pro­ monitor health in the home setting to identify developing issues
duce a model named protein-ligand interaction profiler (PLIP).46 earlier and to reduce the burden on healthcare systems.
Later models used larger datasets; PathChat applied an LLM to
the large UNI pathology image model to create a vision-language Advanced medical screening
model, while contrastive learning from captions from histopathol­ AI medical screening tools can detect disease earlier and more
ogy (CONCH) trained natively on around 1 million image-text pairs efficiently than with traditional methods. Screening programs
from diverse sources.34,47 Both approaches resulted in accurate have traditionally targeted large populations with broad inclusion
multimodal pathology models. As foundation models become criteria. AI enables more accurate targeting of high-risk individ­
more capable, it is likely that these single larger models will uals to realize individual and societal benefits. For example, re­
continue to replace dedicated smaller models. searchers have recalibrated low-dose lung cancer screening
recommendations using AI analysis to prioritize workup for
CHANGING MEDICAL PRACTICE higher-risk patients and to decrease screening frequency for
those at lower risk.57
Reviewing the extensive recent research in biomedical AI allows 2D and 3D mammography-based AI interpretation algorithms
us to anticipate the future direction of medical care. State-of-the- have matched human abilities in real-world clinical scenarios
art AI developments point to a new model of health management while clinical rollout continues.58,59 Research has shown that
in which patients have more regular and detailed insight into their more advanced models combining lesion detection and texture
health. Patients are empowered to manage their own health with analysis to determine short-term and long-term breast cancer
AI tools that provide more timely and personalized feedback. risks have been shown to improve overall risk assessment.60
Traditional screening programs can become more tailored and AI applied to traditional mammography has also been shown
personalized. AI has the potential to enable a transformation in to help determine which patients would most benefit from sup­
the delivery of healthcare by shifting more care from reactive, plemental MRI, reducing missed cancers without a large in­
hospital-centric treatment to proactive, personalized, and crease in the MRI screening burden.61
accessible health management. Patients can be triaged to Photography- and video-based screening tools have been
more fine-tuned levels of care, with progressive escalation as shown to provide affordable and fast analysis of complex neuro­
needed. Earlier diagnosis and intervention reduce the reliance logical disorders, and they have also been able to go a step
on acute care resources and can lead to improved outcomes. further and provide clinical predictions about disease progres­
And finally, these new AI-powered tools promise to revolutionize sion.62–64 For example, a retinal image-based system was devel­
medical forecasting with multiscale capabilities; predictions can oped to predict myocardial infarction, providing a less invasive
be made at molecular, cellular, individual, and population scales, screening option.65
completely rethinking the standard model of care. Powerful AI Next-generation AI-powered screening promises to more accu­
models will enable more accurate and dynamic short-term and rately triage patients, improve screening efficiency, and improve
long-term risk assessment. These new AI-powered tools prom­ predictive analytics with accessible technology (Figure 1). The
ise to improve the accessibility of care, improve decision-mak­ final large component of this new AI-enabled paradigm in health­
ing, and provide more targeted management. care is multiscale medical forecasting.

Continuous monitoring and patient agency Multiscale medical forecasting


Traditional wearable physical sensors such as fitness trackers Medical forecasting encompasses a broad set of integrated
have been isolated from the rest of the health ecosystem. Now, ideas in which the pattern-matching strengths of computers

3650 Cell 188, July 10, 2025


ll
Review

Figure 1. Transformation of medical practice


AI-enabled medical practice transforms clinical care from sporadic interactions to continuous monitoring and regular check-ins. Rather than reactive hospital-
based management of more advanced diseases, medical events can be constantly addressed in familiar settings at an earlier stage. New medical knowledge can
more easily be integrated into care models, while new medications are created using new AI-enabled techniques.

are paired with traditional and new data inputs to enable earlier opers of AlphaFold were awarded the 2024 Nobel Prize in
and more precise, accurate, personalized, affordable, efficient, Chemistry for this work.70
equitable, and convenient medical diagnosis.66 AI algorithms The impact of these advances extends beyond proof of
are being used in medical forecasting to predict future events concept. AlphaFold2 quickly expanded to include protein folding
or outcomes based on personalized patient information after predictions for more than 200 million of the most common pro­
training on large datasets. Forecasting applies to the entire teins found in over 1 million species.71 AlphaFold3 includes
context of health, from the molecular level to the cellular level, more complex biomolecular structures beyond individual pro­
the organ system level, the individual level, and to the population teins, expanding structure prediction capabilities to protein com­
and global levels (Figure 2). plexes and protein-ligand interactions.69 The pace of develop­
ment of new AI tools has been staggering—within just a few
Molecular-level progress weeks near the end of 2024, 10 major molecular-level research
The field of protein science has been revolutionized by the devel­ projects were released.72 These projects included a foundation
opment of an AI model called AlphaFold2, developed by model for DNA, a tool to predict protein-protein interactions,
DeepMind in 2020, which achieved unprecedented accuracy in and a Human Cell Atlas, among others.73–75 These tools are
protein structure prediction.67 This breakthrough, along with crucial for understanding biological processes and for designing
the independently developed RoseTTAFold, used attention new therapeutics.70
mechanisms to predict 3D protein structures from amino acid However, as these models excel at predicting static struc­
sequences with near-experimental precision.68 AlphaFold2’s tures, a significant challenge remains in capturing protein dy­
success stemmed from its innovative use of multiple sequence namics and flexibility.76 Current research is focused on extend­
alignments and its ability to learn spatial relationships between ing these models to predict not just a single structure but also
amino acids.67,69 This advancement not only accelerated struc­ the various conformations a protein might adopt under various
tural biology research but also catalyzed developments in pro­ conditions.69 This is particularly important as subtle changes in
tein design, function prediction, and drug discovery; the devel­ protein folding can lead to significant physiologic differences

Cell 188, July 10, 2025 3651


ll
Review

Figure 2. Multiscale medical forecasting


AI algorithms can be used in medical forecasting to predict future medical events, based on various dynamic inputs. These algorithms can be applied at multiple
levels, from the molecular level to the population level.

and outcomes, with misfolded proteins implicated in diseases networks or cellular systems, pushing the boundaries of syn­
such as cystic fibrosis and Huntington disease.69 Currently these thetic biology and potentially enabling new approaches to treat­
generative AI tools can be used to predict clinical outcomes of ing complex diseases.
various cystic fibrosis mutations.77 AlphaFold2 has been helpful
for predicting antigen proteins in pathogens like rotavirus and Cellular, organ-system, and individual-level forecasting
other infections.78 It has also been used in immunology research Cardiology
to predict antibody structure and assist with vaccine develop­ A wide range of experimental tools in cardiology can benefit
ment, predict membrane protein structure and interactions to from the pattern-matching capabilities of AI techniques. In the
assist with drug development, characterize enzyme activity for acute setting, AI models have the potential to alert clinicians
diseases like porphyria, assist with research on drug resistance, to developing decompensation. For example, Lin et al. devel­
and to predict outcomes for certain acute lymphoblastic leuke­ oped an electrocardiogram (ECG) model that monitored
mia (ALL) subtypes, among many other applications.78 From a 12-lead ECGs of hospitalized patients and was able to alert pro­
clinical perspective, these models promise to enable personal­ viders of impending decompensation and to improve clinical
ized prediction of the impact of genetic mutations on protein outcomes.87 Other models have been able to identify patients
function and disease pathways, as well as help us understand at risk of hypotension, tachycardia, or hypoxia, based on stan­
the nature of cancer. dard vital sign monitors. Researchers were able to use ECG
Building on the foundations of protein structure prediction, the data to detect a pattern of occlusion myocardial infarction
field of protein generation and design has seen significant ad­ even without ST elevation, surpassing human abilities and al­
vancements. Tools like RFdiffusion and FrameDiff use generative lowing for earlier intervention.88 Sundrani et al. developed a
techniques to generate 3D structural protein backbones.79,80 bimodal model to predict tachycardia, hypotension, or hypoxia
Sequence generation tools like ProGen, ProteinMPNN, and Evo in the emergency department (ED), based on triage data and
can output amino acid sequences based on various inputs and ECG/pulse plethysmograph (PPG) waveforms.89
allow researchers to create novel proteins with specific structures AI-powered models have been developed to forecast the risk
or functions. The emergence of Evo 2 also represents a milestone of future cardiovascular disease events such as heart attacks
multimodal foundation model incorporating DNA, RNA, and pro­ or strokes, based on various factors such as age, gender, and
teins in one large model.81 This shift from prediction to design medical history. By identifying a unique set of patient variables
represents a new frontier in protein engineering, opening possibil­ from a potential list of thousands of variables, models have
ities for creating proteins tailored to specific tasks or environ­ been shown to more accurately predict coronary artery disease
ments.82–86 These models are helping to elucidate the mecha­ (CAD) risk than was previously possible.90 In another model, re­
nisms of biological action from DNA to RNA to protein, and they searchers were able to identify 27 specific proteins in blood sam­
enable researchers to efficiently manipulate steps in these pro­ ples that can be used to create a personalized survival model
cesses. that is more accurate than previous methods.91 ECG analysis
As AI continues to reshape protein science, the field is moving can be used to predict the risk of future atrial fibrillation or LV
toward more integrated, multiscale approaches. The next fron­ dysfunction after percutaneous coronary intervention (PCI),
tier will likely involve developing AI systems that can not only which in turn predicts which patients would most benefit from
design individual proteins but can also engineer entire protein medical intervention.92,93

3652 Cell 188, July 10, 2025


ll
Review

Cardiovascular disease risks can be estimated using imaging declines in gait speed in Parkinson disease patients, providing
techniques. An AI tool called EchoCLIP was able to characterize clinically helpful information about progression of the disease.
subtle clinically significant changes over time on echocardio­ Other contactless sensors such as cameras have been used to
grams, which would be difficult for a human interpreter.94 The track neurodegenerative disease progression and even to iden­
timing of future arrhythmic sudden death can be predicted based tify the likely molecular etiology in patients with Friedreich’s
on myocardial scarring seen on MRI.95 Coronary artery CTA ataxia62 and Duchenne muscular dystrophy,112 allowing for
studies are time consuming studies when interpreted manually, earlier diagnosis, intervention, and personalized treatment plans.
but Lin et al. were able to automate the process and show prog­ AI models based on EHR data are capable of predicting read­
nostic value for predicting future myocardial infarction.96 Coro­ mission, mortality, and length-of-stay.89,113 An AI model trained
nary CTA can also show perivascular fat inflammation, allowing on EHR data was able to predict the International Classification
researchers to create an AI algorithm to estimate the risk of of Diseases (ICD) codes of a patient’s next visit, increasing the
future cardiac events even when there is no obstructive coronary ability to predict uncommon outcomes like pancreatic cancer
disease.97 and self-harm.114 Models have been able to predict seizure
Radiology, oncology, and other fields recurrence risk in pediatric patients, based on routine clinical
In radiology, AI algorithms have been applied to standard MRI or notes, chart messages, and diagnostic studies.115
CT studies to identify subtle image texture patterns that are not
detectable by human clinicians. For example, researchers were Population-level forecasting
able to use MRI data to reliably classify pediatric medulloblas­ Medical resources are fundamentally limited in our current med­
toma into four subtypes based on image characteristics alone, ical system. Global forecasting allows for the optimal distribution
facilitating the development of treatment regimens when there of resources in order to provide the most benefit. For example,
is no access to molecular testing.98 AI-based tools have similarly by modeling brain aging across populations, researchers are
been developed for classifying lung cancer, breast cancer, able to identify and address geographic, socioeconomic, and
neuroendocrine tumor, gastrointestinal stromal tumor, colo­ health factors, which are associated with increased risks of de­
rectal cancer, and other tumors, and they can be used to predict mentia.116 Modeling the global spread of infections can provide
histopathology, grading, metastatic potential, and other clinically information about where a disease may next present. And by
useful characteristics.99–103 modeling weather and population data, governments can antic­
Research in the field of oncology has made significant strides in ipate impending heatstroke events.117
leveraging large multimodal AI models for automated analysis of
whole-slide pathology images. AI models have been shown to CHALLENGES AND LIMITATIONS
help determine susceptibility to chemotherapy agents in pancre­
atic adenocarcinoma by analyzing subtle morphological features AI has the potential to transform healthcare delivery, but multiple
in the tumor microenvironment, ultimately informing clinical out­ challenges and limitations need to be addressed before its po­
comes.104,105 AI has facilitated the development of tools like tential can be realized.
tumor origin differentiation using cytological histology (TORCH), LLM development and use presents challenges. Much of what
which can more reliably identify the origin of cancers with un­ has been shown is not based on real-world, prospective studies,
known primary sites, using cytological samples from pleural and but it is instead simulated with patient actors and theoretical
peritoneal fluid.106 Models have been developed that can predict cases. As the infrastructure supporting LLM deployment evolves,
the risk of a patient developing pancreatic adenocarcinoma with the development of foundation models and standardized
based on the patient’s historical diagnoses and trajectory of dis­ benchmarking,19 challenges related to accuracy, bias, privacy,
eases.107 An AI algorithm trained on pancreatic cancer patients and ethics persist.24 Earlier LLMs were prone to ‘‘hallucinating’’ in­
was able to predict future complications, following pancreatic formation, although recent efforts have shown promise in miti­
resection, and was able to show reduced mortality by approxi­ gating this issue.118,119 Bias in training data and the tendency of
mately 50% at 90 days when compared with the usual care in LLMs to accept input text as truthful can also limit output accu­
which the clinician did not have access to the algorithm.108 racy.24,120 According to Han et al., medical LLM models currently
Other AI-based tools have been able to extract additional clin­ do not meet standards of general or medical safety, although ef­
ically relevant information from traditional sources. For example, forts to improve safety have been promising.121 Combining hu­
a tool called RETFound used fundus photography and retinal op­ man skills with AI tools has the potential to improve care.122,123
tical coherence tomography to predict the presence of systemic However, more research is needed to understand how to effec­
conditions like heart failure and myocardial infarction in addition tively integrate these tools into medical workflows.120,124
to more predictably identifying sight-threatening diseases of the Another fundamental challenge lies in the development and
retina.109 validation of AI models. Older models were relatively small and
Harnessing contactless sensor data through the application of used a smaller set of curated training data. Determining how to
AI allows ‘‘ambient intelligence,’’ in which the patterns of sensors regulate these tools is and was a challenge; these models do
are interpreted by AI algorithms to learn about the surrounding not necessarily generalize well across different populations,
environment and patient movement; this has the potential to again raising concerns about bias. Significant research and re­
improve patient safety and clinical efficiency.110 For example, sources will be required to ensure that AI models are appropriate
Liu et al. developed a low-power radio-based sensor to monitor for any given situation or population. Larger and more diverse
gait.111 This device was able to identify statistically significant datasets may be required for training to ensure accurate

Cell 188, July 10, 2025 3653


ll
Review

Figure 3. Medical AI implementation roadmap


Basic science research slowly led to proof-of-concept models. Larger models and early clinical deployment can open the door to eventual clinical deployment
and optimization.

performance. To address the complexity of regulating regularly are complex, often outdated, and heterogeneous. Data integra­
updated AI tools, the Food and Drug Administration (FDA) has tion between devices and into preexisting IT systems will require
developed the Predetermined Change Control Plan (PCCP) significant effort. One solution is unlikely to work everywhere.
framework, in which a vendor undergoes initial certification for The infrastructure must be in place to seamlessly incorporate
a product but then is able to update the product within certain this information into EHRs and decision-making processes.
boundaries.125 This acknowledges the benefits of product up­ Incorporating AI models into practice remains a long and
dates while it maintains safety standards and avoids over­ arduous process (Figure 3).
whelming FDA resources. Multiple other frameworks and entities The role of physicians is likely to evolve. In the future, physicians
can be used for regulation of AI products. may become orchestrators or directors, managing the most com­
Generative AI models present significantly more challenges plex cases and overseeing other providers who review the more
from a regulatory perspective.125 It becomes more unclear about routine cases. Physicians could manage departmental operations,
what data were used for training, how the model performs on any quality control, tumor boards, and procedures, while also bearing
one task or population, and how the output varies based on the legal liability. Research has shown that different physicians can
non-deterministic nature of generative AI tools. Additionally, any respond to AI tools variably, highlighting the need for additional
updates to a model or training data can significantly alter the research into understanding the use of AI tools.124,129 Healthcare
model’s output. Medical device companies have begun to inte­ providers must be trained to interpret and act upon the data
grate generative AI features into their products, although more in­ generated by these tools. Skepticism or hesitancy about new AI
tegral use and approval of generative AI remain in question.126 As tools is inevitable and will need to be addressed, particularly
AI tools become more generally capable and more unknowable, when considering the perception of potential job loss or loss of
evaluating them may begin to look more like the evaluation of phy­ autonomy. Resources such as the AI for Medicine Specialization
sicians, incorporating licensing exams and monitoring.125,127 courses in Coursera (https://www.coursera.org/specializations/
Most AI models today are not transparent enough about their ai-for-medicine), Udacity courses, and books like Co-Intelligence:
design and datasets to allow for recreations of the model. The Living and Working with AI by Mollick and The AI Revolution in
black-box nature of these tools reduces understanding of the Medicine by Lee can serve as starting points.130,131
mechanisms of the model and uses and limitations of the out­ The costs of implementing new AI tools must be addressed to
puts. Additionally, current AI models may not be truly reasoning ensure accessibility and prevent exacerbation of healthcare dis­
but rather repeating trained information. This limitation raises parities. It is unclear where the expenses for AI tools will fall—on
questions about their reliability in novel clinical scenarios.128 patients, populations, or third parties. In the US, the dominant
Many prediction and forecasting studies are retrospective and fee-for-service model leaves open the question of reimbursement
may not be rigorous enough for clinical implementation. It may by the government or private insurers. Most reimbursement for
be years before widespread clinical benefit can be realized or medical care currently is initiated by current procedural terminol­
proven for some of these tools, as the scientific and healthcare ogy (CPT) or diagnosis-related groups (DRG) codes; these codes
communities adapt to the new capabilities. Furthermore, when generally do not cover AI tools in their current form. A few early AI
implementing a new model, questions arise about how to vali­ tools received temporary coverage in part because of their nov­
date, test, and continuously update it with new information or ad­ elty, but a more robust and predictable system of reimbursement
vancements. must be developed, tested, and implemented if fee-for-service
The pathway to integrating AI tools into existing healthcare coverage continues. Generalist medical AI (GMAI) tools may
systems presents challenges. Healthcare systems worldwide not fit well into the traditional fee-for-service model, however.

3654 Cell 188, July 10, 2025


ll
Review

Possible alternative reimbursement models for GMAI include as­ AUTHOR CONTRIBUTIONS
signing an overarching care management coordinated activity
L.J.F. wrote the original draft and edited the work. E.C. performed conceptu­
code for assisting with existing clinical services or value-based
alization and reviewed and edited the work. E.T. performed conceptualization,
reimbursement. Proving clinical and economic value for AI tools investigation, data curation, review and editing, and project administration and
is more complicated and multifaceted than would be initially ex­ provided supervision. P.R. performed conceptualization and review and edit­
pected, however. Advanced statistical analysis techniques may ing and provided supervision.
be required to assess the value or return on investment of a given
AI project or tool. Survival, quality of life, and other variables can DECLARATION OF INTERESTS
be weighted to form a proxy for the ‘‘value’’ of a high-investment
technology. Finally, the realized value of a new tool often lags the P.R. is a co-founder, part-time employee, and equity holder of a2z Radi­
ology AI.
implementation, as users learn how to incorporate the new infor­
mation and optimize patient selection. Health systems and physi­
DECLARATION OF GENERATIVE AI AND AI-ASSISTED
cians are unlikely to implement AI tools on a larger scale if there is
TECHNOLOGIES IN THE WRITING PROCESS
no quantifiable benefit. The overall impact on the healthcare sys­
tem remains uncertain. During the preparation of this work, the author(s) used ChatGPT o4-mini and
Privacy and data security are paramount as the medical sys­ Grok 3 in order to improve the readability and language of the manuscript. After
tem relies on increasingly connected technology. AI tools trained using this tool, the authors reviewed and edited the content as needed and
on real-world data carry the additional risk of exposing patient in­ take full responsibility for the content of the published article.

formation from the training set. As these technologies become


more powerful, the scientific and clinical communities are REFERENCES
actively developing guidelines for responsible use.
1. Weizenbaum, J. (1966). ELIZA—a computer program for the study of nat­
A significant challenge lies in the disparity between the pace of ural language communication between man and machine. Commun.
AI development and traditional medical progress. While medical ACM 9, 36–45. https://doi.org/10.1145/365153.365168.
advancements often occur slowly and methodically, AI research 2. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.
has progressed at a whirlwind pace. This disconnect poses chal­ N., Kaiser, L., and Polosukhin, I. (2023). Attention is all you need. Preprint
lenges for integration and regulation. Regulatory agencies, often at arXiv. https://doi.org/10.48550/arXiv.1706.03762.
under-resourced, may struggle to adapt quickly enough to this 3. Kaplan, J., McCandlish, S., Henighan, T., Brown, T.B., Chess, B., Child,
rapidly evolving field. R., Gray, S., Radford, A., Wu, J., and Amodei, D. (2020). Scaling laws for
neural language models. Preprint at arXiv. https://doi.org/10.48550/ar­
Xiv.2001.08361.
CONCLUSION
4. Liu, S., McCoy, A.B., Wright, A.P., Carew, B., Genkins, J.Z., Huang, S.S.,
Peterson, J.F., Steitz, B., and Wright, A. (2024). Leveraging large lan­
Over the past few years substantial progress has been made in guage models for generating responses to patient messages—a subjec­
the use of AI in health and medicine. The future of medicine in­ tive analysis. J. Am. Med. Inform. Assoc. 31, 1367–1379. https://doi.org/
corporates tools that can process vast amounts of information 10.1093/jamia/ocae052.
on every scale and has the potential to meaningfully improve 5. Bernstein, I.A., Zhang, Y.V., Govil, D., Majid, I., Chang, R.T., Sun, Y.,
diagnostic accuracy and patient outcomes. AI advancements Shue, A., Chou, J.C., Schehlein, E., Christopher, K.L., et al. (2023). Com­
parison of Ophthalmologist and Large Language Model Chatbot Re­
like advanced screenings, innovative imaging technologies, pre­
sponses to Online Patient Eye Care Questions. JAMA Netw. Open 6,
dictive analytics in medical forecasting, and personalized man­
e2330320. https://doi.org/10.1001/jamanetworkopen.2023.30320.
agement plans promise to transform patient care from a reactive
6. Mika, A.P., Martin, J.R., Engstrom, S.M., Polkowski, G.G., and Wilson, J.
hospital-based model to a proactive health-optimizing system M. (2023). Assessing ChatGPT Responses to Common Patient Ques­
with fine-tuned levels of intervention. tions Regarding Total Hip Arthroplasty. J. Bone Joint Surg. Am. 105,
Despite these promises, full clinical acceptance and regular 1519–1526. https://doi.org/10.2106/JBJS.23.00209.
sanctioned use of AI tools are not imminent. Serious challenges 7. Steimetz, E., Minkowitz, J., Gabutan, E.C., Ngichabe, J., Attia, H., Hersh­
remain and will need to be addressed before there is widespread kop, M., Ozay, F., Hanna, M.G., and Gupta, R. (2024). Use of Artificial In­
adoption of AI in clinical practice. Most AI tools are still in devel­ telligence Chatbots in Interpretation of Pathology Reports. JAMA Netw.
opmental phases. While some show clinical benefit in a Open 7, e2412767. https://doi.org/10.1001/jamanetworkopen.2024.
12767.
controlled setting, few can claim to unequivocally improve health
8. Zaretsky, J., Kim, J.M., Baskharoun, S., Zhao, Y., Austrian, J., Aphinya­
for all users. Also, few can claim to clearly reduce costs in all set­
naphongs, Y., Gupta, R., Blecker, S.B., and Feldman, J. (2024). Genera­
tings, and few can claim to have a clear path to implementation in
tive Artificial Intelligence to Transform Inpatient Discharge Summaries to
the current medical systems. Clinical implementation remains Patient-Friendly Language and Format. JAMA Netw. Open 7, e240357.
the main hurdle to more widespread use of AI tools by health pro­ https://doi.org/10.1001/jamanetworkopen.2024.0357.
fessionals. 9. Habicht, J., Viswanathan, S., Carrington, B., Hauser, T.U., Harper, R.,
and Rollwage, M. (2024). Closing the accessibility gap to mental health
ACKNOWLEDGMENTS treatment with a personalized self-referral chatbot. Nat. Med. 30, 595–
602. https://doi.org/10.1038/s41591-023-02766-x.
We thank Sid Dogra, Vish Rao, Rohit Reddy, and Hong-Yu Zhou for their feed­ 10. Holohan, M. (2023). A boy saw 17 doctors over 3 years for chronic pain.
back. E.T. was funded by the National Institutes of Health National grant UL1 ChatGPT found the diagnosis. https://www.today.com/health/mom-
TR001114. chatgpt-diagnosis-pain-rcna101843.

Cell 188, July 10, 2025 3655


ll
Review
11. Van Veen, D., Van Uden, C., Blankemeier, L., Delbrouck, J.-B., Aali, A., 26. Qiu, J., Lam, K., Li, G., Acharya, A., Wong, T.Y., Darzi, A., Yuan, W., and
Bluethgen, C., Pareek, A., Polacin, M., Reis, E.P., Seehofnerová, A., Topol, E.J. (2024). LLM-based agentic systems in medicine and health­
et al. (2024). Adapted large language models can outperform medical ex­ care. Nat. Mach. Intell. 6, 1418–1420. https://doi.org/10.1038/s42256-
perts in clinical text summarization. Nat. Med. 30, 1134–1142. https://doi. 024-00944-1.
org/10.1038/s41591-024-02855-5. 27. Mukherjee, S., Gamble, P., Ausin, M.S., Kant, N., Aggarwal, K., Manju­
12. Tu, T., Schaekermann, M., Palepu, A., Saab, K., Freyberg, J., Tanno, R., nath, N., Datta, D., Liu, Z., Ding, J., Busacca, S., et al. (2024). Polaris:
Wang, A., Li, B., Amin, M., Cheng, Y., et al. (2025). Towards conversa­ A safety-focused LLM constellation architecture for healthcare. Preprint
tional diagnostic artificial intelligence. Nature. https://doi.org/10.1038/ at arXiv. https://doi.org/10.48550/arXiv.2403.13313.
s41586-025-08866-7.
28. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E.,
13. Johri, S., Jeong, J., Tran, B.A., Schlessinger, D.I., Wongvibulsin, S., Cai, Le, Q., and Zhou, D. (2023). Chain-of-thought prompting elicits reasoning
Z.R., Daneshjou, R., and Rajpurkar, P. (2023). Guidelines For Rigorous in large language models. Preprint at arXiv. https://doi.org/10.48550/ar­
Evaluation of Clinical LLMs For Conversational Reasoning. Preprint at Xiv.2201.11903.
medRxiv. https://doi.org/10.1101/2023.09.12.23295399.
29. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., and Cao, Y.
14. Li, J., Guan, Z., Wang, J., Cheung, C.Y., Zheng, Y., Lim, L.-L., Lim, C.C., (2023). ReAct: synergizing reasoning and acting in language models. Pre­
Ruamviboonsuk, P., Raman, R., Corsino, L., et al. (2024). Integrated im­ print at arXiv. https://doi.org/10.48550/arXiv.2210.03629.
age-based deep learning and language models for primary diabetes
30. Xu, F., Hao, Q., Zong, Z., Wang, J., Zhang, Y., Wang, J., Lan, X., Gong, J.,
care. Nat. Med. 30, 2886–2896. https://doi.org/10.1038/s41591-024-
Ouyang, T., Meng, F., et al. (2025). Towards large reasoning models: A
03139-8.
survey of reinforced reasoning with large language models. Preprint at ar­
15. Huang, A.S., Hirabayashi, K., Barna, L., Parikh, D., and Pasquale, L.R. Xiv. https://doi.org/10.48550/arXiv.2501.09686.
(2024). Assessment of a Large Language Model’s Responses to Ques­
31. Savage, T., Nayak, A., Gallo, R., Rangan, E., and Chen, J.H. (2024). Diag­
tions and Cases About Glaucoma and Retina Management. JAMA
nostic reasoning prompts reveal the potential for large language model
Ophthalmol. 142, 371–375. https://doi.org/10.1001/jamaophthalmol.
interpretability in medicine. npj Digit. Med. 7, 20. https://doi.org/10.
2023.6917.
1038/s41746-024-01010-1.
16. Ferber, D., Wiest, I.C., Wölflein, G., Ebert, M.P., Beutel, G., Eckardt, J.-
32. Wu, J., Deng, W., Li, X., Liu, S., Mi, T., Peng, Y., Xu, Z., Liu, Y., Cho, H.,
N., Truhn, D., Springfeld, C., Jäger, D., and Kather, J.N. (2024). GPT-4
Choi, C.-I., et al. (2025). MedReason: eliciting factual medical reasoning
for Information Retrieval and Comparison of Medical Oncology Guide­
steps in LLMs via knowledge graphs. Preprint at arXiv. https://doi.org/10.
lines. NEJM AI 1, AIcs2300235. https://doi.org/10.1056/AIcs2300235.
48550/arXiv.2504.00993.
17. Koch, M.-C. (2025). Clinical co-pilot receives first approval for Class IIb
33. Moor, M., Huang, Q., Wu, S., Yasunaga, M., Zakka, C., Dalmia, Y., Reis,
medical device. Heise. https://www.heise.de/en/news/Clinical-co-pilot-
E.P., Rajpurkar, P., and Leskovec, J. (2023). Med-flamingo: a multimodal
receives-first-approval-for-Class-IIb-medical-device-10348301.html.
medical few-shot learner. Preprint at arXiv. https://doi.org/10.48550/ar­
18. Kung, T.H., Cheatham, M., Medenilla, A., Sillos, C., De Leon, L., Elepaño, Xiv.2307.15189.
C., Madriaga, M., Aggabao, R., Diaz-Candido, G., Maningo, J., et al.
(2023). Performance of ChatGPT on USMLE: Potential for AI-assisted 34. Lu, M.Y., Chen, B., Williamson, D.F.K., Chen, R.J., Liang, I., Ding, T.,
medical education using large language models. PLOS Digit. Health 2, Jaume, G., Odintsov, I., Le, L.P., Gerber, G., et al. (2024). A visual-lan­
e0000198. https://doi.org/10.1371/journal.pdig.0000198. guage foundation model for computational pathology. Nat. Med. 30,
863–874. https://doi.org/10.1038/s41591-024-02856-4.
19. Singhal, K., Azizi, S., Tu, T., Mahdavi, S.S., Wei, J., Chung, H.W., Scales,
N., Tanwani, A., Cole-Lewis, H., Pfohl, S., et al. (2023). Large language 35. Acosta, J.N., Falcone, G.J., Rajpurkar, P., and Topol, E.J. (2022). Multi­
models encode clinical knowledge. Nature 620, 172–180. https://doi. modal biomedical AI. Nat. Med. 28, 1773–1784. https://doi.org/10.
org/10.1038/s41586-023-06291-2. 1038/s41591-022-01981-2.

20. Lee, P., Bubeck, S., and Petro, J. (2023). Benefits, Limits, and Risks of 36. Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S.,
GPT-4 as an AI Chatbot for Medicine. N. Engl. J. Med. 388, 1233– Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. (2021). Learning trans­
1239. https://doi.org/10.1056/NEJMsr2214184. ferable visual models From natural language supervision. Preprint at ar­
Xiv. https://doi.org/10.48550/arXiv.2103.00020.
21. Saab, K., Tu, T., Weng, W.-H., Tanno, R., Stutz, D., Wulczyn, E., Zhang,
F., Strother, T., Park, C., Vedadi, E., et al. (2024). Capabilities of Gemini 37. Bommasani, R., Hudson, D.A., Adeli, E., Altman, R., Arora, S., Arx, S. von,
models in medicine. Preprint at arXiv. https://doi.org/10.48550/arXiv. Bernstein, M.S., Bohg, J., Bosselut, A., Brunskill, E., et al. (2022). On the
2404.18416. opportunities and risks of foundation models. Preprint at arXiv. https://
doi.org/10.48550/arXiv.2108.07258.
22. Liu, S., McCoy, A.B., Wright, A.P., Carew, B., Genkins, J.Z., Huang, S.S.,
Peterson, J.F., Steitz, B., and Wright, A. (2023). Leveraging Large Lan­ 38. Zhang, K., Zhou, R., Adhikarla, E., Yan, Z., Liu, Y., Yu, J., Liu, Z., Chen, X.,
guage Models for Generating Responses to Patient Messages. Preprint Davison, B.D., Ren, H., et al. (2024). A generalist vision–language founda­
at medRxiv, 2023.07.14.23292669. https://doi.org/10.1101/2023.07.14. tion model for diverse biomedical tasks. Nat. Med. 30, 3129–3141.
23292669. https://doi.org/10.1038/s41591-024-03185-2.
23. Tierney, A.A., Gayre, G., Hoberman, B., Mattern, B., Ballesca, M., Kipnis, 39. Zhou, H.-Y., Yu, Y., Wang, C., Zhang, S., Gao, Y., Pan, J., Shao, J., Lu,
P., Liu, V., and Lee, K. (2024). Ambient Artificial Intelligence Scribes to G., Zhang, K., and Li, W. (2023). A transformer-based representation-
Alleviate the Burden of Clinical Documentation. NEJM Catal. 5. https:// learning model with unified processing of multimodal input for clinical di­
doi.org/10.1056/CAT.23.0404. agnostics. Nat. Biomed. Eng. 7, 743–755. https://doi.org/10.1038/
s41551-023-01045-x.
24. Omiye, J.A., Gui, H., Rezaei, S.J., Zou, J., and Daneshjou, R. (2024).
Large Language Models in Medicine: The Potentials and Pitfalls: A Narra­ 40. Khader, F., Müller-Franzes, G., Wang, T., Han, T., Tayebi Arasteh, S.,
tive Review. Ann. Intern. Med. 177, 210–220. https://doi.org/10.7326/ Haarburger, C., Stegmaier, J., Bressem, K., Kuhl, C., Nebelung, S.,
M23-2772. et al. (2023). Multimodal Deep Learning for Integrating Chest Radio­
25. Grewal, H., Dhillon, G., Monga, V., Sharma, P., Buddhavarapu, V.S., graphs and Clinical Parameters: A Case for Transformers. Radiology
Sidhu, G., and Kashyap, R. (2023). Radiology Gets Chatty: The 309, e230806. https://doi.org/10.1148/radiol.230806.
ChatGPT Saga Unfolds. Cureus 15, e40135. https://doi.org/10.7759/cur­ 41. Chen, R.J., Lu, M.Y., Williamson, D.F.K., Chen, T.Y., Lipkova, J., Noor, Z.,
eus.40135. Shaban, M., Shady, M., Williams, M., Joo, B., et al. (2022). Pan-cancer

3656 Cell 188, July 10, 2025


ll
Review
integrative histology-genomic analysis via multimodal deep learning. �evic
56. Shaikh, N., Conway, S.J., Kovac �, J., Condessa, F., Shope, T.R., Har­
Cancer Cell 40, 865–878.e6. https://doi.org/10.1016/j.ccell.2022.07.004. alam, M.A., Campese, C., Lee, M.C., Larsson, T., Cavdar, Z., et al. (2024).
42. Minoura, K., Abe, K., Nam, H., Nishikawa, H., and Shimamura, T. (2021). Development and Validation of an Automated Classifier to Diagnose
A mixture-of-experts deep generative model for integrated analysis of Acute Otitis Media in Children. JAMA Pediatr. 178, 401–407. https://
single-cell multiomics data. Cell Rep. Methods 1, 100071. https://doi. doi.org/10.1001/jamapediatrics.2024.0011.
org/10.1016/j.crmeth.2021.100071. 57. Landy, R., Wang, V.L., Baldwin, D.R., Pinsky, P.F., Cheung, L.C., Castle,
43. Vanguri, R.S., Luo, J., Aukerman, A.T., Egger, J.V., Fong, C.J., Horvat, N., P.E., Skarzynski, M., Robbins, H.A., and Katki, H.A. (2023). Recalibration
Pagano, A., Araujo-Filho, J.A.B., Geneslaw, L., Rizvi, H., et al. (2022). of a Deep Learning Model for Low-Dose Computed Tomographic Images
Multimodal integration of radiology, pathology and genomics for predic­ to Inform Lung Cancer Screening Intervals. JAMA Netw. Open 6,
tion of response to PD-(L)1 blockade in patients with non-small cell lung e233273. https://doi.org/10.1001/jamanetworkopen.2023.3273.
cancer. Nat. Cancer 3, 1151–1164. https://doi.org/10.1038/s43018-022- 58. Lång, K., Josefsson, V., Larsson, A.-M., Larsson, S., Högberg, C., Sartor,
00416-8. H., Hofvind, S., Andersson, I., and Rosso, A. (2023). Artificial intelligence-
44. Rao, V.M., Hla, M., Moor, M., Adithan, S., Kwak, S., Topol, E.J., and supported screen reading versus standard double reading in the
Rajpurkar, P. (2025). Multimodal generative AI for medical image inter­ Mammography Screening with Artificial Intelligence trial (MASAI): a clin­
pretation. Nature 639, 888–896. https://doi.org/10.1038/s41586-025- ical safety analysis of a randomised, controlled, non-inferiority, single-
08675-y. blinded, screening accuracy study. Lancet Oncol. 24, 936–944. https://
45. Chen, R.J., Ding, T., Lu, M.Y., Williamson, D.F.K., Jaume, G., Song, A.H., doi.org/10.1016/S1470-2045(23)00298-X.
Chen, B., Zhang, A., Shao, D., Shaban, M., et al. (2024). Towards a gen­ 59. Ng, A.Y., Oberije, C.J.G., Ambrózay, É., Szabó, E., Serfo } zo
} , O., Karpati,
eral-purpose foundation model for computational pathology. Nat. Med. E., Fox, G., Glocker, B., Morris, E.A., Forrai, G., et al. (2023). Prospective
30, 850–862. https://doi.org/10.1038/s41591-024-02857-3. implementation of AI-assisted screen reading to improve early detection
46. Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T.J., and Zou, J. of breast cancer. Nat. Med. 29, 3044–3049. https://doi.org/10.1038/
(2023). A visual–language foundation model for pathology image analysis s41591-023-02625-9.
using medical Twitter. Nat. Med. 29, 2307–2316. https://doi.org/10.1038/ 60. Lauritzen, A.D., Von Euler-Chelpin, M.C., Lynge, E., Vejborg, I., Nielsen,
s41591-023-02504-3. M., Karssemeijer, N., and Lillholm, M. (2023). Assessing Breast Cancer
47. Lu, M.Y., Chen, B., Williamson, D.F.K., Chen, R.J., Zhao, M., Chow, A.K., Risk by Combining AI for Lesion Detection and Mammographic Texture.
Ikemura, K., Kim, A., Pouli, D., Patel, A., et al. (2024). A multimodal gener­ Radiology 308, e230227. https://doi.org/10.1148/radiol.230227.
ative AI copilot for human pathology. Nature 634, 466–473. https://doi.
61. Salim, M., Liu, Y., Sorkhei, M., Ntoula, D., Foukakis, T., Fredriksson, I.,
org/10.1038/s41586-024-07618-3.
Wang, Y., Eklund, M., Azizpour, H., Smith, K., et al. (2024). AI-based se­
48. Gadaleta, M., Harrington, P., Barnhill, E., Hytopoulos, E., Turakhia, M.P., lection of individuals for supplemental MRI in population-based breast
Steinhubl, S.R., and Quer, G. (2023). Prediction of atrial fibrillation from cancer screening: the randomized ScreenTrustMRI trial. Nat. Med. 30,
at-home single-lead ECG signals without arrhythmias. npj Digit. Med. 2623–2630. https://doi.org/10.1038/s41591-024-03093-5.
6, 229. https://doi.org/10.1038/s41746-023-00966-w.
62. Kadirvelu, B., Gavriel, C., Nageshwaran, S., Chan, J.P.K., Nethisinghe,
49. Gavidia, M., Zhu, H., Montanari, A.N., Fuentes, J., Cheng, C., Dubner, S., S., Athanasopoulos, S., Ricotti, V., Voit, T., Giunti, P., Festenstein, R.,
Chames, M., Maison-Blanche, P., Rahman, M.M., Sassi, R., et al. (2024). et al. (2023). A wearable motion capture suit and machine learning predict
Early warning of atrial fibrillation using deep learning. Patterns (N Y) 5, disease progression in Friedreich’s ataxia. Nat. Med. 29, 86–94. https://
100970. https://doi.org/10.1016/j.patter.2024.100970. doi.org/10.1038/s41591-022-02159-6.
50. Attia, Z.I., Harmon, D.M., Dugan, J., Manka, L., Lopez-Jimenez, F., Ler­
63. Mekkes, N.J., Groot, M., Hoekstra, E., De Boer, A., Dagkesamanskaia,
man, A., Siontis, K.C., Noseworthy, P.A., Yao, X., Klavetter, E.W., et al.
E., Bouwman, S., Wehrens, S.M.T., Herbert, M.K., Wever, D.D., Roze­
(2022). Prospective evaluation of smartwatch-enabled detection of left
muller, A., et al. (2024). Identification of clinical disease trajectories in
ventricular dysfunction. Nat. Med. 28, 2497–2503. https://doi.org/10.
neurodegenerative disorders with natural language processing. Nat.
1038/s41591-022-02053-1.
Med. 30, 1143–1153. https://doi.org/10.1038/s41591-024-02843-9.
51. Guan, G., Mofaz, M., Qian, G., Patalon, T., Shmueli, E., Yamin, D., and
64. Dingemans, A.J.M., Hinne, M., Truijen, K.M.G., Goltstein, L., Van Reeu­
Brandeau, M.L. (2022). Higher sensitivity monitoring of reactions to
wijk, J., De Leeuw, N., Schuurs-Hoeijmakers, J., Pfundt, R., Diets, I.J.,
COVID-19 vaccination using smartwatches. npj Digit. Med. 5, 140.
Den Hoed, J., et al. (2023). PhenoScore quantifies phenotypic variation
https://doi.org/10.1038/s41746-022-00683-w.
for rare genetic diseases by combining facial analysis with other clinical
52. Madhvapathy, S.R., Wang, J.-J., Wang, H., Patel, M., Chang, A., features using a machine-learning framework. Nat. Genet. 55, 1598–
Zheng, X., Huang, Y., Zhang, Z.J., Gallon, L., and Rogers, J.A. (2023). 1607. https://doi.org/10.1038/s41588-023-01469-w.
Implantable bioelectronic systems for early detection of kidney trans­
plant rejection. Science 381, 1105–1112. https://doi.org/10.1126/sci­ 65. Diaz-Pinto, A., Ravikumar, N., Attar, R., Suinesiaputra, A., Zhao, Y., Lev­
ence.adh7726. elt, E., Dall’armellina, E., Lorenzi, M., Chen, Q., Keenan, T.D.L., et al.
(2022). Predicting myocardial infarction through retinal scans and mini­
53. Torrente-Rodrı́guez, R.M., Tu, J., Yang, Y., Min, J., Wang, M., Song, Y.,
mal personal information. Nat. Mach. Intell. 4, 55–61. https://doi.org/
Yu, Y., Xu, C., Ye, C., IsHak, W.W., et al. (2020). Investigation of Cortisol
10.1038/s42256-021-00427-7.
Dynamics in Human Sweat Using a Graphene-Based Wireless mHealth
System. Matter 2, 921–937. https://doi.org/10.1016/j.matt.2020.01.021. 66. Topol, E.J. (2024). Medical forecasting. Science 384, eadp7977. https://
doi.org/10.1126/science.adp7977.
54. Lin, M., Zhang, Z., Gao, X., Bian, Y., Wu, R.S., Park, G., Lou, Z., Zhang,
Z., Xu, X., Chen, X., et al. (2024). A fully integrated wearable ultrasound 67. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger,
system to monitor deep tissues in moving subjects. Nat. Biotechnol. �
O., Tunyasuvunakool, K., Bates, R., Zı́dek, A., Potapenko, A., et al.
42, 448–457. https://doi.org/10.1038/s41587-023-01800-0. (2021). Highly accurate protein structure prediction with AlphaFold. Na­
55. Menzies, S.W., Sinz, C., Menzies, M., Lo, S.N., Yolland, W., Lingohr, J., ture 596, 583–589. https://doi.org/10.1038/s41586-021-03819-2.
Razmara, M., Tschandl, P., Guitera, P., Scolyer, R.A., et al. (2023). Com­ 68. Baek, M., DiMaio, F., Anishchenko, I., Dauparas, J., Ovchinnikov, S., Lee,
parison of humans versus mobile phone-powered artificial intelligence G.R., Wang, J., Cong, Q., Kinch, L.N., Schaeffer, R.D., et al. (2021). Ac­
for the diagnosis and management of pigmented skin cancer in second­ curate prediction of protein structures and interactions using a three-
ary care: a multicentre, prospective, diagnostic, clinical trial. Lancet Digit. track neural network. Science 373, 871–876. https://doi.org/10.1126/sci­
Health 5, e679–e691. https://doi.org/10.1016/S2589-7500(23)00130-9. ence.abj8754.

Cell 188, July 10, 2025 3657


ll
Review
69. Abramson, J., Adler, J., Dunger, J., Evans, R., Green, T., Pritzel, A., Ron­ across diverse families. Nat. Biotechnol. 41, 1099–1106. https://doi.
neberger, O., Willmore, L., Ballard, A.J., Bambrick, J., et al. (2024). Accu­ org/10.1038/s41587-022-01618-2.
rate structure prediction of biomolecular interactions with AlphaFold 3. 85. Altman, R.B. (2023). A Holy Grail — The Prediction of Protein Structure.
Nature 630, 493–500. https://doi.org/10.1038/s41586-024-07487-w. N. Engl. J. Med. 389, 1431–1434. https://doi.org/10.1056/NEJMcibr
70. Yang, Z., Zeng, X., Zhao, Y., and Chen, R. (2023). AlphaFold2 and its ap­ 2307735.
plications in the fields of biology and medicine. Signal Transduct. Target. 86. Anishchenko, I., Pellock, S.J., Chidyausiku, T.M., Ramelot, T.A., Ovchin­
Ther. 8, 115. https://doi.org/10.1038/s41392-023-01381-z. nikov, S., Hao, J., Bafna, K., Norn, C., Kang, A., Bera, A.K., et al. (2021).
71. Varadi, M., Anyango, S., Deshpande, M., Nair, S., Natassia, C., Yorda­ De novo protein design by deep network hallucination. Nature 600, 547–
nova, G., Yuan, D., Stroe, O., Wood, G., Laydon, A., et al. (2022). Alpha­ 552. https://doi.org/10.1038/s41586-021-04184-w.
Fold Protein Structure Database: massively expanding the structural 87. Lin, C.-S., Liu, W.-T., Tsai, D.-J., Lou, Y.-S., Chang, C.-H., Lee, C.-C.,
coverage of protein-sequence space with high-accuracy models. Nu­ Fang, W.-H., Wang, C.-C., Chen, Y.-Y., Lin, W.-S., et al. (2024). AI-
cleic Acids Res. 50, D439–D444. https://doi.org/10.1093/nar/gkab1061. enabled electrocardiography alert intervention and all-cause mortality:
72. Topol, E. (2024). Learning the Language of Life with A.I.: A Hyper- a pragmatic randomized clinical trial. Nat. Med. 30, 1461–1470. https://
Accelerated Phase of New Foundation Models. Ground Truths. https:// doi.org/10.1038/s41591-024-02961-4.
erictopol.substack.com/p/learning-the-language-of-life-with. 88. Al-Zaiti, S.S., Martin-Gill, C., Zègre-Hemsey, J.K., Bouzid, Z., Faramand,
73. Nguyen, E., Poli, M., Durrant, M.G., Kang, B., Katrekar, D., Li, D.B., Bar­ Z., Alrawashdeh, M.O., Gregg, R.E., Helman, S., Riek, N.T., Kraevsky-
tie, L.J., Thomas, A.W., King, S.H., Brixi, G., et al. (2024). Sequence Phillips, K., et al. (2023). Machine learning for ECG diagnosis and risk
modeling and design from molecular to genome scale with Evo. Science stratification of occlusion myocardial infarction. Nat. Med. 29, 1804–
386, eado9336. https://doi.org/10.1126/science.ado9336. 1813. https://doi.org/10.1038/s41591-023-02396-3.
74. Rood, J.E., Wynne, S., Robson, L., Hupalowska, A., Randell, J., Teich­ 89. Sundrani, S., Chen, J., Jin, B.T., Abad, Z.S.H., Rajpurkar, P., and Kim, D.
mann, S.A., and Regev, A. (2025). The Human Cell Atlas from a cell (2023). Predicting patient decompensation from continuous physiologic
census to a unified foundation model. Nature 637, 1065–1071. https:// monitoring in the emergency department. npj Digit. Med. 6, 60. https://
doi.org/10.1038/s41586-024-08338-4. doi.org/10.1038/s41746-023-00803-0.
75. Xiong, D., Qiu, Y., Zhao, J., Zhou, Y., Lee, D., Gupta, S., Torres, M., Lu, 90. Agrawal, S., Klarqvist, M.D.R., Emdin, C., Patel, A.P., Paranjpe, M.D., El­
W., Liang, S., Kang, J.J., et al. (2024). A structurally informed human linor, P.T., Philippakis, A., Ng, K., Batra, P., and Khera, A.V. (2021). Selec­
protein–protein interactome reveals proteome-wide perturbations tion of 51 predictors from 13,782 candidate multimodal features using
caused by disease mutations. Nat. Biotechnol., 1–15. https://doi.org/ machine learning improves coronary artery disease prediction. Patterns
10.1038/s41587-024-02428-4. (N Y) 2, 100364. https://doi.org/10.1016/j.patter.2021.100364.

76. Karelina, M., Noh, J.J., and Dror, R.O. (2023). How accurately can one 91. Williams, S.A., Ostroff, R., Hinterberg, M.A., Coresh, J., Ballantyne, C.M.,
predict drug binding modes using AlphaFold models? eLife 12, Matsushita, K., Mueller, C.E., Walter, J., Jonasson, C., Holman, R.R.,
RP89386. https://doi.org/10.7554/eLife.89386. et al. (2022). A proteomic surrogate for cardiovascular outcomes that is
sensitive to multiple mechanisms of change in risk. Sci. Transl. Med.
77. Drysdale, E. (2023). A multitask neural network trained on embeddings
14, eabj9625. https://doi.org/10.1126/scitranslmed.abj9625.
from ESMFold can accurately rank order clinical outcomes for different
cystic fibrosis mutations. Preprint at bioRxiv. https://doi.org/10.1101/ 92. Khurshid, S., Friedman, S., Reeder, C., Di Achille, P., Diamant, N., Singh,
2023.10.26.564274. P., Harrington, L.X., Wang, X., Al-Alusi, M.A., Sarma, G., et al. (2022).
ECG-Based Deep Learning and Clinical Risk Factors to Predict Atrial
78. Zhang, H., Lan, J., Wang, H., Lu, R., Zhang, N., He, X., Yang, J., and
Fibrillation. Circulation 145, 122–133. https://doi.org/10.1161/CIRCULA­
Chen, L. (2024). AlphaFold2 in biomedical research: facilitating the devel­
TIONAHA.121.057480.
opment of diagnostic strategies for disease. Front. Mol. Biosci. 11,
1414916. https://doi.org/10.3389/fmolb.2024.1414916. 93. Jeon, K.-H., Lee, H.S., Kang, S., Jang, J.-H., Jo, Y.-Y., Son, J.M., Lee, M.
S., Kwon, J.-M., Kwun, J.-S., Cho, H.-W., et al. (2024). AI-enabled ECG
79. Watson, J.L., Juergens, D., Bennett, N.R., Trippe, B.L., Yim, J., Eisenach,
index for predicting left ventricular dysfunction in patients with ST-
H.E., Ahern, W., Borst, A.J., Ragotte, R.J., Milles, L.F., et al. (2023). De
segment elevation myocardial infarction. Sci. Rep. 14, 16575. https://
novo design of protein structure and function with RFdiffusion. Nature
doi.org/10.1038/s41598-024-67532-6.
620, 1089–1100. https://doi.org/10.1038/s41586-023-06415-8.
94. Christensen, M., Vukadinovic, M., Yuan, N., and Ouyang, D. (2024).
80. Yim, J., Trippe, B.L., Bortoli, V.D., M, E., Doucet, A., Barzilay, R., and
Vision–language foundation model for echocardiogram interpretation.
Jaakkola, T. (2023). SE(3) diffusion model with application to protein
Nat. Med. 30, 1481–1488. https://doi.org/10.1038/s41591-024-02959-y.
backbone generation. Preprint at arXiv. https://doi.org/10.48550/arXiv.
2302.02277. 95. Popescu, D.M., Shade, J.K., Lai, C., Aronis, K.N., Ouyang, D., Moorthy,
M.V., Cook, N.R., Lee, D.C., Kadish, A., Albert, C.M., et al. (2022).
81. Brixi, G., Durrant, M.G., Ku, J., Poli, M., Brockman, G., Chang, D., Gon­
Arrhythmic sudden death survival prediction using deep learning analysis
zalez, G.A., King, S.H., Li, D.B., Merchant, A.T., et al. (2025). Genome
of scarring in the heart. Nat CardioVasc Res 1, 334–343. https://doi.org/
modeling and design across all domains of life with Evo 2. Preprint at bio­
10.1038/s44161-022-00041-9.
Rxiv. https://doi.org/10.1101/2025.02.18.638918.
96. Lin, A., Manral, N., McElhinney, P., Killekar, A., Matsumoto, H., Kwiecin­
82. Ni, B., Kaplan, D.L., and Buehler, M.J. (2023). Generative design of de ski, J., Pieszko, K., Razipour, A., Grodecki, K., Park, C., et al. (2022).
novo proteins based on secondary structure constraints using an atten­ Deep learning-enabled coronary CT angiography for plaque and stenosis
tion-based diffusion model. Chem 9, 1828–1849. https://doi.org/10. quantification and cardiac risk prediction: an international multicentre
1016/j.chempr.2023.03.020. study. Lancet Digit. Health 4, e256–e265. https://doi.org/10.1016/
83. Lutz, I.D., Wang, S., Norn, C., Courbet, A., Borst, A.J., Zhao, Y.T., Dosey, S2589-7500(22)00022-X.
A., Cao, L., Xu, J., Leaf, E.M., et al. (2023). Top-down design of protein 97. Chan, K., Wahome, E., Tsiachristas, A., Antonopoulos, A.S., Patel, P.,
architectures with reinforcement learning. Science 380, 266–273. Lyasheva, M., Kingham, L., West, H., Oikonomou, E.K., Volpe, L., et al.
https://doi.org/10.1126/science.adf6591. (2024). Inflammatory risk and cardiovascular events in patients without
84. Madani, A., Krause, B., Greene, E.R., Subramanian, S., Mohr, B.P., obstructive coronary artery disease: the ORFAN multicentre, longitudinal
Holton, J.M., Olmos, J.L., Xiong, C., Sun, Z.Z., Socher, R., et al. cohort study. Lancet 403, 2606–2618. https://doi.org/10.1016/S0140-
(2023). Large language models generate functional protein sequences 6736(24)00596-8.

3658 Cell 188, July 10, 2025


ll
Review
98. Zhang, M., Wong, S.W., Wright, J.N., Wagner, M.W., Toescu, S., Han, M., 111. Liu, Y., Zhang, G., Tarolli, C.G., Hristov, R., Jensen-Roberts, S., Wad­
Tam, L.T., Zhou, Q., Ahmadian, S.S., Shpanskaya, K., et al. (2022). MRI dell, E.M., Myers, T.L., Pawlik, M.E., Soto, J.M., Wilson, R.M., et al.
Radiogenomics of Pediatric Medulloblastoma: A Multicenter Study. (2022). Monitoring gait at home with radio waves in Parkinson’s dis­
Radiology 304, 406–416. https://doi.org/10.1148/radiol.212137. ease: A marker of severity, progression, and medication response.
99. Rengo, M., Onori, A., Caruso, D., Bellini, D., Carbonetti, F., De Santis, D., Sci. Transl. Med. 14, eadc9669. https://doi.org/10.1126/scitranslmed.
Vicini, S., Zerunian, M., Iannicelli, E., Carbone, I., et al. (2023). Develop­ adc9669.
ment and Validation of Artificial-Intelligence-Based Radiomics Model 112. Ricotti, V., Kadirvelu, B., Selby, V., Festenstein, R., Mercuri, E., Voit, T.,
Using Computed Tomography Features for Preoperative Risk Stratifica­ and Faisal, A.A. (2023). Wearable full-body motion tracking of activities
tion of Gastrointestinal Stromal Tumors. J. Pers. Med. 13, 717. https:// of daily living predicts disease trajectory in Duchenne muscular dystro­
doi.org/10.3390/jpm13050717. phy. Nat. Med. 29, 95–103. https://doi.org/10.1038/s41591-022-
02045-1.
100. Li, L., Zhou, X., Cui, W., Li, Y., Liu, T., Yuan, G., Peng, Y., and Zheng, J.
(2023). Combining radiomics and deep learning features of intra-tumoral 113. Jiang, L.Y., Liu, X.C., Nejatian, N.P., Nasir-Moin, M., Wang, D., Abidin, A.,
and peri-tumoral regions for the classification of breast cancer lung Eaton, K., Riina, H.A., Laufer, I., Punjabi, P., et al. (2023). Health system-
metastasis and primary lung cancer with low-dose CT. J. Cancer Res. scale language models are all-purpose prediction engines. Nature 619,
Clin. Oncol. 149, 15469–15478. https://doi.org/10.1007/s00432-023- 357–362. https://doi.org/10.1038/s41586-023-06160-y.
05329-2. 114. Yang, Z., Mitra, A., Liu, W., Berlowitz, D., and Yu, H. (2023). Transfor­
101. Granata, V., Fusco, R., Setola, S.V., Galdiero, R., Maggialetti, N., Sil­ mEHR: transformer-based encoder-decoder generative model to
vestro, L., De Bellis, M., Di Girolamo, E., Grazzini, G., Chiti, G., et al. enhance prediction of disease outcomes using electronic health records.
(2023). Risk Assessment and Pancreatic Cancer: Diagnostic Manage­ Nat. Commun. 14, 7857. https://doi.org/10.1038/s41467-023-43715-z.
ment and Artificial Intelligence. Cancers 15, 351. https://doi.org/10. 115. Beaulieu-Jones, B.K., Villamar, M.F., Scordis, P., Bartmann, A.P., Ali, W.,
3390/cancers15020351. Wissel, B.D., Alsentzer, E., De Jong, J., Patra, A., and Kohane, I. (2023).
Predicting seizure recurrence after an initial seizure-like episode from
102. Chiti, G., Grazzini, G., Flammia, F., Matteuzzi, B., Tortoli, P., Bettarini, S.,
routine clinical notes using large language models: a retrospective cohort
Pasqualini, E., Granata, V., Busoni, S., Messserini, L., et al. (2022). Gas­
study. Lancet Digit. Health 5, e882–e894. https://doi.org/10.1016/
troenteropancreatic neuroendocrine neoplasms (GEP-NENs): a radiomic
S2589-7500(23)00179-6.
model to predict tumor grade. Radiol. Med. 127, 928–938. https://doi.
org/10.1007/s11547-022-01529-x. 116. Moguilner, S., Baez, S., Hernandez, H., Migeot, J., Legaz, A., Gonzalez-
Gomez, R., Farina, F.R., Prado, P., Cuadros, J., Tagliazucchi, E., et al.
103. Swanson, K., Wu, E., Zhang, A., Alizadeh, A.A., and Zou, J. (2023). From
(2024). Brain clocks capture diversity and disparities in aging and demen­
patterns to patients: Advances in clinical machine learning for cancer
tia across geographically diverse populations. Nat. Med. 30, 3646–3657.
diagnosis, prognosis, and treatment. Cell 186, 1772–1791. https://doi.
https://doi.org/10.1038/s41591-024-03209-x.
org/10.1016/j.cell.2023.01.035.
117. Ogata, S., Takegami, M., Ozaki, T., Nakashima, T., Onozuka, D., Murata,
104. Nimgaonkar, V., Krishna, V., Krishna, V., Tiu, E., Joshi, A., Vrabac, D., S., Nakaoku, Y., Suzuki, K., Hagihara, A., Noguchi, T., et al. (2021). Heat­
Bhambhvani, H., Smith, K., Johansen, J.S., Makawita, S., et al. (2023). stroke predictions by machine learning, weather information, and an all-
Development of an artificial intelligence-derived histologic signature population registry for 12-hour heatstroke alerts. Nat. Commun. 12,
associated with adjuvant gemcitabine treatment outcomes in pancreatic 4575. https://doi.org/10.1038/s41467-021-24823-0.
cancer. Cell Rep. Med. 4, 101013. https://doi.org/10.1016/j.xcrm.2023.
118. Zakka, C., Shad, R., Chaurasia, A., Dalal, A.R., Kim, J.L., Moor, M., Fong,
101013.
R., Phillips, C., Alexander, K., Ashley, E., et al. (2024). Almanac —
105. Sorin, M., Rezanejad, M., Karimi, E., Fiset, B., Desharnais, L., Perus, L.J. Retrieval-Augmented Language Models for Clinical Medicine. NEJM AI
M., Milette, S., Yu, M.W., Maritan, S.M., Doré, S., et al. (2023). Single-cell 1, AIoa2300068. https://doi.org/10.1056/AIoa2300068.
spatial landscapes of the lung tumour immune microenvironment. Nature
119. Menz, B.D., Kuderer, N.M., Bacchi, S., Modi, N.D., Chin-Yee, B., Hu, T.,
614, 548–554. https://doi.org/10.1038/s41586-022-05672-3.
Rickard, C., Haseloff, M., Vitry, A., McKinnon, R.A., et al. (2024). Current
106. Tian, F., Liu, D., Wei, N., Fu, Q., Sun, L., Liu, W., Sui, X., Tian, K., Nemeth, safeguards, risk mitigation, and transparency measures of large lan­
G., Feng, J., et al. (2024). Prediction of tumor origin in cancers of un­ guage models against the generation of health disinformation: repeated
known primary origin with cytology-based deep learning. Nat. Med. 30, cross sectional analysis. BMJ 384, e078538. https://doi.org/10.1136/
1309–1319. https://doi.org/10.1038/s41591-024-02915-w. bmj-2023-078538.
107. Placido, D., Yuan, B., Hjaltelin, J.X., Zheng, C., Haue, A.D., Chmura, P.J., 120. (2024). How to support the transition to AI-powered healthcare. Nat.
Yuan, C., Kim, J., Umeton, R., Antell, G., et al. (2023). A deep learning al­ Med. 30, 609–610. https://doi.org/10.1038/s41591-024-02897-9.
gorithm to predict risk of pancreatic cancer from disease trajectories. 121. Han, T., Kumar, A., Agarwal, C., and Lakkaraju, H. (2024). MedSafety­
Nat. Med. 29, 1113–1122. https://doi.org/10.1038/s41591-023-02332-5. Bench: evaluating and improving the medical safety of large language
108. Smits, F.J., Henry, A.C., Besselink, M.G., Busch, O.R., Van Eijck, C.H., models. Preprint at arXiv. https://doi.org/10.48550/arXiv.2403.03744.
Arntz, M., Bollen, T.L., Van Delden, O.M., Van Den Heuvel, D., Van Der 122. Shah, N.H., Entwistle, D., and Pfeffer, M.A. (2023). Creation and Adoption
Leij, C., et al. (2022). Algorithm-based care versus usual care for the early of Large Language Models in Medicine. JAMA 330, 866–869. https://doi.
recognition and management of complications after pancreatic resection org/10.1001/jama.2023.14217.
in the Netherlands: an open-label, nationwide, stepped-wedge cluster- 123. Sharma, A., Lin, I.W., Miner, A.S., Atkins, D.C., and Althoff, T. (2023). Hu­
randomised trial. Lancet 399, 1867–1875. https://doi.org/10.1016/ man–AI collaboration enables more empathic conversations in text-
S0140-6736(22)00182-9. based peer-to-peer mental health support. Nat. Mach. Intell. 5, 46–57.
109. Zhou, Y., Chia, M.A., Wagner, S.K., Ayhan, M.S., Williamson, D.J., https://doi.org/10.1038/s42256-022-00593-2.
Struyven, R.R., Liu, T., Xu, M., Lozano, M.G., Woodward-Court, P., 124. Yu, F., Moehring, A., Banerjee, O., Salz, T., Agarwal, N., and Rajpurkar, P.
et al. (2023). A foundation model for generalizable disease detection (2024). Heterogeneity and predictors of the effects of AI assistance on ra­
from retinal images. Nature 622, 156–163. https://doi.org/10.1038/ diologists. Nat. Med. 30, 837–849. https://doi.org/10.1038/s41591-024-
s41586-023-06555-x. 02850-w.
110. Haque, A., Milstein, A., and Fei-Fei, L. (2020). Illuminating the dark 125. Blumenthal, D., and Patel, B. (2024). The Regulation of Clinical Artificial
spaces of healthcare with ambient intelligence. Nature 585, 193–202. Intelligence. NEJM AI 1, AIpc2400545. https://doi.org/10.1056/AIpc
https://doi.org/10.1038/s41586-020-2669-y. 2400545.

Cell 188, July 10, 2025 3659


ll
Review
126. Capoot, A. (2024). Dexcom’s over-the-counter glucose monitor now 129. Togher, D., Dean, G., Moon, J., Mayola, R., Medina, A., Repec, J., Me­
offers users an AI summary of how sleep, meals and more impact sugar heux, M., Mather, S., Storey, M., Rickaby, S., et al. (2025). Evolution of
levels. CNBC. https://www.cnbc.com/2024/12/17/dexcom-launches- radiology staff perspectives during artificial intelligence (AI) implementa­
generative-ai-platform-for-stelo-users.html. tion for expedited lung cancer triage. Clin. Rad. 81, 106704. https://doi.
127. Rajpurkar, P., and Topol, E.J. (2025). A clinical certification pathway for org/10.1016/j.crad.2024.09.010.
generalist medical AI systems. Lancet 405, 20. https://doi.org/10.1016/ 130. Mollick, E. (2024). Co-Intelligence: Living and Working with AI
S0140-6736(24)02797-1. (Portfolio).
128. Kim, W. (2024). Seeing the Unseen: Advancing Generative AI Research in 131. Lee, P., Goldberg, C., and Kohane, I. (2023). The AI Revolution in Medi­
Radiology. Radiology 311, e240935. https://doi.org/10.1148/radiol.240935. cine: GPT-4 and Beyond (Pearson).

3660 Cell 188, July 10, 2025

Common questions

Powered by AI

AI plays a significant role in oncology by enabling automated analysis of whole-slide pathology images and predictive modeling of cancer susceptibility to treatment . AI can identify subtle morphological tumor features to inform clinical outcomes, and tools like TORCH accurately determine cancer origins . AI models predict pancreatic cancer complications post-resection, reducing mortality significantly when compared to standard care . Multimodal AI models in genomics, like those for PD-(L)1 blockade response in non-small cell lung cancer, also demonstrate AI's impact in predicting treatment outcomes .

AI algorithms facilitate tumor classification by analyzing medical imaging to detect histopathological features undetectable by humans, enabling the categorization of lung, breast, neuroendocrine, gastrointestinal stromal, and colorectal cancers . By predicting tumor histopathology, grading, and metastatic potential, AI informs the development of personalized treatment strategies . This results in more precise treatments, early interventions, and tailored therapeutic regimens, improving clinical outcomes and treatment efficiency in oncology .

AI algorithms have significantly advanced medical forecasting by using dynamic inputs across different levels—from molecular to population . Tools like EchoCLIP on echocardiograms detect changes difficult for humans to perceive, aiding individualized treatment prediction . At the population level, AI models forecast global infection spread and heatstroke events, guiding resource distribution . These capabilities improve clinical outcomes, patient management, and resource allocation, offering predictive insight for effective prevention strategies nationally and globally .

AI models enhance cardiovascular disease prediction by identifying unique patient variables, leading to more accurate risk assessment of coronary artery disease (CAD). Tools like EchoCLIP characterize subtle echocardiogram changes over time , and ML models identify myocardial infarction risks from coronary artery CTA data, discovering predictive inflammation even without obstructive disease . AI is used to forecast heart disease events and improve risk prediction beyond human capability .

Integrating contactless sensor data with AI models can transform patient care by providing continuous, non-invasive monitoring of neurological conditions, significantly aiding neurodegenerative disease management . For example, sensors track gait speed in Parkinson's patients to monitor disease progression . Cameras assess disease progression and molecular etiologies in conditions like Friedreich’s ataxia, enabling earlier diagnosis and personalized treatment planning . This ambient intelligence enhances patient monitoring accuracy and improves clinical interventions without patient discomfort or interruption .

AI-based multimodal models significantly influence computational pathology by integrating different data modes to improve image interpretation and diagnosis . Innovations include visual-language models that facilitate pathology image analysis and multimodal generative AI frameworks that function as co-pilots for pathologists . These advancements help in more reliably identifying cancer types, predicting treatment responses, and leveraging social media data for image insights, contributing to informed medical decisions and tailored therapies .

AlphaFold and similar tools have significantly advanced drug discovery by accurately predicting protein structures, facilitating the design of diagnostic strategies for diseases . They aid in identifying drug binding modes, accelerating the discovery of therapeutically relevant targets . These tools lead to improved understanding and manipulation of molecular interactions, enhancing the efficiency of biomedical research and opening new possibilities for innovative drug development .

Advanced AI techniques, such as those used in RFdiffusion, FrameDiff for 3D structure generation, and sequence generation tools like ProGen, ProteinMPNN, and Evo, allow for the prediction and design of proteins based on various inputs . This enables the creation of novel proteins with specific structures or functions, opening possibilities for designing proteins tailored to specific tasks or environments . These models help elucidate biological mechanisms and facilitate the manipulation of biological processes, advancing synthetic biology and offering new therapeutic approaches for complex diseases . The integration of multimodal foundation models like Evo 2 further enhances the ability to understand and manipulate the central dogma from DNA to protein .

AI advancements in predicting patient risks utilize EHR data by developing models that foresee readmission, mortality, and length-of-stay . For example, an AI model predicts International Classification of Diseases (ICD) codes for future visits, increasing detection capabilities for rare outcomes like pancreatic cancer . AI also predicts seizure recurrence risks in pediatric patients by analyzing routine clinical data, indicating improved outcome forecasting through EHR data synthesis .

Challenges in deploying AI in healthcare include issues of data accuracy, bias, privacy, and ethics, particularly related to large language models (LLMs). LLMs have been known to "hallucinate" or generate incorrect information, though recent efforts have aimed to mitigate this . Bias from training data persists, potentially impacting the fairness and reliability of these models . The necessity for real-world, prospective studies supersedes simulations with patient actors to validate AI applications .

You might also like