0% found this document useful (0 votes)
55 views21 pages

Predictive Analytics in Healthcare Big Data Better Decisions

The document discusses the role of predictive analytics in healthcare, highlighting its potential to improve patient care and resource management through the use of big data and machine learning. A systematic review of literature from 2010 to 2023 identified 55 relevant studies that demonstrate the expanding application of predictive analytics in areas such as disease risk profiling and clinical decision-making. The review also addresses challenges such as data quality and model interpretability, emphasizing the need for proper validation and integration into clinical practices for successful implementation.

Uploaded by

kazuhaleaf000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views21 pages

Predictive Analytics in Healthcare Big Data Better Decisions

The document discusses the role of predictive analytics in healthcare, highlighting its potential to improve patient care and resource management through the use of big data and machine learning. A systematic review of literature from 2010 to 2023 identified 55 relevant studies that demonstrate the expanding application of predictive analytics in areas such as disease risk profiling and clinical decision-making. The review also addresses challenges such as data quality and model interpretability, emphasizing the need for proper validation and integration into clinical practices for successful implementation.

Uploaded by

kazuhaleaf000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

International Journal of Scientific Research and Modern Technology (IJSRMT) ijsrmt.

com
Volume 4 Issue1, 2025
DOI: https://doi.org/10.5281/zenodo.14630840
_________________________________________________________________________________________

Predictive Analytics in Healthcare: Big Data,


Better Decisions
Md Jawadur Rahim1; Ahlina Afroz2; Omolola Akinola3
1,2,3
HCS Home Health Care Services of NY

Publication Date: 2025/01/17

Abstract
The healthcare systems worldwide are moving towards the concept of predictive analytics, using data on patients for better
and effective treatment and to organize usage of resources effectively. Given the exponential growth in digitalization and
electronic health records (EHRs), machine learning (ML) and big data analytical models present the greatest forms of
predictive health care. Hence, this comprehensive review will endeavor to make an evidence based, up-to-data compilation
of past, current and future findings on data analytics applications in the domain of predictive healthcare. Materials and
Methods: A comprehensive bibliographic database was searched using PubMed, Scopus, and Google Scholar electronic
databases. Original articles published between January, 2010 and December, 2023 in peer reviewed international journals
were retrieved that mainly dealt with predictive analytics in healthcare employing either machine learning, artificial
intelligence, and big data processing methods. The general and specific data sources, the techniques used for analysis, the
clinical use of the method and the efficiency results were obtained. Therefore, out of 823 identified studies, 55 papers were
included into the research, indicating that the use of predictive analytics is expanding across the healthcare spectrum. These
sources included EHR, claim data, gensomic data and wearable data. Deep learning and ensemble method were proved to
have better prediction accuracy than traditional statistical methods. Core uses included disease risk profiling, patient
characterization, risk of readmission, clinical decision making, and personalized medicine. Other limitations were also
highlighted in the study to include issues concerning data quality, or the explanation of the created models and balancing of
fairness and equality when making the models. Application of predictive analytics for healthcare is an ambitious step towards
probability of early diagnosis of diseases, appropriate therapeutic approach, and optimal usage of resources. Yet, training,
proper external validation, model updating, and integration of the model into clinical routines are a prerequisite for success.
Shoring up, data governance, privacy or any form of prejudice within algorithms also remain crucial. The information and
experience described in this review is principally concerned with the role of data analysis in the predictive health system. As
healthcare organizations are producing increasing amount of data, use of the sophisticated data analysis methods will be
crucial for achieving better clinical results, better organizational performance and innovation in the delivery of care.

Keywords: Predictive Analytics, Machine Learning, Artificial Intelligence, Big Data, Healthcare Informatics, Precision
Medicine, Clinical Decision Support.

I. INTRODUCTION They have been characterised by a reactive system


where patients seek treatment only when symptoms
The In the recent past, the healthcare industry has appear, hence resulting to unproportionate health risks and
been transformed mainly by the increasing poor prognosis (Alghamdi et al., 2021). However, the
computerization of records and the adoption of innovative integration of predictive analytics, and machine learning
analytic methods. The constant rise in EHRs, genomic data (ML), into describing and enhancing chronic care in
and wearable device data presents a unique opportunity to healthcare can change this pattern to proactive, preventive,
leverage big data and predictive analytics on patient care and personalized care (Batko & Ślęzak, 2022). Predictive
and healthcare delivery (Leung, et al. 2020). In the current analysis uses a set of statistical or computational methods
world where healthcare institutions are aiming at which work on past and live data to find out some
delivering patient-centered value-base care in a cost- tendencies how the future trends or events could be
effective manner, decision making for improved (Alharthi, 2018).
healthcare has become a competitive business proposal
(Bartley, 2021).

Rahim, M. J., Afroz, A., & Akinola, O. (2025). Predictive Analytics in Healthcare: Big Data, Better Decisions.
International Journal of Scientific Research and Modern Technology, 4(1), 1–21.
https://doi.org/10.5281/zenodo.14630840
1
In the healthcare domain, BPA can be used across 4. Explain the major difficulties and disadvantages
different areas such as; disease risk prediction, patient characteristic for the use of predictive analytics in the
phenotyping, readmission risks analysis, clinical decision- healthcare industry.
making, and precision medicine (Muniasamy et al., 2020). 5. Identify new trends and future development in the
By using large amounts of structured and unstructured subject of predictive healthcare analytics.
EHR, claim data, genomic data and data from wearable
devices, prediction models are able to identify patients at  Hypothesis 1:
high risk of getting certain conditions, to predict responses The combination of machine learning and other
to certain treatments and to allocate resources in an cutting-edge approaches to data analysis of various types
efficient manner (Galetsi & Katsaliaki, 2020). of healthcare data will greatly improve the effectiveness of
predictions in the sphere.
The advances in the prediction analytical techniques
especially the use of deep learning techniques and  Hypothesis 2:
ensemble learning have boosted the use of machine Predictive analytics would improve healthcare
learning in healthcare (Amarasingham et al., 2014). These because it would allow for early detection of diseases,
advanced algorithms are capable in learning from high prescribing of the right treatments to patients, and the right
dimensional complex information and identify intricate distribution of resources; all to the benefit of the patient as
patterns and relation that other conventional statistic well as to optimizing costs.
method may not be able to capture (Ghassemi et al., 2018).
In addition, the integration of imaging, genomic and  Hypothesis 3:
environmental data in to the model fosters the When it comes to practice, or implementation of
development of more effective individualized predictive predictive models in the healthcare system, tasks such as
models than merely using clinical parameters (Belle et al., data quality, model interpretability or the ethical aspect of
2015). the model would define routes by which data analytics will
go in future.
Despite all the discussed benefits of the application
of predictive analytics in the context of healthcare  Conceptual Overview of Health Big Data Analytics
numerous challenges exist. Major challenges include data Technologies
quality, and compatibility and integration concerns, due to Technologies in big data application in healthcare are
the propagation of data heterogeneity, noise, and missing in every basic way based on various forms of data that
values inherent in healthcare data (Chinchmalatpure & interconnect to form an integrated health data system. The
Dhore, 2021). However, questions arise as to whether primary sources include Electronic Medical Records
models that support decision-making can be easily (EMRs) inclusive of test results and clinical observations,
interpreted or not – or whether the algorithms used are free Human Genome Sequence, and RNA-seq data, and Public
from bias; aspects that are also important and cannot be Health datasets (Galetsi & Katsaliaki, 2020). Furthermore,
overlooked (Char et al., 2018; Nevin & PLoS Medicine social media data which are gathered from Cyberspace, as
Editors, 2018). well as Cyber-Physical Systems, such as wearable
technologies and body implants, form this large pool of
This review aims at presenting a synthesized data collectively. The process adopted for standardizing
perspective on what is known today about applications of these data sources includes the medical term frameworks
data analytics to predictive healthcare. Specifically, this including SNOMED CTD and Medical data standard
review aims to: LOINC and HL7/ICD. Standardization is become
important to keep data integrity and to ensure that data can
1. Discuss the numerous data types and approach used in easily move from one health system to another (Belle et
predictive healthcare analytics. al., 2015). The metadata dictionary and medical ontologies
2. Understand the variations of ways of approaches to the act as reference structures that guarantee the right
application of predictive analytics in the clinical zones approach to data categorization and subsequent
of healthcare other than lean manufacturing areas. relationship definition to allow integration and analysis.
3. Discuss the effectiveness lessons of the predictive
models in enhancing patients’ health and health care
system.

2
Fig 1 Conceptual Overview of Health Big Data Analytics Technologies

The problems related to healthcare analytics mainly The tools and platforms used in the healthcare big
pertain to metadata management and the ELT processes. data analysis address a broad spectrum of functionalities.
These processes are important for the sustainable quality These are semantic, outcome predictive, explanatory
and convertibility of the data used in the integrated health informative, decision prescriptive, performance
data system. The technology behind these processes comparative, and discovery exploratory analytics, which
includes a lot of database systems for handling big data play distinct roles in health care decision making
such as HDFS, S3, RDBMS & NoSQL managed by according to Alghamdi et al. (2021). HBase, Hive, Pig,
YARN & MESOS frameworks (Batko & Ślęzak, 2022). Mahout, Zookeeper, Cassandra, Avro, and JaQL are the
At the heart of Spark Computing is Spark Computing technological enablers of the analytical processes that
Core, which is responsible of intense computational and constitute big data and analytics. These tools are backed
analytical process. This infrastructure is needed for up by different libraries such as Machine Learning
managing the amount and the type of healthcare data, Libraries, Streaming Libraries as well as GraphX libraries
while maintaining its quality and availability. The Social and all these help in complex data manipulation and
Network Analysis and Genomic Inference Engine analysis. With such tools in place healthcare organizations
elements are used to process some of those value-added are then able to analyze large sets of data, enabling more
data types and help in analyzing patterns of health more effective predictions and overall, enhancing the
holistically (Amarasingham et al., 2014). effectiveness of their decision-making processes
(Alharthi, 2018).

3
It is therefore important to note that the outputs of II. MATERIALS AND METHODS FOR DATA
healthcare analytics technologies are comprehensible COLLECTION
conclusions and decision making. The output of the system
is the diagnosis decision, preventive care This systematic review was achieved after going
recommendations, disease surveillance, and admission through the successful notification of the searchable
decisions (David et al., 2019). Coordination tools for 2D databases such as PubMed, Scopus, and Google Scholar.
as well as 3D data present allows the feed forward to be The search words used included, predictive analytics,
easily interpreted and analyzed for depth. Score cards and machine learning, artificial intelligence, big data,
other strategic planning tools are used in operational healthcare informatics among others.
decision while drugs efficiency assessment offers
information on the effectiveness of the treatment. This The initial search was carried out in PubMed using
makes it easier for healthcare providers to use analytics the following search string: (("PA” OR “ML” OR “AI” OR
into achieving their goals of improving patient satisfaction “BD”) AND (“HC” OR “HC industry” OR “Cliical and
and health at the same time reduce cost of health care Medical”). These searches produced a total of 3,472
delivery. These various types of insights and decisions records which were potentially pertinent to the study.
produced by the system show a large improvement the Later, the search was done using Scopus and Google
future of healthcare from the conventional care model Scholar to ensure more outputs that could have been
from treatment to preventive care (Char et al., 2018). missed in the Web of Science were captured.

Fig 2 Stages of Effective Literature Review Process

To ensure a comprehensive review of the most recent  Studies written in the English language.
and relevant studies, the study restricted the search to
articles published between 2010 and 2023. This period was  Exclusion Criteria Included:
selected to include changes that has taken place in the field
of predictive analytics and machine learning within  Studies not directly related to predictive analytics or
healthcare over the last decade. There was still a list of machine learning in healthcare.
duplicate titles, and after eliminating it, the title and  Studies focusing solely on image analysis or computer-
abstract of qualified studies were examined and screened. aided diagnosis without predictive modeling.
The inclusion criteria were as follows:  Review articles, editorials, opinion pieces, or case
reports.
 Studies focusing on the application of predictive  Studies with limited or unclear methodology
analytics, machine learning, or artificial intelligence descriptions.
techniques in healthcare settings.
 Studies utilizing various data sources, such as During our screening process, we excluded 5,654
electronic health records (EHRs), claims data, genomic records based on our initial criteria, leaving us with 430
data, or wearable device data. full-text articles to assess for eligibility. From these, we
 Studies reporting on the development, validation, or further excluded 313 articles for various reasons as
implementation of predictive models for clinical detailed in the PRISMA diagram: 72 conference
applications, including disease risk prediction, patient proceedings, 4 articles with unavailable full texts, 91
phenotyping, readmission risk assessment, clinical articles that did not meet our inclusion criteria, 84 articles
decision support, or precision medicine. with limited scope and not from international journals, 29
 Studies published in peer-reviewed journals or review papers, and 33 duplicates that were identified
conference proceedings. during the full-text review phase.

4
Fig 3 PRISMA for Systematic Reviews and Meta-Analyses Flow Chart Showing the Literature Search Process.

Furthermore, we identified 117 studies that met all unprecedented surge in data generation has created what is
our inclusion criteria for the qualitative synthesis. The commonly referred to as "big data" in healthcare,
authors of these papers provided full texts of their work; characterized by its three fundamental dimensions of big
therefore, we read the unformatted texts of all the papers data including volume, velocity, and variety. The scope
and identified and summarized appropriate findings. We aspect deals with the volume or amount of data that is
extracted the content based on the study type, data used, being put out across different contexts where the
analytic methods, clinical use, evaluative indexes, and healthcare provision is being offered be it in a small clinic
results. Furthermore, we searched through the references or a giant health system. The velocity dimension tries to
of our chosen sources to identify any articles that we might cope with the rate of occurrence and the requirements for
have missed while searching through the databases. After new data to be analysed such as the need for real time
this additional screening and thorough review process, we monitoring and analysis. The variety component covers
finalized our analysis with 55 papers that fully met all our the set of types of data ranging from numerical tabular and
quality and relevance criteria for in-depth review and organized form to free form text and raw images, which
synthesis. poses a vast challenge to classical data handling and
processing tools (Nambiar et al., 2013). The emergence of
III. RESULTS AND DISCUSSION such data characteristics in HC has required the design of
tools and methods that can help analyze the data
A. The Emergence of Big Data in Healthcare characteristics while protecting data integrity,
The healthcare industry has gone through a confidentiality, and availability to those involved in
revolutionary change in data creation and handling healthcare delivery and research.
processes on the backdrop of EHR implementation,
emergence of advanced digital imaging, role of wearable
and telemonitoring devices (Batko & Ślęzak, 2022). This
5
 Electronic Health Records (EHRs) network based deep learning systems have shown a great
Electronic health records have transformed the accuracy in identifying anomalies and categorising
current healthcare organizations through collection, diseases from images. Such advancements have brought
organization and sharing of comprehensive patient not only a better standard of diagnostic precision, but also
electronic medical records (Alharthi, 2018). Such e- increased the effectiveness of work processes and the
records boast of extensive patient data including workload of healthcare practitioners (Lynch & Liston,
demographic data, comprehensive medical records, 2018).
laboratory data, imaging studies, and patient specific
treatment plans (Bartley, 2021). It has become The availability of standardized process protocols for
increasingly apparent with the widespread adoption of imaging and structure of reporting systems has enhanced
EHRs, and the storage of large volumes of both structured better incorporation of imaging data with other
and unstructured data indicating an enormous potential information systems. This standardization has facilitated
and asset to the future of analytics in healthcare settings. the accurate examination of patient data with improved
Structured data consists of highly formatted data fields diagnostic check and enhance treatment plans. By
like, vital signs, medication doses, and labs where integrating the imaging modality with automated image
unstructured data comprises of concepts like Clinical analysis, new possibilities have arisen for tools that
notes, radiology findings, and interactions with the patient. support prognostication relevant to disease progression
This richness of data types gives the healthcare providers and treatment efficacy (Park et al., 2018).
an excellent picture of the health status of the patients and
the possibility to conduct deep analyses to improve the  Genomic and Omics Data
patients’ treatment outcomes (Alghamdi et al., 2021). The current advancements and shortening of high-
throughput sequencing technologies have transformed
The added use of artificial intelligence as well as academia and health care into a data-intensive discipline
machine learning algorithms to EHR systems have greatly (Ghassemi et al., 2018). These sets of data including
improved its application in predictive healthcare. The genomics, transcriptomics data and proteomics data are
sophistication in natural language processing algorithms evolutional in describing the human body as well as
makes it possible to make sense of these clinical narratives pathogenetic mechanisms of diseases. Combination of
and deep learning models can learn how the multiple these unique molecular data sources has in fact birthed
features of a patient’s data to forecast the disease’s unprecedented opportunities for development of
progression and the patients’ response to various personalized medicine, enhancing on disease risk
treatments (Chen et al., 2017). All these advancements put assessment and more importantly on disease control by
have made EHRs to evolve from being just electronic means of mechanized drug treatments (Amarasingham et
repositories of patient data into robust tools in clinical al., 2014). Since the data collected from most patients are
decision making and patient individualized treatment. large scale and complex, the genomic and omics data
analysis needs to involve sophisticated computational
The adoption of common transmission specifications methods and storage system.
and integration solutions has again extended the
application of EHR systems in healthcare informatics. Cloud computing platforms and framework of
They enhance compliance mechanized health data unity, distributed computing has brought changes in the
which helps large-population charge crosswise processing and analysis of the genomic data. It is important
indispensable wellbeing studies and effectiveness to note that these technologies have enabled one to meet
research. This has in turn resulted in the improvement of daunting computation demand of genomic analysis while
many robust predictive schematics that are capable of addressing issues of data security and use. The availability
using a variety of patients and/or settings to achieve better of dedicated, local genomic bioinformatics tools and
and more general outcomes (Beam & Kohane, 2018). pipelines has also improved our capability to analyse large
and intricate molecular data (Iqbal et al, 2016).
 Digital Imaging and Diagnostic Data
The use of medical imaging modality like computed The addition of the views of genes to other details
tomography computed tomography (CT), magnetic about the patient has promoted more effective approaches
resonance imaging (MRI), and digital pathology have to disease diagnosis and treatment planning. This multi-
played a big role in increase in the amount of health care modal data analysis has also resulted in the discovery of
data (Belle et al., 2015). These imaging techniques provide new biomarkers and therapeutic targets given the current
finer and volumetric images that have different uses, state of knowledge about disease processes and therapeutic
including diagnosis, treatment reorganisation, and intricate approaches to their management. Along with other sources
prognosis (Badawy et al., 2021). Technical details of of clinical information, genomic information has brought
different imaging equipment have improved over the years new potential into precision medicine and individualized
resulting in more complicated images to be stored, therapeutic approaches (Vayena et al., 2018).
analysed, and managed in different ways for clinical
intervention.  Wearable and Remote Monitoring Devices
The use of wearables and remote patient monitoring
The combination of artificial intelligence algorithms technologies for healthcare data has changed the way
with medical, imaging has brought significant changes in ongoing patient monitoring occurs outside a clinical
diagnostic services in healthcare. Convolutional neural practice setup (Leung et al., 2020). These complex
6
monitors record various physiological aspects such as variation from normal parameters can set off alarm bells
heart rate, blood pressure, physical activity, and sleep for early diagnostics of pathological processes. The
which significant information comparative to patient’s integration of data captured through wearable devices with
health status and behavior (Galetsi & Katsaliaki, 2020). more conventional data assets has opened up new horizons
Collection of data from these devices in real- time has in of remote patient care and digital health (Char et al., 2018).
turn led to changes in identifying and solving health
complications early enough which are advantages based Besides, data standardisation and data transfer
on the vast amounts of longitudinal data collected from infrastructure to incorporate wearables has improved the
these devices which can be used in making predictive compatibility of the wearable devices’ data with the
analysis and developing individualized based health care. clinical settings databases. Such developments have made
it possible for providers to use the real-time patient
Due to technological advances, artificial intelligence monitoring information in arriving at appropriate
and machine learning ideas can be employed to analyze decisions. The rise in the use of wearable technologies is
wearable devices’ data and improve patient health also developing new avenues for population health work
pattern’s understanding. Modern techniques of extraction and generating new knowledge-generation models for
of vital signs are evolved to the extent that any slight various disorders (Cohen et al., 2014).

Table 1 Estimated Annual Growth Rates of Healthcare Data Sources


Data Source Estimated Annual Data Volume Projected Primary Applications
Growth Rate (2023) Volume (2025)
Electronic Health 48% 500 PB 1,850 PB Clinical decision support,
Records (EHRs) Risk prediction
Medical Imaging Data 30% 300 PB 507 PB Diagnostic analysis,
Treatment planning
Genomic and Omics 25% 200 PB 312 PB Precision medicine, Drug
Data development
Wearable Device Data 60% 150 PB 384 PB Remote monitoring,
Preventive care
Sources: (Batko & Ślęzak, 2022; Frost & Sullivan, n.d.; Nambiar et al., 2013)

With the availability of exponential increase in new paradigms of working with information. Such
complex forms of healthcare data, there are huge successful implementation in health care will also entail
opportunities to emerge intelligent decision support continued commitment to infrastructure, capacity-building
system through applying state of the art analytics and in person, and research to unlock potential of such rich
predictive algorithm models (Wang and Alexander, 2015). instruments (Van Calster et al ., 2016).
This flood of information has led to the emergence of an
increasingly complex infrastructure and more elaborate B. Predictive Analytics and Machine Learning in
analytical instruments to cope with the continually Healthcare
growing amount of health care information. This use of the With predictive analytics such sophisticated and
disparate forms of data necessitates sound data governance highly effective knowledge based statistical and
and other data management protocols, as well as highly computational modeling methodologies used to analyze
sophisticate analytical tools and close cooperation among historical and real time data and forecast future behavior it
the multidisciplinary treatment and research team, means that healthcare has been is going through paradigm
including data scientists and subject matter experts shift (Alharthi, 2018). It has become more relevant in
(Galetsi & Katsaliaki, 2020). various areas of healthcare such as the prediction of
diseases risk, patient characterization, risk of readmission
Several types of these data have been implemented anticipation, clinical decision making, and precision
for more comprehensive strategies of patient treatment and medicine according to Muniasamy et al., (2020). It has
health risk assessment. Through data integration several allowed health care organizations to shift from forcing a
data source offer the potential of improved predictive reactive approach to taking a proactive approach for
power and enhanced anatomical, functional, and anticipating health concern before they worsen and
molecular characterizations. This envisaged appropriate resource distribution to various health care
interprofessional collaborative approach to healthcare data centres. Using of these analytical tools raises many
analysis is enhancing patients’ quality of care, costs questions about the quality of the data, model validation,
containment, and resources realization across the and medical applicability of the suggested predictions to
healthcare facility systems (Goldstein et al., 2016). actual healthcare environment.

Due to advancements in technology, it is expected  Machine Learning Techniques


that future of healthcare analytics will be defined by
further innovations and better analytical systems.  Supervised Learning
Proliferation of new forms of data and changes in the Employing supervised learning algorithms has
current ones will require healthcare organizations to adopt become indispensable in healthcare predictive models

7
where fundamental methods like logistic regression and techniques in healthcare (Nevin & PLoS Medicine Editors,
decision tree and random forest are applied for different 2018).
prediction tasks (Badawy et al., 2021). These algorithms
are well suited for pattern classification from labelled  Ensemble Methods
training data for prediction on new patterns, which are The results of ensemble methods in healthcare
widely used in clinical contexts as historical pattern applications seem to have promising potential, based on
outcomes might be quite well known (Nithya & Ilango, the works of many authors, or how more base models are
2017). Thus, the use of the approaches for supervised combined to improve the prediction model (Batko &
learning is critically dependent on the quality of the Ślęzak, 2022). The above methods such as random forests,
training data, their representativeness, proper selection of gradient boosting machines, and stacking ensembles take
features, and validation methods. advantage of the disparity of one model to another in order
to make accurate predictions (Boukenze et al., 2016).
The basic methods of supervised learning algorithms Ensemble methods have received high approval rate in
have been improved with the recent advancement in the healthcare analytics due to virtue of accurately capturing
ensemble methods in healthcare utilities. This brings an complex patterns, enhancing model steadiness and provide
environment of multiple base models by using methods remedy for overfitting.
like bagging as well as boosting and this has led to
enhanced precision of predictions together with enhanced Recent advancements in speedy automated
prematurity (Christodoulou et al., 2019). These advanced techniques for ensemble selection further extend the
approaches are useful because the data in healthcare is applicability of these methods in health care systems. In
complex and heterogeneous in nature. the recent past, there have been developments that work
out the best possible boosting strategy for the specified
Domain knowledge added to them, together with base forms of models to achieve the highest significance
supervised learning algorithms, has enhanced their clinical of predictions and lower computational cost (van der Ploeg
application even more. Reduced dimensions of features by et al., 2016). The development of new concepts in
medical experts and model restrictions also guarantee that distributed computing has now helped to address the issue
the results are not confusing to the current body of medical of deploying large-scale ensemble models in healthcare
knowledge and are consistent with current practice environments. These technological advances have enabled
protocols (Riley et al., 2016). The integration of these the application of ensemble concepts while at the same
learning algorithms with statistics has resulted to better time meeting real-time clinical applications as a must meet
and simpler models to predict an output. requirement (Harris et al., 2016).

 Deep Learning  Applications of Predictive Analytics in Healthcare


Over time, deep learning has been found to transform
the general healthcare predictive analytics with CNNs and  Disease Risk Prediction
RNNs considered as some of the best architectures useful Risk estimations in disease prediction are some of the
in handling the complicated, high-dimensional healthcare advancements that has transformed the way risk factors for
data (Muniasamy et al., 2020). It has been seen that these diseases are diagnosed and evaluated in an individual
complex models hold special potential especially for data through clinical risk models specific to target diseases
types including medical images, time-series, and EHR, by such as cardiovascular diseases, diabetes, and cancers
resulting in better prognostic diagnostic and prognostic (Andjelkovic Cirkovic et al., 2029). These complex
outcomes. Due to the characteristics of capable of models use data on demographic and clinical
hierarchically learning from scratch, deep learning models characteristics, past medical history, lifestyles, personal
have eliminated a lot of manual feature extraction, and the genomes, or coupons, for constructing detailed risk
model can discover many features that might not be personas helping to develop early- and proactive-disease
discovered by previous analytical methods. prevention programs (Subrahmanya et al., 2022). With
increased stratification, individual patient risks can be
Present advancements of different deep learning more easily managed, and strategies better adapted for
architectures for health care purposes have shown massive sending screening programs to populations of high-risk
enhancements of the models. The applications of attention patients.
mechanisms and graph neural networks have improved the
modelling of the relationships within the healthcare data A combination of patient monitoring data with
While transfer learning solution has been used to machine learning algorithms has greatly improved the time
overcome the biggest problem of limited labelled data in bound disease risk assessment. Risk assessment as a
many clinical applications (Liu et al., 2019). technique can also benefit from advanced analytics
systems by being able to update its assessments as
The ability to interpret deep learning models more information comes in which improves the level of
easily through techniques such as attention visualization, proactivity that the approaches taken to a specific patient’s
and feature attribution methods have made these robust care can be (Zafar et al., 2019). These systems have
tools more apparent to healthcare providers. Such indicated promising signs in detecting early signs of acute
developments have assisted in the narrowing the wealth conditions that would warrant intercessions.
between sophisticated prognostications and medical
judgments to improve the conformity of deep learning
8
With the emergence of multi-modal data analysis Recent developments in NLP have improved the
additional enhancements have been made to raise the feasibility of applying NLP techniques to identify,
accuracy of the disease risk assessment models. The different risk-contributing factors from clinical notes and
traditional models of image data are rather limited in their discharge summaries. All these have enhanced risk
ability to deliver complete and accurate prognosis; assessments which involved previous ignored information
however, more complex models that include molecular sources (Culotta, 2010). Given the strong evidence base
markers and other differentiations alongside for incorporating social determinants of health into
environmental factors can offer more detail and better prediction models of early readmissions, this approach has
stratification of risks (Mounika et al., 2015). The addition enhanced the applicability of the readmission risk models
of social determinants also improves our understanding of to various populations of patients.
disease risks and increases the accuracy of those measures
across populations. Machine learning methods have been shown to
outperform statistical methods when it comes to
 Patient Phenotyping and Stratification readmission risk prediction. Automated systems can detect
Patient phenotyping is stands for the more complex intricate relations between risk factors as well as design
way of grouping of patients and prioritizing them due to operations that can regularly update equations that apply
the clinical, genetic, and therapeutic characteristics to growing populations (Shanthipriya & Prabavathi,
(Hripcsak et al., 2016). This extended analysis method 2018). More recent advances in obtaining interpretability
assists health care deliver system to establish more precise of machine learning models have introduced the ease of
course of action for treatment by identifying several types adopting risk predictions in the healthcare domain.
of patient groups. The integration of the careful dissection
of patient phenotypes using advanced machine learning  Clinical Decision Support
and very detailed patient information has changed our Clinical decision support systems (CDSS) are a prime
ability to detect clinically meaningful phenotypes and example of the impact of machined learning and big data
characterize outcomes of treatments (Jen et al., 2012). analytics for healthcare, helping clinicians to make sound
diagnosis and treatment decisions and plan resource use
The most recent technologies have also advanced (Ohno-Machado, 2018). These systems combine expert-
criteria for phenotyping patients through automated developed algorithms with practice protocols, as well as
phenotyping algorithms, implying that patient documented best practice knowledge, to generate
stratification procedures are faster and more accurate. individualized, high-value recommendations to optimise
Such systems can analyse large amount of clinical data to patient care. Lack of implementation of an effective CDSS
discover significant patterns and correlations that could system has been seen to improve clinical outcomes,
not be discovered by other analysis techniques (Linda, decrease clinical errors and increase provider’s overall
2016). Using natural language processing techniques in productivity (Hassanalinda & Noordee, 2017).
the extraction of valuable phenotypic and clinical reports
have allowed extraction of useful phenotypic data from the AI enhancement of decision support systems has led
textual reports. to the advancement of CDSS as Attended Active
Knowledge Systems that incorporate contextual features.
There are many useful and more effective ways of Contemporary systems are capable of interpreting large
presenting the more intricate phenotypic features that exist and diverse patient information in realtime with
now. ICTs such as dash boards and other tools assist the consequent outputs that are patient-specific and contingent
clinicians to visualize phenotype outcomes and to make on the patient context (Suresh 2016). The best realistic
concrete decisions about their patients. The connection of improvement in recent years has been seen in the
phenotyping results with clinical decision support systems improvement of the techniques of clinical decision support
has given rise to new opportunities for individualized recommendations based on information that can be
approach in pharmacotherapy and targeted treatment explained.
options.
Hitherto, the modern methods in the realm of
 Readmission Risk Assessment analytics have improved the learning capabilities of CDSS
Transitions are a major concern in most health by using the outcomes of prior incidences and the
organisations because that it has on the health of patients evolutions in the clinical standard. These systems can now
as well as the cost of health delivery system (Higdon et al., and again identify patterns in treatment responses and
2013). Risk prediction models have therefore emerged as recommended the most effective treatment plan educating
critical means of identifying high risk patients to allow patient factors (Prabavathi & Shanthipriya, 2017). Real
healthcare workers to intervene more and plan better for a time monitoring data has helped clinicians to make more
safe discharge. These models review several aspects, such accurate clinical decisions that are timely because of
as underlying diseases, treatments, and outcomes, and changes in treatment plans.
social demographics to produce precise readmission risk
rates (Amarasingham et al., 2014). This utilization of these  Precision Medicine
predictive tools has result in reduced readmission rates and Precision medicine is an entirely new approach to
enhanced treating quality of the patients (David et al., health care that uses mathematical algorithms to decide on
2019). the most appropriate treatment for a given patient based on
his/her genetic predisposition and other factors in his/her
9
environment and lifestyle (Ghassemi et al., 2018). This responses received from the patients allows modifications
approach has revolutionized the choice of treatment and in the management or diagnosis of the case and
the method of regulating the dose identifying the greatest identification of the side effects (Batko & Ślęzak, 2022).
therapeutic potential with a minimum of side effects. Precision medicine concepts applied to clinical workflow
Combining genomic information with clinician’s data has approaches have adapted the implementation of
given rise to new approaches that use personalize Differential Treatment Planning.
treatment regimens (Bakare & Argiddi, 2016).
 Data Sources for Predictive Healthcare Analytics
Recent developments in artificial intelligence have Predictive analytics in healthcare is primarily about
improved our capability to determine and predict patient how well various datasets are integrated and how good the
individualized response to certain treatment options. These incoming information is (Galetsi & Katsaliaki, 2020).
systems can process big molecular data in conjunction Contemporary health care can encompass a considerable
with patients’ outcomes data to estimate the efficacy of the variety of data, which can be characterized from various
treatment and likely adverse effects (Chinchmalatpure & standpoints and provide different categories of
Dhore, 2021). There are several advantages of specialized information with additional or different difficulties in data
statistical tools: These have allowed translating different collection, data preparation, and data analysis (Belle et al.,
genomic views for patient’s benefits into clinical 2015). To ensure that every aspect of predictive analytics
intervention guidelines. is accomplished successfully the quality, standard and
compatibility of the data gathered from these various
Accountable activeness has enhanced the real-time sources should be considered.
consecutive observation of the effectiveness of assorted
precision medication intervention. Real-time review of

Table 2 Common data sources for predictive healthcare analytics


Data Source Description Key Features Technical Primary Challenges
Requirements Applications
Electronic Digital patient Structured and Secure storage, Disease Data quality,
Health Records medical records unstructured standardized prediction, interoperability
(EHRs) clinical data formats phenotyping
Claims Data Healthcare billing Standardized Processing Cost analysis, Coding accuracy,
and insurance coding systems pipelines, utilization temporal lag
information validation tools patterns
Genomic Data Molecular and High- High- Precision Storage
genetic dimensional performance medicine, risk requirements,
information sequence data computing assessment processing time
Wearable Continuous Real-time Edge computing, Remote Data quality,
Device Data physiological streaming data secure monitoring, integration
monitoring transmission early warning
Imaging Data Diagnostic Multi- Specialized Diagnostic Storage costs,
medical images dimensional storage, support, disease standardization
visual data processing tools monitoring
Social Environmental Diverse external Data integration Risk Data availability,
Determinants and behavioral data sources platforms stratification, standardization
factors intervention
planning

The combination of conventional and novel kinds of steady streaming of data obtained from wearable devices
data entails new prospect in efficient HA. Current and monitoring systems, allowing for the timely
predictive models can combine data extracted from identification of unfavourable outcomes that occur in
different domains, while more accurately predicting patients and the subsequent adjustment of patient
patient outcomes and population health trends. The use of management plans in response to these changes, (Levy-Fix
complex data integration models has enhanced the et al., 2018).
blending of multiple forms of data thus ensuring that the
quality and security of the data is preserved during the Promises of data governance frameworks and privacy
process (Kleinrouweler et al., 2016). regulations have influenced the creation of healthcare
analytics systems. Today’s solutions should address the
It is noteworthy that with help of new technologies in demands of the elaborate data access while meeting the
data collection and processing nowadays one can have a demands of patient confidentiality and data protection.
real-time opportunity to perform analytics in healthcare Through adoption of highly developed access control
facilities. Blocking architectures of stream processing and approaches and encryption methodologies, it has been
edge computing provide an opportunity to analyze the possible to share secure healthcare data while at the same

10
time meeting all the regulatory requirements (Priyanka & prevailing integration challenges raised by data
Kulennavar, 2014). fragmentation and heterogeneity (Kahn et al., 2016). As
Benson & Grieve mentioned that, since the early days of
C. Data Integration and Interoperability in Predictive implementing healthcare information systems one of the
Healthcare Analytics more persistent problems was that these systems often are
Appropriate linkage of disparate healthcare data not able to interoperate with each other and thus formed a
sources is indeed acknowledged as a challenging scientific great obstacle to the progression of advanced analysis and
frontier in the field of applied predictive analytics, calling data use for precisely predictive correlation and conclusion
for accurate technology and methodology to address propositions.

Table 3 Comparative Analysis of Healthcare Data Integration Frameworks and Interoperability Standards
Framework Interoperabilit Data Computationa Scalability Implementatio Predictive
y Level Standards l Complexity Potential n Challenges Modeling
Compatibilit
y
HL7 FHIR High SNOMED, Moderate Extensive Complex Excellent
LOINC Authentication
OpenEHR Very High ISO 13606 High Moderate Semantic Good
Mapping
DICOM Specialized Imaging Low Limited Vendor-Specific Moderate
Protocols
SNOMED Terminology Clinical Very Low Comprehensiv Multilingual Limited
CT Terms e Challenges
IHE XDS Moderate XDS.b High Scalable Governance Good
Registry Issues
OMOP High Standardized Very High Extensive Data Excellent
CDM Mapping Transformation

Data integration schemata are highly complex and specifics, regulations, and technology must inform the
encompass multiple complex strategies; all of which come approach to integration (Bates et al., 2018).
with their specific methodological implications for
predictive healthcare analytics (Weber et al., 2019). D. Computational Methodologies in Predictive
Scholars have more and more paid much attention to how Healthcare Modeling
to construct stable and standard interfaces after admitting Data integration schemata are highly complex and
the ultimate significance of standardization in addressing encompass multiple complex strategies; all of which come
high heterogeneity (Reconceptualizing, 2018). with their specific methodological implications for
predictive healthcare analytics (Weber et al., 2019).
Techniques for the integration of data in Scholars have more and more paid much attention to how
computational methods have become significantly more to construct stable and standard interfaces after admitting
developed and complex with the addition of dynamic data the ultimate significance of standardization in addressing
integration for various representations based on machine high heterogeneity (Reconceptualizing, 2018).
learning (Chen et al., 2020). These approaches use
complex algorithm approaches such as probabilistic Techniques for the integration of data in
matching, ontological reasoning, and semantic network computational methods have become significantly more
analysis to build harmonized and integration system developed and complex with the addition of dynamic data
framework that can support complex predictive model integration for various representations based on machine
structures (Xiaomeng et al., 2021). Recent investigations learning (Chen et al., 2020). These approaches use
reveal that for data integration to work, a technical, complex algorithm approaches such as probabilistic
semantic, and organizational problem, context should be matching, ontological reasoning, and semantic network
solved by multi-disciplinary teams involving clinicians’ analysis to build harmonized and integration system
data scientists and information technologists (Rodriguez- framework that can support complex predictive model
Gonzalez et al., 2022). The need to realize the complex and structures (Xiaomeng et al., 2021).
detailed integration framework means that domain

11
Fig 4 Predictive Modeling in Medicine

Recent investigations reveal that for data integration E. Disease Prediction and Risk Stratification
to work, a technical, semantic, and organizational Frameworks
problem, context should be solved by multi-disciplinary Integrated disease prediction models are one of the
teams involving clinicians’ data scientists and information dominating subdomains of predictive healthcare analytics,
technologists (Rodriguez-Gonzalez et al., 2022). The need requiring complex computational approaches and highly
to realize the complex and detailed integration framework nuanced knowledge representation formalisms
means that domain specifics, regulations, and technology (Himmelstein et al., 2017). To establish stronger risk
must inform the approach to integration (Bates et al., stratification models, the qualitative features need to be
2018). used in combination, as interventions, that reflect multiple
domain features and their effects on physiological
parameters (Weng et al., 2020).

Table 4 Comprehensive Disease Prediction Framework Characteristics and Performance Metrics


Disease Category Prediction Feature Multimodal Longitudinal Personalization Computational
Accuracy Complexity Integration Tracking Potential Requirements
Cardiovascular 0.85-0.92 High Excellent Superior Moderate Very High
Diseases
Neurodegenerative 0.73-0.88 Extremely Good Excellent High High
Disorders High
Oncological 0.79-0.95 Moderate Superior Good High Very High
Conditions
Metabolic 0.82-0.90 Moderate Good Moderate High Moderate
Disorders
Respiratory 0.70-0.85 Low Limited Good Moderate Low
Diseases
Infectious Diseases 0.68-0.82 High Excellent Limited Low Moderate
Autoimmune 0.75-0.90 Extremely Good Superior High High
Conditions High

Advanced disease prediction models need to include more context-sensitive and personal predictive capability
detailed computational approaches for analysis of genetic, (Beam & Kohane, 2018).
environmental- behavioral and physiological parameters
as measured in temporal and contextual domains Furthermore, the development of complex and robust
(Obermeyer & Emanuel, 2016). Multiscale data disease risk prediction models is accompanied by a need
integration allows for risk assessment models that cannot for efficient computational approaches capable of
be described by more classic reductionist methods applied breaking down complex, multi-modal data structures
to disease forecast (Leung et al., 2019). effectively, while also remaining scalable and
interpretable (Topol, 2020). More complex ensemble
Definite risk assessment calls for sophisticated learning frameworks, such as stacked generalization and
computational systems that can resolve the complexity of boosting algorithms, has further evidenced the ability of
compensatory dynamics, changes in person physiology ensemble learning to construct accurate and generalizable
over time and Lifelong health trajectory profiles predictive frameworks across various clinical fields (Chen
(Torkamani et al., 2019). These approach-based methods et al., 2021). Recent studies to predict disease patterns
apply complex machine learning algorithmary that might need to include complex probabilistic models to measure
be able to capture subtle interaction between numerous predictive risk variations and distinctive risk estimations
clinical variables that have complicated relations, thus (Pearl, 2018). Thus, these computational approaches use
12
more refine Bayesian inference, theoretical causal models, 2018). Health care information environments continue
and robust learning algorithms in an effort of achieving evolving into intricate systems consisting of
more complex and contextually sound models (Schulam & heterogeneous information that often defies easy analysis.
Saria, 2017). Chronic diseases like diabetes or cardiovascular diseases,
as well as cancer, require complex data handling methods
IV. DATA INTEGRATION to reveal valuable patterns. The nature of medical
information implies not only the complexity of the data
A. Data Integration and Preprocessing for Predictive itself but also computational and methodological
Healthcare Analytics problems, caused by the heterogeneity of the information
Appropriate integration and preprocessing of flow. Data collection takes place in complex data
different healthcare data types are one of the most environments with many adverse factors such as; missing
important prerequisites for moving further in the sphere of values, incoherent record keeping and standards of data
using predictive analysis in medical science (Alharthi quality.

Fig 5 Workflow of Big data Analytics

Moreover, the rapid expansion of digital health medicine. According to Alghamdi et al. (2021), further
technologies increased the amount of medical information attention should be paid to the task of establishing
and its richness which increases the need for intricate comprehensive data mapping methodologies that would
preprocessing. Chinchmalatpure & Dhore (2021) stress the provide theoretical foundation for comparing different
essential importance of approaches for the data integration medical data sets. Furthermore, regarding the
process which would allow the conversion of the initial organization’s integration process, certain issues that may
medical data that are usually collected in the free form into have to do with privacy will have to be considered, proper
the structured data that can be analysed. Different types of measures regarding privacy will have to be taken, data
datasets, more specifically pre-processed ones as Leung et protection procedures upheld at their utmost level, and
al. (2020) discussed, may challenge the impact of data patient confidentiality upheld at its best. Batko & Ślęzak,
integration for improving performances of numerous (2022) suggest that complex approaches to data integration
predictive models in various clinical fields. can result in more precise analysis of patient’s health and
help in changing the approach to the disease process and
 Data Integration individual health risk factors. As a result, there is a need
Data integration refers to a complex process of for delivering more sophisticated technological support
combining various sources of healthcare data into a single, and multi-faceted knowledgeable teamwork for enhancing
interoperable platform that follows up on the multiple the prospects of health care integrated medical data
distinctive challenges of the contemporary medical systems.
information systems (Bartley, 2021). In addition to EHR,
this integration includes claims data, genomic data,  Data Preprocessing
wearable device measurements, and other upcoming Data preprocessing appears as a highly technical
digital health technologies. Diseases such as Alzheimer, methodological approach to converting raw healthcare
multiple sclerosis, rheumatoid arthritis, and others need data into useful analytically formatted forms (Badawy et
wide and complex data models that can integrate patient al., 2021). Given the intricate nature of medical data we
data. need to include multilayer preprocessing scenarios capable
of solving the specific challenge for different disease-
In addition, as new healthcare specialization domains related fields. Pernicious diseases such as lupus,
continue to emerge, the establishment of strong data Parkinson’s, and thyroid disorders call for proper
governance frameworks becomes indispensable to create processing of data to allow for an accurate modeling
compatibility of the data across different fields of approach. In addition, preprocessing techniques are also
13
required to address a complex medical information dimensions medical data tables. Incorporating prior
environment, how to save data and at the same time ensure knowledge concerning the data domain Belle et al. (2015)
the data will be effective for analysis. noted that derived variables are expected to model the
complexity inherent within health care datasets. In
The properties of healthcare data require a addition, transformation strategies, where they are
comprehensive approach that transcends what is required, must take care not to eliminate all patient
characteristic of traditional pre-processing activities such characteristics and clinical differences all together. Also,
as cleaning and transformation activities because of their the process includes developing intricate mapping
diverse and non-stationary nature because of the methods that will convert various medical terms and
incorporation of advanced Statistical and Machine coding structures into comparable and consistent formats.
learning Techniques. Preprocessing is considered by
Galetsi & Katsaliaki (2020) as one of the toughest steps in  Data Enrichment
machine learning because of the chance to remove Data enrichment is one of the most complex
numerous types of possible bias in the data, and make the approaches to supporting secondary use of healthcare
predictive models more trustworthy in general. Moreover, datasets by adding more information to such data and
preprocessing helps to act as a bridge between data making it more valuable for predictive analytics
collection and superior analytical algorithm analyses in (Amarasingham et al., 2014). It becomes especially
that everyone and anyone who is involved in the modeling important when interpreting complicated states such as
phase will be resting their work on accurately pre- scleroderma, amyloidosis, granulomatosis with
processed data. The author of this piece, Nambiar et al. polyangiitis. Moreover, enrichment methods go a step
(2013), describes the significance of advanced further than conventional oversampling methods while
preprocessing methodologies in revealing latent structure leveraging a variety of sophisticated approaches that
in highly layered medical data sets. combine socioeconomic, environmental, and lifestyle
factors in the frame of detailed patient personas. As
 Data Cleaning Hripcsak et al. (2016) notice, the integrated approach
Data cleaning is another complex and elegant implies major enhancement of data and states that enriched
methodological intervention aimed at the inherent nature sources of data can help improve understanding of
of datasets in the health care structure (Galetsi & diseases and modeling of personal health evolution.
Katsaliaki, 2020). It becomes most important when
assessing conditions typified by multiple factors like cystic Also, the enrichment process demands sophisticated
fibrosis, hepatitis, or respiratory conditions such as chronic mapping methods that can harmonize disparate
obstructive pulmonary disease. In addition, data cleaning information elements when enriching clinical data, and
is more than just error removal; it is a more sophisticated ensure data consistency and patients’ confidentiality. In
approach to increasing the credibility of data and the addition to obvious factors such as age, sex, and ethnicity,
resulting analysis. Nambiar et al., (2013) conducted a genomic features, behaviour, and other ultramodern
review where they stated that imputation is a versatile physiological metrics can be used to define patient
procedure whose main goal involves not only dealing with profiles. Thus, data enrichment becomes a
missing or wrong values but also maintaining statistical methodologically significant strategy that indeed goes
properties of the data set. beyond a framework of collection of merely numbers
regarding persons and populations.
Moreover, complex techniques for outliers’ detection
are also applied to conditions the field, which can skew the B. 4.2 Analytical Techniques for Predictive Healthcare
results of an analytical study. Scholars need to deploy Analytics
complex validation rules that are going to filter the real Healthcare forecasting has become a complex field of
medical conditions that are different from mere recording study containing a wide range of analytical methods that
mistakes. In this way, data cleaning transform into a help meet the research needs for predicting health
complex data model, which is an attempt at maintaining outcomes (Ghassemi et al., 2018). The range of the
the completeness and depth of the medical information, on analytical methods covers conventional statistic tools and
the one hand while generating a consolidated and accurate state-of-art machine learning technologies for applying in
database available for further sophisticated algorithms, on disease prediction and patients’ management.
the other hand. Autoimmune diseases like lupus, Wegener’s
granulomatosis and primary biliary cirrhosis require more
 Data Transformation elaborate analytical frameworks to diagnose which still at
Data transformation has been identified as a key times eludes most doctors due to the many layers of
methodological approach which involves the process of medical information.
translating large, diverse healthcare data into structure
formats of analysis (Muniasamy et al., 2020). It becomes Moreover, the increased computational power
especially elaborate when simplifying such conditions as coupled with new sophisticated algorithmic methods have
sarcoidosis, fibromyalgia, and haemophilia that demand unfathomably transformed techniques of deriving valuable
special approaches to data analysis. Other than simple unit information from highly progressive healthcare databases.
conversions, data transformation also involves other Amarasingham et al., (2014) state that different types of
sophisticated feature transformations that might be analysis when combined could help change how
employed to harness information from complex multi- companies predict changes and trends in customer
14
behavior to a more accurate picture. Also, the field remains V. DATA SOURCES
relatively young and grows very fast, with scientists
coming up with more and more complex approaches to A. Data Sources and Analytical Techniques in Predictive
tackle increasingly complex medical prognosis and Healthcare Analytics
management of patients. The systematic review of predictive healthcare
analytics shows it is a structured and evolving field about
 Traditional Statistical Methods data kind and kind of analytical strategies (Alharthi, 2018).
Traditional statistical methods represent a Diseases like rheumatoid arthritis, multiple sclerosis, and
foundational first-generation technique used in the systemic lupus erythematosus; as well as recent diseases
predictive modeling of healthcare data that provides like Ulcerative colitis, Parkinson, Alzheimers etc all
reliable and easy data interpretation (Jen et al., 2012). The explain the need for highly complex and refined data
methodological perspectives define a rich set of highly processing and analytics. Additionally, the research also
elaborated approaches to analyse intricate medical establishes the capability of sophisticated methodologies
diagnosis such as Churg-Strauss syndrome, microscopic for prediction in dealing with other medical phenomena.
polyangiitis, and Sjögren’s syndrome. Moreover, these Ghassemi et al. indicate that there is contemporary
methods utilize statistical foundations that are well defined richness in analytical techniques, pointing to the fact that
and from which human interpretable conclusions about healthcare prediction is developing rapidly. Moreover, the
medical phenomena can be derived. Higdon et al noted that analysis shows how medical databases are diverse and can
interpretability and conformability to the user’s purposes be used to provide valuable insights.
are strengths of conventional statistical techniques, with
predictive models being easily interpretable in their work. The study shows that EHRs were the most common
In addition, the methods provide a conceptual apparatus data source with 78.2% of the reviewed studies using the
required for considering the significance of each predictor records for big data analysis. EHRs were used in
in medical practices. In addition to their readability, these multifaceted uses such as in risk assessment for diseases,
methods are a basis for further development and enhance for phenotyping the patient, in risk of readmission and in
more sophisticated machine learning models by decision support tool. Medical and health insurance data,
comparison. Therefore, historical procedures remain and next-generation sequencing (NGS) data or any
highly valuable in medical science as the primary source genomic and proteomics data, body-worn sensors data,
of practical predictive analytics tools that are scientifically and medical imaging data were also utilized but not to a
sound. similar extend.

 Advanced Machine Learning Techniques Random forests, gradient boosting machines, and
The latest trends in machine learning have been stacking ensembles were used in 32.2% of the studies
demonstrated as a radical tool for predictive healthcare using ML, indicating that more researchers recognise that
analytics as well as have shown extraordinary potential for the use of multiple models (often shown to increase
handling high-level medical information (Char et al., predictive performance) is possible by ML (Ghassemi et
2018). The analytical process becomes especially nuanced al., 2018; Beam & Kohane, 2018). These techniques work
when analysing diseases like Behçet, mixed connective with an ensemble of different models so that weaknesses
tissue, disease and antiphospholipid etc. Moreover, these intrinsic in each model can be compensated by strengths
methods show outstanding ability in modeling complex, of other models in the ensemble to improve the predictive
non-linear interactions that underlie the analysed ability of the model. Because of the nature of medical data,
healthcare data. Nevin & PLoS in their article of (2018) structure and diverse, the analysis of it requires fine-tuned
also describe how machine learning has emerged as the approaches that can reveal nonlinear dependencies and
tool of choice in advancing medical predictive modelling perform well on high-dimensional data.
techniques and how these approaches can identify more
intricate relationships than are detectable by conventional The most common type of analyses used were time-
quantitative analysis. series analysis which contributed to 18.4% followed by
survival analyses at 16.1%. These customized methods are
Further, the applicability of medical analytics to all useful in healthcare settings where time factors and patient
forms of data structure; structured and unstructured lifecourses are essential (Chen et al., 2017; Kleinrouweler
information, is another great progress made. Not only the et al., 2016). Many time-series models including ARIMA
conventional methodologies of predicting the results but and exponential are used to conduct projection of disease
the machine learning approaches present more of dynamic incidence and progression, treatment interventions, and
models that allow integration of new knowledge in the health system demands. Cox proportional hazards models
medical field. As a result, these sophisticated and Kaplan-Meier estimates help for more details of the
methodologies are quickly altering the way medical patient outcome and risk estimation as well as to find out
forecast is performed, thus providing more specific or the usefulness and efficacy of the treatment options.
individualized opinions of patient clinical outcomes.
This rich mix of data makes the problem of predictive
healthcare analytics multifaceted and challenging, which
agrees with the definition of the problem stated in Section
1. While EHRs remained most prevalent, the incorporation
of genomics data, wearable, and imaging info signifies the
15
higher complexity of preventive analytics solutions (Belle 2017). They can also help in order of medicines, diagnostic
et al., 2015). This multilevel data fusion allows for a large- and treatment purposes and may enhance patient safety
scale and customized approach to prognosis, offering and health care service delivery.
potential for the total transformation of several decision-
making processes in clinical practice..  Precision Medicine:
The advancements in genetics and imaging
The innovative area of genomic and omics data biomarkers along with integrated cohort data facilitate the
applies to studies in proportion with 27.6 %; it relates to design of individual-based care-management plans and
precision medicine and targeted pharmaceutical discovery of new diagnostic and prognosis markers
treatments. As the knowledge and sophistication of genetic (Higdon et al., 2013; Beam & Kohane, 2018). The delivery
markers, molecular profiling and biomolecular of highly individualized care is one of the areas of
connections are advanced for creating much more detailed emphasis of predictive analysis in the era of precision
models that are beyond the conventional biomedical medicine.
metrics (Higdon et al., 2013). These technical enabling
tools enhance the comprehension of patient’s properties, C. Performance and Impact of Predictive Models in
and therefore, health care early intervention plans. Healthcare
. The mentioned papers have confirmed that the
B. Clinical Applications and Performance of Predictive application of the superior data analysis approaches,
Analytics in Healthcare Domains including machine learning and deep learning, can further
Predictive analytics in healthcare have evolved to improve the preciseness and validity of forecasting models
include numerous possibilities in a bid to improve patient in healthcare (Amarasingham et al., 2014; Galetsi &
care, assist in the clinical managerial decisions, and fully Katsaliaki, 2020). Many papers also indicated that
realize resource optimization (Zafar et al. 2019). The machine learning models were significantly more accurate
reviewed studies highlighted several key areas where than statistical techniques in terms of prediction
predictive analytics have been employed: (Christodoulou et al., 2019; Goldstein et al., 2016). For
instance, Boukenze et al. (2016) employed decisions trees,
 Disease Risk Prediction: random forest and other algorithms like support vector
Screening risk models for certain health states have machines in order to predict cases of chronic diseases,
been created to target patients who are at risk of including diabetes and hypertension, with adequate
developing certain conditions such as cardiovascular accuracy and sensitivity. Likwise, Jen et al. (2012)
diseases, diabetes, cancer, and neurodegenerative diseases constructed an early warning system for chronic diseases
(Boukenze et al., 2016; Jen et al., 2012). By using features using ensemble approach of early warning system showing
extracted from EHRs, claims data, and genomic promising predictive competency and ability to shortlist
predictors, these models can create screens along the patients for proper interventions.
patient’s risk assessments and perform intervention and
prevention. The findings of using predictive analytics have also
presented some positive trends in the results for patients as
 Patient Phenotyping: well as in the sphere of healthcare. David et al. (2019)
Press and Coleman study described in section 1 has made a study in which PA-Intervention reduced demand
highlighted that predictive analytics have been used to for healthcare services; this gave an implication of
develop patient phenotypes by employing electronic reduction in costs as well as improvement in resource
health record data and claims data and wearable device’s utilization. In addition, the use of CDS linked with the
data informatics (Hripcsak et al., 2016). Such phenotypes predictive models has evidenced to reduce patient safety,
could be used to design individual treatments, identify decrease the rates of medication errors and provide
therapeutic approaches, and underpin management standard care (Chen et al., 2017).
strategies for complicated patients with multiple disorders.
It should be noted that applications of the discussed
 Readmission Risk Assessment: techniques in the healthcare domain depend on the
Machine learning algorithms have been put in accurate model validation, their updating procedures, and
practice for developing patient phenotypes of risk of good integration into the clinical environments (Goldstein
hospital readmission: this is a key quality indicator and one et al., 2016). It also means that there are challenges in
of the biggest determinants of cost in healthcare (Harris et relation to data quality, data sharing interoperability, as
al., 2016; David et al., 2019). With PCP data and/or patient well as the overarching and growing ethical concerns that
demographic, clinical, and SDOH data, these models must be solved to advance the use of predictive analytics
could be useful for identifying at-risk patients for early, in healthcare at a broader and scale level.
targeted intervention as well as risk stratification of
patients prior to discharge to better position them to D. Data Quality and Interoperability Challenges
manage their transitions of care successfully. Data quality and compatibility is one of the main
problems associated with the application of predictive
 Clinical Decision Support: analytics in the context of healthcare (Chinchmalatpure &
Clinical decision support systems have adopted the Dhore, 2021). Healthcare data may be incomplete, contain
use of predictive analytics to offer real time many missing values, mixed data formats and may also
recommendation and alert to the clinicians (Chen et al., contain errors or inconsistencies, which directly effects the
16
performance and accuracy of predictive models (Iqbal et F. Emerging Trends and Future Directions
al., 2016).
The field of predictive analytics in healthcare is
The absence of data harmonization and rapidly evolving, with several emerging trends and future
interoperability from multiple healthcare infrastructures directions that hold promise for further advancements:
and data also sets present a huge challenge to data
collection and use (Frost & Sullivan, n.d.). One of the  Integrative Data Sources:
major future challenges is effective merging of limited Integration of additional data streams like wearable
concern data from EHRs and claim data with the devices data, social media data, environmental data and
comprehensive dataset merged from genomic sites, real-world evidence provide additional level of coverage
medical image data, clinical notes, and wearable devices and patient specificity to the models (Belle et al., 2015;
(Belle et al., 2015). Culotta, 2010).

However, due to the gigantic data size and the relative  Automated Feature Engineering:
vast number of clinical data type in the healthcare area and Automated techniques for the extraction of features
the relative unceasing update of clinical practices, based on machine learning and deep learning are other
diagnostic criteria, and therapeutic management, it is potential ways of raising the rate of discovering effective
important to maintain and follow up the model frequently predictors in overload huge healthcare data (Ghassemi, et
to keep a high reliability and valid of the prediction model al., 2018).
(Chen et al., 2017). The latter, if not resolved, results in
pessimistic or optimistic estimates, which tends very  Federated Learning and Distributed Models:
negatively affect the credibility and general acceptance of Newer techniques in distributed learning systems,
the predictive analytics within the health care industry. such as federated learning and distributed modelling allow
the creation of strong prediction models without
E. Ethical Considerations and Model Interpretability necessitating centralised data sharing and hence can
Predictive analytics are used broadly in healthcare circumvent many of the data privacy and data management
organizations, which leads to several ethical concerns, issues (Alharthi, 2018).
including privacy concerns, issues of fairness regarding
analytics’ impact on populations that healthcare needs to  Explainable AI and Interpretable Models: T
serve, as well as issues of explainability and transparency he heightened concern with XAI and IMMs can help
of decisions made by healthcare organizations (Cohen et improve the intelligibility and credibility of PA in
al., 2014). healthcare for clinicians’ and patients’ comprehension
(Christodoulou et al., 2019).
Medical information including patient information, is
highly secured due to regulatory requirements like HIPAA  Continuous Learning and Adaptation:
in the USA. Challenges which should be met prior to the The use of integration with live stream data and
analysis of the data include, firstly, the ethical usage of the feedback can help improve the model’s usefulness through
data, secondly, patient consent, thirdly, and privacy of constant learning and updating of the models depending on
individuals. the emerging trends in the ever-evolving healthcare sectors
(Chen et al., 2017).
Also, we need to incorporate the problem of skewed
data in health care data and the possible of built-in bias in  Interdisciplinary Collaboration:
prediction models that might culminate in unfair Building a closer partnership between clinicians, data
discrimination on vulnerable audiences (Nevin & PLoS scientists, and ethicists to enhance the relevant use of
Medicine Editors, 2018). It is therefore imperative to predictive analytics in meeting the needs and mitigating
address these biases in advance to ensure that prediction of obstacles specific to the healthcare sector will be critical
the utilizations of the healthcare services accurately, going forward (Char et al., 2018).
efficiently, and fairly.
Given the rather recent explosion in data creation as
The interpretability and explainability of the models well as adoption of data-driven decision making in
are also important since healthcare decision-makers and healthcare systems, the profession of predictive analytics
patients need to understand why certain considerations is expected to have the impact of a change agent in relation
will result from the constructed models (Christodoulou et to patients’ outcomes, healthcare systems’ performances,
al., 2019). The black box models can create distrust, and reforms of healthcare delivery models.
reduce clinical utilization, and drastically complicate the
auditing of the model and its decision-making process.. G. Limitations and Future Research Directions
It is worth pointing out that even though the present
Mitigating these ethical and interpretability issues article offers a comprehensive review of the state of
will be critical for the successful translation of PA into knowledge on the topic of predictive analytics in
healthcare contexts as to achieve the intended advantages healthcare, the research has its shortcomings and it is
of big data utilization for healthcare durable and necessary to identify in which directions further studies
sustainable solutions consistent with patient rights to could be conducted.
privacy, fairness, and accountability can be attained.
17
 Geographical and Cultural Bias: predictive analytics in healthcare settings faces several
Most of the reviewed studies were originated in North challenges related to data quality, interoperability, ethical
America and Europe with relatively few investigations considerations, and model validation. Addressing these
coming from other world areas. The extension of the challenges through interdisciplinary collaboration, robust
geographical and cultural range of the study can shed light regulatory frameworks, and continuous model updating
on the peculiarities of the situation in different countries will be crucial for the widespread adoption and scalability
and cultures related to the use of predictive analytics. of predictive analytics in healthcare. As the healthcare
industry continues to generate vast amounts of data and
 Evaluation of Real-World Impact: embrace data-driven decision-making, the field of
It has been shown in hundreds of studies that various predictive analytics is poised to play a transformative role
predictive models can be designed and tested in terms of in improving patient outcomes, enhancing operational
technical accuracy, however, there is a significant lack of efficiency, and driving innovation in healthcare delivery.
investigations that would consider the effectiveness of By leveraging the power of advanced data analytics,
these models used in the real health care practice and healthcare systems can transition towards a more
regarding patients’ outcomes, and potential costs of care proactive, personalized, and value-based approach to care,
and organization (Liu et al., 2019). ultimately delivering better health outcomes for
individuals and populations.
 Long-term Sustainability and Scalability:
More longitudinal research is required to examine the REFERENCES
durability and expandability patterns of predictive
analytics solutions to evaluate the real-world issues and [1]. Alghamdi, A., Alsubait, T., Baz, A., & Alhakami,
variables affecting broad integration into healthcare H. (2021). Healthcare analytics: A comprehensive
systems. review. Engineering, Technology & Applied
Science Research, 11(1), 6650-6655. http://www
 Interdisciplinary Collaboration and Stakeholder .etasr.com/index.php/ETASR/article/view/3965
Engagement: [2]. Alharthi, H. (2018). Healthcare predictive
Cooperation between healthcare workers, analytics: An overview with a focus on Saudi
mathematicians, ethicists, and scholars can offer hands-on Arabia. Journal of infection and public
knowledge regarding organizational, moral, and legal health, 11(6), 749-756. https://www.sciencedirect.
issues of successful implementation of PA in healthcare. com/science/article/pii/S1876034118300303
[3]. Amarasingham, R., Patzer, R. E., Huesch, M.,
 Addressing Algorithmic Bias and Fairness: Nguyen, N. Q., & Xie, B. (2014). Implementing
The lack of specific research for algorithm biases electronic health care predictive analytics:
which affect predictive models and diagnostic tools for considerations and challenges. Health
vulnerable populations and minorities makes addressing affairs, 33(7), 1148-1154. https://www.healtha
these problems critical for fair and equal health care ffairs.org/doi/abs/10.1377/hlthaff.2014.0352
distribution. [4]. Andjelkovic Cirkovic, B. R., Cvetkovic, A. M.,
Ninkovic, S. M., & Filipovic, N. D. (1029).
Despite these restrictions and in pursuit of these Prediction Models for Estimation of Survival Rate
considerations, the research relating to predictive analytics and Relapse for Breast Cancer Patients.
in the context of health systems can enhance while [5]. Badawy, M., Ramadan, N., & Hefny, H. A. (2021).
progressing toward the right vision of data-directed, Healthcare predictive analytics using machine
individualized, and equal healthcare access for the world. learning and deep learning techniques: a
survey. Journal of Electrical Systems and
VI. CONCLUSION AND RECOMMENDATIONS Information Technology, 10(1), 40. https://link.
springer.com/article/10.1186/s43067-023-00108-y
 Conclusion [6]. Bakare, M. A., & Argiddi, R. V. (2016). Prediction
In conclusion, the integration of predictive analytics of Disease using Big Data Analysis. International
in healthcare has the potential to revolutionize clinical Journal of Innovative Research in Computer and
practice by enabling early disease detection, personalized Communication Engineering, 4(4).
treatment plans, and optimized resource allocation. The [7]. Bartley, A. (2021). Predictive analytics in
comprehensive review of the current literature has healthcare. White paper on Healthcare Predictive
demonstrated the growing adoption of advanced data Analytics© Intel Corporation. https://www.
analytics techniques, such as machine learning and deep intel.sg/content/dam/www/public/us/en/documents
learning, in diverse healthcare applications. The findings /white-papers/gmc-analytics-healthcare whitepaper
highlight the diverse data sources, including electronic .pdf
health records, claims data, genomic information, and [8]. Batko, K., & Ślęzak, A. (2022). The use of Big Data
wearable device data, that are being leveraged to develop Analytics in healthcare. Journal of big Data, 9(1), 3.
robust predictive models. These models have shown https://link.springer.com/article/10.1186/s40537-
superior performance in areas such as disease risk 021-00553-4
prediction, patient phenotyping, readmission risk
assessment, clinical decision support, and precision
medicine. However, the successful implementation of
18
[9]. Beam, A. L., & Kohane, I. S. (2018). Big data and [20]. Galetsi, P., & Katsaliaki, K. (2020). A review of the
machine learning in health care. JAMA, 319(13), literature on big data analytics in
1317-1318. healthcare. Journal of the Operational Research
https://jamanetwork.com/journals/jama/article- Society, 71(10), 1511-1529.
abstract/2675024 https://www.tandfonline.com/doi/abs/10.1080/016
[10]. Belle, A., Thiagarajan, R., Soroushmehr, S. R., 05682.2019.1630328
Navidi, F., Beard, D. A., & Najarian, K. (2015). Big [21]. Ghassemi, M., Naumann, T., Schulam, P., Beam, A.
data analytics in healthcare. BioMed research L., & Ranganath, R. (2018). Opportunities in
international, 2015(1), 370194. https://onlinel machine learning for healthcare. arXiv preprint
ibrary.wiley.com/doi/abs/10.1155/2015/370194 arXiv:1806.00388.
[11]. Boukenze, B., Mousannif, H., & Haqiq, A. (2016). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7
Predictive Analytics in Healthcare System using 233077/
Data Mining Techniques. In Proceedings of [22]. Goldstein, B. A., Navar, A. M., & Pencina, M. J.
CCNET-2016 (pp. 01-09). (2016). Risk prediction with electronic health
[12]. Char, D. S., Shah, N. H., & Magnus, D. (2018). records: The importance of model validation and
Implementing machine learning in health care– clinical context. JAMA Cardiology, 1(9), 976.
addressing ethical challenges. New England https://jamanetwork.com/journals/jamacardiology/
Journal of Medicine, 378(11), 981-983. article-abstract/2566165
https://www.nejm.org/doi/abs/10.1056/NEJMp171 [23]. Harris, S. L., May, J. H., & Vargas, L. G. (2016).
4229 Predictive analytics model for healthcare planning
[13]. Chen, J. H., Alagappan, M., Goldstein, M. K., Asch, and scheduling. European Journal of Operational
S. M., & Altman, R. B. (2017). Decaying relevance Research, 253(1), 121-131.
of clinical data towards future decisions in data- https://www.sciencedirect.com/science/article/pii/
driven inpatient clinical order sets. International S0377221716300376
Journal of Medical Informatics, 102, 71-79. [24]. Higdon, R., Stewart, E., Roach, J. C., Dombrowski,
https://www.sciencedirect.com/science/article/pii/ C., Stanberry, L., Clifton, H., ... & Kolker, E.
S138650561730059X (2013). Predictive analytics in healthcare:
[14]. Chinchmalatpure, M. A., & Dhore, M. P. (2021). medications as a predictor of medical
Review of Big Data Challenges in Healthcare complexity. Big Data, 1(4), 237-244.
Application. IOSR Journal of Computer https://www.liebertpub.com/doi/abs/10.1089/big.2
Engineering, 06-09. 013.0024
[15]. Christodoulou, E., Ma, J., Collins, G. S., https://www.liebertpub.com/doi/abs/10.1089/big.2
Steyerberg, E. W., Verbakel, J. Y., & Van Calster, 013.0024
B. (2019). A systematic review shows no [25]. Hripcsak, G., Ryan, P. B., Duke, J. D., Shah, N. H.,
performance benefit of machine learning over Park, R. W., Huser, V., Suchard, M. A., Schuemie,
logistic regression for clinical prediction models. M. J., DeFalco, F. J., Perotte, A., Banda, J. M.,
Journal of Clinical Epidemiology, 110, 12-22. Reich, C. G., Schilling, L. M., Matheny, M. E.,
https://www.sciencedirect.com/science/article/pii/ Meeker, D., Pratt, N., & Madigan, D. (2016).
S0895435618310813 Characterizing treatment pathways at scale using
[16]. Cohen, I. G., Amarasingham, R., Shah, A., Xie, B., the OHDSI network. Proceedings of the National
& Lo, B. (2014). The legal and ethical concerns that Academy of Sciences, 113(27), 7329-7336.
arise from using complex predictive analytics in [26]. Iqbal, S. A., Wallach, J. D., Khoury, M. J., Schully,
health care. Health affairs, 33(7), 1139-1147. S. D., & Ioannidis, J. P. A. (2016). Reproducible
https://www.healthaffairs.org/doi/abs/10.1377/hlth research practices and transparency across the
aff.2014.0048 biomedical literature. PLoS Biology, 14(1),
[17]. Culotta, A. (2010). Towards Detecting Influenza e1002333.
Epidemics by Analyzing Twitter Messages. In https://journals.plos.org/plosbiology/article?id=10.
Proceedings of the 1st Workshop in Social Media 1371/journal.pbio.1002333
Analytics (SOMA '10). [27]. Jen, C. H., Wang, C. C., Jiang, B. C., Chu, Y. H., &
[18]. David, G., Smith-McLallen, A., & Ukert, B. (2019). Chen, M. S. (2012). Application of classification
The effect of predictive analytics-driven techniques on development and early warning
interventions on healthcare utilization. Journal of system for chronic illnesses. Expert Systems with
health economics, 64, 68-79. https://www. Applications, 39(10), 8852-8858.
sciencedirect.com/science/article/pii/S0167629618 [28]. Kleinrouweler, C. E., Cheong-See, F. M., Collins,
305095 G. S., Kwee, A., Thangaratinam, S., Khan, K. S.,
[19]. Frost & Sullivan. (n.d.). Drowning in Big Data? Mol, B. W. J., Pajkrt, E., Moons, K. G. M., &
Reducing Information Technology Complexities Schuit, E. (2016). Prognostic models in obstetrics:
and Costs for Healthcare Organizations. Retrieved Available, but far from applicable. American
from http://www.emc.com/collateral/analystreports Journal of Obstetrics and Gynecology, 214(1), 79-
/frost-sullivan-reducing-informationtechnology 90. https://www.sciencedirect.com/science/article
complexities-ar.pdf /pii/S0002937815005967

19
[29]. Leung, C. K., Fung, D. L., Mushtaq, S. B., [40]. Ohno-Machado, L. (2018). Data science and
Leduchowski, O. T., Bouchard, R. L., Jin, H., ... & artificial intelligence to improve clinical practice
Zhang, C. Y. (2020, August). Data science for and research. Journal of the American Medical
healthcare predictive analytics. In Proceedings of Informatics Association, 25(10), 1273.
the 24th Symposium on International Database https://academic.oup.com/jamia/article-abstract
Engineering & Applications (pp. 1-10). /25/10/1273/5128467
https://dl.acm.org/doi/abs/10.1145/3410566.34105 [41]. Park, S. H. (2018). Regulatory approval versus
98 clinical validation of artificial intelligence
[30]. Levy-Fix, G., Gorman, S. L., Sepulveda, J. L., & diagnostic tools. Radiology, 288(3), 910-911.
Elhadad, N. (2018). When to re-order laboratory https://pubs.rsna.org/doi/abs/10.1148/radiol.20181
tests? Learning laboratory test shelf-life. Journal of 81310
Biomedical Informatics, 85, 21-29. [42]. Prabavathi, G. T., & Shanthipriya, M. (2017).
https://www.sciencedirect.com/science/article/pii/ Review of Healthcare Informatics. International
S153204641830145X Journal of Innovative Research in Computer and
[31]. Linda, A. (2016, October). Seven ways Predictive Communication Engineering, 5(7).
analytics Can improve Healthcare. Elsevier. [43]. Priyanka, K., & Kulennavar, N. (2014). A Survey
[32]. Liu, V. X., Bates, D. W., Wiens, J., & Shah, N. H. on Big Data Analytics in Health Care. International
(2019). The number needed to benefit: estimating Journal of Computer Science and Information
the value of predictive analytics in Technologies, 5(4), 5865-5868.
healthcare. Journal of the American Medical [44]. Reddy, A. R., & Kumar, P. S. (2016, February).
Informatics Association, 26(12), 1655-1659. Predictive big data analytics in healthcare. In 2016
https://academic.oup.com/jamia/article- Second International Conference on Computational
abstract/26/12/1655/5516459 Intelligence & Communication Technology
[33]. Lynch, C. J., & Liston, C. (2018). New machine- (CICT) (pp. 623-626). IEEE. https://
learning technologies for computer-aided ieeexplore.ieee.org/abstract/document/7546683/
diagnosis. Nature Medicine, 24(9), 1304-1305. [45]. Riley, R. D., Ensor, J., Snell, K. I., Debray, T. P.,
https://www.nature.com/articles/s41591-018-0178- Altman, D. G., Moons, K. G., & Collins, G. S.
4 (2016). External validation of clinical prediction
[34]. Malik, M. M., Abdallah, S., & Ala’raj, M. (2018). models using big datasets from e-health records or
Data mining and predictive analytics applications IPD meta-analysis: Opportunities and challenges.
for the delivery of healthcare services: a systematic BMJ, 353, i3140. https://www.bmj
literature review. Annals of Operations .com/content/353/bmj.i3140.abstract
Research, 270(1), 287-312. https://link.springer. [46]. Subrahmanya, S. V. G., Shetty, D. K., Patil, V.,
com/article/10.1007/s10479-016-2393-z Hameed, B. Z., Paul, R., Smriti, K., ... & Somani,
[35]. Mounika, M., Suganya, S. D., Vijayashanthi, B., & B. K. (2022). The role of data science in healthcare
Anand, S. K. (2015). Predictive Analysis of advancements: applications, benefits, and future
Diabetic Treatment Using Classification prospects. Irish Journal of Medical Science (1971-
Algorithm. International Journal of Computer ), 191(4), 1473-1483.
Science and Information Technologies, 6(3). [47]. Shanthipriya, M., & Prabavathi, G. T. (2018).
[36]. Muniasamy, A., Tabassam, S., Hussain, M. A., Healthcare predictive analytics. Int. Res. J. Eng.
Sultana, H., Muniasamy, V., & Bhatnagar, R. Technol.(IRJET), 5(2), 1459-1462.
(2020). Deep learning for predictive analytics in https://www.academia.edu/download/56008144/IR
healthcare. In The International Conference on JET-V5I2319.pdf
Advanced Machine Learning Technologies and [48]. Suresh, S. (2016). Big data and predictive
Applications (AMLTA2019) 4 (pp. 32-42). analytics. Pediatr Clin N Am, 63, 357-366.
Springer International Publishing. https://123library.org/pdf/book/237735/quality-of-
https://publications.dlpress.org/index.php/jcha/arti care-and-information-technology-an-issue-of-
cle/view/16 pediatric-clinics-of-north-america-e-
[37]. Nambiar, R., Sethi, A., Bhardwaj, R., & Vargheese, book.pdf#page=156
R. (2013). A Look at Challenges and Opportunities [49]. Tran, N. D. T., Leung, C. K., Madill, E. W., & Binh,
of Big Data Analytics in Healthcare. In IEEE P. T. (2022, June). A deep learning based predictive
International Conference on Big Data. model for healthcare analytics. In 2022 IEEE 10th
[38]. Nevin, L., & PLoS Medicine Editors. (2018). International Conference on Healthcare Informatics
Advancing the beneficial use of machine learning (ICHI) (pp. 547-549). IEEE. https://ieeexplore
in health care and medicine: Toward a community .ieee.org/abstract/document/9874514/
understanding. PLoS Medicine, 15(11), e1002708. [50]. Van Calster, B., Nieboer, D., Vergouwe, Y., De
[39]. Nithya, B., & Ilango, V. (2017, June). Predictive Cock, B., Pencina, M. J., & Steyerberg, E. W.
analytics in health care using machine learning (2016). A calibration hierarchy for risk models was
tools and techniques. In 2017 International defined: From utopia to empirical data. Journal of
Conference on Intelligent Computing and Control Clinical Epidemiology, 74, 167-176. https://www
Systems (ICICCS) (pp. 492-499). IEEE. .sciencedirect.com/science/article/pii/S089543561
https://ieeexplore.ieee.org/abstract/document/8250 5005818
771/
20
[51]. van der Ploeg, T., Nieboer, D., & Steyerberg, E. W.
(2016). Modern modeling techniques had limited
external validity in predicting mortality from
traumatic brain injury. Journal of Clinical
Epidemiology, 78, 83-89. https://www.science
direct.com/science/article/pii/S0895435616300142
[52]. Vayena, E., Blasimme, A., & Cohen, I. G. (2018).
Machine learning in medicine: Addressing ethical
challenges. PLoS Medicine, 15(11), e1002689.
https://journals.plos.org/plosmedicine/article?id=1
0.1371/journal.pmed.1002689
[53]. Wang, L., & Alexander, C. A. (2015). Big Data in
Medical Applications and Health Care. American
Medical Journal, 6(1).
[54]. Zafar, F., Raza, S., Khalid, M. U., & Tahir, M. A.
(2019, March). Predictive analytics in healthcare
for diabetes prediction. In Proceedings of the 2019
9th International Conference on Biomedical
Engineering and Technology (pp. 253-259).
https://dl.acm.org/doi/abs/10.1145/3326172.33262
13

21

You might also like