Heart Failure Patients Classification Using ML Algos
Heart Failure Patients Classification Using ML Algos
ABSTRACT Heart failure is a critical condition with a high mortality rate, making accurate survival
prediction essential for timely interventions. This study proposes an optimized machine learning approach
using Gradient Boosting Machine (GBM) and Adaptive Inertia Weight Particle Swarm Optimization (AIW-
PSO) to predict heart failure survival. The dataset, sourced from Kaggle, includes clinical features such as
age, ejection fraction, and serum creatinine levels for 299 heart failure patients. To address the imbalance
in survival outcomes, Synthetic Minority Over-sampling Technique (SMOTE) was employed to balance the
dataset, followed by SelectKBest and Chi-square feature selection methods to retain the most significant
predictors. The optimized hyperparameters for the GBM model were identified using the AIW-PSO
algorithm, which effectively balanced exploration and exploitation by adaptively adjusting inertia weights.
Model selection was further refined using information criteria, including Akaike Information Criterion (AIC)
and Bayesian Information Criterion (BIC), ensuring that the best-performing model was chosen based on
both predictive accuracy and model complexity. The optimized GBM model achieved a test accuracy of
94%, demonstrating superior performance compared to traditional machine learning models. The study
underscores the importance of hyperparameter tuning through metaheuristic algorithms and highlights the
potential of AIW-PSO in enhancing model performance for clinical prediction tasks. These findings have
significant implications for clinical decision-making, offering a reliable and interpretable tool for predicting
patient outcomes in heart failure management.
INDEX TERMS Heart failure survival prediction, machine learning algorithms, hyperparameter
optimization, class imbalance handling, AIW-PSO optimization.
Organization (WHO) reports that CVDs are the leading cause This research is particularly timely given the growing
of death globally, accounting for approximately 17.9 million emphasis on personalized medicine and the need for accurate
deaths each year, which represents nearly 32% of all deaths risk stratification in HF management [13]. As healthcare
worldwide. systems worldwide grapple with resource allocation and
Heart failure (HF) remains a formidable challenge in treatment prioritization, especially in the wake of global
cardiovascular medicine, affecting an estimated 64.3 million health crises, refined prognostic tools could play a pivotal role
people worldwide and accounting for a substantial portion of in optimizing patient care pathways [14].
global healthcare expenditure [1]. This chronic, progressive
condition is characterized by the heart’s inability to pump A. RESEARCH QUESTIONS
blood efficiently, leading to a cascade of symptoms that Our investigation seeks to address the following key
significantly impair quality of life and elevate mortality risk. questions:
It is frequently caused by underlying conditions like diabetes, 1) How effective are optimized machine learning algo-
hypertension, or other heart diseases [2]. Despite advance- rithms in predicting the survival of heart failure
ments in therapeutic interventions, the 5-year mortality rate patients?
for HF patients hovers around 50%, underscoring the urgent 2) Which clinical and demographic factors most signifi-
need for improved prognostic tools [3]. cantly impact the survival predictions for heart failure
In recent years, the intersection of machine learning patients?
(ML) and clinical medicine has opened new avenues for 3) Does optimization improve the performance of
enhancing patient care through data-driven decision support machine learning models in heart failure survival
systems [4], [34]. The application of ML algorithms to prediction?
predict HF outcomes has shown promise, yet challenges
By exploring these questions, we aim to contribute to
persist in model accuracy and generalizability [5]. A crit-
the evolving landscape of ML-assisted clinical decision
ical bottleneck in leveraging ML for clinical prediction
support, potentially offering clinicians a more refined tool for
tasks lies in the optimization of model hyperparame-
prognostication in heart failure management.
ters, a process that can significantly influence predictive
performance [6].
B. KEY CONTRIBUTIONS
The advent of metaheuristic optimization algorithms, such
The key contributions of this work are summarized as
as Particle Swarm Optimization (PSO), has provided a
follows:
powerful framework for navigating the complex hyperpa-
rameter landscape of ML models [7]. However, the classical 1) Introduction of AIW-PSO and GBM Combina-
PSO algorithm often struggles with the delicate balance tion: This study introduces the novel combination
between exploration and exploitation, potentially leading to of AIW-PSO and GBM for optimizing heart fail-
suboptimal solutions [8]. To address this limitation, we pro- ure prediction models, demonstrating its potential
pose the application of Adaptive Inertia Weight Particle to improve model performance by effectively tuning
Swarm Optimization (AIW-PSO), an enhanced variant that hyperparameters.
dynamically adjusts its search behavior, to optimize the 2) Performance Across Balanced and Imbalanced
hyperparameters of a Gradient Boosting Machine (GBM) Datasets: The model’s performance is explored across
model for HF survival prediction. both balanced and imbalanced datasets, showcasing its
Our study leverages a curated dataset of 299 HF patients, practical utility in real-world applications, particularly
encompassing a rich tapestry of clinical features including in dealing with class imbalance issues that are common
left ventricular ejection fraction, serum creatinine levels, in medical datasets.
and comorbidities [9]. To mitigate the inherent class imbal- 3) Identification of Key Predictors: Through feature
ance typical of survival data, we employ the Synthetic selection techniques, the study identifies critical pre-
Minority Over-sampling Technique (SMOTE), ensuring a dictors, such as ejection fraction and serum crea-
balanced representation of outcomes [10]. Feature selection tinine, which provide valuable insights for clinical
is performed using the SelectKBest algorithm in conjunction decision-making and contribute to the model’s high
with Chi-square statistical tests, distilling the most salient accuracy.
predictors from the feature space [11]. Heart failure (HF) is a major public health concern that
The novelty of our approach lies in the synergistic affects millions of people worldwide, resulting in high
integration of AIW-PSO with GBM, a powerful ensemble morbidity and mortality. The accurate prediction of survival
learning method known for its robustness in handling in patients with heart failure is critical for guiding clinical
complex, non-linear relationships [12]. By harnessing the practice. This study demonstrates a novel application of adap-
adaptive capabilities of AIW-PSO, we aim to fine-tune tive inertia weight particle swarm optimization (AIW-PSO)
the GBM model’s hyperparameters, potentially unlocking in conjunction with a Gradient Boosting Machine (GBM)
superior predictive performance compared to traditional, for model performance improvement in prediction tasks. The
manually-tuned ML models. proposed methodology shows the potential for improving
the accuracy of models and providing meaningful clinical learning model for predicting 30-day readmission in heart
applications. failure patients. Their approach, which incorporated temporal
trends in lab results and vital signs, achieved higher accuracy
than traditional logistic regression models, underscoring the
II. LITERATURE REVIEW ability of advanced ML techniques to leverage complex,
Heart failure (HF) is a critical public health issue, affecting time-dependent data. Mortazavi et al. [20] compared various
approximately 26 million people worldwide and contributing machine learning algorithms for predicting 1-year mortality
significantly to morbidity and mortality rates [2]. The in heart failure patients, finding that ensemble methods like
complex nature of HF, characterized by various etiologies Random Forests and Gradient Boosting Machines outper-
and comorbidities, necessitates advanced predictive mod- formed both traditional regression models and individual
eling to enhance clinical outcomes and guide treatment ML algorithms. This study emphasized the importance of
strategies [15]. The evolution of predictive modeling in algorithm selection and feature engineering in developing
healthcare, particularly in the realm of heart failure, has been effective predictive models.
marked by a shift from traditional statistical methods to more As the complexity of machine learning models increases,
sophisticated machine learning approaches [16]. Accurate the need for effective optimization strategies [37], [38]
survival prediction in HF patients can enable clinicians to becomes more pronounced. Hyperparameter optimization
stratify risk effectively, optimize treatment plans, and allocate plays a crucial role in enhancing the performance of these
resources more efficiently, potentially improving patient models, and various techniques have been explored in
outcomes and quality of life [17]. the context of heart failure prediction. Bagheri et al. [21]
Additionally, Reinforcement learning (RL) has emerged employed a Genetic Algorithm (GA) for feature selection
as a powerful approach for optimizing decision-making in and hyperparameter tuning in their heart failure prediction
healthcare. Recent studies have demonstrated the potential model. Their approach, which combined GA with a Support
of RL in learning treatment policies for critical conditions Vector Machine classifier, demonstrated how optimization
such as sepsis. However, deploying RL in healthcare settings techniques can significantly improve model performance
requires robust model selection frameworks to address by identifying the most relevant predictors and optimal
challenges like overfitting and computational complexity, model configurations. In a novel approach, Beunza et al.
particularly in offline settings. A recent work [36] investigates [22] utilized Bayesian optimization for hyperparameter
a practical model selection pipeline for offline RL using tuning in their ensemble model for predicting in-hospital
off-policy evaluation (OPE) methods. The study highlights mortality in heart failure patients. Their study showcased
Fitted Q Evaluation (FQE) as the most effective method for how advanced optimization techniques can enhance model
validation ranking, albeit at high computational costs, and performance while reducing computational overhead com-
proposes a two-stage approach to balance ranking accuracy pared to traditional grid search methods. Alaa et al. [23]
and efficiency. introduced an automated machine learning framework for
The relevance of such RL-based approaches aligns with clinical prediction tasks, including heart failure outcomes.
this journal’s focus on advancing machine learning appli- Their approach, which used multi-armed bandits for model
cations for impactful real-world problems. While our work selection and Bayesian optimization for hyperparameter
focuses on supervised learning using Gradient Boosting tuning, demonstrated state-of-the-art performance across
Machines (GBM) and AIW-PSO to classify heart failure various clinical prediction tasks.
patients, incorporating RL approaches like OPE could enable Class imbalance presents a significant challenge in pre-
the development of adaptive and dynamic treatment policies dictive modeling for heart failure, as mortality and adverse
in future studies. These advancements would not only expand events are often relatively rare occurrences in clinical
the scope of predictive modeling but also enhance the datasets. This imbalance can lead to biased models that
practical utility of ML frameworks in clinical settings. perform poorly on the minority class, which is typically the
The application of machine learning (ML) algorithms class of greatest clinical interest. Choi et al. [24] addressed
in healthcare, especially in predicting patient survival and class imbalance in their study of heart failure readmission pre-
disease outcomes, has gained substantial momentum in recent diction by employing the Synthetic Minority Over-sampling
years. Various studies have demonstrated the potential of Technique (SMOTE). Their approach, which combined
these techniques to outperform traditional statistical methods SMOTE with ensemble learning methods, demonstrated
in predicting heart failure outcomes. Awan et al. [18] utilized improved predictive performance for the minority class
a combination of ML algorithms, including Support Vector without sacrificing overall model accuracy. Zahid et al. [25]
Machines (SVM) and Random Forests (RF), to predict explored various resampling techniques, including SMOTE
mortality in heart failure patients. Their study demonstrated and Adaptive Synthetic (ADASYN) sampling, in conjunc-
superior predictive performance compared to conventional tion with different machine learning algorithms for heart
risk scores, highlighting the potential of ML in capturing failure prediction. Their comprehensive comparison provided
complex interactions between clinical variables. In another insights into the effectiveness of different approaches to
significant study, Panahiazar et al. [19] developed a deep handling class imbalance in clinical prediction tasks. In a
VOLUME 13, 2025 30557
M. Ahmed et al.: Predicting the Classification of HF Patients Using Optimized ML Algorithms
different approach, Guidi et al. [26] utilized cost-sensitive our computations utilizing the 13 and 7 best attributes are
learning to address class imbalance in their study on displayed. Furthermore, our findings suggest that the SMOTE
predicting heart failure decompensation. By assigning higher method is the most beneficial inequality provision. Table 1
misclassification costs to the minority class, they were able to shows a summarized overview of the dataset.
improve the model’s performance on this clinically important
group without explicit resampling of the dataset. B. FEATURE SELECTION
Feature selection was performed using SelectKBest and
III. METHOD Chi-square tests to reduce dimensionality and retain only
This study focuses on developing a robust machine learning the most relevant characteristics for the prediction of heart
model to predict heart failure survival, integrating various failure survival. SelectKBest selects the best k features
advanced techniques, including data preprocessing, feature based on the ANOVA F value, while the Chi-square test
selection, handling imbalanced data, and hyperparameter evaluates the relationship between each feature and the target
optimization using a nature inspired improved optimization variable. The Chi-Square Test for Independence is used to
algorithm called Particle Swarm Optimization Algorithm. determine whether there is a significant association between
Metaheuristic optimization algorithms, such as AIW-PSO two categorical variables in a dataset. The Chi-square statistic
(Adaptive Inertia Weight Particle Swarm Optimization), is computed using the following formula:
have proven to be highly effective [29] in improving the n
X (Oi − Ei )2
performance of machine learning models. These optimizers χ2 = , (1)
work by fine-tuning hyperparameters, which are often Ei
i=1
difficult to manually adjust, to enhance model accuracy, where Oi and Ei are the observed and expected frequencies,
reduce overfitting, and improve overall model generalization. respectively.
AIW-PSO and similar algorithms explore the search space The results revealed that features such as age, serum
intelligently to find optimal solutions without getting trapped creatinine, and ejection fraction were among the most
in local optima. These steps are crucial for improving the predictive of patient survival. Reducing the feature space
accuracy and reliability of predictions. Figure 1 explains the through this process not only improves the computational
overall architecture of the study. efficiency of the models, but also enhances interpretability
by focusing on the most significant predictors.
A. DATA COLLECTION AND TRANSFORMATION
The dataset used in this study is the Heart Failure Clinical C. CORRELATION ANALYSIS OF FEATURES
Records obtained from the Kaggle platform, containing
To ensure an accurate correlation analysis between variables,
299 records with 13 clinical features. The primary objective
we distinguished between continuous-continuous variable
is to predict the target variable DEATH_EVENT, which
pairs and categorical-continuous variable pairs.
signifies whether a patient survived heart failure during the
1) For continuous-continuous pairs, we employed the
follow-up period. Features such as age, ejection fraction,
Pearson correlation coefficient, which measures the
serum creatinine, and high blood pressure are considered
linear association between two continuous variables
crucial indicators for survival prediction.
and is computed as:
Data preprocessing included handling missing values, Pn
standardizing features using Standard Scaler, and splitting (xi − x̄)(yi − ȳ)
r = qP i=1 , (2)
the dataset into training and testing sets with an 80:20 ratio. n 2
Pn 2
Standardization was applied to bring all features to a uniform i=1 (xi − x̄) i=1 (yi − ȳ)
scale, ensuring that algorithms sensitive to feature scaling, where xi and yi are the values of the two continuous
such as Support Vector Machines (SVM), perform optimally. variables, and x̄ and ȳ are their respective means.
Given the imbalanced nature of the dataset, with the majority 2) For categorical-continuous pairs (e.g., binary vari-
of patients surviving, the Synthetic Minority Over-sampling ables like anaemia, diabetes, sex, and smoking),
Technique (SMOTE) was implemented. SMOTE works by we applied the Point-Biserial correlation coefficient,
generating synthetic samples for the minority class, in this which is appropriate for associations of categorical
case, death events, through interpolation between nearest and continuous binary variables. The Point-Biserial
neighbors in the feature space. This process ensures a correlation is computed as:
balanced training dataset, improving the model’s ability to r
X̄1 − X̄0 n1 n0
generalize to unseen data while preventing bias toward the rpb = , (3)
majority class. s n2
The goal of our research was to ascertain how long heart where:
failure patients may survive. We have used a number of • X̄1 and X̄0 are the means of the continuous variable
well-known machine learning techniques that have been for the two binary classes (1 and 0),
enhanced by AIW-PSO (Adaptive Inertia Weight Particle • s is the standard deviation of the continuous
Swarm Optimization) to achieve this. The outcomes of variable,
•n1 and n0 are the number of observations in each features and ensures compatibility with algorithms such as
class, Random Forest and Support Vector Machines (SVM).
• n is the total number of observations.
The correlation analysis results were visualized in a E. HANDLING IMBALANCED DATA
heatmap (Figure 2), where Pearson correlation was used The dataset presented a significant class imbalance, with
for continuous pairs and Point-Biserial correlation for the majority of patients surviving and a smaller propor-
categorical-continuous pairs. This method ensures that the tion experiencing death events. To address this, SMOTE
relationships between features are accurately measured (Synthetic Minority Over-sampling Technique) [30], [31],
based on their respective types, avoiding incorrect statistical [32] was employed. SMOTE generates synthetic samples for
assumptions. the minority class by selecting examples from the minority
The heatmap highlights the key relationships between the class and interpolating between their nearest neighbors in
characteristics and the target variable (DEATH_EVENT ). For the feature space. This technique effectively balances the
example, age, serum creatinine, and ejection fraction show dataset, enabling the model to learn more effectively from the
strong associations with the target variable. This approach minority class. After applying SMOTE, the class distribution
not only enhances the rigor of the analysis, but also ensures was more balanced, leading to improved generalization and
methodological correctness. prediction performance on the minority class (death events)
In each category, there are two different kinds of infor-
D. CLASS ENCODING mation altogether. Asymmetric representation of the target
Since some features in the data set were categorical, such as two classes with attributes 13 and 7 characterizes one
sex, the One-Hot Encoding [29] was applied to convert these kind, whereas a balanced distribution of the two classes
categorical variables into numerical binary columns. For characterizes the other, Figure: 3 explains.
instance, the sex variable was split into two binary columns:
sex_male and sex_female. This encoding ensures that F. HYPERPARAMETER OPTIMIZATION
machine learning models can correctly interpret categorical To improve the performance of the machine learning
features without introducing ordinal biases. One-Hot Encod- models, hyperparameter optimization was conducted using
ing is particularly effective when dealing with non-ordinal the Mealpy [27] framework. Instead of using traditional
FIGURE 2. Feature Correlation Heatmap showing Pearson correlation for continuous-continuous pairs and Point-Biserial correlation for
binary categorical-continuous pairs.
G. AIW-PSO ALGORITHM
The Adaptive Inertia Weight Particle Swarm Optimiza-
tion (AIW-PSO) algorithm enhances the classical PSO by
adaptively adjusting the inertia weight w over iterations,
effectively balancing exploration and exploitation. The
update equation for each particle’s velocity and position is
given by:
Velocity update:
vt+1
i = w · vti + c1 · r1 · (pbest,i − xit )
+ c2 · r2 · (gbest − xit ) (4)
FIGURE 3. Proposed types of predictive model designs. where:
t+1
• vi is the velocity of particle i at time step t + 1,
• w is the inertia weight, which is adaptively adjusted
optimization methods, this study employed Adaptive Inertia using AIW strategy,
Weight Particle Swarm Optimization (AIW-PSO) [28], • c1 and c2 are acceleration coefficients,
a variant of the standard Particle Swarm Optimization • r1 and r2 are random values in the range [0,1],
(PSO) algorithm. AIW-PSO is known for its enhanced • pbest,i is the personal best position of particle i,
exploration-exploitation balance through adaptive adjust- • gbest is the global best position,
ments of the inertia weight, which improves convergence • xit is the position of particle i at time step t.
Algorithm 1 Calculation of Information Criteria (IC) • Precision: Evaluates the ratio of true positive predic-
Values tions to the total positive predictions, reflecting the
Input: n: Number of data points, k: Number of model ability of the model to avoid false positives.
parameters, LL: Log-likelihood value. • Recall (Sensitivity): Measures the ratio of true positives
Output: IC values (AIC, BIC, HQIC, AICc). to all actual positives, indicating the model’s ability to
begin identify all positive cases.
Calculate AIC: AIC = 2k − 2LL; • F1-Score: The harmonic mean of Precision and Recall,
Calculate BIC: BIC = k log(n) − 2LL; providing a balanced metric when both false positives
Calculate HQIC: HQIC = 2k log(log(n)) − 2LL; and false negatives are of concern.
Calculate AICc: AICc = AIC + 2k(k+1)
n−k−1 ;
These metrics are particularly relevant for imbalanced
return AIC, BIC, HQIC, AICc; datasets, such as the heart failure survival dataset used
in this study. Accuracy alone may fail to provide an
accurate assessment in imbalanced scenarios; therefore,
we incorporated Precision, Recall, and F1-Score to provide a
AIW-PSO was defined as minimizing classification error more comprehensive evaluation of the model’s performance.
(1 - accuracy). The AIW-PSO algorithm explores potential
B. INTEGRATION OF INFORMATION CRITERIA (IC) VALUES
hyperparameter solutions by simulating particles moving
FOR CLASSIFICATION
through a solution space, updating their positions based on
both individual and global best solutions. By adaptively To enhance the model’s performance and ensure minimized
adjusting the inertia weight, AIW-PSO was able to strike Information Criterion (IC) values, several strategies can
an optimal balance between exploration and exploitation, be employed. First, feature selection and regularization
leading to superior hyperparameter tuning results compared techniques can be utilized to reduce dataset dimensionality
to the original PSO.The hyperparameter optimization process and limit model complexity, which directly impacts IC
led to substantial performance improvements across all calculations by lowering the number of parameters (k).
models, with the gradient boosting model achieving the Regularization methods such as L1 (Lasso) or L2 (Ridge) can
highest accuracy (93.84%). AIW-PSO’s adaptive behavior be introduced during training to mitigate overfitting, further
allowed it to converge more quickly and find better solutions, contributing to robustness.
resulting in more efficient model training and higher Second, hyperparameter tuning should be performed with
predictive accuracy. greater granularity, particularly refining ranges and step sizes
for parameters like n_estimators, learning_rate, and others.
This includes tuning additional regularization parameters
IV. MODEL EVALUATION AND RESULTS ANALYSIS
like min_samples_leaf , subsample, and max_features. Incor-
Following hyperparameter optimization, the models were
porating log-loss as an optimization target can also prove
evaluated using key performance metrics, including accuracy,
beneficial, as it is inherently linked to IC computations;
precision, recall, F1-score, and the area under the ROC curve
minimizing log-loss directly reduces IC values.
(AUC).
Advanced optimizers, such as hybrid approaches com-
In machine learning classification tasks, by selecting
bining AIW-PSO with local search methods, can further
appropriate evaluation metrics is critical to accurately assess
refine parameter optimization for precision. Lastly, dataset
model performance. While Information Criteria (AIC, SBIC,
balancing techniques, such as SMOTE, should be revisited to
HQIC, AICc) are well-suited for statistical model selection,
avoid introducing synthetic noise that could inflate IC values,
their utility is more aligned with parametric models such
ensuring a cleaner dataset for robust model evaluation.
as regression or likelihood-based models. In contrast, this
The algorithm of IC values calculation used in this study
study focuses on evaluating and optimizing the predictive
shown in Algorithm 1 and the explanation showed in Table 3.
performance of machine learning classifiers, which require
By including IC values in the analysis, researchers can assess
metrics that are tailored for classification outcomes.
model robustness more comprehensively, as these metrics
combine model fit and complexity, offering insights into the
A. RATIONALE FOR ACCURACY, PRECISION, RECALL, AND trade-offs between predictive performance and overfitting.
F1-SCORE
To evaluate the proposed Gradient Boosting Machine (GBM) C. PERFORMANCE EVALUATION
model optimized using Adaptive Inertia-Weight Particle In this study, the performance of various machine learning
Swarm Optimization (AIW-PSO), we employed the follow- algorithms for heart failure survival prediction was evaluated,
ing widely accepted metrics: categorizing the results into two types based on the number of
• Accuracy: Represents the proportion of correctly pre- features used. Type I refers to the analysis using 13 features
dicted instances among the total instances. Accuracy under both imbalanced and balanced conditions, while Type
is a fundamental measure for assessing overall model II refers to the results obtained using 7 features under
performance. the same conditions. The models assessed include Random
Forest (RF), Support Vector Classifier (SVC), AdaBoost, competitively with an accuracy of 84.34% and an F1 score of
Gradient Boosting Machine (GBM), and Stochastic Gradient 0.85. In contrast, SVC exhibited lower performance, attaining
Descent (SGD). an accuracy of 81.12%. Notably, when comparing imbal-
anced and balanced data, the implementation of SMOTE
1) TYPE I: 13 FEATURES - BALANCED AND IMBALANCED markedly enhanced the models’ performance, showcasing
For Type I, the results in Figure 6 indicated that using its effectiveness in addressing the class imbalance issue
all 13 features with SMOTE significantly improved the prevalent in the dataset.
model performance metrics. The Gradient Boosting Machine Table 4 presents a comprehensive performance comparison
(GBM) achieved an accuracy of 87.79% and an F1 of optimized machine learning algorithms, incorporating
score of 0.89. The Random Forest (RF) model performed AIW-PSO, on both imbalanced and balanced datasets using
various metrics, including Accuracy, F1 Score, Precision, the highest accuracy of 93.84% with a robust F1 score of
Recall, and Information Criteria (IC) values (AIC, BIC, 0.95, indicating its superior predictive capability when fewer
HQIC, and AICc). Across the imbalanced dataset, the GBM- features were employed. Similarly, the Random Forest (RF)
AIW_PSO algorithm achieved the highest accuracy (83.58%) model demonstrated a significant improvement, reaching an
and F1 Score (0.85), coupled with the lowest AIC (9.33), accuracy of 88.11%. The results from the reduced feature
BIC (21.83), and HQIC (13.73), demonstrating superior set further highlight the importance of feature selection
robustness and predictive performance. Conversely, the SGD- in improving model performance, as evidenced by the
AIW_PSO algorithm showed significantly lower accuracy substantial gains in accuracy across all algorithms tested.
(42.67%) and higher IC values, indicating poorer perfor- The implementation of AIW-PSO for hyperparameter opti-
mance. On the balanced dataset, GBM-AIW_PSO continued mization played a crucial role in enhancing the performance
to excel with the highest accuracy (87.79%), F1 Score of the machine learning models in this study. By optimizing
(0.89), and the lowest AIC (9.23), highlighting its robustness parameters such as n_estimators and learning_rate for GBM,
in handling balanced datasets. Other algorithms, such as and C and gamma for SVC, the models could achieve higher
Adaboost-AIW_PSO and RF-AIW_PSO, also performed accuracy levels and better generalization. For instance, using
competitively but were outperformed by GBM-AIW_PSO in AIW-PSO optimization led to the GBM model’s impressive
most metrics. The balanced dataset notably improved overall accuracy of 93.84% in Figure 9, illustrating how effective
performance metrics across all algorithms, emphasizing the hyperparameter tuning is in optimizing machine learning
importance of dataset balancing in predictive modeling. algorithms. The results underscore the significance of
These results underscore the robustness and reliability of employing advanced optimization techniques like AIW-PSO
GBM-AIW_PSO as a state-of-the-art approach for both to boost predictive performance and ensure more reliable
imbalanced and balanced datasets. survival analysis in clinical applications. ALL the comparison
shown in bar-chart in the Figure 8.
2) TYPE II: 7 FEATURES - BALANCED AND IMBALANCED
For Type II, shown in the following Figure 7 and in tabular V. DISCUSSION
format in Table 5 utilizing a reduced set of 7 features yielded This study effectively addresses the critical research ques-
even more promising results. The GBM model achieved tions regarding the survival prediction of heart failure
patients through the application of optimized machine accuracy, achieving up to 93.84% (by GBM) which is supe-
learning algorithms. The results demonstrate that optimized rior to [33] accuracy when utilizing advanced optimization
models, particularly Gradient Boosting Machine (GBM) techniques such as AIW-PSO. The analysis identifies key
and Random Forest (RF), significantly enhance predictive clinical and demographic factors that influence survival
TABLE 4. Type I performance comparison of optimized ml algorithms on imbalanced and balanced datasets with IC values.
TABLE 5. Type II performance comparison of optimized ML algorithms on imbalanced and balanced datasets.
predictions, including age, serum creatinine levels, and of various algorithms like SVC, AdaBoost, and Stochastic
ejection fraction, thereby providing valuable insights into the Gradient Descent (SGD), has been shown to improve
attributes that clinicians should monitor closely. Furthermore, model performance consistently. The comprehensive analy-
the optimization process, which fine-tunes hyperparameters ses, including the handling of class imbalance using SMOTE
and feature selection methods, underscore the importance (RF), Support Vector Classifier (SVC), AdaBoost, Gradient
of employing sophisticated machine learning techniques to Boosting Machine (GBM), and Stochastic Gradient Descent
achieve reliable survival predictions, ultimately contributing (SGD), GBM achieved the highest accuracy of 93.84%
to better clinical decision-making and patient outcomes in with SMOTE, demonstrating its robustness in handling
heart failure management. imbalanced datasets. Feature selection significantly impacted
Recent advancements in machine learning, such as model performance, improving some models while reducing
the development of FLUID-GPT, have demonstrated the the effectiveness of others. The study highlights the critical
potential of transformer-based architectures for predictive role of hyperparameter tuning through AIW-PSO, which
modeling in complex systems. FLUID-GPT, a hybrid model enhanced model performance.
combining Generative Pre-Trained Transformer 2 (GPT-2) Additionally, information criteria such as Akaike Infor-
with a Convolutional Neural Network (CNN), has been mation Criterion (AIC) and Bayesian Information Criterion
applied to predict particle trajectories and surface erosion (BIC) were incorporated to refine model selection, ensuring
patterns in industrial-scale systems [35]. By leveraging an optimal balance between predictive accuracy and com-
information from initial conditions such as particle size, inlet plexity. GBM exhibited the lowest AIC and BIC values,
speed, and pressure, FLUID-GPT achieves a 54% reduction reinforcing its superiority. The findings emphasize that
in mean squared error and 70% faster training times compared integrating advanced optimization techniques and address-
to traditional BiLSTM approaches. These advancements ing class imbalance can significantly enhance prediction
illustrate the growing role of generative transformer-based outcomes. The inclusion of information criteria provides a
models in replacing computationally expensive simula- rigorous model evaluation framework, contributing valuable
tions, particularly for dynamic and time-series predictions. insights into heart failure survival analysis and improving
While this study focuses on GBM with AIW-PSO for the clinical decision-making.
classification of heart failure patients, the integration of Future research could focus on three key areas to improve
transformer-based architectures like FLUID-GPT could be predictive modeling for heart failure survival analysis. Firstly,
a valuable direction for future exploration, especially for expanding the data set to include additional demographic and
clinical datasets requiring sophisticated time-series model- clinical variables, along with longitudinal data, could provide
ing.Future work may consider adapting such architectures to a more comprehensive understanding of the factors that
clinical datasets, enabling the incorporation of contextual and influence survival, potentially improving the accuracy and
sequential data for more accurate predictions in healthcare interpretability of the model. Secondly, exploring advanced
applications. ensemble methods or hybrid models that combine the
strengths of various algorithms, such as stacking or blending
VI. CONCLUSION AND FUTURE WORK multiple machine learning techniques, could yield even
This study evaluates the performance of various machine better predictive performance. Lastly, investigating alter-
learning models for the prediction of heart failure sur- native optimization algorithms beyond AIW-PSO, such as
vival, incorporating feature selection and class balancing Genetic Algorithms or Differential Evolution, may uncover
using SMOTE. Among the models tested, Random Forest novel hyperparameter configurations that enhance model
robustness and adaptability to different datasets, ensuring that [20] B. J. Mortazavi, N. S. Downing, E. M. Bucholz, K. Dharmarajan,
the predictions remain accurate in diverse clinical settings. A. Manhapra, S.-X. Li, S. N. Negahban, and H. M. Krumholz, ‘‘Analysis
of machine learning techniques for heart failure readmissions,’’ Circulat.,
Cardiovascular Quality Outcomes, vol. 9, no. 6, pp. 629–640, Nov. 2016,
REFERENCES doi: 10.1161/CIRCOUTCOMES.116.003039.
[21] A. Bagheri, Q. Gao, and P. Rafi, ‘‘Genetic algorithm-based heart failure
[1] T. A. McDonagh et al., ‘‘2021 ESC guidelines for the diagnosis and prediction and diagnosis system,’’ J. Intell. & Fuzzy Syst., vol. 37, no. 6,
treatment of acute and chronic heart failure,’’ Eur. Heart J., vol. 42, no. 36, pp. 7649–7660, 2019, doi: 10.3233/JIFS-179230.
pp. 3599–3726, Sep. 2021, doi: 10.1093/eurheartj/ehab368.
[22] J.-J. Beunza, E. Puertas, E. García-Ovejero, G. Villalba, E. Condes,
[2] G. Savarese and L. H. Lund, ‘‘Global public health burden of heart G. Koleva, C. Hurtado, and M. F. Landecho, ‘‘Comparison of machine
failure,’’ Cardiac Failure Rev., vol. 3, no. 1, pp. 7–11, 2017, doi: learning algorithms for clinical event prediction (risk of coronary heart
10.15420/cfr.2016:25:2. disease),’’ J. Biomed. Informat., vol. 97, Sep. 2019, Art. no. 103257, doi:
[3] S. S. Virani et al., ‘‘Heart disease and stroke statistics—2021 update: 10.1016/j.jbi.2019.103257.
A report from the American heart association,’’ Circulation, vol. 143, no. 8, [23] A. M. Alaa, T. Bolton, E. Di Angelantonio, J. H. F. Rudd, and
pp. e254–e743, Feb. 2021, doi: 10.1161/cir.0000000000000950. M. van der Schaar, ‘‘Cardiovascular disease risk prediction using auto-
[4] A. Rajkomar, J. Dean, and I. Kohane, ‘‘Machine learning in medicine,’’ mated machine learning: A prospective study of 423,604 U.K. biobank
New England J. Med., vol. 380, no. 14, pp. 1347–1358, Apr. 2019, doi: participants,’’ PLoS ONE, vol. 14, no. 5, May 2019, Art. no. e0213653,
10.1056/nejmra1814259. doi: 10.1371/journal.pone.0213653.
[5] S. Angraal, B. J. Mortazavi, A. Gupta, R. Khera, T. Ahmad, N. R. Desai, [24] E. Choi, A. Schuetz, W. F. Stewart, and J. Sun, ‘‘Using recurrent
D. L. Jacoby, F. A. Masoudi, J. A. Spertus, and H. M. Krumholz, ‘‘Machine neural network models for early detection of heart failure onset,’’
learning prediction of mortality and hospitalization in heart failure with J. Amer. Med. Inform. Assoc., vol. 24, no. 2, pp. 361–370, 2017, doi:
preserved ejection fraction,’’ JACC, Heart Failure, vol. 8, no. 1, pp. 12–21, 10.1093/jamia/ocw179.
Jan. 2020, doi: 10.1016/j.jchf.2019.06.013. [25] F. M. Zahid, S. Faisal, S. Ramzan, and I. Hussain, ‘‘A novel heart disease
[6] M. Feurer and F. Hutter, ‘‘Hyperparameter optimization,’’ in Automated prediction method based on machine learning techniques and feature
Machine Learning, F. Hutter, L. Kotthoff, and J. Vanschoren, Eds., Cham, selection,’’ Healthcare Informat. Res., vol. 25, no. 2, pp. 124–135, 2019,
Switzerland: Springer, 2019, pp. 3–33, doi: 10.1007/978-3-030-05318- doi: 10.4258/hir.2019.25.2.124.
5_1. [26] G. Guidi, M. C. Pettenati, P. Melillo, and E. Iadanza, ‘‘A machine learning
[7] D. Wang, D. Tan, and L. Liu, ‘‘Particle swarm optimization algorithm: system to improve heart failure patient assistance,’’ IEEE J. Biomed.
An overview,’’ Soft Comput., vol. 22, no. 2, pp. 387–408, Jan. 2018, doi: Health Informat., vol. 18, no. 6, pp. 1750–1756, Nov. 2014, doi:
10.1007/s00500-016-2474-6. 10.1109/JBHI.2014.2341731.
[8] Y. Xue, J. Jiang, B. Zhao, and T. Ma, ‘‘A self-adaptive artificial bee colony [27] N. Van Thieu and S. Mirjalili, ‘‘MEALPY: An open-source library for latest
algorithm based on global best for global optimization,’’ Soft Comput., meta-heuristic algorithms in Python,’’ J. Syst. Archit., vol. 139, Jun. 2023,
vol. 22, no. 9, pp. 2935–2952, May 2018, doi: 10.1007/s00500-017-2547- Art. no. 102871, doi: 10.1016/j.sysarc.2023.102871.
1. [28] Z. Qin, F. Yu, Z. Shi, and Y. Wang, ‘‘Adaptive inertia weight particle
[9] T. Ahmad, A. Munir, S. H. Bhatti, M. Aftab, and M. A. Raza, ‘‘Survival swarm optimization,’’ in Artificial Intelligence and Soft Computing–
analysis of heart failure patients: A case study,’’ PLoS ONE, vol. 12, no. 7, ICAISC (Lecture Notes in Computer Science), vol. 4029, L. Rutkowski,
Jul. 2017, Art. no. e0181001, doi: 10.1371/journal.pone.0181001. R. Tadeusiewicz, L. A. Zadeh, and J. M. urada, Eds., Berlin, Germany:
[10] A. Fernandez, S. Garcia, F. Herrera, and N. V. Chawla, ‘‘SMOTE for Springer, 2006, pp. 450–459, doi: 10.1007/11785231_48.
learning from imbalanced data: Progress and challenges, marking the 15- [29] R. Karthiga, G. Usha, N. Raju, and K. Narasimhan, ‘‘Transfer learning
year anniversary,’’ J. Artif. Intell. Res., vol. 61, pp. 863–905, Apr. 2018, based breast cancer classification using one-hot encoding technique,’’ in
doi: 10.1613/jair.1.11192. Proc. Int. Conf. Artif. Intell. Smart Syst. (ICAIS), Mar. 2021, pp. 115–120,
[11] J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu, doi: 10.1109/ICAIS50930.2021.9395930.
‘‘Feature selection: A data perspective,’’ ACM Comput. Surv., vol. 50, no. 6, [30] W. Satriaji and R. Kusumaningrum, ‘‘Effect of synthetic minority
pp. 1–45, Nov. 2017, doi: 10.1145/3136625. oversampling technique (SMOTE), feature representation, and classi-
fication algorithm on imbalanced sentiment analysis,’’ in Proc. 2nd
[12] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T. Liu,
Int. Conf. Informat. Comput. Sci. (ICICoS), Oct. 2018, pp. 1–5, doi:
‘‘LightGBM: A highly efficient gradient boosting decision tree,’’ in Proc.
10.1109/ICICOS.2018.8621648.
Adv. Neural Inf. Process. Syst., vol. 30, 2017, pp. 3146–3154.
[31] R. Blagus and L. Lusa, ‘‘Joint use of over- and under-sampling techniques
[13] E. Bjornson, J. Born, and A. Mardinoglu, ‘‘Personalized cardiovascular
and cross-validation for the development and assessment of prediction
disease prediction and treatmenta review of existing strategies and novel
models,’’ BMC Bioinf., vol. 16, no. 1, pp. 1–10, Dec. 2015, doi:
systems medicine tools,’’ Frontiers Physiol., vol. 11, p. 705, Jan. 2020, doi:
10.1186/s12859-015-0784-9.
10.3389/fphys.2020.00705.
[32] N. V. Chawla, ‘‘Data mining for imbalanced datasets: An overview,’’ in
[14] M. Cikes and S. D. Solomon, ‘‘Beyond ejection fraction: An integrative Data Mining and Knowledge Discovery Handbook. Cham, Switzerland:
approach for assessment of cardiac structure and function in heart failure,’’ Springer, 2009, pp. 875–886, doi: 10.1007/978-0-387-09823-4_45.
Eur. Heart J., vol. 42, no. 6, pp. 657–670, 2021, doi: 10.1093/eur- [33] A. Ishaq, S. Sadiq, M. Umer, S. Ullah, S. Mirjalili, V. Rupapara, and
heartj/ehaa731. M. Nappi, ‘‘Improving the prediction of heart failure patients’ survival
[15] P. Ponikowski et al., ‘‘2016 ESC guidelines for the diagnosis and treatment using SMOTE and effective data mining techniques,’’ IEEE Access, vol. 9,
of acute and chronic heart failure,’’ Eur. Heart J., vol. 37, no. 27, pp. 39707–39716, 2021, doi: 10.1109/ACCESS.2021.3064084.
pp. 2129–2200, Jul. 2016, doi: 10.1093/eurheartj/ehw128. [34] S. M. Alhashmi, M. S. I. Polash, A. Haque, F. Rabbe, S. Hossen,
[16] B. A. Goldstein, A. M. Navar, and R. E. Carter, ‘‘Moving beyond regression N. Faruqui, I. Abaker Targio Hashem, and N. Fathima Abubacker,
techniques in cardiovascular risk prediction: Applying machine learning to ‘‘Survival analysis of thyroid cancer patients using machine learn-
address analytic challenges,’’ Eur. Heart J., vol. 38, no. 23, pp. 1805–1814, ing algorithms,’’ IEEE Access, vol. 12, pp. 61978–61990, 2024, doi:
2017, doi: 10.1093/eurheartj/ehw302. 10.1109/ACCESS.2024.3392275.
[17] A. C. Alba, T. Agoritsas, and M. Walsh, ‘‘Discrimination and calibration of [35] S. D. Yang, Z. A. Ali, and B. M. Wong, ‘‘Fluid-gpt (fast learning
clinical prediction models: Users’ guides to the medical literature,’’ JAMA, to understand and investigate dynamics with a generative pre-trained
vol. 310, no. 19, pp. 2501–2502, 2013, doi: 10.1001/jama.2017.12126. transformer): Efficient predictions of particle trajectories and erosion,’’
[18] S. E. Awan, F. Sohel, F. M. Sanfilippo, M. Bennamoun, and Ind. & Eng. Chem. Res., vol. 62, no. 37, pp. 15278–15289, 2023.
G. Dwivedi, ‘‘Machine learning in heart failure: Ready for prime [36] S. Tang and J. Wiens, ‘‘Model selection for offline reinforcement learning:
time,’’ Current Opinion Cardiol., vol. 33, no. 2, pp. 190–195, 2018, doi: Practical considerations for healthcare settings,’’ in Proc. Mach. Learn.
10.1097/HCO.0000000000000507. Healthcare Conf., 2021, pp. 2–35.
[19] M. Panahiazar, V. Taslimitehrani, N. L. Pereira, and J. Pathak, ‘‘Using [37] M. Ahmed, M. H. Sulaiman, M. M. Hassan, M. A. Rahaman, and
EHRs for heart failure therapy recommendation using multidimensional M. Abdullah, ‘‘Selective opposition based constrained barnacle mating
patient similarity analytics,’’ Stud. Health Technol. Informat., vol. 210, optimization: Theory and applications,’’ Results Control Optim., vol. 17,
p. 369, Jun. 2015, doi: 10.3233/978-1-61499-512-8-369. Dec. 2024, Art. no. 100487.
[38] M. Ahmed, M. H. Sulaiman, A. J. Mohamad, and M. Rahman, ‘‘Gooseneck MD MARUF HASSAN (Member, IEEE) was
barnacle optimization algorithm: A novel nature inspired optimization born in Dhaka, Bangladesh. He received the
theory and application,’’ Math. Comput. Simul., vol. 218, pp. 248–265, bachelor’s degree in information systems from
Apr. 2024. Australian Catholic University, Australia, the mas-
ter’s degree in computer science and engineering
from East West University, Bangladesh, and the
Ph.D. degree in computer engineering from the
Universiti Malaysia Perlis (UniMAP), Malaysia.
He has amassed over 17 years of experience in
both academic and professional roles, primarily
in Australia and Bangladesh. He holds multiple industry certifications,
including a Certified Information Systems Auditor (CISA), a Computer
Hacking Forensic Investigator (CHFI), and a Certified Ethical Hacker
MARZIA AHMED received the Ph.D. degree from
(CEH). Currently, he is an Associate Professor with Southeast University,
the Faculty of Electrical and Electronics Engi-
Bangladesh. Prior to this role, he was an Associate Professor with Daffodil
neering Technology, Universiti Malaysia Pahang
International University, Bangladesh, where he was the Director of the Cyber
Al-Sultan Abdullah (UMPSA), in 2024. She is
Security Centre and coordinated cybersecurity programs. He has designed
an Assistant Professor with the Department of
the curriculum and syllabus for several courses related to cybersecurity and
Software Engineering, Daffodil International Uni-
has played a key role in developing the M.Sc. degree in cyber security
versity, Dhaka, Bangladesh. She is an inventor of
syllabus for multiple universities. He has authored more than 48 research
the Gooseneck Barnacle Optimization Algorithm
articles in various internationally indexed journals, book chapters, and
(GBOA), which has been applied in various
conference proceedings. He is a member of several professional societies and
optimization problems. She has authored more
has been actively involved in conducting workshops, seminars, and training
than 20 research papers published in international journals and conferences.
sessions in the field of cybersecurity.
Her research focuses on swarm intelligence, optimization algorithms,
artificial intelligence, and the Internet of Things (IoT).