AI-Enhanced Sepsis Detection
AI-Enhanced Sepsis Detection
Corresponding Author:
Smitha N.
Research Scholar, Department of Computer Science and Engineering
University Visvesvaraya College of Engineering, Bangalore University
Bengaluru, India
Email: [email protected]
1. INTRODUCTION
Sepsis is a medical condition that requires immediate attention to avoid a life-threatening situation.
This is a condition arises due to body’s response to an infection. If this is not treated at the initial stage can
lead to inflammation and dysfunction of organ and finally death. Sepsis can be triggered by various infectious
agents such as bacteria, viruses, fungi, or parasites, and it can affect individuals of any age or health status.
The immune system, designed to defend against infections, can go into overdrive during sepsis, releasing an
excessive amount of inflammatory chemicals. Early identification and intervention is very much required to
avoid sepsis to mitigate to severe stages like severe sepsis and septic shock [1]. If it’s not treated in the early
stage can lead to organ failure and even death their by increasing the mortality rate. Timely intervention
includes antibiotics treatment to stabilize the initial symptoms. Sepsis remains a significant global health
challenges. Machine learning plays a crucial role in detecting sepsis by leveraging computational algorithms
and statistical models to analyze large datasets and identify patterns associated with the condition [2].
Predictive analytics can assist clinicians in identifying high-risk patients and implementing
preventive measures. Machine learning algorithms can be integrated into clinical decision support systems
to assist healthcare professionals in diagnosing sepsis. These systems analyze real-time patient data and pro-
vide recommendations or alerts based on established sepsis detection criteria. The current state of the art in
sepsis detection and management emphasizes the importance of early identification and prompt intervention
to prevent the progression of sepsis to severe stages, including septic shock. In this work feed-forward neu-
ral network model is developed and performance is checked using different metric like accuracy, precision,
F1-score, and recall. To address class imbalance in datasets, techniques like random oversampling, synthetic
minority over-sampling technique (SMOTE), and random sampling are employed, significantly improving the
performance of feed-forward neural networks in detecting sepsis. Hyperparameter optimization methods, such
as RandomizedSearchCV and GridSearchCV, are used to fine-tune these models, leading to improved accuracy
and performance metrics. This work underscoring the importance of robust hyperparameter tuning methods in
optimizing models for imbalanced datasets for detecting sepsis.
2. LITERATURE SURVEY
According to Lydia et al. [3], after evaluating multiple machine learning models, XGBoost stood
out with a remarkable accuracy of 0.98, surpassing other models. Opting for efficiency, the study suggests
using the lazy predict library and applies random search for hyper-parameter tuning of the XGBoost classifier.
Introducing a lazy classifier to address classification challenges, the research achieves an impressive score of
0.99 using random search for optimal parameters, emphasizing the effectiveness of this approach with the
XGBoost algorithm [3].
Kijpaisalratana et al. [4] have done hyper-parameter tuning using grid search on logistic regression
model. Using a 5-fold cross-validation technique, the model’s biases are minimized. Authors have compared
the proposed model performance with the existing traditional scoring systems. They have used shapley additive
explanation (SHAP) approach to explain the role of each feature in the prediction of sepsis. But this work has a
limitation, as it’s based on international classification of diseases (ICD) code which could be wrongly classified
due to different types of error [4].
Ghias et al. [5] have used machine learning algorithms to effectively predict sepsis in the patients who
are admitted to the intensive care unit (ICU). The features used here are the vital signs of the patients who are 18
and above aged. Here, the authors have used missforest algorithm for imputation of data. No hyperparameter
tuning used, as they have used random forest, which can handle all the missing values. The XGBoost model
outperformed with 0.98 accuracy [5].
Shashikumar et al. [6] have developed an recurrent neural network (RNN) model called deep artificial
intelligence sepsis expert (DeepAISE) for early sepsis prediction. This model has achieved area under the
curve (AUC) ranging from 0.87 to 0.90. Schamoni et al. [7] have developed an ensembling neural network
(NN) to analyse medical data there by reducing error. This approach is exemplified in the early prediction of
sepsis using real intensive care unit data, showcasing its efficacy in healthcare analytics. Here, authors have
created a pool of individual learners. But they have not evaluated attribute inference attack. Through preventing
membership inference, they were able to prevent attribute inference as well [7].
A novel hybrid meta heuristic algorithm, human mental search (HMS)-particle swarm optimization
(PSO), used to optimize the deep neural network weights in early sepsis diagnosis. Here, local minima are
reduced using PSO and HMS. System outperforms with mean square error (MSE) value of 0.22 [8]. This
proposed system addresses the need to optimize the feed-forward neural network using hyper-parameter tuning,
their by showcasing the effect of imbalance dataset on the model performance and addresses the need to hyper-
parameter tuning to improve the overall performance of the model in sepsis detection [9].
3. METHOD
A feed-forward neural network is a type of artificial neural network in which information flows in one
direction: forward, from the input layer to the output layer. It consists of multiple layers of interconnected
nodes or neurons, organized into three main types of layers:
− Input layer: Input layer takes input data, and it has corresponding neurons as shown (1) and (2).
Enhancing sepsis detection using feed-forward neural networks with hyperparameter ... (Smitha N.)
1254 ❒ ISSN: 2252-8938
− Hidden layers: These are middle layers in the network. Each hidden layer has neurons and are connected
to every other neurons in the previous layer.
− Output layer: The output layer consists of number of neurons, which is completely depending on the
nature of the problem which we are solving.
A feed-forward neural network, at its core, is akin to a single-layer perceptron. It processes a sequence
of inputs by multiplying them with weights. These weighted inputs are then summed to produce a total. If this
sum surpasses a predetermined threshold (often set at zero), the output is typically 1; otherwise, it is -1. The
single-layer perceptron is widely used for classification and can also integrate machine learning features [10].
3.1.1. GridSearchCV
GridSearchCV is a hyperparameter tuning technique provided by scikit-learn that exhaustively
searches through a predefined set of hyperparameter values for a given machine learning model. It performs
a cross-validated grid search over a parameter grid, evaluating the model’s performance for each combination
of hyperparameters. Hyperparameter tuning finds the best possible settings to optimize the model’s predictive
performance. In hyperparameter tuning, you define a grid of hyperparameter values that you want to explore.
This grid is essentially a set of points in the hyperparameter space [15], [6]. The grid search algorithm aims
to find the optimal hyperparameters (h∗ ) by evaluating the model’s performance across all combinations of
hyperparameter values in the grid (H) .
GridSearchCV systematically explores this grid by training and evaluating the model at each point.
It covers all possible combinations of hyperparameters specified in the grid. Cross-validation is an essential
part of the process. The model is trained on some folds and tested on others, iteratively. GridSearchCV can be
computationally expensive, especially when the hyperparameter space is large [16].
3.1.2. RandomizedSearchCV
Like GridSearchCV, the purpose of RandomizedSearchCV is hyperparameter tuning. Instead of
exhaustively searching through all possible combinations of hyperparameters, RandomizedSearchCV randomly
samples a specified number of combinations from the hyperparameter space. Similar to GridSearchCV, you
define hyperparameter space to explore. However, instead of specifying exact values, you provide distributions
from which values are randomly drawn [17].
The mathematical representation can be simplified as follows: Let H be the set of hyperparameters,
M be the machine learning model, and P be the performance metric. The randomized search algorithm aims
to find the optimal hyperparameters (h∗ ) by randomly sampling hyperparameter values from the distributions
defined in the search space, as shown in (3).
Where M (h) represents the machine learning model with hyperparameters h, P (·) is the performance metric
used for evaluation, and arg maxh∈H denotes the hyperparameter values that maximize the model’s
performance.
Hyperparameter tuning with RandomizedSearchCV is often more suitable when computational
resources are limited or when you want to quickly get a sense of the hyperparameter space [18]. In sum-
mary, RandomizedSearchCV efficiently explores the hyperparameter space through random sampling, making
it particularly useful when the search space is extensive and an exhaustive search is not feasible. It offers
a good compromise between exploration and computational cost, providing an effective alternative to Grid-
SearchCV. Figure 1 shows the proposed system architecture where feed-forward neural network is optimized
using SMOTE, random sampling, GridCV, and RandomizedCV.
Enhancing sepsis detection using feed-forward neural networks with hyperparameter ... (Smitha N.)
1256 ❒ ISSN: 2252-8938
Table 1. Best parameters and accuracy for feed-forward neural network with hyperparameter tuning
Best parameters {’mlpclassifier alpha’: 0.01, ’mlpclassifier activation’: ’logistic’}
Best accuracy 0.561
Test accuracy of best model 0.575
vided features. Test accuracy of best model is 51.5%. This represents the ability of the tuned neural network to
generalize and make accurate predictions on new data, specifically for sepsis detection. In the context of sepsis
detection, these results suggest that the feed-forward neural network, with the specified hyperparameters, can
achieve an accuracy of approximately 51.5% on unseen data.
Enhancing sepsis detection using feed-forward neural networks with hyperparameter ... (Smitha N.)
1258 ❒ ISSN: 2252-8938
5. CONCLUSION
Feed-forward neural network model achieved 83% accuracy on the test set with random oversam-
pling, there is a substantial improvement in the precision, recall, and F1 score of the positive class compare to
original imbalanced dataset. Optimal hyperparameter tuning like RandomizedSearchCV identified parameters
like alpha value 0.01, logistic activation function with 57.5% accuracy. Whereas with GridSearchCV there was
a down flow accuracy which is 51.5%. These findings underscore the role of robust hyper-parameter tuning
methods in optimizing the feed-forward neural network for imbalanced datasets in detecting sepsis.
REFERENCES
[1] M. Moor, B. Rieck, M. Horn, C. R. Jutzeler, and K. Borgwardt, “Early prediction of sepsis in the icu using machine learning: a
systematic review,” Frontiers in Medicine, vol. 8, 2021, doi: 10.3389/fmed.2021.607952.
[2] S. M. Lauritsen et al., “Early detection of sepsis utilizing deep learning on electronic health record event sequences,” Artificial
Intelligence in Medicine, vol. 104, 2020, doi: 10.1016/j.artmed.2020.101820.
[3] E. L. Lydia, S. A. Althubiti, C. S. S. Anupama, and K. V. Kumar, “Prediction of sepsis disease using random search to optimize
hyperparameter tuning based on lazy predict model,” Smart Innovation, Systems and Technologies, vol. 371, pp. 351–367, 2023,
doi: 10.1007/978-981-99-6706-3 31.
[4] N. Kijpaisalratana, D. Sanglertsinlapachai, S. Techaratsami, K. Musikatavorn, and J. Saoraya, “Machine learning algorithms for
early sepsis detection in the emergency department: a retrospective study,” International Journal of Medical Informatics, vol. 160,
2022, doi: 10.1016/j.ijmedinf.2022.104689.
[5] N. Ghias et al., “Using machine learning algorithms to predict sepsis and its stages in icu patients,” medRxiv, pp. 1–9, 2022, doi:
10.1101/2022.03.15.22271655.
[6] S. P. Shashikumar, C. Josef, A. Sharma, and S. Nemati, “DeepAISE–an end-to-end development and deployment of a recurrent
neural survival model for early prediction of sepsis,” arXiv-Computer Science, pp. 1–45, 2019, doi: 10.48550/arXiv.1908.04759.
[7] S. Schamoni, M. Hagmann, and S. Riezler, “Ensembling neural networks for improved prediction and privacy in early diagnosis of
sepsis,” Proceedings of Machine Learning Research, vol. 182, pp. 123–145, 2022.
[8] U. Kaya, A. Yılmaz, and S. Aşar, “Sepsis prediction by using a hybrid metaheuristic algorithm: a novel approach for optimizing
deep neural networks,” Diagnostics, vol. 13, no. 12, 2023, doi: 10.3390/diagnostics13122023.
[9] R. R. Kanna, T. M. Priya, V. A. Immanuel, V. B. Kirubanand, T. Senthilnathan, and V. Rohini, “An novel cutting edge ann ma-
chine learning algorithm for sepsis early prediction and diagnosis,” AIP Conference Proceedings, vol. 2909, no. 1, 2023, doi:
10.1063/5.0181885.
[10] G. Bebis and M. Georgiopoulos, “Feed-forward neural networks,” IEEE Potentials, vol. 13, no. 4, pp. 27–31, 2002, doi:
10.1109/45.329294.
[11] E. Junter, “Predicting sepsis in the intensive care unit using machine learning,” M.Sc. Thesis, School of Engineering Science, LUT
University, Lappeenranta, Finlandia, 2020.
[12] Z. Wang and B. Yao, “Multi-branching temporal convolutional network for sepsis prediction,” IEEE Journal of Biomedical and
Health Informatics, vol. 26, no. 2, pp. 876–887, 2022, doi: 10.1109/JBHI.2021.3092835.
[13] R. Bardenet, M. Brendel, B. Kégl, and M. Sebag, “Collaborative hyperparameter tuning,” in 30th International Conference on
Machine Learning, ICML 2013, 2013, pp. 858–866.
[14] T. Van Steenkiste et al., “Accurate prediction of blood culture outcome in the intensive care unit using long short-term memory
neural networks,” Artificial Intelligence in Medicine, vol. 97, pp. 38–43, 2019, doi: 10.1016/j.artmed.2018.10.008.
[15] B. H. Shekar and G. Dagnew, “Grid search-based hyperparameter tuning and classification of microarray cancer data,” in 2019
Second International Conference on Advanced Computational and Communication Paradigms (ICACCP), IEEE, 2019, pp. 1–8,
doi: 10.1109/ICACCP.2019.8882943.
[16] S. Kumar, H. Tiwari, and M. Jaiswal, “Diabetes prediction using optimisation techniques with machine learning algorithms,” Inter-
national Journal of Electronic Healthcare, vol. 13, no. 2, pp. 158–168, 2023, doi: 10.1504/IJEH.2023.130515.
[17] T. N. Pham, L. Van Tran, and S. V. T. Dao, “Early disease classification of mango leaves using feed-forward neural network and
hybrid metaheuristic feature selection,” IEEE Access, vol. 8, pp. 189960–189973, 2020, doi: 10.1109/ACCESS.2020.3031914.
[18] D. L. Shanthi and N. Chethan, “Genetic algorithm based hyper-parameter tuning to improve the performance of machine learning
models,” SN Computer Science, vol. 4, no. 2, Dec. 2022, doi: 10.1007/s42979-022-01537-8.
[19] W. G. Cochran, Sampling techniques. Hoboken, USA: John Wiley & Sons, 1977, doi: 10.1002/9781118445112.stat01272.
[20] A. Fernández, S. Garcı́a, F. Herrera, and N. V. Chawla, “SMOTE for learning from imbalanced data: progress and challenges,
marking the 15-year anniversary,” Journal of Artificial Intelligence Research, vol. 61, pp. 863–905, 2018, doi: 10.1613/jair.1.11192.
[21] J. Y. Park et al., “Predicting sepsis mortality in a population based national database: machine learning approac,” Journal of Medical
Internet Research, vol. 24, no. 4, 2022.
[22] M. A. Reyna, C. S. Josef, R. Jeter, S. P. Shashikumar, M. B. Westover, S. Nemati, G. D. Clifford, and A. Sharma, “Early prediction
of sepsis from clinical data: the PhysioNet/Computing in Cardiology Challenge 2019,” Crit. Care Med., vol. 48, no. 2, pp. 210–217,
2020, doi: 10.1097/CCM.0000000000004145.
[23] P. Nejedly, F. Plesinger, I. Viscor, J. Halamek, and P. Jurak, “Prediction of sepsis using lstm with hyperparameter optimization with
a genetic algorithm,” in 2019 Computing in Cardiology (CinC), Dec. 2019, pp. 1–4, doi: 10.22489/CinC.2019.022.
[24] R. Hassanzadeh, M. Farhadian, and H. Rafieemehr, “Hospital mortality prediction in traumatic injuries patients: comparing different
smote-based machine learning algorithms,” BMC Medical Research Methodology, vol. 23, no. 1, 2023, doi: 10.1186/s12874-023-
01920-w.
[25] I. Persson, A. Östling, M. Arlbrandt, J. Söderberg, and D. Becedas, “A machine learning sepsis prediction algorithm for intended
intensive care unit use (navoy sepsis): proof-of-concept study,” JMIR Formative Research, vol. 5, no. 9, 2021, doi: 10.2196/28000.
[26] E. K. Sari and D. P. Sari, “The effect of mobile learning application in enhancing students’ knowledge of caring sepsis patients,”
International Journal of Public Health Science, vol. 12, no. 2, pp. 787–793, 2023, doi: 10.11591/ijphs.v12i2.22206.
BIOGRAPHIES OF AUTHORS
Enhancing sepsis detection using feed-forward neural networks with hyperparameter ... (Smitha N.)