Precision Non-Alcoholic Fatty Liver Disease NAFLD
Precision Non-Alcoholic Fatty Liver Disease NAFLD
Article
Precision Non-Alcoholic Fatty Liver Disease (NAFLD) Diagnosis:
Leveraging Ensemble Machine Learning and Gender Insights for
Cost-Effective Detection
Azadeh Alizargar 1 , Yang-Lang Chang 1 , Mohammad Alkhaleefah 1                                 and Tan-Hsu Tan 1,2, *
                                           Abstract: Non-Alcoholic Fatty Liver Disease (NAFLD) is characterized by the accumulation of excess
                                           fat in the liver. If left undiagnosed and untreated during the early stages, NAFLD can progress to
                                           more severe conditions such as inflammation, liver fibrosis, cirrhosis, and even liver failure. In this
                                           study, machine learning techniques were employed to predict NAFLD using affordable and accessible
                                           laboratory test data, while the conventional technique hepatic steatosis index (HSI)was calculated for
                                           comparison. Six algorithms (random forest, K-nearest Neighbors, Logistic Regression, Support Vector
                                           Machine, extreme gradient boosting, decision tree), along with an ensemble model, were utilized
                                           for dataset analysis. The objective was to develop a cost-effective tool for enabling early diagnosis,
                                           leading to better management of the condition. The issue of imbalanced data was addressed using
                                           the Synthetic Minority Oversampling Technique Edited Nearest Neighbors (SMOTEENN). Various
                                           evaluation metrics including the F1 score, precision, accuracy, recall, confusion matrix, the mean
                                           absolute error (MAE), receiver operating characteristics (ROC), and area under the curve (AUC) were
Citation: Alizargar, A.; Chang, Y.-L.;     employed to assess the suitability of each technique for disease prediction. Experimental results
Alkhaleefah, M.; Tan, T.-H. Precision
                                           using the National Health and Nutrition Examination Survey (NHANES) dataset demonstrated that
Non-Alcoholic Fatty Liver Disease
                                           the ensemble model achieved the highest accuracy (0.99) and AUC (1.00) compared to the machine
(NAFLD) Diagnosis: Leveraging
                                           learning techniques that we used and HSI. These findings indicate that the ensemble model holds
Ensemble Machine Learning and
Gender Insights for Cost-Effective
                                           potential as a beneficial tool for healthcare professionals to predict NAFLD, leveraging accessible and
Detection. Bioengineering 2024, 11, 600.   cost-effective laboratory test data.
https://doi.org/10.3390/
bioengineering11060600                     Keywords: machine learning techniques; NAFLD; ensemble model; HSI; AUC; SMOTEENN
                               drawbacks that limit its routine use in public health check-up studies [8,9]. The procedure
                               is invasive, posing a risk of bleeding, and may not be suitable for widespread application.
                               To address these concerns, alternative diagnostic techniques have been explored, such
                               as ultrasonography, magnetic resonance imaging, and computed tomography, which can
                               also detect NAFLD. However, these imaging methods come with their own limitations,
                               including being time-consuming, expensive, and often not easily accessible, particularly in
                               remote regions. Although abdominal ultrasound has been widely used for the assessment
                               of NAFLD, while ultrasound is a non-invasive and relatively affordable imaging technique,
                               it does have some limitations when applied to NAFLD diagnosis and characterization.
                               One of the primary challenges lies in its sensitivity to detecting mild levels of fat infil-
                               tration, often resulting in false negatives in cases of early stage NAFLD. The presence of
                               obesity can also hinder the clarity of ultrasound images, making it challenging to accurately
                               assess liver fat content and inflammation. Abdominal ultrasound for NAFLD presents
                               additional complexities that demand specialized expertise. Interpreting ultrasound images
                               for NAFLD requires skilled radiologists or sonographers familiar with the specific nuances
                               of liver appearance associated with fatty infiltration. However, not all regions have access
                               to such specialized medical professionals, exacerbating the diagnostic challenge [10–12].
                               Due to the invasiveness and cost associated with a liver biopsy, it is not routinely per-
                               formed, prompting the need for non-invasive and more practical diagnostic approaches to
                               effectively identify and manage NAFLD [13].
                                     Recent advancements in the field of Non-Alcoholic Fatty Liver Disease (NAFLD) have
                               led to the development of several new indices for its diagnosis and assessment, including
                               the hepatic steatosis index (HSI) and fatty liver index (FLI). These novel indices integrate
                               various non-invasive parameters, such as clinical, biochemical, and imaging data, to offer
                               a more comprehensive and accurate evaluation of NAFLD. Unlike traditional methods
                               that require invasive liver biopsies, these new indices provide a safer and less burdensome
                               approach to diagnosing and monitoring NAFLD. These innovative indices can be easily
                               implemented in routine clinical practice, enabling earlier detection, intervention, and
                               improved management of NAFLD to mitigate its potential complications. As research
                               continues in this area, FLI and HSI, and other emerging indices hold great potential to
                               revolutionize the diagnosis and management of NAFLD, ultimately contributing to better
                               patient outcomes and reducing the disease’s global burden [5,14,15].
                                     HSI is a non-invasive index used to assess the presence of hepatic steatosis, also known
                               as fatty liver disease. It is based on easily obtainable clinical and biochemical parameters,
                               making it a practical tool for diagnosing and monitoring NAFLD. HSI incorporates factors
                               such as body mass index (BMI), and the levels of certain liver enzymes, including aspartate
                               aminotransferase (AST) and alanine aminotransferase (ALT). The index is calculated using
                               a specific formula. The simplicity and effectiveness of HSI make it a valuable screening
                               tool, particularly in settings where more sophisticated imaging or liver biopsy options
                               might be limited. However, like other non-invasive indices, HSI may not provide a defini-
                               tive diagnosis and is often used in conjunction with other diagnostic tools to assess the
                               severity and progression of hepatic steatosis. Ongoing research and validation studies are
                               further refining the utility of HSI in clinical practice, offering a valuable contribution to the
                               management of NAFLD.
                                     If indices utilized for NAFLD diagnosis, including HSI, showcase varying degrees of
                               accuracy and area under the curve (AUC), it underscores an ongoing requirement to explore
                               innovative avenues for enhancing early NAFLD detection with heightened precision. A
                               notable drawback of these indices lies in their limited potential to consistently achieve this
                               goal. This drawback accentuates the urgency for more sophisticated and precise method-
                               ologies, such as employing machine learning techniques, to refine NAFLD prediction
                               by encompassing a wider array of clinical and biochemical variables. Machine learning
                               harnesses advanced algorithms to scrutinize intricate patterns and relationships within
                               extensive datasets, thereby fostering the development of more refined and personalized
                               diagnostic models. By integrating diverse clinical, imaging, and biochemical data points,
Bioengineering 2024, 11, 600                                                                                          3 of 14
                               machine learning algorithms can heighten the precision of identifying hepatic steatosis,
                               thereby contributing to the early recognition of individuals at risk. This proactive strategy
                               holds significant promise in enhancing patient outcomes by enabling timely interventions
                               and tailored management strategies.
                                     The field of medicine is abundant with data, yet physicians may inadvertently over-
                               look crucial information necessary for accurate disease diagnosis and treatment. Machine
                               learning techniques, widely employed across diverse areas of health sciences, offer a poten-
                               tial solution to this problem. Leveraging large datasets, these techniques can effectively
                               address the challenges of information extraction and analysis. In fact, machine learning has
                               already been successfully utilized in numerous medical disciplines for disease prediction.
                               Predicting NAFLD plays a crucial role in early detection, resource allocation, and public
                               health planning, ultimately leading to improved management. However, the available data
                               on the use of machine learning models for NAFLD prediction worldwide have been limited.
                                     Machine learning models have been employed for several years to predict
                               NAFLD [16–18]. Weidong Ji et al. [16] employed four machine learning algorithms to
                               predict NAFLD in 304,145 adults, with XGBoost showing the best accuracy (0.880) and
                               AUC (0.951). Xu et al. [17] used 11 techniques on a dataset of 2,522 individuals to achieve
                               an 83% accuracy for NAFLD prediction. Liu et al. [18] found XGBoost to have the highest
                               accuracy (0.795) among seven models for diagnosing NAFLD in 15,315 Chinese subjects.
                                     The aim of our study is to develop an ensemble machine learning model for the
                               prediction of NAFLD and compare its performance with HSI, using laboratory tests that
                               are easily obtainable and cost-effective. By utilizing such data, we seek to enhance the
                               accuracy and performance of NAFLD prediction, which could have important implications
                               for both diagnosing and treating NAFLD.
                               To achieve a more reliable and accurate outcome, the ensemble aggregates individual
                               predictions and makes the final prediction based on a majority vote, thereby leveraging the
                               collective knowledge of the models. The selection of the base models was carefully executed
                               to create an ensemble that outperforms each individual model in isolation, achieving a
                               superior performance.
                               3. Experimental Results
                                    The dataset used in this study comprised a total of 2505 individuals, including 1310 fe-
                               males and 1195 males. It encompassed 23 features, thoroughly detailed in Table 1, along
                               with additional analysis and information.
Bioengineering 2024, 11, 600       std: Standard deviation; GGT: 𝛾-glutamyl transferase; AST: aspartate aminotransferase; ALT:6 of 14
                                                                                                                                ala-
                                   nine aminotransferase; ALP: Alkaline phosphatase.
                                         After performing
                                        After   performing the necessary data preprocessing
                                                                                   preprocessing steps,
                                                                                                    steps, such
                                                                                                           suchas aselimination
                                                                                                                     eliminationof ofdu-
                                                                                                                                      du-
                                   plicates, the treatment
                                  plicates,       treatmentof  ofmissing
                                                                  missingvalues,  and
                                                                            values, andconverting
                                                                                          convertingspecific string
                                                                                                       specific      attributes
                                                                                                                 string         into into
                                                                                                                         attributes  nu-
                                   merical formats,
                                  numerical     formats,thethe
                                                            datasets  underwent
                                                               datasets  underwent standardization.   Following
                                                                                       standardization.            this,this,
                                                                                                           Following      a comprehen-
                                                                                                                              a compre-
                                   sive analysis
                                  hensive          of correlations
                                            analysis                waswas
                                                       of correlations   conducted,   resulting
                                                                             conducted,         in the
                                                                                           resulting inproduction     of heat
                                                                                                         the production       mapsmaps
                                                                                                                           of heat  that
                                   visually
                                  that       illustrated
                                        visually          the relationships
                                                   illustrated                between
                                                                the relationships       the dataset’s
                                                                                   between             features.
                                                                                              the dataset’s       This visualization
                                                                                                             features.                 is
                                                                                                                         This visualiza-
                                   presented
                                  tion          in Figure
                                        is presented    in 1, providing
                                                           Figure         valuablevaluable
                                                                   1, providing    insights insights
                                                                                             into the correlations   between different
                                                                                                       into the correlations    between
                                   variables.variables.
                                  different
                                  Figure 1.
                                  Figure 1. Heat
                                            Heat map
                                                 map illustrating
                                                     illustrating the
                                                                  the interrelationships
                                                                      interrelationships among
                                                                                         among various
                                                                                               various attributes.
                                                                                                       attributes.
                                  Table2.2. Rank
                                  Table     Rank of
                                                 of features
                                                     features importance.
                                                               importance.
                                   Features
                                    Features                                          Rank Rank
                                   ALT
                                    ALT                                                      0.0302475
                                                                                      0.0302475
                                   Albumin                                                   0.03000044
                                    Albumin                                           0.03000044
                                   ALP                                                       0.03011238
                                    ALP                                               0.03011238
                                   AST                                                       0.03231525
                                    AST
                                   Creatinine                                         0.03231525
                                                                                             0.03287693
                                    Creatinine
                                   GGT                                                0.03287693
                                                                                             0.02722758
                                   Triglycerides
                                    GGT                                                      0.04470876
                                                                                      0.02722758
                                   LDL
                                    Triglycerides                                            0.03276697
                                                                                      0.04470876
                                    LDL                                               0.03276697
                                    Age                                               0.05981952
Bioengineering 2024, 11, 600                                                                                                    7 of 14
Table 2. Cont.
                                Features                                             Rank
                                Total Cholesterol                                    0.03232777
                                Fasting Glucose                                      0.02749087
                                HDL                                                  0.04328872
                                SBP                                                  0.0530115
                                DBP                                                  0.03660032
                                Hemoglobin                                           0.02463633
                                Weight                                               0.02756716
                                Height                                               0.03652343
                                BMI                                                  0.03185026
                                Waist Circumference                                  0.04027026
                                WBC                                                  0.02362244
                                RBC                                                  0.02518571
                                Gender                                               0.27754989
                                    The datasets utilized in this study were divided into a testing set, accounting for 20%
                               of the data, and a training set, representing the remaining 80%. Table 3 displays the number
                               of subjects in both the training and testing sets, along with the counts of individuals testing
                               positive and negative for NAFLD.
                                    The objective was to determine the most appropriate technique for predicting Non-
                               Alcoholic Fatty Liver Disease (NAFLD), so six distinct machine learning techniques were
                               developed and implemented. Additionally, an ensemble model was constructed to further
                               enhance prediction accuracy. The performance of these models was then evaluated.
                                    After calculating the HSI using a specific formula, the best cutoff (which is found to be
                               46.47) was determined to maximize AUC, accuracy. The primary goal of this evaluation was
                               to assess the effectiveness of HSI in predicting NAFLD. Following this evaluation, the two
                               sets of results were compared to determine their relative performance. The comparative
                               results of these models are presented in Table 4.
Table 4. Comparison of Results: Ensemble Model and Machine Learning Algorithms and HIS index.
 Machine Learning Techniques     Accuracy       Precision   Recall   F1-Score    Sensitivity      Specificity   Area under the Curve
 Decision Tree                   0.92           0.92        0.92     0.92        0.97             0.85          0.915
 KNN                             0.88           0.90        0. 87    0. 87       1                0.74          0.968
 Random Forest                   0.98           0.98        0.98     0.98        0.99             0.96          0.999
 SVM                             0.97           0.97        0.97     0.97        1                0.93          1.000
 XGBoost                         0.98           0.98        0.98     0.98        0.99             0.96          1.000
 Logistic Regression             0.84           0.85        0.84     0.84        0.91             0.76          0.907
 Ensemble Model                  0.99           0.99        0.99     0.99        1                0.97          1.000
 HSI index                       0.74           0.77        0.62     0.68        0.62             0.72          0.73
gineering 2024, 11, x FOR PEER REVIEW
      Bioengineering 2024, 11, 600                                                                                              8 of 14
                                     Figure 2. AUC and ROC curve obtained by Ensemble Model and Machine Learning Algorithms.
                                     Figure 2. AUC and ROC curve obtained by Ensemble Model and Machi
                                     Table 5. Numbers of TP, FN, TN, and FP cases of Ensemble Model and Machine Learning Algorithms.
                                     Table
                                      Machine5.   Numbers
                                               Learning        of TP, FN, TN,
                                                        Techniques         TP and FPTNcases of FN
                                                                                                Ensemble
                                                                                                      FP Model an
                                     rithms.
                                      Decision Tree                        454      381        13     63
                                      Random Forest                                466           430           1           14
                                     Machine         Learning
                                      K-nearest Neighbors             Techniques
                                                          classifier (KNN)       467     331                   0         TP
                                                                                                                          113             TN
                                     Decision
                                      Support VectorTree
                                                     Machine (SVM)               467     415                   0         454
                                                                                                                          29              38
                                      Logistic Regression                        428     339                   39         105
                                     Random         Forest
                                      Extreme Gradient Boosting (XGBoost)        466     429                   1
                                                                                                                         466
                                                                                                                          15
                                                                                                                                          43
                                     K-nearest
                                      Ensemble Model  Neighbors         classifier
                                                                                 467(KNN)433                   0         467
                                                                                                                          11              33
                                     Support Vector Machine (SVM)                                                        467              41
                                     Logistic Regression                                                                 428              33
                                     Extreme Gradient Boosting (XGBoost)                                                 466              42
                                     Ensemble Model                                                                      467              43
                                     Table 6. MAE in testing and training phases of Ensemble Model and Mac
    Bioengineering 2024, 11, 600                                                                                             9 of 14
Table 6. MAE in testing and training phases of Ensemble Model and Machine Learning Algorithms.
                                   Figure 3. AUC and ROC curve obtained by Ensemble Model and Machine Learning Algorithms after
                                   Figure  3. AUC and ROC curve obtained by Ensemble Model and Machine Learning Algorit
                                   applying SMOTEENN exclusively to the training set.
                                   after applying SMOTEENN exclusively to the training set.
                                   Table 7. Comparison of Results after applying SMOTEENN exclusively to the training set.
                                   4. Discussion
     Machine Learning Techniques     Accuracy
                                        Based Precision Recall F1-Score
                                              on the experimental        Sensitivity
                                                                   results,          Specificity
                                                                            the findings  suggestArea under
                                                                                                    that anthe Curve
                                                                                                            ensemble           mode
     Decision Tree                   0.89
                                   corporating0.94      0.89 random
                                                techniques       0.91      forest,0.85
                                                                                    XGBoost,0.89
                                                                                             and     SVM,0.901
                                                                                                           can  serve as an effective
     KNN                           for0.94
                                        healthcare0.96professionals
                                                              0.94     and
                                                                       0.94 doctors
                                                                                  0.95in predicting
                                                                                               0.93 NAFLD,
                                                                                                         0.964 utilizing affordable la
     Random Forest                 atory
                                     0.99 test data.
                                                  0.99 This0.99ensemble0.99model0.92
                                                                                   demonstrates1    superior
                                                                                                         0.975performance compare
     SVM                           this
                                     0.96index, which
                                                  0.96    has0.96
                                                                been utilized
                                                                       0.95    by 0.47
                                                                                   doctors for1 evaluating  and screening NAFLD.
                                                                                                         0.993
     XGBoost                         0.97 Conventional
                                                  0.97      statistical
                                                              0.97      techniques
                                                                       0.97       0.67 have  limitations
                                                                                               1         in directly predicting NAF
                                                                                                         0.965
     Logistic Regression           due
                                     0.86to their0.91
                                                   reliance on0.86selecting
                                                                       0.88 potential
                                                                                  0.60 risk factors
                                                                                               0.88 from0.854
                                                                                                          data. To overcome these l
     Ensemble Model                tations,
                                     0.99     this0.99
                                                     study introduces
                                                              0.99     0.99machine0.85 learning1 techniques,
                                                                                                         0.981 which leverage statis
     HSI index
                                   methods
                                     0.74
                                               for   data
                                                  0.77
                                                           analysis
                                                              0.62
                                                                      and
                                                                       0.68
                                                                            evaluation.
                                                                                  0.62
                                                                                          Machine
                                                                                               0.72
                                                                                                    learning
                                                                                                         0.73
                                                                                                              algorithms offer powe
                                   tools for disease study and diagnosis, with the advantage of simultaneously conside
                                   multiple features without the need for variable selection. By applying machine learn
                                   techniques, this research aims to enhance NAFLD prediction by comprehensively ana
                                   ing diverse factors and their complex relationships, providing a data-driven and m
                                   variate approach to improve diagnostic accuracy.
                                          Furthermore, in our study, it was observed that HSI exhibited lower accuracy
Bioengineering 2024, 11, 600                                                                                                  10 of 14
                               Table 8. Numbers of TP, FN, TN, and FP cases after applying SMOTEENN exclusively to the
                               training set.
Table 9. MAE in testing and training phases after applying SMOTEENN exclusively to the training set.
                               Table 10. Number of subjects in the training and testing sets after applying SMOTEENN exclusively
                               to the training set.
                               4. Discussion
                                     Based on the experimental results, the findings suggest that an ensemble model
                               incorporating techniques random forest, XGBoost, and SVM, can serve as an effective
                               tool for healthcare professionals and doctors in predicting NAFLD, utilizing affordable
                               laboratory test data. This ensemble model demonstrates superior performance compared
                               to this index, which has been utilized by doctors for evaluating and screening NAFLD.
                                     Conventional statistical techniques have limitations in directly predicting NAFLD due
                               to their reliance on selecting potential risk factors from data. To overcome these limitations,
                               this study introduces machine learning techniques, which leverage statistical methods for
                               data analysis and evaluation. Machine learning algorithms offer powerful tools for disease
                               study and diagnosis, with the advantage of simultaneously considering multiple features
                               without the need for variable selection. By applying machine learning techniques, this
                               research aims to enhance NAFLD prediction by comprehensively analyzing diverse factors
                               and their complex relationships, providing a data-driven and multivariate approach to
                               improve diagnostic accuracy.
                                     Furthermore, in our study, it was observed that HSI exhibited lower accuracy and AUC
                               compared to the machine learning models utilized. This indicates that HSI was limited in
                               its ability to accurately identify individuals with NAFLD, as it could only detect a small
                               number of patients with the condition. Conversely, all of the machine learning models
                               demonstrated higher F1 scores than HSI. This indicates that the machine learning models
Bioengineering 2024, 11, 600                                                                                         11 of 14
                               outperformed HSI in terms of precision and recall, achieving a better balance between
                               correctly identifying true positives and minimizing false positives and false negatives. The
                               higher F1 scores achieved by the machine learning models provide supporting evidence
                               that they exhibited superior performance in predicting NAFLD compared to HSI.
                                     SMOTEENN was also exclusively applied to the training set after the dataset was
                               split. This adjustment was made to ensure the absence of potential data leakage and to
                               ensure the robustness of our model evaluation. Our evaluation metrics, including accuracy,
                               precision, recall, F1-score, and AUC, remained consistently high. Minor variations observed
                               in certain metrics were within an acceptable range and did not compromise the overall
                               performance of the model. Furthermore, the Mean Absolute Error (MAE) for both the
                               training and testing datasets remained very low, indicating a strong fit of the model to the
                               data. Confusion matrices were generated for both scenarios to provide a comprehensive
                               assessment of our model’s performance. Given these results, confidence is asserted that
                               data leakage did not occur. Additionally, it can be confirmed that no subject contributed
                               data to both the training and test sets. In conclusion, applying SMOTEENN exclusively
                               to the training set after splitting the dataset maintains the integrity of our findings while
                               addressing potential concerns regarding data leakage. The consistency of our evaluation
                               metrics reaffirms the robustness of our approach and the validity of our results.
                                     Over the years, the utilization of machine learning techniques in disease prediction
                               has been widely explored by numerous researchers. In a study conducted by Weidong Ji
                               et al. [16], four machine learning algorithms were employed to predict Non-Alcoholic Fatty
                               Liver Disease (NAFLD) using a dataset consisting of 304,145 adults. Among these four
                               algorithms, XGBoost demonstrated the highest performance, with accuracy of 0.880 and
                               AUC value of 0.951. Xu et al. [17] undertook a study to evaluate the optimal predictive
                               clinical model for NAFLD using machine learning techniques. The study encompassed
                               a dataset of 2,522 individuals who met the diagnostic criteria for NAFLD. Among the
                               11 different techniques employed, the best performance was observed, yielding an accuracy
                               of 83%. Liu et al. [18] conducted a study aiming to explore the predictive capabilities of
                               seven machine learning tools for NAFLD on 15,315 Chinese subjects. Among these models,
                               the XGBoost model exhibited the highest accuracy, achieving a value of 0.795. This model
                               demonstrated the best prediction ability among all the models constructed in the study,
                               highlighting its superior performance in accurately diagnosing NAFLD.
                                     Atsawarungruangkit et al. conducted a study aiming to create machine learning
                               models for predicting Non-Alcoholic Fatty Liver Disease (NAFLD), utilizing data from
                               the NHANES 1988–1994 dataset comprising 3235 participants, sourced from the National
                               Center for Health Statistics (NCHS). Comparing the results, they found that the ensemble
                               of random undersampling (RUS) boosted trees achieved the highest accuracy at 71.1%.
                               Interestingly, a simpler model, referred to as “coarse trees,” outperformed this with an
                               accuracy of 74.9% [25].
                                     The findings of previous studies strongly support the effectiveness of machine learning
                               tools in predicting NAFLD. These studies provide compelling evidence of the significant
                               potential of machine learning algorithms in screening and identifying individuals at risk
                               of NAFLD. The results affirm that machine learning techniques can be valuable tools in
                               improving the accuracy and efficiency of NAFLD diagnosis. Consistent with the aforemen-
                               tioned studies, our own research corroborated the effectiveness of machine learning tools in
                               NAFLD prediction. In our study, we developed an ensemble model using machine learning
                               techniques, which achieved an impressive accuracy of 99% and an AUC of 100%. These
                               results serve as further validation of the utility and efficacy of machine learning models
                               in predicting NAFLD. Additionally, our findings highlight the potential of employing an
                               ensemble approach to enhance the accuracy of NAFLD predictions. Together, the cumula-
                               tive evidence from previous studies and our own research underscores the robustness and
                               promising nature of machine learning tools in NAFLD prediction. These results emphasize
                               the valuable contribution of machine learning algorithms in improving the screening and
                               diagnosis of this prevalent liver disease. It is crucial to emphasize that while the ensemble
Bioengineering 2024, 11, 600                                                                                             12 of 14
                               model shows promising results in the accurate prediction of NAFLD, its effectiveness and
                               integration into clinical practice require further research and validation studies. Nonethe-
                               less, the current findings indicate its potential as a valuable tool for healthcare professionals,
                               utilizing affordable laboratory test data to aid in the diagnosis of NAFLD. To ensure its
                               reliable implementation, additional research is warranted to validate its performance and
                               establish its role in the healthcare setting.
                                     Gender has emerged as a critically important factor in NAFLD, which is currently the
                               most common liver disorder worldwide. Sexual dimorphism is evident in NAFLD, with
                               notable disparities in both prevalence and severity based on gender. These differences are
                               not solely influenced by sociocultural factors or lifestyle variations, but also attributed to
                               biological disparities resulting from chromosomal makeup and sex hormone levels [26].
                               Numerous studies have demonstrated the impact of gender on NAFLD prevalence. For
                               instance, a Japanese study conducted over 12 years found that the average prevalence
                               of fatty liver in men was double that observed in women (26% vs. 13%). Interestingly,
                               women exhibited a gradual increase in prevalence with age, whereas men demonstrated a
                               relatively stable prevalence across all age groups. In the 70–79 age group, the prevalence
                               of NAFLD was higher in females compared to males. Similarly, a study conducted in
                               South China revealed a significantly higher prevalence of NAFLD in men compared to
                               women below the age of 50 (22.4% vs. 7.1%). However, the prevalence reversed among
                               individuals over the age of 50, with higher rates observed in women (27.6% vs. 20.6%). In
                               humans, NAFLD predominantly affects men, while premenopausal women are equally
                               protected from NAFLD and cardiovascular disease [27]. In our study, we conducted an
                               analysis to determine the importance value of each feature and ranked them accordingly.
                               Strikingly, the results indicated that gender exhibited the highest scores in terms of feature
                               importance, reinforcing its significance in predicting NAFLD. These findings align with
                               previous studies that have demonstrated the prominent role of gender in the prevalence
                               and severity of NAFLD. Therefore, gender is a significant and influential feature in NAFLD.
                               It plays a crucial role in the varying prevalence and severity of the disease. Understanding
                               the mechanisms underlying gender disparities in NAFLD is crucial for developing targeted
                               therapeutic interventions. Further research is needed to explore the complex interplay
                               between gender, sex hormones, and NAFLD to optimize management strategies and
                               improve patient outcomes.
                                     This study has notable strengths that contribute to its robustness and generalizability.
                               Firstly, it utilizes the NHANES dataset, which encompasses a diverse racial and ethnic
                               background of individuals in the United States. The dataset is specifically designed to
                               represent the non-institutionalized civilian population and includes oversampling of spe-
                               cific demographic groups, ensuring adequate representation. This approach enhances the
                               study’s findings and minimizes potential bias. Additionally, this study leverages large-
                               scale datasets for testing and training the algorithms, enabling a comprehensive evaluation
                               across different models. Notably, an ensemble model is developed to enhance accuracy and
                               optimize algorithm performance. These strengths collectively enhance the reliability and
                               applicability of the study’s results.
                                     Nevertheless, our study has certain limitations, notably the utilization of restricted
                               datasets, the absence of clinical data, and the lack of a clinical trial. For future endeavors,
                               our goal is to integrate further attributes pertaining to NAFLD, which will aid in the
                               creation of more dependable and effective machine learning methods. Additionally, we
                               strongly advocate for the implementation of a clinical trial to authenticate the efficacy of
                               these techniques in real-world situations.
                               5. Conclusions
                                     The results of our study highlight the effectiveness of the ensemble model, combining
                               random forest, XGBoost, and SVM techniques, in accurately diagnosing NAFLD. The
                               ensemble model achieved a remarkable performance, with an accuracy of 0.99 and an AUC
                               of 1.00, indicating its high precision and reliability. Notably, our analysis identified gender
Bioengineering 2024, 11, 600                                                                                                           13 of 14
                                   as the most important feature in predicting NAFLD, further emphasizing its significance in
                                   this disease. In comparison to the other indices like HSI, our ensemble model demonstrated
                                   superior diagnostic capabilities and yielded substantially better results. These findings
                                   underscore the potential of the ensemble model for early detection and diagnosis of NAFLD,
                                   useful in screening.
                                   Author Contributions: Conceptualization, A.A.; Data curation, A.A.; Formal analysis, A.A.; Investi-
                                   gation, A.A.; Software, A.A.; Supervision, Y.-L.C., M.A., and T.-H.T.; Writing—original draft, A.A.;
                                   Writing—review and editing, Y.-L.C., M.A., and T.-H.T. All authors have read and agreed to the
                                   published version of the manuscript.
                                   Funding: This work was supported by the National Science and Technology Council of the Republic
                                   of China (Taiwan) under Contract NSTC 113-2119-M-027-003, 113-2119-M-006-007, 112-2221-E-027-097
                                   and 112-2221-E-027-107.
                                   Institutional Review Board Statement: Not applicable.
                                   Informed Consent Statement: Not applicable.
                                   Data Availability Statement: The NHANES dataset, the Centers for Disease Control (CDC) of the
                                   United States as a part of the National Health and Nutrition Examination Survey (NHANES), available
                                   online at https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?BeginYear=2017
                                   (accessed on 15 February 2020).
                                   Conflicts of Interest: The authors declare no conflict of interest.
References
1.    Su, P.-Y.; Chen, Y.-Y.; Lin, C.-Y.; Su, W.-W.; Huang, S.-P.; Yen, H.-H. Comparison of Machine Learning Models and the Fatty Liver
      Index in Predicting Lean Fatty Liver. Diagnostics 2023, 13, 1407. [CrossRef] [PubMed]
2.    Younossi, Z.M.; Koenig, A.B.; Abdelatif, D.; Fazel, Y.; Henry, L.; Wymer, M. Global Epidemiology of Nonalcoholic Fatty Liver
      Disease-Meta-Analytic Assessment of Prevalence, Incidence, and Outcomes. Hepatology 2016, 64, 73–84. [CrossRef]
3.    Castellana, M.; Donghia, R.; Guerra, V.; Procino, F.; Lampignano, L.; Castellana, F.; Zupo, R.; Sardone, R.; De Pergola, G.;
      Romanelli, F.; et al. Performance of Fatty Liver Index in Identifying Non-Alcoholic Fatty Liver Disease in Population Studies. A
      Meta-Analysis. J. Clin. Med. 2021, 10, 1877. [CrossRef] [PubMed]
4.    Fan, J.-G.; Kim, S.-U.; Wong, V.W.-S. New Trends on Obesity and NAFLD in Asia. J. Hepatol. 2017, 67, 862–873. [CrossRef]
      [PubMed]
5.    Chen, L.-D.; Huang, J.-F.; Chen, Q.-S.; Lin, G.-F.; Zeng, H.-X.; Lin, X.-F.; Lin, X.-J.; Lin, L.; Lin, Q.-C. Validation of Fatty Liver
      Index and Hepatic Steatosis Index for Screening of Non-Alcoholic Fatty Liver Disease in Adults with Obstructive Sleep Apnea
      Hypopnea Syndrome. Chin. Med. J. 2019, 132, 2670–2676. [CrossRef] [PubMed]
6.    Fan, J.G.; Wei, L.; Zhuang, H. Guidelines of Prevention and Treatment of Nonalcoholic Fatty Liver Disease (2018, China). J. Dig.
      Dis. 2019, 20, 163–173. [CrossRef] [PubMed]
7.    Younossi, Z.; Stepanova, M.; Ong, J.P.; Jacobson, I.M.; Bugianesi, E.; Duseja, A.; Eguchi, Y.; Wong, V.W.; Negro, F.; Yilmaz, Y.; et al.
      Nonalcoholic Steatohepatitis Is the Fastest Growing Cause of Hepatocellular Carcinoma in Liver Transplant Candidates. Clin.
      Gastroenterol. Hepatol. Off. Clin. Pract. J. Am. Gastroenterol. Assoc. 2019, 17, 748–755.e3. [CrossRef] [PubMed]
8.    Chalasani, N.; Younossi, Z.; Lavine, J.E.; Charlton, M.; Cusi, K.; Rinella, M.; Harrison, S.A.; Brunt, E.M.; Sanyal, A.J. The Diagnosis
      and Management of Nonalcoholic Fatty Liver Disease: Practice Guidance from the American Association for the Study of Liver
      Diseases. Hepatology 2018, 67, 328–357. [CrossRef]
9.    Sîrbu, O.; Floria, M.; Dăscălit, a, P.; Şorodoc, V.; Şorodoc, L. Non-Alcoholic Fatty Liver Disease-From the Cardiologist Perspective.
      Anatol. J. Cardiol. 2016, 16, 534–541. [CrossRef]
10.   Pirmoazen, A.M.; Khurana, A.; El Kaffas, A.; Kamaya, A. Quantitative ultrasound approaches for diagnosis and monitoring
      hepatic steatosis in nonalcoholic fatty liver disease. Theranostics 2020, 10, 4277–4289. [CrossRef] [PubMed] [PubMed Central]
11.   Zhang, Y.N.; Fowler, K.J.; Hamilton, G.; Cui, J.Y.; Sy, E.Z.; Balanay, M.; Hooker, J.C.; Szeverenyi, N.; Sirlin, C.B. Liver fat imaging-a
      clinical overview of ultrasound, CT, and MR imaging. Br. J. Radiol. 2018, 91, 20170959. [CrossRef] [PubMed] [PubMed Central]
12.   Petzold, G. Role of Ultrasound Methods for the Assessment of NAFLD. J. Clin. Med. 2022, 11, 4581. [CrossRef] [PubMed]
      [PubMed Central]
13.   Decharatanachart, P.; Chaiteerakij, R.; Tiyarattanachai, T.; Treeprasertsuk, S. Application of Artificial Intelligence in Non-
      Alcoholic Fatty Liver Disease and Liver Fibrosis: A Systematic Review and Meta-Analysis. Therap. Adv. Gastroenterol. 2021, 14,
      17562848211062808. [CrossRef] [PubMed]
14.   Bedogni, G.; Bellentani, S.; Miglioli, L.; Masutti, F.; Passalacqua, M.; Castiglione, A.; Tiribelli, C. The Fatty Liver Index: A Simple
      and Accurate Predictor of Hepatic Steatosis in the General Population. BMC Gastroenterol. 2006, 6, 33. [CrossRef] [PubMed]
Bioengineering 2024, 11, 600                                                                                                            14 of 14
15.   Kahl, S.; Straßburger, K.; Nowotny, B.; Livingstone, R.; Klüppelholz, B.; Keßel, K.; Hwang, J.-H.; Giani, G.; Hoffmann, B.; Pacini,
      G.; et al. Comparison of Liver Fat Indices for the Diagnosis of Hepatic Steatosis and Insulin Resistance. PLoS ONE 2014, 9, e94059.
      [CrossRef] [PubMed]
16.   Ji, W.; Xue, M.; Zhang, Y.; Yao, H.; Wang, Y. A Machine Learning Based Framework to Identify and Classify Non-Alcoholic Fatty
      Liver Disease in a Large-Scale Population. Front. Public Health 2022, 10, 846118. [CrossRef] [PubMed]
17.   Ma, H.; Xu, C.; Shen, Z.; Yu, C.; Li, Y. Application of Machine Learning Techniques for Clinical Predictive Modeling: A
      Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China. BioMed Res. Int. 2018, 2018, 4304376. [CrossRef] [PubMed]
18.   Liu, Y.-X.; Liu, X.; Cen, C.; Li, X.; Liu, J.-M.; Ming, Z.-Y.; Yu, S.-F.; Tang, X.-F.; Zhou, L.; Yu, J.; et al. Comparison and Development
      of Advanced Machine Learning Tools to Predict Nonalcoholic Fatty Liver Disease: An Extended Study. Hepatobiliary Pancreat.
      Dis. Int. 2021, 20, 409–415. [CrossRef] [PubMed]
19.   CDC Database. Available online: https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?BeginYear=2017
      (accessed on 15 February 2020).
20.   Kapoor, S.; Narayanan, A. Leakage and the reproducibility crisis in machine-learning-based science. Patterns 2023, 4, 100804.
      [CrossRef]
21.   Lee, J.-H.; Kim, D.; Kim, H.J.; Lee, C.-H.; Yang, J.I.; Kim, W.; Kim, Y.J.; Yoon, J.-H.; Cho, S.-H.; Sung, M.-W.; et al. Hepatic Steatosis
      Index: A Simple Screening Tool Reflecting Nonalcoholic Fatty Liver Disease. Dig. Liver Dis. Off. J. Ital. Soc. Gastroenterol. Ital.
      Assoc. Study Liver 2010, 42, 503–508. [CrossRef]
22.   Lee, Y.; Bang, H.; Park, Y.M.; Bae, J.C.; Lee, B.-W.; Kang, E.S.; Cha, B.S.; Lee, H.C.; Balkau, B.; Lee, W.-Y.; et al. Non-Laboratory-
      Based Self-Assessment Screening Score for Non-Alcoholic Fatty Liver Disease: Development, Validation and Comparison with
      Other Scores. PLoS ONE 2014, 9, e107584. [CrossRef] [PubMed]
23.   Chon, Y.E.; Jung, K.S.; Kim, S.U.; Park, J.Y.; Park, Y.N.; Kim, D.Y.; Ahn, S.H.; Chon, C.Y.; Lee, H.W.; Park, Y.; et al. Controlled
      Attenuation Parameter (CAP) for Detection of Hepatic Steatosis in Patients with Chronic Liver Diseases: A Prospective Study of a
      Native Korean Population. Liver Int. Off. J. Int. Assoc. Study Liver 2014, 34, 102–109. [CrossRef] [PubMed]
24.   Shih, K.-L.; Su, W.-W.; Chang, C.-C.; Kor, C.-T.; Chou, C.-T.; Chen, T.-Y.; Wu, H.-M. Comparisons of Parallel Potential Biomarkers
      of 1H-MRS-Measured Hepatic Lipid Content in Patients with Non-Alcoholic Fatty Liver Disease. Sci. Rep. 2016, 6, 24031.
      [CrossRef] [PubMed]
25.   Atsawarungruangkit, A.; Laoveeravat, P.; Promrat, K. Machine learning models for predicting non-alcoholic fatty liver disease in
      the general United States population: NHANES database. World J. Hepatol. 2021, 13, 1417–1427. [CrossRef] [PubMed]
26.   Nagral, A.; Bangar, M.; Menezes, S.; Bhatia, S.; Butt, N.; Ghosh, J.; Manchanayake, J.H.; Mahtab, M.A.; Singh, S.P. Gender
      Differences in Nonalcoholic Fatty Liver Disease. Euroasian J. Hepato-Gastroenterol. 2022, 12 (Suppl. S1), S19–S25. [CrossRef]
27.   Ballestri, S.; Nascimbeni, F.; Baldelli, E.; Marrazzo, A.; Romagnoli, D.; Lonardo, A. NAFLD as a Sexual Dimorphic Disease:
      Role of Gender and Reproductive Status in the Development and Progression of Nonalcoholic Fatty Liver Disease and Inherent
      Cardiovascular Risk. Adv. Ther. 2017, 34, 1291–1326. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.