Ijst 2023 3151
Ijst 2023 3151
RESEARCH ARTICLE
https://www.indjst.org/ 702
Patel & Shah / Indian Journal of Science and Technology 2024;17(8):702–712
1 Introduction
Lung diseases include pneumonia, tuberculosis, lung opacity, and COVID-19 are a
worldwide health issue. COVID-19 has increased the urgency of respiratory illnesses,
according to the WHO in 2020 (1) . Traditional methods for identifying lung diseases
include skin tests, blood tests, sputum sample analysis, chest X-rays, and computed
tomography (CT) scans. Radiologists and doctors use X-ray and CT images to
diagnose lung diseases. X-rays are popular in radiology because they are non-invasive,
cost-effective, distant, and portable (2) . Due to interclass similarities, lung disease
interpretation is difficult despite medical imaging advances, requiring more computer-
aided systems (CAD).
Recent advances in GPU technology and deep learning, have greatly improved CAD
system performance. Machine learning algorithms such as support vector machines
(SVM), K-nearest neighbors (KNNs), and decision trees (DT) have contributed to
disease prognosis, their performance heavily depends on feature extraction (3) . Deep
learning techniques, especially, notably Convolutional Neural Network (CNN), have
gained popularity in medical imaging to address these limitations (4,5) . However, the
categorization of multiclass lung diseases remains a challenging task.
Several research articles have focused on binary classification of lung diseases using
DL. Recent studies by various researchers have made significant strides in multiclass
lung disorder detection using chest X-rays. Ozturk et al. (6) demonstrated the success
of their 17-layer Darknet model in accurately identifying COVID-19 in both binary
and multi-class classification tasks. The lightweight DCNN model designed by Sultana
et al. (7) for the identification of lung diseases outperformed other pretrained CNN
architectures. Hussain’s work (8) introduced the CoroDet CNN model, emphasizing
the importance of activation functions and various network layers in detecting lung
diseases from X-ray and CT scan images. Furthermore, Ucar et al.’s (9) COVIDiagnosis-
Net addressed class imbalance issues in classifying chest X-ray images into Covid,
Pneumonia, and Normal categories; the researchers incorporated offline augmentation
techniques to enhance the model’s performance. ChestX-ray6 (10) , despite its lightweight
architecture, demonstrated impressive accuracy in detecting pneumonia, COVID-19,
and other diseases. Goram et al. (11) proposed a model using VGG19 followed by
CNN layers from scratch to classify six different diseases from chest X-rays. The study
presented by AI, Sheikh (12) , introduced a novel image enhancement algorithm using the
k-symbol Lerch transcendent functions model in the pre-processing phase to enhance
images based on pixel probability and classification, a customized CNN architecture and
two pre-trained CNN models (AlexNet and VGG16Net) were employed. D3SENET (13) ,
a hybrid feature extraction network combining multiple architectures and employing
traditional machine learning methods for classification, including SVMs. Farhan et
al. (14) proposed a hybrid network for lung diseases classification, in which 2D CNN
network was designed to extract features from X-ray images, and the features were
optimized using min-max scaling and classified using various ML algorithms.
Despite the promise shown by CAD systems, multiclass lung disease categorization
using chest X-ray images faces obstacles, particularly in improving image quality and
addressing data balancing. Most studies in this domain have focused on individual
pretrained architectures or CNN models developed from scratch, with a notable lack
of exploration into combining these models to enhance detection capabilities. The
existence of interclass similarities and the relatively underexplored nature of multiclass
lung disease classification highlight the need for further advancement in this area.
https://www.indjst.org/ 703
Patel & Shah / Indian Journal of Science and Technology 2024;17(8):702–712
We propose an effective deep ensemble learning-based multiclass classification method for COVID-19, pneumonia, lung
opacity, and normal lung diseases to address these problems.
2 Methodology
The proposed methodology for the multi-class classification of lung diseases using chest X-ray images is shown in Figure 1.
It comprises several stages, including data acquisition, image pre-processing (involving image resizing, normalization,
enhancement, and noise removal), data resampling, feature extraction, and classification. For feature extraction, we
employed fine-tuned pre-trained models, such as VGG16, MobileNetV2, and InceptionV3. The resulting feature vectors were
concatenated and then used in an ML classifier model (SVM, RFt, KNN, DT) to determine the final classification score through
a majority voting system. Performance parameters of the trained model were measured.
Fig 1. Proposed Methodology for lung diseases prediction using Chest X-rays
2.1 Dataset
The chest X-ray datasets utilized in this research were sourced from the COVID-19 Radiography Database, which received the
COVID-19 Dataset Award from the Kaggle Community. These datasets are freely accessible on Kaggle and can be downloaded
using the link (15) . The database is continually updated with contributions from researchers across different regions. The dataset
comprises a total of 18,953 images distributed across four different classes: COVID-19, Lung Opacity, Normal, and Viral
Pneumonia, with respective counts of 3616, 6012, 7980, and 1345. The distribution of these images is unbalanced, and they
exhibit noise as well as a distinct difference in contrast.
https://www.indjst.org/ 704
Patel & Shah / Indian Journal of Science and Technology 2024;17(8):702–712
Where I2 is the output of the median filtering, and w is the window size. Figure 2 illustrates the effect of image quality
enhancement after applying histogram equalization and median filtering techniques.
The issue of class imbalance in medical image classification poses a challenge, leading to bias that favors the majority class
and compromises classifier accuracy. To tackle this problem, the Near-Miss under-sampling technique is utilized, selectively
eliminating samples from the majority class while retaining essential learning-related data. This approach alleviates bias by
identifying samples based on their shortest average distance to their K nearest neighbors.
1. Transfer learning
2. TL with tuning
3. Fixed feature extraction
https://www.indjst.org/ 705
Patel & Shah / Indian Journal of Science and Technology 2024;17(8):702–712
Below is a brief explanation of the TL, TL with fine-tuning, and fixed feature extraction techniques.
https://www.indjst.org/ 706
Patel & Shah / Indian Journal of Science and Technology 2024;17(8):702–712
The proposed modified architectures are employed to extract features from the processed chest X-ray image dataset, with all
layers frozen using ImageNet weights. Each model produces feature vectors: F1, F2, and F3, respectively. These feature vectors
are subsequently combined and inputted into the ML classifier for the assessment of the classification scoreIn this study, the
selected machine learning classifier is SVM. It is chosen for its effectiveness, especially with relatively small to moderately-sized
datasets, and its capability to handle both linear and non-linear separations. The two-level ensemble method is depicted in
Figure 5.
https://www.indjst.org/ 707
Patel & Shah / Indian Journal of Science and Technology 2024;17(8):702–712
https://www.indjst.org/ 708
Patel & Shah / Indian Journal of Science and Technology 2024;17(8):702–712
Table 1. Training accuracy and validation accuracy of various pretrained network without any processing
Network Architecture Training Accuracy Validation Accuracy
CNN from scratch (6 Layers) 0.42 042
ResNet50 0.60 0.32
MobileNetV2 0.81 0.78
InceptionV3 0.82 0.77
VGG-16 0.87 0.76
As indicated in Table 1, VGG-16, MobileNetV2, and InceptionV3 architectures demonstrated better performance compared
to other pretrained models on the dataset. These models operated independently as feature extractors, and the extracted features
were subsequently classified using a single ML classifier. In this proposed method, the performance of SVM and RF classifiers
was compared, utilizing 42 decision trees and a random state value of 32 of RF. Table 2 represent the performance matrix of
various pretrained architecture.
As presented in Table 2, the SVM classifier consistently outperforms the RF classifier across all pretrained architectures.
Moreover, MobileNetV2 demonstrates faster feature extraction times compared to VGG-16 and InceptionV3, likely owing to
its lighter architecture. Notably, the combined use of the modified MobileNetV2 pretrained architecture with the SVM classifier
achieves the best performance among the evaluated architectures for the dataset.
In the first approach, an ensemble is created by merging feature vectors extracted from pretrained VGG-16, InceptionV3, and
MobileNetV2 models. These concatenated features were employed with an SVM classifier for prediction, yielding noteworthy
performance measures. The accuracy score, precision, recall, and F1 score all achieved 0.93. In classification problems, accuracy
is a crucial metric, reflecting the alignment of estimated values with the original values in the classification process. The
outcomes indicate that the 2-level ensemble approach enhances accuracy from 0.92 to 0.93, underscoring its effectiveness.
In the second approach, feature vectors were generated by combining pretrained networks VGG-16, InceptionV3, and
MobileNetV2. Prediction was carried out using various machine learning classifiers such as SVM, RF, DT, and KNN, followed
by majority voting for the final prediction. Table 3 presents the performance measures for the two- and three-level ensemble
approach.
Three level ensemble approaches the performance measures improved from 0.93 to 0.94. The confusion matrices for two
approaches are presented in Figure 7.
To enable remote access to the model for classification, a shared web link and user interface was developed using Python
Gradio. Figure 8 showcases the predicted result within the user interface. This user-friendly interface allows users to interact
with the model through a web browser, making it accessible and convenient for remote classification of lung diseases from chest
https://www.indjst.org/ 709
Patel & Shah / Indian Journal of Science and Technology 2024;17(8):702–712
X-ray images.
https://www.indjst.org/ 710
Patel & Shah / Indian Journal of Science and Technology 2024;17(8):702–712
Table 4 continued
References No of Classes Architecture Accuracy
(19) 4 (Covid-19, Viral Pneumonia, Bacterial Xcepion 89 %
Pneumonia, Normal)
(20) 3 (Normal, Covid-19, SARS) DeTraC 93 %
(21) 4 (normal, pneumonia, and pneumothorax, EfficientnetV2 82 %
normal, pneumonia, and pneumothorax)
(22) 3 (Covid-19, Viral and Bacterial Pneumonia) MobileNetv2 fine tuned 92 %
(23) 4, (normal, viral, bacterial pneumonia and Ensemble of ResNet50 with 94 %
COVID-19) MobileNet_V2 with Incep-
tionResNet_V2
Proposed 4, normal, viral pneumonia, lung opacity and Ensemble of VGG16, 94 %
COVID-19 Inceptinv3 and
MobilenetV2 with fixed
feature extraction and ML
Classifiers
4 Conclusion
The proposed convolutional deep learning-based technique utilizes an ensemble fixed feature extraction approach for classifying
various lung disorders from chest X-ray images. This fully automated and end-to-end model eliminates the need for manual
feature extraction. In the multiclass classification of COVID, viral pneumonia, normal, and lung opacity, the ensemble model
achieves a classification accuracy of 94%. The user-friendly interface further enhances convenience for remote classification.
In the future, this study may involve expanding the database to include more classes for classifying other lung disorders.
Additionally, we may incorporate X-ray images with additional metadata, such as age, gender, region, smoking habits, and other
physical symptoms, for training. Furthermore, testing the proposed models in clinical practice and consulting with medical
professionals would validate their practical use in diagnosing lung diseases.
5 Acknowledgement
The authors would like to acknowledge the COVID-19 radiography database, which is freely available on Kaggle, for making
this research possible.
References
1) Coronavirus overview, prevention and symptoms. 2020. Available from: https://www.who.int/emergencies/diseases/novel-coronavirus-2019.
2) Rehman A, Saba T, Tariq U, Ayesha N. Deep Learning-Based COVID-19 Detection Using CT and X-Ray Images: Current Analytics and Comparisons.
IT Professional. 2021;23(3):63–68. Available from: https://doi.org/10.1109/MITP.2020.3036820.
3) Zak M, zak AK. Classification of Lung Diseases Using Deep Learning Models. In: International Conference on Computational Science- ICCS 2020;vol.
12139 of Lecture Notes in Computer Science. Springer, Cham. 2020;p. 621–634. Available from: https://doi.org/10.1007/978-3-030-50420-5_47.
4) Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, et al. Review of deep learning: concepts, CNN architectures, challenges,
applications, future directions. Journal of Big Data. 2021;8(1):1–74. Available from: https://doi.org/10.1186/s40537-021-00444-8.
5) Goyal S, Singh R. Detection and classification of lung diseases for pneumonia and Covid-19 using machine and deep learning techniques. Journal of
Ambient Intelligence and Humanized Computing. 2023;14(4):3239–3259. Available from: https://doi.org/10.1007/s12652-021-03464-7.
6) Ozturk T, Talo M, Yildirim EA, Baloglu UB, Yildirim OA, Acharya UR. Automated detection of COVID-19 cases using deep neural networks with X-ray
images. Computers in Biology and Medicine. 2020;121:1–11. Available from: https://doi.org/10.1016/j.compbiomed.2020.103792.
7) Sultana S, Pramanik A, Rahman MS. Lung Disease Classification Using Deep Learning Models from Chest X-ray Images. In: 2023 3rd International
Conference on Intelligent Communication and Computational Techniques (ICCT). IEEE. 2023;p. 1–7. Available from: https://doi.org/10.1109/
ICCT56969.2023.10075968.
8) Hussain E, Hasan M, Rahman MA, Lee I, Tamanna T, Parvez MZ. CoroDet: A deep learning based classification for COVID-19 detection using chest
X-ray images. Chaos, Solitons & Fractals. 2021;142:1–12. Available from: https://doi.org/10.1016/j.chaos.2020.110495.
9) Ucar F, Korkmaz D. COVIDiagnosis-Net: Deep Bayes-SqueezeNet based diagnosis of the coronavirus disease 2019 (COVID-19) from X-ray images.
Medical Hypotheses. 2020;140:1–12. Available from: https://doi.org/10.1016/j.mehy.2020.109761.
10) Nahiduzzaman M, Islam MR, Hassan R. ChestX-Ray6: Prediction of multiple diseases including COVID-19 from chest X-ray images using convolutional
neural network. Expert Systems with Applications. 2023;211:1–14. Available from: https://doi.org/10.1016/j.eswa.2022.118576.
11) Alshmrani GMM, Ni Q, Jiang R, Pervaiz H, Elshennawy NM. A deep learning architecture for multi-class lung diseases classification using chest X-ray
(CXR) images. Alexandria Engineering Journal. 2023;64:923–935. Available from: https://doi.org/10.1016/j.aej.2022.10.053.
12) Al-Sheikh MH, Dandan OA, Al-Shamayleh AS, Jalab HA, Ibrahim RW. Multi-class deep learning architecture for classifying lung diseases from chest
X-Ray and CT images. Scientific Reports. 2023;13(1):1–14. Available from: https://doi.org/10.1038/s41598-023-46147-3.
13) Kaya M, Eris M. D3SENet: A hybrid deep feature extraction network for Covid-19 classification using chest X-ray images. Biomedical Signal Processing
and Control. 2023;82:1–10. Available from: https://doi.org/10.1016/j.bspc.2022.104559.
https://www.indjst.org/ 711
Patel & Shah / Indian Journal of Science and Technology 2024;17(8):702–712
14) Farhan AMQ, Yang S. Automatic lung disease classification from the chest X-ray images using hybrid deep learning algorithm. Multimedia Tools and
Applications. 2023;82(25):38561–38587. Available from: https://doi.org/10.1007/s11042-023-15047-z.
15) COVID-19 Radiography Database. . Available from: https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database.
16) Asuntha A, Srinivasan A. Deep learning for lung Cancer detection and classification. Multimedia Tools and Applications. 2020;79:7731–7762. Available
from: https://doi.org/10.1007/s11042-019-08394-3.
17) Bokade AN, Shah A. Classification of Mammography Images Using Deep-CNN based Feature Ensemble Approach and its Implementation on a Low-Cost
Raspberry Pi. International Journal of Computing and Digital Systems. 2023;13(1):1451–1463. Available from: http://dx.doi.org/10.12785/ijcds/1301117.
18) Patel M, Shah M. Transfer learning with fine-tuned deep CNN model for COVID-19 diagnosis from chest X-ray images. International Journal of Advanced
Technology and Engineering Exploration. 2023;10(103):720–740. Available from: https://doi.org/10.19101/IJATEE.2022.10100044.
19) Khan AI, Shah JL, Bhat MM. CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images. Computer Methods
and Programs in Biomedicine. 2020;196:1–9. Available from: https://doi.org/10.1016/j.cmpb.2020.105581.
20) Abbas A, Abdelsamea MM, Gaber MM. Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network. Applied
Intelligence. 2021;51(2):854–864. Available from: https://doi.org/10.1007/s10489-020-01829-7.
21) Kim S, Rim B, Choi S, Lee A, Min S, Hong M. Deep Learning in Multi-Class Lung Diseases’ Classification on Chest X-ray Images. Diagnostics. 2022;12(4).
Available from: https://doi.org/10.3390/diagnostics12040915.
22) Velu S. An efficient, lightweight MobileNetV2-based fine-tuned model for COVID-19 detection using chest X-ray images. Mathematical Biosciences and
Engineering. 2023;20(5):8400–8427. Available from: https://doi.org/10.3934/mbe.2023368.
23) Asnaoui KE. Design ensemble deep learning model for pneumonia disease classification. International Journal of Multimedia Information Retrieval.
2021;10(1):55–68. Available from: https://doi.org/10.1007/s13735-021-00204-7.
https://www.indjst.org/ 712