0% found this document useful (0 votes)
52 views17 pages

An Optimal Deep Learning Approach For Breast Cancer

Breast cancer (BC) is the most dominant kind of cancer, which grows continuously and serves as the second highest cause of death for women worldwide. Early BC prediction helps decrease the BC mortality rate and improve treatment plans. Ultrasound is a popular and widely used imaging technique to detect BC at an earlier stage. Segmenting and classifying the tumors from ultrasound images is difficult. This paper proposes an optimal deep learning (DL)-based BC detection system with effective pre-tr

Uploaded by

lcmeena2008
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views17 pages

An Optimal Deep Learning Approach For Breast Cancer

Breast cancer (BC) is the most dominant kind of cancer, which grows continuously and serves as the second highest cause of death for women worldwide. Early BC prediction helps decrease the BC mortality rate and improve treatment plans. Ultrasound is a popular and widely used imaging technique to detect BC at an earlier stage. Segmenting and classifying the tumors from ultrasound images is difficult. This paper proposes an optimal deep learning (DL)-based BC detection system with effective pre-tr

Uploaded by

lcmeena2008
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Journal of Biomolecular Structure and Dynamics

ISSN: (Print) (Online) Journal homepage: www.tandfonline.com/journals/tbsd20

An optimal deep learning approach for breast


cancer detection and classification with pre-
trained CNN-based feature learning mechanism

Meena L. C & Joe Prathap P. M

To cite this article: Meena L. C & Joe Prathap P. M (27 Nov 2024): An optimal deep learning
approach for breast cancer detection and classification with pre-trained CNN-based
feature learning mechanism, Journal of Biomolecular Structure and Dynamics, DOI:
10.1080/07391102.2024.2430454

To link to this article: https://doi.org/10.1080/07391102.2024.2430454

Published online: 27 Nov 2024.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=tbsd20
JOURNAL OF BIOMOLECULAR STRUCTURE AND DYNAMICS
https://doi.org/10.1080/07391102.2024.2430454

An optimal deep learning approach for breast cancer detection and


classification with pre-trained CNN-based feature learning mechanism
Meena L. C and Joe Prathap P. M
Department of Computer Science and Engineering, R.M.D. Engineering College, Tiruvallur, India

ABSTRACT ARTICLE HISTORY


Breast cancer (BC) is the most dominant kind of cancer, which grows continuously and serves as the Received 10 November 2023
second highest cause of death for women worldwide. Early BC prediction helps decrease the BC mor­ Accepted 16 April 2024
tality rate and improve treatment plans. Ultrasound is a popular and widely used imaging technique
KEYWORDS
to detect BC at an earlier stage. Segmenting and classifying the tumors from ultrasound images is dif­
Breast cancer detection;
ficult. This paper proposes an optimal deep learning (DL)-based BC detection system with effective breast cancer segmentation;
pre-trained transfer learning models-based segmentation and feature learning mechanisms. The pro­ deep learning; DenseNet;
posed system comprises five phases: preprocessing, segmentation, feature learning, selection, and clas­ feature selection; long
sification. Initially, the ultrasound images are collected from the breast ultrasound images (BUSI) short-term memory
dataset, and the preprocessing operations, such as noise removal using the Wiener filter and contrast
enhancement using histogram equalization, are performed on the collected data to improve the data­
set quality. Then, the segmentation of cancer-affected regions from the preprocessed data is done
using a dilated convolution-based U-shaped network (DCUNet). The features are extracted or learned
from the segmented images using spatial and channel attention including densely connected convolu­
tional network-121 (SCADN-121). Afterwards, the system applies an enhanced cuckoo search optimiza­
tion (ECSO) algorithm to select the features from the extracted feature set optimally. Finally, the ECSO-
tuned long short-term memory (ECSO-LSTM) was utilized to classify BC into ‘3’ classes, such as normal,
benign, and malignant. The experimental outcomes proved that the proposed system attains 99.86%
accuracy for BC classification, which is superior to the existing state-of-the-art methods.

1. Introduction other modalities, ultrasound is the non-radiation, low-cost,


and real-time imaging model for BC detection (Liu et al.,
An uncontrolled development of cells called cancer can
2022). However, healthcare professionals face difficulty accur­
spread to any organ quickly and kill 90% of cancer patients
ately identifying the tumor parts in the ultrasound due to
(Nassif et al., 2022). Among all types of cancer, breast, skin,
the poor illumination of ultrasound photographs and the var­
and lung cancers are considered necessary (Botlagunta et al.,
iety of tumor forms. As a result, much attention has been
2023). BC is one of the most common types of cancer in given to computer-aided diagnosis (CAD) systems to auto­
women compared to others, and its prevalence is increasing matically identify the tumor regions from breast ultrasound
equally in developing and industrial countries (Khan et al., images (BUI) by applying robust artificial intelligence-based
2022; Zewdie et al., 2021). The type of BC is identified based prediction approaches (Tong et al., 2021).
on the cells’ ability to grow in the breast because the tumor Radiotherapists strongly advise the artificial intelligence-
develops in several areas of the breast. Connective tissue, based CAD system to improve clinical procedures (Madani
ducts, and lobes are the three major structural elements of et al., 2022; Ragab et al., 2022). Machine learning (ML) is a
the breast (Saber et al., 2021). The world health organization kind of CAD system that has attracted researchers in recent
forecasted that BC was the most dominant sort of cancer years for prediction-related tasks (Priyanka, 2021), which imi­
worldwide, and in the previous 5 years, 7.8 million cases of tates intelligent human behaviour to learn and classify tumor
BC were identified (Kumar et al., 2022). Based on the proper­ cells into benign or malignant. Some commonly applied ML
ties of the breast cells, BC is categorized as malignant or algorithms support vector machines, naive Bayes, decision
benign (Masud et al., 2021). It is essential to detect the BC trees, k-nearest neighbours, random forests, and gradient
earlier because that lowers mortality rates and treatment boosting (Sharma & Mishra, 2022). However, the model also
costs and improves the quality of the patients’ lives (Michael suffers from the following drawbacks: it requires human visu­
et al., 2021). The healthcare industry widely adopts several alization for feature engineering. It could have performed
non-invasive detection methods, i.e. image modalities such better when the dataset size is large, underfitting and over­
as X-rays, magnetic resonance imaging, ultrasound imaging, fitting, and computational resources. So, DL is preferred over
and others, to identify BC (Das et al., 2021). Compared to ML, which is a subset of ML but improves the model

CONTACT Meena L. C [email protected] Department of Computer Science and Engineering, R.M.D. Engineering College, Tiruvallur, India
� 2024 Informa UK Limited, trading as Taylor & Francis Group
2 M. L. C AND J. P. P. M

performance by integrating the following advantages: auto­ the input data because the traditional classification systems
mated feature learning, dealing with complex and large data, fail to learn sequential information from the input data. The
generalization, scalability and enhanced performance (Chugh optimal version of LSTM helps improve classification per­
et al., 2021). Also, the models generate higher classification formance for BC detection by minimizing the loss function.
outcomes when trained on large amounts of highly anno­ Additionally, our model uses an optimal feature selection sys­
tated data (Çayır et al., 2022; Lotter et al., 2021). The widely tem to select essential features from the extracted feature
used DL models in prediction-related tasks include convolu­ set, which diminishes the network training time and
tion neural network (CNN), recurrent neural network (RNN), improves the prediction performance of the classifier by
deep neural network (DNN), long short-term memory eliminating irrelevant features for classification. The signifi­
(LSTM), etc. cant contributions of the current research work are listed as
Amongst all, CNN is more accurate when imaging modal­ follows:
ities detect cancer. It automatically learns the hierarchical
features from the images by applying different sizes of con­ � The system uses DCUNet to segment the cancer-affected
volution kernels. The convolved features are given to pooling regions from the BUI. Combined with the advantages of
layers to reduce the dimensions of the extracted feature sets. DC, more image feature information can be extracted,
Finally, the reduced feature sets are inputted into a fully con­ and segmentation accuracy can be improved.
nected layer for classification. However, CNN requires lots of � We are using the SCADN-121 to perform feature learning
data to achieve target accuracy, which leads to a higher that helps to attain higher classification performance.
training time for classifying the tumors. Transfer learning, on � The proposed system uses the ECSO algorithm to obtain
the other hand, improves classification performance by only the most optimal features, which helps the classifier make
requiring a minimal amount of data to attain target accuracy. more accurate predictions and removes irrelevant data
The models used millions of images to train on large data­ from the extracted feature sets.
sets (such as ImageNet). Instead of training a CNN from � The ECSO-LSTM plays a pivotal role in our system, classi­
scratch for a specific dataset, the pre-trained weights can fying the image into three categories: benign, malignant,
reduce the required amount of labelled data for training and and normal. The network parameters, including weights
the training time for the specific task. Also, the pre-trained and bias, are finely tuned using the ECSO algorithm,
CNN models learn deep hierarchical features from diverse ensuring optimal prediction outcomes. The utilization of
datasets. The knowledge is transferred to the specific task by comprehensive multi-phase techniques, such as prepro­
tuning these models on the specific dataset, which results in cessing, segmentation, feature learning, selection, and
better generalization performance, mainly when limited data classification, helps to attain robust performance in BC
is utilized. Some commonly used pre-trained CNN models classification.
are UNet, DenseNet, ResNet, AlexNet, GoogleNet, etc. In our
proposed system, we are using two pre-trained models, UNet The rest of the manuscript is organized as follows:
and DenseNet, for two different purposes: segmentation and Section 2 surveys recent works regarding BC segmentation
feature learning. Like feature learning, segmentation of and classification. Section 3 presents a detailed explanation
breast lesions is essential to achieve remarkable performance of the proposed system. Section 4 compares the outcomes
in tumor detection (Vigil et al., 2022). The improved version of the proposed and existing works and discusses the pro­
of UNet-based image segmentation avoids the need for posed system’s superiority over existing works. Finally, in sec­
handcrafted features or immediate processing steps used in tion 5, the conclusion of the proposed system is given, along
traditional segmentation models, making the model less with future research challenges.
prone to errors and more efficient. Also, the UNet supports
pixel-level segmentation that results in precise and detailed
2. Literature survey
segmentation because the conventional segmentation tech­
niques perform regions or boundaries-based segmentation This section surveys the recent methodologies of various
(Rezaei, 2021). authors for BC segmentation and classification using several
Also, our study chose an improved version of DenseNet machine and DL frameworks. It addresses the limitations of
to perform feature learning, which learns high-level and the surveyed models and discusses the solutions offered by
abstract features from the segmented cancer lesions com­ the proposed system to overcome those challenges.
pared to shallower connections. Also, the densely connected Ali et al. (2023) presented a CNN based BC classification
layers of the model encourage feature reuse and propaga­ system with the help of meta learner. The system collected
tion, resulting in an overfitting reduction, particularly when the data from the BUSI dataset and then performed prepro­
the dataset has limited training data. So, in our study, we are cessing of the collected data samples by carrying out the
using these two variants of pre-trained mechanisms for seg­ noise removal and contrast enhancement processes that
mentation and feature learning. After feature learning, the improved the quality of the data for classification. The pre­
learned features are given to the classifier to detect the can­ processed data was given to the CNN and metal learner for
cer level of the patients. In our study, we chose the optimal detecting and classifying the BCs as benign, malignant and
LSTM network for classification, which allows the system to normal. The model achieved 90% accuracy, which was better
learn temporal dependencies or sequential information from than previous schemes. Balaha et al. (2022) recommended a
JOURNAL OF BIOMOLECULAR STRUCTURE AND DYNAMICS 3

BC detection framework using hybrid DL and genetic algo­ diagnosis-BC (WDBC) dataset, and noise removal of the col­
rithm. The system collected data from BUSI and then CNN lected images was done to improve the data quality. Then
was utilized for learning the features from the collected data, feature extraction process was performed manually, and the
as well as the parameters of the CNN were optimized using principal-component-analysis and linear discriminate-analysis
the genetic algorithm. Finally, the BC was classified using the (PCA-LDA) model was utilized to perform the feature reduc­
transfer learning model, and the system attained a maximum tion. The reduced features were given to the ANFIS for can­
area under the curve (AUC) of 0.89% and an accuracy of cer classification, and the system achieved 98.6% accuracy
89.52% for the tested datasets. for the tested dataset.
Pathan et al. (2022) suggested a multi-headed CNN for BC Hirra et al. (2021) suggested a patch-based deep belief
classification. Initially, the system used the WDBC dataset to network (DBN) model for BC classification in histopatho­
collect the input breast images and the preprocessing was logical images. The data was collected from the whole slide
performed on the collected data to invert and reshape the histopathology image dataset and then preprocessing was
input images. The preprocessed data was given to the CNN applied to the collected dataset for removing the noises and
for final classification. The method attained an accuracy of extra backgrounds in the input data. Next, the system per­
78.97% when using the raw images directly and 81.02% formed a feature extraction process and the learned features
when using the masked images of the collected dataset. were inputtedto the backpropagation neural network for
Arooj et al. (2022) presented a transfer learning model called cancer detection and classification. The system attained a
AlexNet for BC detection and classification. The trained maximum of 86% accuracy, which was satisfactory to the
model was stored in the cloud if the learning conditions previous approaches. Alhussan et al. (2023) presented a CNN
were met by them, otherwise, it was retrained. The experi­ by combining GoogleNet and dynamic dipper-throated opti­
ments were carried out on two different datasets (one con­ mization for BC classification. The system initially performed
tains ultrasound and the other contain histopathology preprocessing on the BUSI dataset and then the features
images), and the system achieved 99% as the maximum were extracted using the GoogleNet model. The network
accuracy in BC classification. was fine-tuned using the dipper-throated optimization to get
Podda et al. (2022) developed a segmentation and classifi­ optimal feature extraction outcomes. The most relevant fea­
cation scheme for BUI using a fully-automated DL approach. tures were selected from the extracted feature sets using the
Initially, the segmentation of the tumor lesions from the col­ probabilistic method. Finally, the system used CNN for tumor
lected ultrasound images was carried out using the benign- classification and it attained 98.1% of accuracy for cancer
malignant and lesion-normal ensemble methods. Finally, classification.
CNN was utilized to classify the segmented tumor lesions
into normal, benign, and malignant cancers. The system
2.1. Problem statement
attained a 91% of accuracy in classification and 82% of dice
score in segmentation on the tested image dataset, which The above-mentioned surveys provide satisfactory results,
was competitive with the existing schemes. Cruz-Ramos et al. but they have some limitations.
(2023) presented a DL model for BC detection and classifica­
tion from mammogram and ultrasound images. Initially, � The traditional image segmentation approaches used for
tumor lesions were segmented from the mammography and the classification of BC result in a high level of complex­
ultrasound images using a manual segmentation procedure. ity, minimal robustness, and a lack of adaptability because
Then deep and hand-crafted features were extracted from of the use of handcrafted features or intermediate proc­
the segmented images using DenseNet and breast imaging essing in their steps (Cruz-Ramos et al., 2023). Recently,
reporting and database system. The extracted features were pre-trained CNN models like UNet have been widely used
fed into the classifiers such as multilayer perceptron, in medical image segmentation tasks to produce more
XGBoost, and Adaboost for classification. The system attained accurate segmentation outcomes than the conventional
a precision, recall, f-score, and accuracy of 98%, 98%, 98% segmentation approaches regarding end-to-end learning
and 97.6% for the screening mammography (DDSM) and capability, adaptability, and accuracy. Also, the pre-trained
BUSI datasets. model performs pixel-level segmentation, which helps to
Ayana et al. (2022) introduced an ultrasound BC classifica­ attain detailed and accurate segmentation, whereas the
tion system using a transfer learning approach. Initially, traditional algorithms perform segmentation at the region
transfer learning was applied to the cancer cell line micro­ or boundary level.
scopic images, which learned the features similar to the � Only a few researchers focus on the ML algorithm for BC
ultrasound images to change the natural domain into a classification (Cruz-Ramos et al., 2023; Preetha & Jinny,
microscopic domain. Finally, CNN was utilized to perform the 2021). The ML system accurately categorizes BC, but it
task of classification on the extracted feature sets. The sys­ suffers from computational overhead because an enor­
tem was tested on MT-small-dataset obtained from the BUSI mous quantity of data is required for training.
and achieved an accuracy of 98.7%, which was higher than Additionally, it employs hand-engineered features, i.e.
the existing methods. Preetha and Jinny (2021) proffered the subject-matter specialists are required for feature engin­
BC classification system using an adaptive neuro inference eering, leading to performance variability and a lack of
system (ANFIS). The data was collected from the Wisconsin- consistency (Hirra et al., 2021; Preetha & Jinny, 2021).
4 M. L. C AND J. P. P. M

� To overcome the limitations of ML, DL models (Alhussan 3. Proposed methodology


et al., 2023; Ali et al., 2023; Ayana et al., 2022; Balaha
The proposed system comprises ‘5’ steps, say, data prepro­
et al., 2022; Hirra et al., 2021; Pathan et al., 2022; Podda
cessing, segmentation, feature learning, feature selection,
et al., 2022) are proposed that offer several advantages
and classification. Initially, the BUI are collected from the
over traditional ML models, such as automatic learning of
publicly available BUSI dataset and the preprocessing of the
hierarchical and complex features from the raw data,
collected images is done by performing noise removal and
adaptability to several data types, scalability, flexibility,
contrast enhancement using Weiner filter (WF) and histo­
and state-of-the-art performance in complex and high
gram equalization (HE). Then the segmentation of the pre­
dimensional data. However, the performance of the DL
processed image is done using the DCUNet model. Next, the
models depends on the parameters (weight, bias, and
features from the segmented lesions are extracted using the
learning rate) used for backpropagation training. Suppose
SCADN-121 network. The relevant features are optimally
these parameters are not assigned properly, which leads
selected from the extracted feature sets using the ECSO.
to slower convergence, a lack of generalization, and lower
Finally, the ECSO-tuned LSTM is used for classification, in
performance. So, tuning the network parameters to attain
which the parameters of the LSTM are tuned using the ECSO
optimal and improved performance in classification is
(Figure 1).
essential.
� The existing BC classification methods primarily use CNN-
based learning models for segmentation and classifica­ 3.1. Image preprocessing
tion, which provide satisfactory outcomes for imaging
modalities. However, they require a large amount of train­ The BUI is initially collected from the publicly available BUSI
ing data to attain target accuracy, which increases the dataset. Image preprocessing improves the quality of the
training time and degrades the network performance dataset, thereby boosting the prediction network’s perform­
when the dataset is small. As a result, pre-trained CNN ance. In this proposed work, preprocessing involves two proc­
models are developed and widely used in image-related esses, noise removal and contrast enhancement, which are
prediction tasks, which learn deep and complex features explained briefly in the subsection below. The quality and
from the raw data without requiring lots of training data. informativeness of the input data given to the DL model are
However, improvement is still possible by including atten­ increased when incorporating these preprocessing techniques
tion mechanisms to attain higher prediction performance into the breast cancer detection system. This results in
in segmentation and classification. improved detection performance by enabling more accurate
� Also, none of the works are focused on feature selection feature extraction, enhancing the perceptibility of abnormal­
processes that help to avoid irrelevant and redundant fea­ ities, diminishing the impact of artefacts and noise, and
tures from the extracted feature set. This leads to improving the model’s capability to generalize to new and
improved classification accuracy and reduces the compu­ unseen cases of BC. So, preprocessing significantly improves
tation burden of the classifier for BC detection. The exist­ the input data for DL models, which ultimately causes more
ing works either focus solely on classification using CNN reliable and effective detection of BC from ultrasound images.
or pre-trained models or segmentation using traditional
or CNN variants. Developing practical approaches for seg­ 3.1.1. Noise removal
mentation and classification is essential to performing BC Image denoising is challenging in image preprocessing
detection accurately. because several noises corrupt BUI. The proposed system
uses a widely used image filtering method called WF due to
So, considering the existing drawbacks, this paper devel­ its simplicity and speed. The weights of a set of ideal filters
ops effective pre-trained CNN models and DL-based segmen­ were estimated using WF, which diminishes the noise in the
tation and classification systems for BC detection from image using a system of linear equations. The mathematical
ultrasound images. The system initially performs preprocess­ formulation of WF is defined as follows:
ing steps to improve the dataset’s quality and then applies ^ ��
an improved UNet model for performing semantic segmenta­ ^ 000 B ðx, yÞ
An ðx, yÞ ¼ �� �2 ^ � (1)
tion of the tumor lesions at the pixel level. Then, attention- � ^ � ^
�B ðx, yÞ� þ P a ðx, yÞ=P b ðx, yÞ
based DenseNet is utilized as a feature learner that learns
the profound and abstract level features from the segmented ^ 000 ^ ��
Where An ðx, yÞ indicates the denoised image, B ðx, yÞ
data using their dense connections, leading to higher feature
refers to ^the complex conjugate of the input image
learning capability. The optimal features are selected from ^
ðB ðx, yÞÞ, Pa ðx, yÞ signifies the noise’s power spectrum, and
the extracted feature set using ECSO to diminish the network ^
Pb ðx, yÞ proffers the input image’ power spectrum.
training time and avoid irrelevant features. Finally, optimal
LSTM is used for classification, which takes advantage of
combined feature learning with pre-trained CNN for context­ 3.1.2. Contrast enhancement
ual and temporal information gathering and leads to higher The contrast enhancement is applied using HE after the
prediction performance. The summarization of the related noise removal process to enhance the contrast level of the
works is given in Table 1. data for further analysis. The method is simple and invertible.
JOURNAL OF BIOMOLECULAR STRUCTURE AND DYNAMICS 5

Table 1. Summarization of related works.


Author name and year Methodology used Findings Research gap
Ali et al. (2023) CNN Accuracy-90% Require segmentation and effective feature
learning mechanism to produce more desired
results.
Balaha et al. (2022) HMBDLGAHA Accuracy-89.52% The system had higher computational complexity.
AUC-0.89%
Pathan et al. (2022) CNN Accuracy-78.97% Achieved lower results than other models. Needs
Precision-81.02% improvement in overall system design.
Arooj et al. (2022) AlexNet Acuracy-99% AlexNet was not deep enough to learn the
features from the data compared to the later
model.
Podda et al. (2022) CNN Dice coefficient-82% Absence of preprocessing. If unclear tissue
Accuracy- 91% structures or noisewere present in the input
image, it hard to distinguish from the healthy
tissue.
Cruz-Ramos et al. (2023) XGBoost, Multilayer Accuracy-97.6%, precision-98%, Causes overfitting in classification. The models
Perceptron, and AdaBoost recall-98%, and f1-score-98% require hyperparameters tuning to get the
optimized performance in classification.
Ayana et al. (2022) CNN Accuarcy-98.7% Smaller amount of data samples is used for
training so the results, so is likely to be biased
by outliers.
Preetha and Jinny (2021) PCA-LDA-ANFIS Accuarcy-98.6% Higher computational cost because of complex
network design.
Hirra et al. (2021) DBN Accuracy-86% Needs more training data to provide higher
detection results.
Alhussan et al. (2023) CNN Accuracy-98.1% Needs feature selection module to make
prediction process more accurate.

Figure 1. Workflow of the proposed methodology.

It enhances the intensity level of the images by separating Where floorðÞ indicates rounds down to the nearest inte­
the dominant intensity levels from the images. Let assume ger. The filtered and enhanced image quality improves the
000
^
€ 00 as the denoised image and the normalized
An ðx, yÞ and H diagnosis accuracy by offering richer visualization of func­
^ 000 tional structures and abnormalities, diminishing noise and
histogram of An ðx, yÞ, where the intensity of the pixel val­ artefacts, and enlightening the performance of numerical
ues ranges from 0 to ^L − 1: Here ^L indicates the number of analysis techniques. It also influences image segmentation by
possible intensity values that is frequently 256. The computa­ allowing more detailed boundary detection, improved fea­
€ 00 is mathematically given as follows:
tion of H ture extraction, minimal variability, and better incorporation
with diagnostic procedures.
€ 00c ¼ NPwi
H (2)
TNp
Where c ¼ 0, 1, ::::::, ^L − 1, NPwi indicates the number of 3.2. Image segmentation
pixels with intensity c, and TNp refers the total number of
pixels. Thus, the histogram equalized image ðCE IM Þ will be The segmentation of cancerous cells from medical images is
mathematically computed by, a necessary process, as it isolates the objects from the back­
ground and partitions the image into non-overlapping
0 1 regions. Our study develops DCUNet to segment breast
^000
B X, yÞ
An ðx
C tumor portions from the preprocessed data. The UNet is a
CE IM ðx, yÞ ¼ floor@ð^L − 1Þ € 00 c A
H (3) two-dimensional CNN architecture consisting of three blocks:
c¼0
the encoder (downsampling), decoder (up sampling), and
6 M. L. C AND J. P. P. M

skip-connection. The encoder block diminishes the possibil­ ordinary convolution kernel. A filter k is applied according to
00
ities of overfitting, expands the receptive field, and speeds Equation (4) for every location d on the output ^F DI and
up the computation by enhancing the model’s resistance. each kernel size ks when DC is executed to a 2-dimensional
00
The decoder performs the re-decoding of the abstract fea­ feature map ^FCL :
tures to the original size of the image. Skip connections, con­ 00 X 00 � �
^FDI ½d� ¼ ~ DR � ks k½d�
^FCL d þ R (4)
versely, improve segmentation accuracy by connecting every
ks
layer in a feed-forward manner. Each layer receives the fea­
ture maps of the previous layers. This information is passed Where the dilation rate ðR ~ DR Þ is equal to the stride at
to the subsequent layers to perform feature reuse, which which the input image is sampled. This process is similar to
00
avoids the problem of vanishing gradients by offering mul­ convolving the input ^FCL with the up-sampled filters attained
tiple paths for gradient flow in training. However, the max ~ DR − 1Þ zeroes between two consecutive filter
by inserting ðR
pooling layer is used in the encoder block to reduce the spa­ values along each spatial dimension. Each DC is followed by
tial dimensions of the extracted feature sets after the convo­ a ReLU activation function ðn�AF Þ that conveys non-linearity
lution process, sometimes leading to spatial information loss into the network for image generalization in training.
of the input data. However, this limitation of the UNet model n�AF ðPRIM Þ ¼ maxð0, PRIM Þ (5)
is avoided with the help of skip connections that combine
the encoder blocks’ high-level features with the decoder Where PRIM refers to the preprocessed image. After the
block’s low-level features. So, the network preserves the DC process, the network uses a 2 � 2 max pooling operation
spatial information of the target masks. However, the UNet to reduce the feature maps’ spatial dimensions (height and
convolutions have a small receptive field, and encoder breadth). After each max pooling operation, the convolution
down-sampling may cause the pixels’ correlation to deterior­ layer’s filters are doubled with an initial kernel size of 32.
ate. Our study includes dilated convolution (DC) in UNet to This process is repeated four times to extract the features
obtain more expansive receptive fields without lowering the from the imaging regions.
resolution. Thus, the system is named DCUNet. The structural
design of the DCUNet is shown in Figure 2.
Step 2: Decoder
The decoder up-samples the feature maps with the help of
Step 1: Encoder 2 � 2 transposed convolution and two 3 � 3 convolutional
Initially, the preprocessed image is passed into the convolu­ operations, which are repeated four times. The decoder uses
tional layer in the encoder part. The encoder is a convolu­ skip connections to link the layers of the decoder blocks
tional block containing two 3 � 3 convolutional kernels, with the preceding outputs to prevent data loss from the
which move across the input image’ receptive field to preceding levels. The skip connections also improve results
retrieve the image features. The kernel moves throughout and model convergence. The final decoder is passed through
the entire image iteratively to generate the feature maps a 1 � 1 convolution to obtain the segmented lesions. To
from them. After that, the obtained feature maps from the reduce the impact of interclass similarity while encoding
convolution block are passed to the DC layer, which per­ breast images, the loss is computed using the dice coeffi­
forms image sampling with an interval and dilation rate of cient, which measures the overlap between two masks using
−1 and incorporates the expansion coefficient to the the equation below (Kumar et al., 2022).

Figure 2. Structure diagram of the DCUNet model.


JOURNAL OF BIOMOLECULAR STRUCTURE AND DYNAMICS 7

1 þ 2 � SPI � SAI activation, and 3 � 3 Convolution. It is mathematically


DCLoss ¼ 1 − (6)
1 þ 2 � SPI � SAI expressed as follows:
Where SPI indicates the predicted segmented image and �
G €m D
€ m ¼ €E C € 0, D
€ 1, D
€ 2 , ::::::D
€ m−1 (8)
SAI refers to the actual image.
Where G € m indicates the dense block’s feature maps,
€ 0, D
ðD € 1, D
€ 2 , ::::::D
€ m−1 Þ indicates the feature maps from the
3.3. Feature extraction convolutional layer, €E C € m refers to the composite function of
The proposed system uses the SCADN-121 network to extract the three repeated operations such as BN, ReLU, and convo­
the segmented image’s features effectively. DenseNet-121 is lution on the input of m−th layer.
a popular pre-trained CNN model consisting of 4 dense
blocks, three transition layers, and 121 layers (117-convolu­
Step 3: Transitional block
tional, 3-transition, and 1-fully connected). The convolutional
Next, the feature maps from the dense block are fed into the
layer applies varying sizes of convolution filters for feature
transitional block. The transitional block changes the feature
extraction with batch normalization and ReLU activation. The
maps’ size between dense blocks. It comprises BN, a ReLU, a
dense block receives the feature maps from the preceding
1 � 1 convolution and a 2 � 2 average pooling layers. Herein,
layers and forwards them to the subsequent layers for reus­
the feature maps are first passed to the convolution and the
ing the features, which permits the network to integrate and
output from the convolution is given to the SCA to extract
combine features learned at different network depths. Using
complex spatial and temporal information from data more
the dense connections, the network solves the problem of
efficiently. In this SCA mechanism, initially, the convoluted
gradient saturation, improves feature propagation, and sig­
features ð� q Þ are passed to the spatial attention to extract
nificantly diminishes the number of network parameters. The
spatial features. Spatial attention is opposite to channel
network uses the transition layer between consecutive dense
attention, which generates the inter-spatial interaction of fea­
blocks to diminish the feature maps’ dimensionality. It natur­
tures and focuses on where there is an informative element.
ally comprises convolutional, pooling, and normalization
It is mathematically expressed as follows:
layers. Using these components together increases the effect­
iveness of the network. However, it is still challenging to cd s
€S A
€ HW ¼ A € ðq
€ P 1 X
extract deep spatial information from the images, such as � HW Þ ¼ � ðnÞ
q (9)
cd s n HW
position, orientation, posture, and angular value, and the
information from various regions of the image is ignored, Where q � HW refers to the input feature maps from the con­
resulting in a lower capability of feature learning. So, our volution at the spatial position of ðH, WÞ, A € refers to the
€ P
study suggests a spatial and channel attention (SCA) module global average pooling for channel dimension, and cd s
to address these problems by extracting temporal and spatial denotes the channel dimension. Then, the convoluted fea­
elements of the segmented images profoundly and richly. tures are passed to the channel attention to extract the tem­
This SCA included conventional DenseNet-121, which is poral features. Channel attention focuses on the significant
termed SCADN-121. The structure of SCADN-121 is shown in parts of the input data since feature map’s every channel is
Figure 3. thought of as a feature detector. It generates the global dis­
tributions of the channel features ðC €A€ f Þ and it is mathemat­
Step 1: Convolutional block ically defined as follows:
Initially, the segmented image is passed to the SCADN- H X W
€A€f ¼ A € ðq
€ P 1 X
121’s convolutional block, which extracts the features from C �f Þ ¼ � ðu, vÞ
q (10)
H � W u¼1 v¼e f
the segmented data using a set of convolutional kernels or
filters with a set of weights. The system applies a ReLU Where q � f indicates the local feature of the channel f and
activation function to all local weight values, increasing the € P
A € refers to the global average pooling for spatial dimen­
nonlinearity. The convolution process can be expressed as sion. Finally, the combined feature maps are attained by con­
follows: catenating the spatial attention and channel attention

€m ¼ -
D € m−1 þ j
€ m � n�AF D €m (7) block’s output features and it is mathematically expressed as
follows:
Where D € m indicates the m−th convolutional layer status,
n�AFrefers to the ReLU activation function that is computed dOF ¼ €S A�
€ C€A
€ (11)
using Equation (5), - € m and j
€ m refers to the weights and
Following that, the pooling in the transitional layer
bias from m − 1 to m−th convolution layer, respectively.
receives the output features of the SCA block, which uses
the stride of 2 to reduce the dimensionality of each feature
Step 2: Dense block map. The features from the pooling are given to feature
Next, the features obtained from the convolutional block are selection to select relevant features from them. The layer-by-
passed to the dense block, improving information transmis­ layer settings of the proposed SCADN-121 are given in
sion between layers. It performs batch normalization, ReLU Table 2.
8 M. L. C AND J. P. P. M

Figure 3. Architecture of the proposed SCADN-121 network.

Table 2. Parameter setting of SCADN-121 network. � The number of available host nests is fixed, and there is a
Layers Outside size Settings probability that the host bird will find a cuckoo egg.
Convolution 112 � 112 3 � 3 conv, stride2
Pooling 56 � 56 3 � 3 max pool, stride 2 Apart from its advantages, the model needs help with
h i
Dense Block 1 56 � 56 1x1 conv
x6 premature convergence problems and local optimal solu­
3x3 conv
Transition layer 56 � 56 1 � 1 conv, 3 � 3 conv, 3 � 3 conv, tions. So, our study uses a tent chaotic map (TCM) based ini­
1 with SCA 3 � 3 conv, 1 � 1 conv, tialization strategy to improve the diversity of the algorithm.
2 � 2 average pool, stride2
Dense Block 2 28 � 28
h
1x1 conv
i The small changes in the initial conditions of the algorithm
x12 using TCM lead to meaningfully different outcomes, which is
3x3 conv
Transition layer 28 � 28 1 � 1 conv, 3 � 3 conv, 3 � 3 conv, helpful to performing exploration in CSO because it aids in
2 with SCA 3 � 3 conv, 1 � 1 conv,
2 � 2 average pool, stride2 covering a widespread range of the solution space. In add­
h i ition, the boundary mutation is included in CSO to prevent
Dense Block 3 14 � 14 1x1 conv
x24
3x3 conv the algorithm from local optimal solutions by expanding sol­
Transition layer 14 � 14 1 � 1 conv, 3 � 3 conv, 3 � 3 conv,
3 with SCA 3 � 3 conv, 1 � 1 conv,
utions, promoting solution space’ exploration, preventing
2 � 2 average pool, stride2 early convergence, and balancing exploration and exploit­
h i
Dense Block 4 7�7 1x1 conv ation. This eventually upsurges the probability of discovering
x32
3x3 conv better solutions to optimization problems. So, the TCM and
Prediction Layer 15 � 1 ReLU activation, global average
pooling 2D, SoftMax layer BOM improvisations in conventional CSO are termed ECSO.
The algorithmic procedures for feature selection are
explained as follows:
3.4. Feature selection
The system uses ECSO to choose the optimal set of features
Step 1: Population initialization
from the extracted feature maps, avoiding overfitting BC classi­
Initially, the generation of population is done using TCM for
fication by eliminating the irrelevant features extracted from
n hosts, which enhances population diversity and decreases
SCADN-121. The CSO is a novel population-based metaheuris­
the impact of the initial population distribution. It is math­
tic search method that mimics cuckoo birds’ reproductive
ematically expressed as follows:
behaviour and uses random search, evaluation, and replace­ 8
ment iteratively to enhance solutions to optimization prob­ _ >
<
_
:::
_
:::
::: 2Z l � 0 � Z � 0:5
lems. The algorithm provides numerous benefits over another Z lþ1 ¼ � _ _ (12)
optimization system, which makes it a valuable model for par­ >
: 2 1 − ::: :::
Z l , 0:5 < Z � 1
ameter optimization in LSTM. One of the significant benefits of _
:::
using CSO is its simple implementation process compared to Where Z lþ1 refers to the initial population of the individ­
complex optimization techniques. Also, the method requires ual using TCM and at iteration l:
fewer resources to fine-tune the network parameters, reducing
computation overhead. The following are the central princi­
Step 2: Fitness calculation
ples of CSO for performing optimization.
Compute each individual’s fitness in the population by
considering the classifier’s accuracy, computed using
� Each cuckoo only lays one egg at a time, and the egg is Equation (13). The individual attaining higher accuracy (i.e.
dropped into another bird’s nest. minimal loss) is the best in the current iteration.
� The nest, which has high-quality eggs, is utilized for the
00 Þ
next generation. Fitness ¼ MaxðYacc (13)
JOURNAL OF BIOMOLECULAR STRUCTURE AND DYNAMICS 9

Figure 4. Pseudocode of the ECSO algorithm.

Figure 5. Structure of the LSTM model.

8 p
Table 3. Dataset description. _p
_ p > _
< :::
_
::: :::
_
:::
Number of samples Normal Benign Malignant Total ::: Z l þ Z max, j − v: Randð0, 1Þ, if Z l > Z max, j
Z lþ1 ¼ _p _ _p _
Training 92 311 135 538 >
: ::: ::: ::: :::
Testing 40 118 75 233 Z l þ Z min, j − v: Randð0, 1Þ, if Z l < Z min, j
Total 132 429 210 771 (15)
_
:::
Where Z max, j indicates_ the decision variable’s upper
:::
00
Where Yacc indicates the accuracy that is computed by bound with dimension j, Z min, j refers to the decision varia­
dividing the number of accurate forecasts by the total num­ ble’s lower bound with dimension j, and v indicates the con­
ber of forecasts and it is evaluated as: trol parameter. The above steps are continued until the
VRþ þ VR− optimum solutions are found (i.e. optimal features). The
Y 00acc ¼ (14) pseudocode of the ECSO is given in Figure 4.
Tpn

Where Tpn indicates the total count of samples, VRþ and VR−
denotes true positive and true negative rates of the classifier 3.5. Classification
respectively. Finally, the optimally selected features are fed into the ECSO-
LSTM to classify the cancers into three class levels: normal,
Step 3: Position updating benign, and malignant. The feature maps from the pre-
The position updating of CSO is generally done using levy trained CNN model are changed into a sequential format by
flight, which results in individuals’ poor searching ability due considering each feature map as a time step in a sequence
to the utilization of unified boundary limitation to control that represents the temporal evolution of features across the
boundary violations of constraints. To solve this problem,_ p the ultrasound images. LSTM processes this sequence and learns
:::
proposed system uses BOM to update the position Z lþ1 of the temporal dependencies and patterns within the feature
the cuckoo p, which is expressed as maps. After processing all feature maps, the network outputs
10 M. L. C AND J. P. P. M

Figure 6. Sample images from the dataset.

a sequence of hidden states or the final hidden state, which the sigmoid and tanh activation functions, in which sigmoid
encodes the temporal information of the feature maps. The controls the flow of information via the gates and tanh regu­
network finally includes a fully connected layer to perform lates the input modulation and output generation.
the final classification task to obtain benign, malignant and Finally, weighted focal loss (Equation (22)) is used to find
normal tumor. However, initializing the network’s parameters the prediction loss that solves the class imbalance problem
(weights and bias) is crucial since randomly initialized param­ of the BUSI datasets by combining the ideas of class weight­
eters affect the classifier’s performance and speed of conver­ ing and focal loss. Using this function, the higher weights
gence during network training. The random initialization of are assigned to minority classes and focal loss is applied to
network parameters also lengthens the training time. So, our emphasize hard-to-classify examples within each class that
study uses the ECSO algorithm to optimally select the solves class imbalance. It makes the network learn better
parameters (weight and bias) of the LSTM model that representations for all classes, causing enhanced perform­
improve the network performance by minimizing the classifi­ ance on imbalanced datasets.
cation loss. Thus, the optimal parameter selection in the clas­ $ 00 k � �
sical LSTM using ECSO is named ECSO-LSTM. The structure of WFL ¼ −W PC class 1 − AC class log 1 − PC class
$ 00 �k � (22)
the LSTM model is shown in Figure 5. −B 1 − PC class AC class log PC class
The LSTM includes four gates: input, memory, forget, and
_ _
Where PC class refers to the predicted class, AC class indicates
output gates. Let assume C Gk as the LSTM’s memory unit, the actual class, and k signifies the focusing parameter.
which memorizes information of all sequences up to time k,
_ _ s
C G k refers to the update value of the memory call, OF k sig­ 4. Results and discussion
nifies the input selected feature set at time k, H^0k indicates
_ _ This section discusses and compares the outcomes of the
the hidden layer output. The input gate I Gk regulates the proposed and existing models for BC segmentation and clas­
_ _
stored information, the output gate O Gk regulates the out­ sification when tested on the BUSI dataset. The suggested
_ _ work is implemented in Python programming language with
put information, and the forget gate F Gk regulates the for­
64-bit Windows 10 OS and Intel (R) Xeon (R) Silver 4210 CPU
getting of previous information. The mathematical
@ 2.20 GHz (2 processors) with 128GB RAM. The proposed
formulations of the LSTM information flow are given by:
system uses the BUSI dataset to train and test the efficacy of
� h i $ � the proposed work, which is publicly available and accessed
_ _ $ 00 s
F Gk ¼ n�AF ^
W __ : H k−1 , OF k þ B00__
0 (16) via https://www.kaggle.com/datasets/aryashah2k/breast-ultra­
FG FG
sound-images-dataset. The breast ultrasound images were
� h i $ �
_ _ $ 00 s obtained from 600 female patients aged between 25 and 75,
I Gk ¼ n�AF ^
W __ : H k−1 , OF k þ B00__
0 (17)
IG IG and the data was collected in 2018. The description of the
� dataset with training and testing ratio is given in Table 3.
$ 00
h i $ �
_ _ S The sample dataset images are shown in Figure 6 and the
C Gk ¼ 8�AF W __ : H^0 k−1 , OF k þ B00__ (18)
CG CG output attained by the proposed system in preprocessing
� � � �
_ _ _ _ _ _ _ _ _ _ and segmentation for both benign (a) and malignant cancers
C G k ¼ F Gk :C G k−1 þ C G : I Gk (19) (b) are shown in Figure 7.
� h i $ �
_ _ $ 00 s
O Gk ¼ n� W __ : H^0 k−1 , OF þ B00__
AF k (20)
OG OG
� � 4.1. Performance analysis
_ _ _ _
H^0 k ¼ O G k 8�AF C G k (21)
$ 00 $ 00 $ 00 $ 00 $ $ $ $
Here, the outcomes of the proposed ECSO-LSTM are investi­
Where W __ , W __ , W __ , W __ and B00__ , B00__ , B00__ , B00__ gated against the existing classifiers such as LSTM, deep
FG IG OG CG FG IG OG CG
indicates the optimal weights and biases for input matrices of neural network (DNN), recurrent neural network (RNN), and
the forget gate, input gate, output gate, and cell state, respect­ random forest (RF) approaches regarding the classification
ively selected by the ECSO algorithm and n�AF and 8�AF refers to metrics say accuracy, precision, recall, f-measure, false
JOURNAL OF BIOMOLECULAR STRUCTURE AND DYNAMICS 11

Figure 7. Results of the proposed system.

positive rate (FPR), false negative rate (FNR), area under on true positive, true negative, false positive and false nega­
curve (AUC), receiver operating characteristic curve (ROC) tive rate of the classifier obtained for predicting the BC from
and classification time. These metrics are computed based the dataset images. The time complexity and convergence
12 M. L. C AND J. P. P. M

performance of our proposed system is also discussed at the one because the proposed one achieves 99.86% accuracy,
end. The confusion matrix of the proposed system for three 99.57% precision, 99.96% recall, and 99.79% f-measure for BC
class labels in the dataset is given in Figure 8, which shows classification. Thus, the overall outcomes show that the pro­
that the model effectively predicts cancers with a higher pre­ posed method achieves superior performance compared to
diction rate. the existing methods. The diagrammatic representation of
Table 4 shows the outcomes attained by the proposed Table 1 is given in Figure 9.
and existing models for BC classification regarding accuracy, Next, the outcomes of the proposed work with existing
precision, recall, and f-measure. The existing LSTM offers methods in terms of FPR and FNR are shown in Figure 10.
97.98% accuracy, 97.28% precision, 98.04% recall, and FPR is the percentage of the total predictions incorrectly
97.79% f-measure; the existing DNN provides 95.57% accur­ labelled as positive despite being negative cases. The ratio of
acy, 95.17% precision, 95.67% recall, and 95.36% f-measure, false negatives to the total of false and true positives is
the existing RNN proffers 93.78% accuracy, 93.26% precision, known as FNR. Concerning the FPR metric, the existing
93.82% recall, and 93.68% f-measure, and the existing RF LSTM, DNN, RNN, and RF have 0.085, 0.108, 0.642, and 0.903,
provides 91.67% accuracy, 91.18% precision, 91.78% recall, higher than the proposed one. It is considered a better sys­
and 91.58% f-measure; these existing LSTM, DNN, RNN, and tem if the system shows fewer FPR and FNR values. Herein,
RF provides lesser prediction outcomes than the proposed the proposed system also has a lower FPR value of 0.021.
Likewise, concerning the FNR metric, the proposed one has
0.065 FNR, which is lower than the existing methods because
the existing methods have an FNR of 0.326, 0.261, 0.299, and
0.342. Thus, Figure 6 concludes that the proposed one per­
forms better than the existing models in detecting and clas­
sifying breast tumors from BUI. Next, the outcomes are
compared based on AUC and classification time, shown in
Figure 11.
The system performed better at differentiating between
the positive and negative classes, as shown by a higher AUC.
When the classifier receives an AUC score of 1, it can differ­
Figure 8. Confusion matrix. entiate between the positive and the negative class points.
The proposed one achieves a high AUC of 0.995%, which is
Table 4. Analysis of the proposed and existing methods. higher than the existing methods because the existing LSTM,
Techniques/Metrics (%) Accuracy Precision Recall F-measure DNN, RNN, and RF attained AUC of 0.976%, 0.943%, 0.918%,
Proposed ECSO-LSTM 99.86 99.57 99.96 99.79 and 0.894%, which is lower when compared to the existing
LSTM 97.98 97.28 98.04 97.79 methods. Next, considering the classification time metric, the
DNN 95.57 95.17 95.67 95.36 existing LSTM, DNN, RNN, and RF takes 1.01, 1.95, 2.28, and
RNN 93.78 93.26 93.82 93.68
RF 91.67 91.18 91.78 91.58 2.99 m time to classify the classes as normal, benign, and
malignant, which is higher than the proposed one, because

Figure 9. Evaluation based on precision, recall, f-measure and accuracy.


JOURNAL OF BIOMOLECULAR STRUCTURE AND DYNAMICS 13

Figure 10. FPR and FNR analysis.

Figure 11. AUC and classification time analysis.

the proposed one takes just 0.85 m to complete the classifi­ existing ML technique (Cruz-Ramos et al., 2023) proffers 97.6%
cation process. Figure 12 shows the ROC analysis of the pro­ accuracy and also the existing (Ayana et al., 2022; Balaha
posed method for 5-fold cross-validations, which shows that et al., 2022; Podda et al., 2022), and (Alhussan et al., 2023)
the method attains higher true positive rates in predicting used CNN-based DL approaches that provide 89.52%, 91%,
BCs. Thus, the overall performance analysis shows that the 98.7%, and 98.1% accuracy for cancer detection, which are
proposed method achieves outstanding outcomes compared lower when compared to the proposed method, because the
to the existing methods. proposed one achieves maximum accuracy of 99.86%. Thus,
the overall experimental results showed that the proposed
one achieves more high-level outcomes than the existing
4.2. Comparative analysis
methods. The reason is that the proposed one initially per­
Here, the proposed work is compared with related works in forms the preprocessing on the collected images rather than
the literature which used the BUSI dataset. The models are using raw images directly for segmentation or classification.
compared based on the accuracy they achieved for cancer Next, the proposed system used DCUNet for segmentation
classification, which is shown in Table 5. purposes, which accurately segments the lesions from the
The proposed one provides more promising results for BC preprocessed data using DC and pixel-level segmentation
classification than the existing frameworks. For example, the maps. In addition, the system avoids manual feature learning
14 M. L. C AND J. P. P. M

Figure 12. ROC analysis.

Table 5. Comparative analysis using BUSI dataset.


Reference Techniques used Accuracy (%)
Proposed ECSO-LSTM 99.86
(Balaha et al., 2022) CNN 89.52
(Podda et al., 2022) CNN 91
(Cruz-Ramos et al., 2023) ML techniques 97.6
(Ayana et al., 2022) CNN 98.7
(Alhussan et al., 2023) CNN 98.1

Table 6. Time complexity analysis.


Phases Time complexity
Noise removal using the WF and contrast enhancement using HE O(N), where N represents the number of pixels in the images.
Segmentation of cancer affected regions using DCUNet O (N log N)
Feature extraction using SCADN-121 O(N^2)
Feature selection using ECSO O (G � P), where G is the number of generations and P is the population size
ECSO-LSTM based classification (O (M � T)), where M is the number of features and T is the sequence length.

for classification. Instead, it uses the SCADN-121 network to its dense connectivity, which enables feature reuse and
extract the features from the segmented image. Moreover, it propagation. The features selected using ECSO contribute to
uses the ECSO approach for feature selection, which results in quicker convergence and enhanced classification perform­
higher accuracy, minimal training time and overfitting in ance when combined into optimal LSTM. In classification,
classification. Finally, the optimal LSTM helps to produce the nursing convergence by means of measures, says training
optimal results for the classification of BCs by learning loss, validation accuracy, and early stopping criteria, aids
the long-term dependencies in the input data and modelling guarantee the model converges to an optimal solution with­
the complex sequential data in the images, which results in out overfitting or underfitting. So, integrating these practical
improved prediction performance. The time complexity ana­ components in all phases of the proposed system ensures
lysis of the proposed system is shown in Table 6. faster convergence of our model for BC detection.
Like time complexity, convergence is an essential factor in
analysing performance. The proposed system’s convergence
5. Conclusion
rate is influenced by the convergence behaviour of each pro­
cess, including U-Net for segmentation, DenseNet for feature This paper proposes an optimal DL-based BC detection sys­
extraction, ECSO for feature selection, and optimal LSTM for tem with efficient segmentation and feature learning mecha­
classification. The proposed method can achieve rapid and nisms. The system uses the BUSI dataset to train and test the
accurate detection of BC from BUSI by guaranteeing effectual proposed work, and the performance of the proposed ECSO-
convergence of individual components and integrating them LSTM is weighted against the existing LSTM, DNN, RNN, and
effectively. U-Net naturally converges quickly in training RF approaches regarding accuracy, precision, recall, f-meas­
because of its symmetric architecture and skip connections. ure, FPR, FNR, AUC, and classification time. Herein, the pro­
Likewise, DenseNet’s error rate is usually effectual because of posed system achieves 99.86% accuracy, 99.57% precision,
JOURNAL OF BIOMOLECULAR STRUCTURE AND DYNAMICS 15

99.96% recall, 99.79% f-measure, 0.021 FPR, 0.065 FNR, slide images of breast cancer tissue. Neural Computing and
0.995% AUC, and 0.85 m classification time, which are higher Applications, 34(20), 17837–17851. https://doi.org/10.1007/s00521-022-
07441-9
outcomes than the existing methods. Similarly, the proposed
Chugh, G., Kumar, S., & Singh, N. (2021). Survey on machine learning
work is compared against the existing methods in the litera­ and deep learning applications in breast cancer diagnosis. Cognitive
ture review; herein, the proposed system achieves better out­ Computation, 13(6), 1451–1470. https://doi.org/10.1007/s12559-020-
comes than the existing methods. Our model is a 09813-6
comprehensive approach by combining improved versions of Cruz-Ramos, C., Garc�ıa-Avila, O., Almaraz-Damian, J. A., Ponomaryov, V.,
Reyes-Reyes, R., & Sadovnychiy, S. (2023). Benign and malignant
pre-trained models, say DCUNet and SCADN-121, for seg­ breast tumor classification in ultrasound and mammography images
mentation and feature learning, ECSO for feature selection via fusion of deep learning and handcraft features. Entropy, 25(7), 991.
and optimal LSTM for classification, which learns spatial and https://doi.org/10.3390/e25070991
temporal information from the input data effectively and Das, A., Mohanty, M. N., Mallick, P. K., Tiwari, P., Muhammad, K., & Zhu,
leads to higher accuracy and utility in BC diagnosis tasks. H. (2021). Breast cancer detection using an ensemble deep learning
method. Biomedical Signal Processing and Control, 70, 103009. https://
Our model offers essential potential for enhancing BC detec­ doi.org/10.1016/j.bspc.2021.103009
tion accuracy, but it also offers challenges associated with Hirra, I., Ahmad, M., Hussain, A., Ashraf, M. U., Saeed, I. A., Qadri, S. F.,
interpretability, complexity, and parameter sensitivity. These Alghamdi, A. M., & Alfakeeh, A. S. (2021). Breast cancer classification
drawbacks will be addressed in the future by considering a from histopathological images using patch-based deep learning mod­
resource-efficient approach for BC detection. Also, this work eling. IEEE Access, 9, 24273–24287. https://doi.org/10.1109/ACCESS.
2021.3056516
will be extended to work with other imaging modalities such Khan, S. I., Shahrior, A., Karim, R., Hasan, M., & Rahman, A. (2022).
as MRI, and mammography using another advanced DL MultiNet: A deep neural network approach for detecting breast can­
models say transformers or graph based neural networks to cer through multi-scale feature fusion. Journal of King Saud University
further improve the system performance. - Computer and Information Sciences, 34(8), 6217–6228. https://doi.org/
10.1016/j.jksuci.2021.08.004
Kumar, M., Singhal, S., Shekhar, S., Sharma, B., & Srivastava, G. (2022).
Disclosure statement Optimized stacking ensemble learning model for breast cancer detec­
tion and classification using machine learning. Sustainability, 14(21),
No potential conflict of interest was reported by the author(s). 13998. https://doi.org/10.3390/su142113998
Liu, H., Cui, G., Luo, Y., Guo, Y., Zhao, L., Wang, Y., Subasi, A., Dogan, S.,
& Tuncer, T. (2022). Artificial intelligence-based breast cancer diagno­
Funding sis using ultrasound images and grid-based deep feature generator.
International Journal of General Medicine, 15, 2271–2282. https://doi.
The author(s) reported there is no funding associated with the work fea­ org/10.2147/IJGM.S347491
tured in this article. Lotter, W., Diab, A. R., Haslam, B., Kim, J. G., Grisot, G., Wu, E., Wu, K.,
Onieva, J. O., Boyer, Y., Boxerman, J. L., Wang, M., Bandler, M.,
Vijayaraghavan, G. R., & Gregory Sorensen, A. (2021). Robust breast
References cancer detection in mammography and digital breast tomosynthesis
using an annotation-efficient deep learning approach. Nature
Alhussan, A. A., Eid, M. M., Towfek, S. K., & Khafaga, D. S. (2023). Breast Medicine, 27(2), 244–249. https://doi.org/10.1038/s41591-020-01174-9
cancer classification depends on the dynamic dipper throated opti­ Madani, M., Behzadi, M. M., & Nabavi, S. (2022). The role of deep learning
mization algorithm. Biomimetics, 8(2), 163. https://doi.org/10.3390/ in advancing breast cancer detection using different imaging modal­
biomimetics8020163 ities: A systematic review. Cancers, 14(21), 5334. https://doi.org/10.
Ali, M. D., Saleem, A., Elahi, H., Khan, M. A., Khan, M. I., Yaqoob, M. M., 3390/cancers14215334
Farooq Khattak, U., & Al-Rasheed, A. (2023). Breast cancer classifica­ Masud, M., Hossain, M. S., Alhumyani, H., Alshamrani, S. S.,
tion through meta-learning ensemble technique using convolution Cheikhrouhou, O., Ibrahim, S., Muhammad, G., Rashed, A. E. E., &
neural networks. Diagnostics, 13(13), 2242. https://doi.org/10.3390/ Gupta, B. B. (2021). Pre-trained convolutional neural networks for
diagnostics13132242 breast cancer detection using ultrasound images. ACM Transactions
Arooj, S., Atta-Ur-Rahman, Zubair, M., Khan, M. F., Alissa, K., Khan, M. A., on Internet Technology, 21(4), 1–17. https://doi.org/10.1145/3418355
& Mosavi, A. (2022). Breast cancer detection and classification empow­ Michael, E., Ma, H., Li, H., Kulwa, F., & Li, J. (2021). Breast cancer segmen­
ered with transfer learning. Frontiers in Public Health, 10, 924432. tation methods: Current status and future potentials. BioMed Research
https://doi.org/10.3389/fpubh.2022.924432 International, 20212021, 9962109–9962129. https://doi.org/10.1155/
Ayana, G., Park, J., Jeong, J. W., & Choe, S. W. (2022). A novel multistage 2021/9962109
transfer learning for ultrasound breast cancer image classification. Nassif, A. B., Talib, M. A., Nasir, Q., Afadar, Y., & Elgendy, O. (2022). Breast
Diagnostics, 12(1), 135. https://doi.org/10.3390/diagnostics12010135 cancer detection using artificial intelligence techniques: A systematic
Balaha, H. M., Saif, M., Tamer, A., & Abdelhay, E. H. (2022). Hybrid deep literature review. Artificial Intelligence in Medicine, 127, 102276. https://
learning and genetic algorithms approach (HMB-DLGAHA) for the doi.org/10.1016/j.artmed.2022.102276
early ultrasound diagnoses of breast cancer. Neural Computing and Pathan, R. K., Alam, F. I., Yasmin, S., Hamd, Z. Y., Aljuaid, H., Khandaker,
Applications, 34(11), 8671–8695. https://doi.org/10.1007/s00521-021- M. U., & Lau, S. L. (2022). Breast cancer classification by using multi-
06851-5 headed convolutional neural network modeling. Healthcare, 10(12),
Botlagunta, M., Botlagunta, M. D., Myneni, M. B., Lakshmi, D., Nayyar, A., 2367. https://doi.org/10.3390/healthcare10122367
Gullapalli, J. S., & Shah, M. A. (2023). Classification and diagnostic pre­ Podda, A. S., Balia, R., Barra, S., Carta, S., Fenu, G., & Piano, L. (2022).
diction of breast cancer metastasis on clinical data using machine Fully-automated deep learning pipeline for segmentation and classifi­
learning algorithms. Scientific Reports, 13(1), 485. https://doi.org/10. cation of breast ultrasound images. Journal of Computational Science,
1038/s41598-023-27548-w 63, 101816. https://doi.org/10.1016/j.jocs.2022.101816
Çayır, S., Solmaz, G., Kusetogullari, H., Tokat, F., Bozaba, E., Karakaya, S., Preetha, R., & Jinny, S. V. (2021). Early diagnose breast cancer with PCA-

Iheme, L. O., Tekin, E., Yazıcı, Ç., Ozsoy, G., Ayaltı, S., Kayhan, C. K., LDA based FER and neuro-fuzzy classification system. Journal of
_Ince, U.,
€ Uzel, B., & Kılıç, O. (2022). MITNET: A novel dataset and a Ambient Intelligence and Humanized Computing, 12(7), 7195–7204.
two-stage deep learning approach for mitosis recognition in whole https://doi.org/10.1007/s12652-020-02395-z
16 M. L. C AND J. P. P. M

Priyanka, K. S. (2021). A review paper on breast cancer detection using Sharma, A., & Mishra, P. K. (2022). Performance analysis of machine learn­
deep learning. In IOP conference series: Materials science and engineer­ ing based optimized feature selection approaches for breast cancer
ing (Vol. 1022, No. 1, p. 012071). IOP Publishing. https://doi.org/10. diagnosis. International Journal of Information Technology, 14(4), 1949–
1088/1757-899X/1022/1/012071 1960. https://doi.org/10.1007/s41870-021-00671-5
Ragab, M., Albukhari, A., Alyami, J., & Mansour, R. F. (2022). Ensemble Tong, Y., Liu, Y., Zhao, M., Meng, L., & Zhang, J. (2021). Improved U-net
deep-learning-enabled clinical decision support system for breast can­ MALF model for lesion segmentation in breast ultrasound images.
Biomedical Signal Processing and Control, 68, 102721. https://doi.org/
cer diagnosis and classification on ultrasound images. Biology, 11(3),
10.1016/j.bspc.2021.102721
439. https://doi.org/10.3390/biology11030439
Vigil, N., Barry, M., Amini, A., Akhloufi, M., Maldague, X. P. V., Ma, L., Ren,
Rezaei, Z. (2021). A review on image-based approaches for breast cancer
L., & Yousefi, B. (2022). Dual-intended deep learning model for breast
detection, segmentation, and classification. Expert Systems with cancer diagnosis in ultrasound imaging. Cancers, 14(11), 2663. https://
Applications, 182, 115204. https://doi.org/10.1016/j.eswa.2021.115204 doi.org/10.3390/cancers14112663
Saber, A., Sakr, M., Abo-Seida, O. M., Keshk, A., & Chen, H. (2021). A novel Zewdie, E. T., Tessema, A. W., & Simegn, G. L. (2021). Classification of
deep-learning model for automatic detection and classification of breast cancer types, sub-types and grade from histopathological
breast cancer using the transfer-learning technique. IEEE Access, 9, images using deep learning technique. Health and Technology, 11(6),
71194–71209. https://doi.org/10.1109/ACCESS.2021.3079204 1277–1290. https://doi.org/10.1007/s12553-021-00592-0

You might also like