Breast Cancer Image Classification Method Based on Deep
Breast Cancer Image Classification Method Based on Deep
Transfer Learning
Weimin WANG, Min GAO, Mingxuan XIAO, Xu YAN, Yufeng LI
April 16, 2024
1
tures from images, thereby avoiding the complex- ing to transfer the trained model parameters to
ity and limitations of manually extracting features a new model to aid its training. Specifically, by
in traditional algorithms. Moreover, deep learning training a network on a very large dataset and
is widely applied in various fields such as natural then transferring its pre-trained learning param-
language processing, object recognition, and im- eters, especially weights, to the target network
age classification, laying the foundation for its ap- model, we can provide the target model with pow-
plication in breast cancer pathological images. erful feature extraction capabilities, reducing com-
For instance, Pawwer M M et al. [PPPea22] putation time and storage requirements. Transfer
proposed a multi-scale multi-channel feature net- learning has been widely applied in medical imag-
work (MuSCF-Net) combining ResNet with atten- ing, showing significant efficacy in terms of accu-
tion mechanisms and employing a knowledge shar- racy, training time, and error rates [DYO19].
ing strategy, achieving a classification accuracy of In transfer learning, fine-tuning parameters ad-
up to 98.85% in the binary classification task of dresses the issue of mismatch between pre-trained
breast pathological tissue images. Kavitha T et neural network model feature parameters and the
al. [KMKea22] introduced a breast cancer diag- task in the target domain, which is the most cru-
nostic model based on optimal multi-level thresh- cial step. In this study, since the target domain
old segmentation and capsule networks (OMLTS- dataset is relatively small and significantly differ-
DLCN), achieving accuracies of 98.50% on the ent from the source domain dataset in terms of im-
Mini-MIAS dataset and 97.55% on the DDSM age characteristics, the primary approach adopted
dataset. is to freeze the model for fine-tuning [BSS+ 18].
Recent studies have demonstrated that using
deep learning methods to classify breast cancer tis- 2.2 Deep Transfer Learning
sue pathological images can significantly improve
classification accuracy, thereby assisting physi- In this study, a deep transfer learning method is
cians in diagnosis and ensuring timely treatment proposed. Considering that when using the Im-
for patients [SCPea96], [KGK00]. However, deep ageNet dataset as the source domain for transfer
learning is highly dependent on data; the larger learning, the large quantity and non-relevance to
the dataset, the more helpful it is for the net- cancer of ImageNet data compared to the limited
work’s classification accuracy. However, in reality, amount of cancer-related data can lead to a sig-
acquiring large medical image datasets is challeng- nificant dissimilarity between the domains, result-
ing. Additionally, increasing the depth of neural ing in negative transfer phenomenon and affecting
networks in deep learning does not necessarily im- classification accuracy.
prove classification accuracy; instead, it may lead In simple terms, in convolutional neural net-
to performance degradation. To address these works, shallow layers are responsible for extract-
limitations, this study proposes a breast cancer ing basic features, while deep layers extract ab-
pathological image classification method based on stract features. Since basic features in images are
deep transfer learning. universal, the shallow layers are pre-trained using
the ImageNet dataset. After training the shal-
low layers, the deep layers are further pre-trained
2 Deep Transfer Learning using other cancer-related datasets to exploit the
distinct properties of shallow and deep layers for
2.1 Transfer Learning transfer learning, thereby avoiding negative trans-
fer caused by low dataset similarity.
In medical image classification, data dependency is This study proposes a breast cancer medical im-
a crucial concern. Due to the uniqueness of medi- age classification algorithm based on deep trans-
cal images, available medical data for research pur- fer learning. Firstly, the DenseNet network is se-
poses is often limited and scarce. However, train- lected as the network architecture for this study
ing deep neural network structures such as Con- and is improved by integrating attention mech-
volutional Neural Networks (CNNs) with a small anisms to enhance its performance. Then, the
amount of data may lead to overfitting, thereby improved DenseNet network is subjected to the
affecting experimental results. Therefore, in this first transfer learning using the ImageNet natural
study, transfer learning is introduced. image dataset. After the first transfer learning,
Transfer learning is a process of leveraging the network is further fine-tuned through the sec-
knowledge learned from one domain (source do- ond transfer learning using the LC2500 lung can-
main) to aid learning in another domain (target cer dataset. Subsequently, the network is trained
domain) by exploiting similarities between data, using the preprocessed and augmented BreakHis
tasks, or models [11]. We can utilize transfer learn- breast cancer dataset via fine-tuning. Finally, the
2
Magnification Benign Malignant Total
40× 625 1370 1995
100× 644 1437 2081
200× 623 1390 2013
400× 588 1232 1820
Total 2480 5429 7909
Cases 24 58 82
3
Figure 4: DenseNet network structure
4
5 Experiments
5.1 Evaluation Metrics and Experi-
mental Environment
5.1.1 Experimental Environment
The experimental setup includes a computer with
the following specifications: 64-bit Windows 10
operating system, Intel Core i9-9900 CPU, 64 GB
Figure 5: Squeeze-Excitatin module RAM, and NVIDIA GeForce RTX 4090 GPU. The
experiments were conducted using the PyTorch
deep learning framework.
Nrp
Prp = (1)
employing global average pooling across spatial Nnp
dimensions. This produces a channel-wise statis-
tic—a vector whose elements represent global re- X Prp
ceptive fields for the corresponding channels. Sec- Parp = (2)
NP
ond, the excitation operation involves learning
a non-linear, channel-specific gating mechanism. Where:
Utilizing the squeezed information, it employs a Nnp represents the total number of a patient’s
simple gating mechanism with a sigmoid activa- pathological images.
tion to capture channel-wise dependencies. The Nrp represents the number of correctly classified
resulting weights are employed to adaptively re- images for that patient.
calibrate the original feature maps by rescaling Prp represents the classification accuracy of all
them with the learned activation Subsequently, it pathological images for a single patient.
extracts feature information from channels based Parp represents the average classification accu-
on these weight coefficients, enabling feature fu- racy across all patients in the dataset.
sion across channels and thereby enhancing net- Np represents the total number of patients in
work performance. the dataset. The classification accuracy from the
perspective of breast cancer pathological images is
By introducing the squeeze-and-excitation (SE) calculated as shown in Equation 3:
operation on top of the DenseNet architecture, the X Nr
network has been improved to achieve both spatial Pimg = (3)
Nall
feature fusion and learning relationships between
feature channels, further enhancing network per- Where:
formance. Pimg represents the classification accuracy of all
pathological images.
The modified network structure, as depicted in Nr represents the number of correctly classified
Figure 6, incorporates the SE module into the pathological images.
DenseNet network’s dense block sub-modules and Nall represents the total number of images in
behind the transition layers. the dataset after augmentation.
5
Parameter Name Value
batchSize 32
Epochs 200
Learning Rate 0.01
Dropout 0.25
6
Eavluation BreakHis Data Set
Network
Metrics 40× 100× 200× 400×
DenseNet 73.9 75.0 77.6 78.0
Parp DenseNet + SE 78.0 78.1 78.5 78.7
Ours 80.1 84.3 81.2 82.4
DenseNet 72.5 77.5 77.2 77.5
Pimg DenseNet + SE 72.5 75.6 74.9 80.3
Ours 78.4 79.2 79.7 84.0
6 Conclusion
This paper proposes a medical image classification
method based on transfer learning and deep learn-
ing, targeting the complexity and limited scale
Figure 8: Experimental classification accuracy at of medical pathology tissue images. The model
400× magniation categorizes breast cancer pathology images from
the BreakHis dataset into benign and malignant
Network DenseNet DenseNet+SE Ours classes. Experimental results demonstrate the
Parameter effectiveness of combining transfer learning with
Number 7.127k 8.234k 8.064k deep learning, leading to improvements in classi-
Model Size 82.4Mb 89.5Mb 84.7Mb fication compared to the baseline model. How-
ever, the study has limitations. Firstly, the pro-
Table 4: Number of parameters and model size of posed model only performs binary classification
the network models of benign and malignant breast pathology tissues,
without distinguishing the grading or subtyping of
breast cancer. Additionally, the model parameters
sesses a slightly greater number of parameters and are slightly higher, and the model size is larger,
a larger overall model size. However, this increase without optimization in this regard. This high-
in parameters and size is acceptable considering lights areas for future research and optimization.
the improvement achieved in breast cancer classi- And above the accuracy, according to D, Ma et
fication. al.[DBS+ 23] how to balancing accuracy and inter-
As shown in Table 5 above, the convergence pretability to develop deep learning models that
time of the three network models indicates that both doctors and patients can trust will become
transfer learning is an effective strategy to ad- the research focus of the industry in the future.
dress the problem with limited training data. By
leveraging pre-trained models, we can enhance the
training efficiency and the network’s ability to gen-
eralize. Without pre-training the weights of the References
network on a large dataset, the initial weights
+
would be randomly set, leading to slower conver- [BSS 18] Maciej Byra, Grzegorz Styczyn-
gence of the network. In this study, we employed ski, Cezary Szmigielski, Piotr
transfer learning to initialize the weights and fine- Kalinowski, Lukasz Micha lowski,
tuned them based on the images, thereby acceler- Rafal Paluszkiewicz, Boguslawa
Ziarkiewicz-Wróblewska, Krzysztof
Zieniewicz, Piotr Sobieraj, and An-
Network Convergence time (minute) drzej Nowicki. Transfer learning with
DenseNet 1237 deep convolutional neural network
DenseNet + SE 1356 for liver steatosis assessment in ultra-
Ours 976 sound images. International Journal
of Computer Assisted Radiology and
Table 5: Convergence times of the network model Surgery, 13:1895–1903, 2018.
7
[DBS+ 23] Ma D, Dang B, Li S, Zang H, and of texture measures with classifica-
X Dong. Implementation of computer tion based on feature distribution.
vision technology based on artificial Pattern Recognition, 29:51–59, 1996.
intelligence for medical image anal-
ysis. International Journal of Com- [PPPea22] Manish M Pawer, Sunil D Pujari,
puter Science and Information Tech- Satish P Pawar, and et al. Muscf-net:
nology, 2023. Multi-scale, multi-channel feature
network using resnet-based attention
[DYO19] Ahmed Mamoun Dawud, Kaan mechanism for breast histopatholog-
Yurtkan, and Huseyin Oztoprak. ical image classification. In Ma-
Application of deep learning in chine Learning and Deep Learning
neuroradiology: Brain haemorrhage Techniques for Medical Science, pages
classification using transfer learning. 243–261. CRC Press, 2022.
Computational Intelligence and Neu-
[QCWea15] Aiping Qu, Junmei Chen, Liwei
roscience, 2019:Article ID: 4629859,
Wang, and et al. Segmentation of
2019.
hematoxylin-eosin stained breast can-
[KGK00] Nico Karssemeijer, Maryellen L cer histopathological images based
Giger, and Nico Karssemeijer. on pixel-wise svm classifier. Science
Computer-aided diagnosis of breast China Information Sciences, 58:1–13,
lesions in medical images. Computing 2015.
in Science & Engineering, 2:39–45, [RJK01] Amirhossein Rezazadeh, Yousef Ja-
2000. farian, and Alireza Kord. Explainable
[KMKea22] T Kavitha, P P Mathai, ensemble machine learning for breast
C Karthikeyan, and et al. Deep cancer diagnosis based on ultrasound
learning based capsule neural image texture features. arXiv, 2,
network model for breast cancer 2201.
diagnosis using mammogram images. [SCPea96] Berkman Sahiner, Heang-Ping Chan,
Interdisciplinary Sciences: Compu- Nicholas Petrick, and et al. Classifica-
tational Life Sciences, 14:113–129, tion of mass and normal breast tissue:
2022. A convolution neural network classi-
[LQD+ 24] Shaojie Li, Haichen Qu, Xinqi Dong, fier with spatial domain and texture
Bo Dang, Hengyi Zang, and Yulu images. IEEE Transactions on Medi-
Gong. Leveraging deep learn- cal Imaging, 15:598–610, 1996.
ing and xception architecture for [SOPea15] Fabio A Spanhol, Luiz S Oliveira,
high-accuracy mri classification in Caroline Petitjean, and et al. A
alzheimer diagnosis. arXiv preprint dataset for breast cancer histopatho-
arXiv:2403.16212, 2024. logical image classification. IEEE
Transactions on Bio-Medical Engi-
[LWL+ 18] J. Li, P. Wang, Y.Z. Li, Y. Zhou, X.L.
neering, 63:1455–1462, 2015.
Liu, and K. Luan. Transfer learning
of pre-trained inception-v3 model for [SZ15] Karen Simonyan and Andrew Zisser-
colorectal cancer lymph node metas- man. Very deep convolutional net-
tasis classification. In 2018 IEEE works forlarge-scale image recogni-
International Conference on Mecha- tion. In Proceedings of International
tronics and Automation, pages 1650– Conference on Learning Representa-
1654, Changchun, 2018. tions, pages 1–14, San Diego, 2015.
[MLD+ 24] Danqing Ma, Shaojie Li, Bo Dang, [YHR+ 24] Gong Yulu, Zhang Haoxin, Xu Ruilin,
Hengyi Zang, and Xinqi Dong. Yu Zhou, and Jingbo Zhang.
Fostc3net: A lightweight yolov5 Innovative-deep-learning-methods-
based on the network structure for-precancerous-lesion-detection.
optimization. arXiv preprint International Journal of Innovative
arXiv:2403.13703, 2024. Research in Computer Science and
Technology (IJIRCST), 12(2):81–86,
[OPH96] Timo Ojala, Matti Pietikäinen, and 2024.
David Harwood. Comparative study