Profesional practice IT Assignment
Profesional practice IT Assignment
Professional Practice in IT
Introduction
Over 287,850 new cases of breast illness were identified in 2022 alone, making it one of the top
causes of mortality for women. Although survival rates are greatly increased by early detection,
breast disease diagnosis from mammography pictures still requires a high level of expertise.
Many issues, such as low-accuracy models, irrelevant feature extraction, and overlapping
malignant and non-malignant regions, can impair manual diagnosis.
Deep learning in particular, a type of artificial intelligence (AI), has shown promise in improving
breast cancer detection. To solve these problems, a novel automated method based on deep
learning and transfer learning models is presented in this paper.
Literature Review
A number of methods have been put out in the field of AI-based breast cancer detection. Due to
the complexity of clinical imaging data, traditional techniques like manual feature extraction and
simple machine learning algorithms have had only patchy success. Recently, Convolutional
Neural Networks (CNNs) have showed promise, particularly when used in conjunction with
transfer learning and data augmentation methods. The study's authors cite a wide range of prior
research that emphasises deep learning's efficacy in breast cancer screening.
CNNs and other deep learning techniques work especially well for problems involving picture
classification. Large datasets are necessary for these models' training, though. Data paucity and
imbalanced datasets (e.g., fewer malignant cases compared to benign cases) have been major
problems in the identification of breast cancer.
With the use of transfer learning and AlexNet, a CNN model that has already been trained, this
work expands upon previous deep learning techniques.
Purpose
The main goal of this project is to create an automated system for classifying breast cancer that
will increase the accuracy of mammography-based cancer detection. The paper suggests a
system that:
Extracts features through the use of transfer learning.
ABER is a unique optimisation approach that is used to fine-tune CNN models.
It addresses several problems such as suboptimal model performance, irrelevant feature
extraction, and imbalanced data.
Improving breast cancer early detection is the ultimate objective.
Methodology
Data Augmentation:
Since publicly accessible datasets are frequently of restricted size, strategies for data
augmentation are used to expand the amount of training data. To produce more samples and
strengthen the CNN's resilience during training, images are flipped, rotated, and otherwise
altered. This study made use of two datasets:
A Kaggle dataset including photos of breast cancer
The Digital Database for Screening Mammography (DDSM).
The dataset sizes dramatically expanded after augmentation, going from 1,696 to 6,784 photos
for the first dataset and from 1,356 to 5,424 images for the second.
The AlexNet deep learning architecture leverages transfer learning to enhance the feature
extraction procedure. Relevant features are extracted from the mammography images using
AlexNet, which has been pre-trained on the ImageNet dataset.
A number of metrics, such as F-score, sensitivity, specificity, and accuracy, were used to assess
the model's performance. The system achieved an average classification accuracy of 97.95%
using two datasets for evaluation. ANOVA and Wilcoxon tests were used to validate the
statistical significance of the data, indicating the efficacy of the ABER-optimized CNN for the
classification of breast cancer.
Conclusion
The study effectively creates an automated system for classifying breast cancer using deep
learning and optimisation methods. The system outperformed existing techniques in the sector
and achieved excellent classification accuracy by integrating the innovative ABER algorithm
with transfer learning (AlexNet). Subsequent investigations may examine the system's real-time
implementation in healthcare environments and evaluate its efficacy on more extensive and
varied datasets.