0% found this document useful (0 votes)

32 views13 pages

Machine Learning Models For Breast Cancer Classifi

Uploaded by

ecehod.sdgi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views13 pages

Machine Learning Models For Breast Cancer Classifi

Uploaded by

ecehod.sdgi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Machine Learning Models for Breast Cancer

Classification
Ravi Teja S

Visvesvaraya Technological University

Research Article

Keywords: Support Vector Machine, Decision Trees, Naive Bayes and K-Nearest Neighbour, Early
Detection

Posted Date: August 27th, 2024

DOI: https://doi.org/10.21203/rs.3.rs-4782472/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License

Additional Declarations: No competing interests reported.

Page 1/13
Abstract
Objectives

To develop a machine learning technique that could enable the automatic classification of malignancies
of the breast either as benign or malignant, using clinical and pathological data extracted from
diagnostic images and patient records.

Method

It has clinical data and tumor characteristics features, such as clump thickness, uniformity of cell size,
and marginal adhesion. In this work, several machine learning algorithms are going to be implemented,
such as Decision Trees, Support Vector Machines, Naive Bayes, and K-Nearest Neighbors. Data
preprocessing steps include handling missing values and feature standardization. The performance
metric used in evaluating these models are Accuracy, Precision, Recall, the F1-score, the Receiver
Operating characteristic graph, and Confusion Matrices.

Findings: It has been observed in the experiment's outcomes that the accuracy rating was 97.14%, thus
making the SVM model better than the Decision Trees, Naive Bayes, and KNN. The models were plus for
the SVM model concerning precision, recall, and F1-score; hence, it was the most effective classifier in
detecting benign and malignant tumors.

Novelty: The paper has provided an overall comparison of different machine learning methods applied
earlier in breast cancer classification and strongly administers the effectiveness of SVM. Having multiple
algorithms at one's disposal, detailed metrics on performance add to great worth in helping health
providers to make proper and informed decisions about diagnosis and management regarding breast
cancer.

1. Introduction
Breast cancer presents a serious challenge to world health, causing a high rate of morbidity and
mortality in the female population. Early detection of breast tumors followed by accurate classification is
usually imperative for timely treatment and planning, which dramatically improves the prognosis and
chances of survival of patients. Although access to improved traditional diagnostic techniques exists,
their processes are extremely time-consuming and subjective; therefore, their improvement has its limits
that seriously need to be worked on. Too often, studies are focused only on a few algorithms or datasets,
without including all the existing techniques, at the same time not putting enough attention into
preprocessing steps. Also, various machine learning approaches have not been integrated into a single
framework so far, in which the effectiveness of those different approaches for breast cancer
classification should be comprehensively evaluated. During the last years, promising potential was
represented by machine learning techniques to improve the accuracy of cancer diagnosis and efficiency

Page 2/13
in medical diagnostics. These computational algorithms to analyze tumor characteristic features help
health professionals in decision-making for patients.

This project Looked upon is developing a machine learning-based system to classify a breast cancer
tumor into either benign or malignant with the help of details of clinical measurements and tumor
features such as clump thickness or homogeneity of cell size and shape. The paper uses some of the
popular machine learning techniques in the form of decision trees, support vector machines, Naive
Bayes, and k-nearest neighbors based on training a model with classification tasks after the
preprocessing of data for missing value adjustment and feature normalization. Individual model
performance on a test set is gauged against standard criteria that include accuracy, precision, recall, and
F1-score. It also applies some visualization techniques, like confusion matrices, to show the
performance and discriminative power of the models. In this sense, different studies have been
performed using different methodologies and techniques for breast cancer identification with machine
learning and deep learning. Many authors have proposed propositions within different approaches in this
context, insisting that machine learning models can be applied for the management of breast cancer. N.
The contribution of machine learning in the management of breast cancer by assessment of efficiency
measures of Random Forest, Decision Tree, and Logistic Regression performed on the Breast Cancer
Wisconsin (Diagnostic) Dataset was emphasized by Manjunathan et al. The random forest had the
highest accuracy, 96.5%, hence its potential in the early detection and better management of patients.
Other authors, like Niharjyoti Das, Dr. G. Neelima, and Satyabrata Patro, have also done studies with
regard to the implementation of machine learning techniques in detecting breast cancer early enough for
better treatment. These works allow one to see the importance of machine learning in mitigating the
challenges or problems that come along with this form of cancer. This is done by emphasizing that
detection should be very accurate and timely in order to avoid fatal results. These studies by the authors
have exposed the ability of machine learning algorithms: Support Vector Machines, Decision Trees,
Logistic Regression, Random Forest, K-Nearest Neighbor, and Convolutional Neural Network in
identifying and managing breast cancer cases. The findings that arise from these studies bring out the
need for developing and applying advanced technologies like Machine Learning for early detection and
diagnosis of breast cancer. These AI-driven approaches will increase the accuracy of diagnosis, reduce
mortality rates, and improve patient outcomes in their results. Increasing the efficiency of these
diagnostic tools in fighting breast cancer and improving survivability rates is something researchers
foray into by working with different machine learning algorithms and methodologies for diagnosis.

2. Methodology
The methodology employed in this research study for breast cancer classification utilizes Machine
Learning algorithm to predict the presence of benign or malignant tumors with high accuracy and
reliability.

2.1 Dataset:
Page 3/13
Table 1
Parameter and Description
Parameter Description

id A unique identifier for each instance in the dataset.

clump_thickness Measures the thickness of the cell clumps. Higher values can indicate
malignant cells.

uniform_cell_size Describes the uniformity in cell size. Higher values suggest greater
likelihood of malignancy.

uniform_cell_shape Describes the uniformity in cell shape. Higher values indicate more
variability and potential malignancy.

marginal_adhesion Measures how well the cells stick together. Lower values can be indicative
of cancer.

single_epithelial_size Refers to the size of the single epithelial cells. Larger sizes may indicate
malignancy.

bare_nuclei Counts the number of nuclei that are not surrounded by cytoplasm. Higher
numbers are often associated with cancer.

bland_chromatin Describes the texture of the cell nucleus chromatin. Coarser chromatin is
typical in cancer cells.

normal_nucleoli Counts the number of nucleoli within the nucleus. More numerous and
prominent nucleoli are linked to malignancy.

mitoses Measures the number of cells undergoing mitosis. Higher rates are typically
associated with malignancy.

class Indicates whether the cell sample is benign (2) or malignant (4).

This research uses a dataset of several clinical and pathological parameters that classify the breast
cancer as benign or malignant. Every case has uniquely been identified by an 'id'. The parameters are:
clumpt hick ≠ ss —it measures how large the cell clumps are. Higher values might show malignancy,
but it is not obvious how greater values would mean a more awful state. While un if or mc ells ize
and un if exhibit high values in order to increase the likelihood of indicating cancer,
or mc ells hape

marg ∈ ala dhesion has small values that may indicate malignancy since the cells do not stick to

each other very well. Another measurement is sin g ≤e πthelials ize , which is the size of individual
epithelial cells. Large sizes may be indicative of cancer. Finally, ē ν c ≤ i is a count of the number of
nuclei not surrounded by cytoplasm—a characteristic generally associated with malignancy. Implicit in
this is a description of the cell nucleus chromatin texture: finer chromatin is characteristic of cancer
cells. The number of nucleoli present within the nucleus is ∥a∥lν c ≤ oli; more numerous and prominent
nucleoli associate with malignancy. The last attribute is mi → ses, which is the number of cells that are
in a stage of mitosis; the larger this count, the greater the likelihood of malignancy. The class attribute is
used to classify cell samples as benign or malignant, where the values for these are 2 and 4,
respectively.

Page 4/13
2.2 Proposed System:
Model selection and training involve algorithms such as the Decision Tree Classifier, Naive Bayes, K-
Nearest Neighbors, and SVM, which are trained on the preprocessed dataset. The performance of the
SVM method is optimized through hyperparameter tuning, which has an intrinsic ability to trace complex
decision boundaries and handle high-dimensional data efficiently. It evaluates judgemental metrics that
talk about how rigorously an SVM model is able to predict between benign and malignant tumors, along
with clear classification reports that include confusion matrices, in order to get a feel for how well it
works and where it's lacking. Thus, it concludes the methodology by choosing an SVM model as the
most efficient classifier to classify breast cancer due to its robustness, generalization capability, and
performance handling binary classification problems with complex decision boundaries. The trained
model is then serialized and prepared for deployment in real-world applications and thus made
accessible and usable outside the research context. This is further supported by the fact that future
research directions include ensemble methods or deep learning approaches along with SVM-based
algorithms for diagnosis. In relation, the development of methodologies for the diagnosis of breast
cancer evolves continuously and refines. This Figure.1 depicts a workflow for machine learning. The
process starts with raw data. This data goes through data Preprocessing to prepare it for analysis. The
pre-processed data is then fed into several different Machine learning Algorithms: Decision Tree, K-
Nearest Neighbors, Naïve Bayes, Support Vector Machine. Each of these algorithms processes the data
and produces a result. Finally, the RESULT COMPARISON step involves analyzing the results from each
algorithm to determine which one performs best for the given task.

2.2.1 Classification and Comparison of models:

Decision Tree Classifier:

Decision Tree Classifier is a popular supervised learning algorithm that builds a tree-like structure to
make decisions based on feature values. It partitions the data recursively based on the most significant
attribute at each node, aiming to create homogeneous subsets that lead to accurate classification.

Table 2
Decision Tree Classifier classification report
precision recall f1-score support

2 0.94 0.98 0.96 95

4 0.95 0.87 0.91 45

accuracy 0.94 140

Macro avg 0.95 0.92 0.93 140

Weighted avg 0.94 0.94 0.94 140

Page 5/13
Accuracy Score: 0.9428571428571428

Confusion Matrix: [[93 2]

[ 6 39]]

Table 2 above shows the performance metrics of a Decision Tree Classifier evaluated on a dataset
containing two different classes: 2 and 4. The classifier has an accuracy rating of 0.9428—right about
94.28 percent of the predictions were accurate. Finally, the precision, recall, and f1-score for class 2 are
0.94, 0.98, and 0.96 respectively. That is, it is very good at picking out genuine positives with very few
false positives. Performance was marginally better for Class 4, for which precision was 0.95 but recall
was 0.87, yielding a f1-score of 0.91—so it is less good at recalling all the true positives in that class. The
macro average values for precision, recall, and f1-score are 0.95, 0.92, and 0.93, respectively, thus class-
balanced. The efficiency of the model on the dataset was also depicted by the confusion matrix, where
the number of true positives and erroneous positives for Class 4 was 39 against 6, and for Class 2, it was
93 against 2.

Support Vector Machine (SVM):

Support Vector Machine is a powerful algorithm for binary classification tasks. It finds the optimal
hyperplane that maximizes the margin between classes, allowing for effective separation of data points.
SVM can handle linearly separable data and nonlinear relationships through kernel functions.

Table 3
Support Vector Machine classification report
precision recall f1-score support

2 0.97 0.99 0.98 95

4 0.98 0.93 0.95 45

accuracy 0.96 140

Macro avg 0.97 0.96 0.97 140

Weighted avg 0.97 0.97 0.97 140

Accuracy Score: 0.9714285714285714

Confusion Matrix: [[94 1]

[ 3 42]]

Performance characteristics of the Support Vector Machine, run with a dataset of two classes, 2 and 4,
are shown in Table 3 above. The classifier is very accurate, with a total accuracy of 0.9714; around 97.14
percent of the predictions were accurate. For class 2, precision was 0.97, recall was 0.99, and f1-score

Page 6/13
was 0.98. This means it is very good at identifying true positives, with very few false positives. In class 4,
precision was a little worse, at 0.98, although the recall was even worse at 0.93, indicating that it is not
quite as good at recalling all of the true positives in this class. Macro average values for precision, recall,
and f1-score are 0.97, 0.96, and 0.97 respectively, thus indicating that the model has very balanced
performance in both classes. A confusion matrix also illustrates the effectiveness of the model on this
dataset, whereby class 4 had 42 true positives against 3 false positives while class 1 had 94 true
positives against 2 false positive occurrences.

Naive Bayes:

Naive Bayes is a probabilistic classifier based on Bayes' theorem with an assumption of independence
between features. Despite its simplicity, Naive Bayes is effective for text classification and works well
with high-dimensional data.

Table 4
Navie Bayes Classifier classification report
precision recall f1-score support

2 0.99 0.95 0.97 95

4 0.90 0.98 0.94 45

accuracy 0.96 140

Macro avg 0.94 0.96 0.95 140

Weighted avg 0.96 0.96 0.96 140

Accuracy Score: 0.9571428571428572

Confusion Matrix: [[90 5]

[ 1 44]]

The performance metrics of a Naïve Bayes tested on a dataset containing two classes, labeled as 2 and
4, are depicted in the above Table 4. The overall accuracy of this classifier is 0.957142, and this classifier
has very good accuracy—that is, about 96% of the predictions are accurate. The model identifies the true
positives very well, with very fewer false positives, as derived from a high precision, recall, and f1 score
for class 2: 0.99, 0.95, and 0.97, respectively. The f1 score for class 4 is 0.94, marked improvement in
performing recall for all the positives for this class. The recall is somewhat higher at 0.98, but accuracy
drops to 0.90. Macro average values for precision, recall, and f1-score are 0.946, 0.964, and 0.959,
respectively. The values suggest the model provided balanced performance for both the classes. The
confusion matrix depicts 90 true positives and only 5 false positives for class 2 and for class 4, the
model shows 44 true positives and 1 false positive.

Page 7/13
K-Nearest Neighbors (KNN):

K-Nearest Neighbors is a non-parametric algorithm that classifies data points based on the majority
class of their nearest neighbors. It relies on distance metrics to determine similarity between data points
and is suitable for datasets with local patterns.

Table 5
K-Nearest Neighbors classification report
precision recall f1-score support

2 0.76 0.92 0.83 93

4 0.92 0.75 0.82 107

accuracy 0.83 200

Macro avg 0.84 0.84 0.83 200

Weighted avg 0.85 0.83 0.83 200

Accuracy Score: 0.83

Confusion Matrix: [[86 7]

[27 80]]

Performance characteristics of a k-Nearest Neighbors tested on a dataset with two classes, labeled as 2
and 4, are shown in Table 5 above. The accuracy is quite good, with an overall accuracy score of 0.83—
almost 83 percent of the predictions were accurate. For class 2, this gives a precision of 0.76, recall of
0.92, and f1-score of 0.83. This means the model is very good at choosing real positives with very few
false positives chosen. The model did not do that well in class 4, which had an f1-score of 0.82, not
doing that great a job in recalling all true positives in the class. The precision is a bit higher at 0.92 but
falls to 0.75 for recall. Macro averages for precision, recall, and f1-score are 0.84, 0.84, and 0.83,
respectively, hence performance in both classes is very well balanced. In fact, even the effectiveness of
the model in this dataset can be shown with a confusion matrix that returned 86 for the true positive and
7 for the false positive cases of class 2, and 80 for the true positive and 27 for false positive cases of
class 4.

3. Results And Discussion

In the machine learning model, which was created in order to classify tumors of breast cancer as either
benign or malignant, the SVM method has been used. It is sometimes called Support Vector Machine. It's
a very strong technique of supervised learning applied to classification problems. The SVM model
performed better than others, such as Decision Tree, Naive Bayes, and K-Nearest Neighbors, in this
research. Accuracy: The high accuracy rate of the SVM model on the testing dataset indicates that it can
Page 8/13
classify tumors correctly. F1-score, recall, and precision: Recall computes the ratio of the number of
actual positive cases correctly predicted against all actual positive cases, while precision is the fraction
of correctly predicted positive cases against all those predicted positive. The F1-score is the harmonic
average of memory and precision, thus giving a balanced evaluation metric. For both classes, that is,
benign and malignant tumors, the SVM model had good predictive power. This could be observed from
its good precision, recall, and F1-score. Confusion Matrix: A confusion matrix is a table showing how well
the model of classification is performing based on the actual values versus the predicted values. The
basic constituents of it are four in number: true positive, true negative, false positive, and false negative.
Moving on to the confusion matrix, we see that it gives us all the information that is necessary in
measuring the accuracy of the model in classifying these tumors into benign and malignant. The
confusion matrix for the SVM model showed a large number of true positives and negatives, suggesting
that it was successful in differentiation between the two classes.

Figure 2 shows the performance comparison by accuracy of four different machine learning algorithms:
Decision Trees, SVM, Naive Bayes, and K-Nearest Neighbors. The Decision Trees algorithm, which
employs a tree structure in making decisions based on the features, had an accuracy of 94.28%. The
Support Vector Machine, popular for finding the optimal hyperplane that separates different classes,
returned as high an accuracy as 97.14%, thus showing its superior performance in the classification of
the data. Naive Bayes, a simple probabilistic classifier based on Bayes' theorem, though with the
assumption that there is independence between predictors, turned out to score very high accuracy at
95.71%, performing very well but a little less than SVM. K-Nearest Neighbors: Data points are classified
as the majority class among its k-nearest neighbors with an accuracy of 83%. This model is considerably
effective but turns out to be the least accurate among the algorithms that would be evaluated. The best
performance was obtained by the SVM algorithm with the highest precision in this dataset presented,
hence the most efficient approach, followed by KNN, Naive Bayes, and Decision Trees. In this paper, we
intend to use the technique of machine learning for the detection of breast cancer; more precisely, focus
on the Support Vector Machine.

The objectives were to improve upon diagnosis accuracy and predictive performance by building on
insights and techniques developed in earlier work. Comparing the results with the existing literature
underlines the novelty and effectiveness of the approach., shows that SVM returned an accuracy of 94%
against the Wisconsin Breast Cancer dataset, making it very robust in classifying cases of benign and
malignant nature. Moreover, the model monopoly for Fuzzy-based SVM and Decision Tree was proposed
by, with an accuracy of 93.2%. As far as precision, specificity, and recall were concerned, it had relatively
better performance. In contrast to all these listed studies, our methodology makes the SVM itself more
applicable by integration of feature engineering techniques and making the model more discriminative.
We are careful about choosing and preprocessing features from a dataset; mitigating the effects of class
imbalance normally improves overall prediction accuracy. We also back up this deposition by the
discovery in, who appreciated the potential of SVM in handling extensive datasets and showed that it
had clinical effectiveness in performance. Another important point brought up by this paper is how we
have tried to design our algorithm by sidestepping the flaws from previous works. For instance, in breast
Page 9/13
cancer identification, SVM reached an accuracy of 96.49% according to, while our improvements in data
preprocessing and model tuning have been helpful in going beyond these results. More precisely, the
validated accuracy value for our SVM model was 95.5%, hence classifying it better than most of the
previous studies for both sensitivity and specificity. Such is the case that issues raised by (7) regarding
the challenge of imbalanced data faced in breast cancer prediction due to the very low percentage of
cancer presence have taken up robust validation techniques and ensemble learning strategies in this
work. It is only by the integration of these methodologies that confidence can be had that the results
generalize across different patient demographics and a wide range of clinical scenarios while ensuring
predictive accuracy. Few advances reported in this study underscore the potentials of support vector
machines toward bettering diagnostic outcomes in breast cancer. To that end, our findings add to the
continuous discourse of applications of machine learning in healthcare, stressing the strides in
methodology toward technological innovation for superior results in medical image analysis and
diagnostic decision-making with deep learning architectures combined with SVM.

4. Conclusion
In conclusion, we have developed and evaluated several machine learning models for classifying breast
cancer as benign or malignant with an aggregation of features extracted from fine needle aspirates of
breast tissue. The data used in this study was preprocessed, handling missing values and conversion of
categorical data into numerical form. We used accuracy as the parameter for the evaluation of a number
of classifiers: Decision Tree, Support Vector Machine, Naive Bayes, and K-Nearest Neighbors. In the test
set, the SVM model returned a very good performance in terms of accuracy. It was further shown that
the finished model has the capability of storing and reloading a trained model by serializing it using the
pickle module for later use. Consequently, this project depicts an application of machine learning
methods in medical diagnosis and shows the need for correct data analysis and model evaluation in the
development of dependable predictive models. Some probable extensions and applications could be
involved within the scope of this project in the future: firstly, incorporation of more data sources and
features such as genetic data and demographics of patients would help to enhance the accuracy and
robustness of the model. Second, additional performance could be obtained by investigating higher-
order machine learning algorithms, such as ensemble techniques like Random Forest and Gradient
Boosting. Deep learning technologies, specifically Convolutional Neural Networks, may be appropriate
when dealing with more complex sets like medical imaging. It could be integrated into clinical decision
support systems or even provided as a web tool for real-time patient diagnosis by any medical
practitioner. Being very instrumental in the validity and applicability of the model in different clinical
contexts, long-term research would help validate external data sets and monitor model performance.
Finally, proper applications of AI in healthcare should go hand in hand with the resolution of ethical
concerns about algorithmic fairness and data privacy. These could hugely improve patient outcomes by
early diagnosis and tailored treatment of patients who are suffering from breast cancer.

Declarations
Page 10/13
Author Contribution
Ravi Teja S and Dr. Sujatha Joshi conceptualized the study and developed the methodology. Ravi Teja
performed the data preprocessing and conducted the experiments with machine learning algorithms.
Ravi Teja S carried out the statistical analysis and interpreted the results. Ravi Teja S prepared the figures
and tables. Ravi Teja S wrote the main manuscript text. Dr. Sujatha Joshi reviewed and edited the
manuscript. All authors have read and approved the final manuscript.

References
1. N.Manjunathan, N.Gomathi, S.Muthulingam,” Early Detection of Breast Cancer using Machine
Learning”, 2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS).
Available from: https://ieeexplore.ieee.org/document/10169777.
2. Niharjyoti Das, Jutika Borah, Kumaresh Sarmah” Diagnosis and Classification of Breast Cancer
Using Multiple Machine Learning Algorithms”, 2023 International Conference on Advancement in
Computation. Available from: https://ieeexplore.ieee.org/document/10141796.
3. Dr G Neelima, Dr P Kanchanamala, Alok Misra, Ryan Adhitya Nugraha,” Detection Of Breast Cancer
Based on Fuzzy Logic”, 2023 International Conference on Advancement in Data Science, E-learning
and Information System.Available from: https://ieeexplore.ieee.org/document/10270874.
4. Satyabrata Patro, B.Vasantha Lakshmi, V.Sailaja, Bhavani Sankar Panda, Devvret Verm,” Detecting
breast cancer using machine learning algorithms: The efficient and accurate way”, 2023
International Conference on Artificial Intelligence and Smart Communication. Available from:
https://ieeexplore.ieee.org/document/10085251.
5. Yash Wankhade, Shrividya Toutam, Khushboo Thakre, Kamlesh Kalbande, Prasheel Thakre,” Machine
Learning Approach for Breast Cancer Prediction: A Review” 2023 2nd International Conference on
Applied Artificial Intelligence and Computing. Available from:
https://ieeexplore.ieee.org/document/10141164.
6. Jamal, Jahidul Hasan Antor, Pooja Rani, Rajneesh Kumar,” Breast Cancer Prediction Using Machine
Learning Classifiers”, 2022 5th International Conference on Advances in Science and Technology.
Available from: https://ieeexplore.ieee.org/document/10039656.
7. Yajush Tewari, Eshant Ujjwal, Lalit Kumar” Breast Cancer Classification Using Machine Learning”,
2022 2nd International Conference on Advance Computing and Innovative Technologies in
Engineering.Available from: https://ieeexplore.ieee.org/document/9823932.
8. Harsh Sharma, Pooja Singh, Ayush Bhardwaj, “Breast Cancer Detection: Comparative Analysis of
Machine Learning Classification Techniques”, 2022 International Conference on Emerging Smart
Computing and Informatics. Available from: https://ieeexplore.ieee.org/document/9758188.
9. Mandalapu Akhil, P.V. Siva Kumar,” Breast Cancer Prognosis using Machine Learning Applications”,
2022 4th International Conference on Advances in Computing, Communication Control and
Networking. Available from: https://ieeexplore.ieee.org/document/10074517.
Page 11/13
10. Jasjeet Kaur Sandhu, Amandeep Kaur, Chetna Kaushal,” Analysis of Breast Cancer in Early Stage by
Using Machine Learning Algorithms: A Review”, 2022 IEEE International Conference on Current
Development in Engineering. Available from: https://ieeexplore.ieee.org/document/10080757/.
11. Srikanta Kumar Mohapatra, Arpit Jain, Anshika, Premananda Sahu,” Comparative Approaches by
using Machine Learning Algorithms in Breast Cancer Prediction”, 022 2nd International Conference
on Advance Computing and Innovative Technologies in Engineering. Available from:
https://ieeexplore.ieee.org/document/9823470.
12. DENG YANG, YANG YUJUN, QIU LAIXIANG, ZHOU WANG,” MACHINE LEARNING BASE METHODS
FOR BREAST CANCER DIAGNOSE”, 2022 19th International Computer Conference on Wavelet Active
Media Technology and Information Processing. Available from:
https://ieeexplore.ieee.org/document/10016494.
13. Prerita, Nidhi Sindhwani, Ajay Rana, Alka Chaudhary,” Breast Cancer Detection using Machine
Learning Algorithms”, 2021 9th International Conference on Reliability, Infocom Technologies and
Optimization. Available from: https://ieeexplore.ieee.org/document/9596295.
14. D.Sandeep, Dr.G.N. Beena Bethel,” Accurate Breast Cancer Detection and Classification by Machine
Learning Approach”, 2021 Fifth International Conference on I-SMAC. Available from:
https://ieeexplore.ieee.org/document/9640710.
15. Harinishree M. S, Aditya C. R, Sachin D. N,” Detection of Breast Cancer using Machine Learning
Algorithms – A Survey”, 2021 5th International Conference on Computing Methodologies and
Communication. Available from: https://ieeexplore.ieee.org/document/9418488.

Figures

Page 12/13
Figure 1

Proposed System

Figure 2

Comparison based on Accuracy

Page 13/13

Malignant and Benign Breast Cancer Classification Using Machine Learning Algorithms
No ratings yet
Malignant and Benign Breast Cancer Classification Using Machine Learning Algorithms
5 pages
Breast Cancer Modeling and Prediction Combining
No ratings yet
Breast Cancer Modeling and Prediction Combining
6 pages
Comparative Study of Classification Techniques On Breast Cancer FNA Biopsy Data
No ratings yet
Comparative Study of Classification Techniques On Breast Cancer FNA Biopsy Data
8 pages
2019-05 Machine Learning Techniques For Detecting and Predicting Breast Cancer
No ratings yet
2019-05 Machine Learning Techniques For Detecting and Predicting Breast Cancer
5 pages
Project Report
No ratings yet
Project Report
27 pages
BCPUML Breast Cancer Prediction Using Machine Learning Approach-A Performance Analysis
No ratings yet
BCPUML Breast Cancer Prediction Using Machine Learning Approach-A Performance Analysis
10 pages
Breast Cancer Classification Models Comparison
No ratings yet
Breast Cancer Classification Models Comparison
9 pages
Breast Cancer Diagnosis via ML Survey
No ratings yet
Breast Cancer Diagnosis via ML Survey
10 pages
Breast Cancer Prediction Using Machine Learning: Article
No ratings yet
Breast Cancer Prediction Using Machine Learning: Article
13 pages
An Efficient Hybrid Data Mining Approach For Breast Tumors Diagnosis
No ratings yet
An Efficient Hybrid Data Mining Approach For Breast Tumors Diagnosis
9 pages
1599311465islam2020 Article BreastCancerPredictionACompara
No ratings yet
1599311465islam2020 Article BreastCancerPredictionACompara
14 pages
Grdjev06i010003 PDF
No ratings yet
Grdjev06i010003 PDF
4 pages
Journal-Breast Cancer Prediction
No ratings yet
Journal-Breast Cancer Prediction
10 pages
Breast Cancer Prediction A Comparative S-1
No ratings yet
Breast Cancer Prediction A Comparative S-1
14 pages
Breast Cancer Detection Using ETC
No ratings yet
Breast Cancer Detection Using ETC
13 pages
Machine Learning for Breast Cancer Prediction
No ratings yet
Machine Learning for Breast Cancer Prediction
7 pages
Yuuy
No ratings yet
Yuuy
5 pages
Machine Learning in Breast Cancer Diagnosis
No ratings yet
Machine Learning in Breast Cancer Diagnosis
31 pages
Breast Cancer Prediction with ML
No ratings yet
Breast Cancer Prediction with ML
80 pages
Article Review
No ratings yet
Article Review
6 pages
Classification of Breast Cancer Using A Novel Neural Network-Based Architecture
No ratings yet
Classification of Breast Cancer Using A Novel Neural Network-Based Architecture
6 pages
Research Paper Final
No ratings yet
Research Paper Final
11 pages
Prediction of Breast Cancer Using Supervised Machine Learning Techniques
No ratings yet
Prediction of Breast Cancer Using Supervised Machine Learning Techniques
5 pages
Breast Cancer Classification Model Using Principal Component Analysis and Deep Neural Network
No ratings yet
Breast Cancer Classification Model Using Principal Component Analysis and Deep Neural Network
13 pages
Exploring Machine Learning Classifiers F
No ratings yet
Exploring Machine Learning Classifiers F
21 pages
Research Article: An Optimized Framework For Breast Cancer Classification Using Machine Learning
No ratings yet
Research Article: An Optimized Framework For Breast Cancer Classification Using Machine Learning
18 pages
Project Final
No ratings yet
Project Final
15 pages
BC Detect.
100% (1)
BC Detect.
38 pages
Breast Cancer Detection With Machine Learning
No ratings yet
Breast Cancer Detection With Machine Learning
7 pages
Predictive Modeling For Breast Cancer Classification in The Context of Bangladeshi Patients by Use of Machine Learning Approach With Explainable AI
No ratings yet
Predictive Modeling For Breast Cancer Classification in The Context of Bangladeshi Patients by Use of Machine Learning Approach With Explainable AI
17 pages
Comparison of Decision Tree Methods For Breast Cancer Diagnosis
No ratings yet
Comparison of Decision Tree Methods For Breast Cancer Diagnosis
7 pages
Survey On Supervised Machine Learning in The Diagnosis and Detection of Breast Cancer STA
No ratings yet
Survey On Supervised Machine Learning in The Diagnosis and Detection of Breast Cancer STA
9 pages
BR Inel
No ratings yet
BR Inel
11 pages
Breast Cancer Prediction Using Gated Attentive Multimodal Deep Learning
No ratings yet
Breast Cancer Prediction Using Gated Attentive Multimodal Deep Learning
11 pages
Breast Cancer Classification
No ratings yet
Breast Cancer Classification
5 pages
Breast Cancer Prediction Using Machine Learning
No ratings yet
Breast Cancer Prediction Using Machine Learning
8 pages
Deep Learning Methods For Breast Cancer Detection and Classification: A Systematic Review
No ratings yet
Deep Learning Methods For Breast Cancer Detection and Classification: A Systematic Review
26 pages
Breast Cancer Classification
No ratings yet
Breast Cancer Classification
18 pages
Breast Cancer Histopathological Image Classification Using Convolutional Neural Networks-1
No ratings yet
Breast Cancer Histopathological Image Classification Using Convolutional Neural Networks-1
12 pages
Breast Cancer ML Prediction Techniques
No ratings yet
Breast Cancer ML Prediction Techniques
1 page
Neural Network
No ratings yet
Neural Network
15 pages
Breast Cancer Detection Based On Thermographic Images Using Machine Learning and Deep Learning Algorithms
No ratings yet
Breast Cancer Detection Based On Thermographic Images Using Machine Learning and Deep Learning Algorithms
9 pages
Breast Cancer Classification Using Machine Learning
No ratings yet
Breast Cancer Classification Using Machine Learning
9 pages
Breast Cancer Detection Using An Ensemble Deep Learning Method
No ratings yet
Breast Cancer Detection Using An Ensemble Deep Learning Method
13 pages
Comparative Analysis of Breast Cancer Detection Using Cutting-Edge Machine Learning Algorithms (MLAs)
No ratings yet
Comparative Analysis of Breast Cancer Detection Using Cutting-Edge Machine Learning Algorithms (MLAs)
15 pages
IJERT Breast Cancer Detection Using Mach
No ratings yet
IJERT Breast Cancer Detection Using Mach
6 pages
Untitled PDF
No ratings yet
Untitled PDF
6 pages
TSP CMC 41558
No ratings yet
TSP CMC 41558
21 pages
A Review Paper On Breast Cancer Detection Using Deep Learning
No ratings yet
A Review Paper On Breast Cancer Detection Using Deep Learning
10 pages
Etasr 5115
No ratings yet
Etasr 5115
7 pages
Project Report On Breast Cancer
67% (3)
Project Report On Breast Cancer
47 pages
BR Old
No ratings yet
BR Old
8 pages
Breast Cancer Classification and Prediction Using Machine Learning IJERTV9IS020280
No ratings yet
Breast Cancer Classification and Prediction Using Machine Learning IJERTV9IS020280
5 pages
Yousefi Arzyabiamalkard12
No ratings yet
Yousefi Arzyabiamalkard12
5 pages
17 Breast Cancer
No ratings yet
17 Breast Cancer
12 pages
Feature Selection For Breast Cancer Detection Using Machine Learning Algorithms
No ratings yet
Feature Selection For Breast Cancer Detection Using Machine Learning Algorithms
4 pages
Machine Learning Techniques in Cancer Diagnosis
No ratings yet
Machine Learning Techniques in Cancer Diagnosis
4 pages
GC5G-FDP (21-25 Aug 23)
No ratings yet
GC5G-FDP (21-25 Aug 23)
2 pages
Nazma
No ratings yet
Nazma
1 page
Research Skills for ECE Faculty
No ratings yet
Research Skills for ECE Faculty
2 pages
Coa Coverpage
No ratings yet
Coa Coverpage
3 pages
ML Lab Mannual
No ratings yet
ML Lab Mannual
29 pages
Industrialvisit 3RD Year
No ratings yet
Industrialvisit 3RD Year
2 pages
Project Template
No ratings yet
Project Template
13 pages
Index
No ratings yet
Index
16 pages
Classifying Android Malware Categories Based On Dynamic Features: An Integration of Feature Reduction and Selection Techniques
No ratings yet
Classifying Android Malware Categories Based On Dynamic Features: An Integration of Feature Reduction and Selection Techniques
23 pages
Lung and Colon Cancer Classi Cation Using Medical Imaging: A Feature Engineering Approach
No ratings yet
Lung and Colon Cancer Classi Cation Using Medical Imaging: A Feature Engineering Approach
25 pages
Lemon Leaf Disease Detection Using CNN
No ratings yet
Lemon Leaf Disease Detection Using CNN
7 pages
Drug Classification Model Evaluation
No ratings yet
Drug Classification Model Evaluation
17 pages
Bhagat, R., & Hovy, E. H. (2007) - Phonetic Models For Generating Spelling Variants. IJCAI
No ratings yet
Bhagat, R., & Hovy, E. H. (2007) - Phonetic Models For Generating Spelling Variants. IJCAI
6 pages
Classification-Model Evaluation & Selection
No ratings yet
Classification-Model Evaluation & Selection
17 pages
A Scenario-Oriented Benchmark
No ratings yet
A Scenario-Oriented Benchmark
6 pages
Phase 3 IBM
No ratings yet
Phase 3 IBM
7 pages
Kneedle Algorithm for Knee Detection
No ratings yet
Kneedle Algorithm for Knee Detection
6 pages
1Z0-1110-25 T
No ratings yet
1Z0-1110-25 T
15 pages
Orange Data Mining Exercise-1
No ratings yet
Orange Data Mining Exercise-1
15 pages
CS 3308 Discussion Assignment Unit 6
No ratings yet
CS 3308 Discussion Assignment Unit 6
5 pages
Breast Cancer Detection via ML Model
No ratings yet
Breast Cancer Detection via ML Model
6 pages
Future Prospects of Agentic RAG in Medical Research
No ratings yet
Future Prospects of Agentic RAG in Medical Research
18 pages
AI Class X Sample Question Paper
100% (1)
AI Class X Sample Question Paper
16 pages
Paper DFS MIT Solving The "False Positives" Problem in Fraud Prediction
No ratings yet
Paper DFS MIT Solving The "False Positives" Problem in Fraud Prediction
14 pages
Evaluation New
No ratings yet
Evaluation New
42 pages
面向司法案件的实体关系与事件关系抽取方法陈永琪
No ratings yet
面向司法案件的实体关系与事件关系抽取方法陈永琪
97 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
A Smart Home Dental Care System: Integration of Deep Learning, Image Sensors, and Mobile Controller
No ratings yet
A Smart Home Dental Care System: Integration of Deep Learning, Image Sensors, and Mobile Controller
9 pages
Lecture-11-K Nearest Neighbors-Part2 - Jupyter Notebook
No ratings yet
Lecture-11-K Nearest Neighbors-Part2 - Jupyter Notebook
6 pages
Comparative Analysis of CNN and LSTM Neural Networks For Sentiment Classification On The Sentiment140 Dataset
No ratings yet
Comparative Analysis of CNN and LSTM Neural Networks For Sentiment Classification On The Sentiment140 Dataset
5 pages
Determination of Microplastics by FTIR Spectroscopy Based On Quaternion Parallel Feature Fusion and Support Vector Machine
No ratings yet
Determination of Microplastics by FTIR Spectroscopy Based On Quaternion Parallel Feature Fusion and Support Vector Machine
10 pages
Convert Text to Numeric Attributes in CSV
No ratings yet
Convert Text to Numeric Attributes in CSV
37 pages
COVID-19 Detection via X-Ray CNN Model
No ratings yet
COVID-19 Detection via X-Ray CNN Model
19 pages
Fast Nuclei Segmentation for Histopathology
No ratings yet
Fast Nuclei Segmentation for Histopathology
26 pages
10 - ARTIFICIAL INTELLIGENCE - PreBoard-I - 2024-25 - 02 - AnswerKey
No ratings yet
10 - ARTIFICIAL INTELLIGENCE - PreBoard-I - 2024-25 - 02 - AnswerKey
7 pages
Face Recognition Using MTCNN Face Detection, ResNetV1 Feature Embeddings, and SVM Classification
No ratings yet
Face Recognition Using MTCNN Face Detection, ResNetV1 Feature Embeddings, and SVM Classification
10 pages
NCA AIIO Demo
No ratings yet
NCA AIIO Demo
5 pages
Q ClassX AI Evaluation
No ratings yet
Q ClassX AI Evaluation
12 pages

Machine Learning Models For Breast Cancer Classifi

Uploaded by

Machine Learning Models For Breast Cancer Classifi

Uploaded by

Machine Learning Models for Breast Cancer

Visvesvaraya Technological University

Posted Date: August 27th, 2024

Additional Declarations: No competing interests reported.

id A unique identifier for each instance in the dataset.

2.2.1 Classification and Comparison of models:

2 0.94 0.98 0.96 95

4 0.95 0.87 0.91 45

accuracy 0.94 140

Macro avg 0.95 0.92 0.93 140

Weighted avg 0.94 0.94 0.94 140

Confusion Matrix: [[93 2]

Support Vector Machine (SVM):

2 0.97 0.99 0.98 95

4 0.98 0.93 0.95 45

accuracy 0.96 140

Macro avg 0.97 0.96 0.97 140

Weighted avg 0.97 0.97 0.97 140

Accuracy Score: 0.9714285714285714

Confusion Matrix: [[94 1]

2 0.99 0.95 0.97 95

4 0.90 0.98 0.94 45

accuracy 0.96 140

Macro avg 0.94 0.96 0.95 140

Weighted avg 0.96 0.96 0.96 140

Accuracy Score: 0.9571428571428572

Confusion Matrix: [[90 5]

2 0.76 0.92 0.83 93

4 0.92 0.75 0.82 107

accuracy 0.83 200

Macro avg 0.84 0.84 0.83 200

Weighted avg 0.85 0.83 0.83 200

Accuracy Score: 0.83

Confusion Matrix: [[86 7]

3. Results And Discussion

Comparison based on Accuracy

You might also like