0% found this document useful (0 votes)
167 views20 pages

Breast Cancer

This document describes a student project to predict breast cancer using machine learning algorithms. The student analyzed three algorithms: decision tree, K-nearest neighbors (KNN), and logistic regression. The student implemented these algorithms on a breast cancer dataset, took snapshots of the results, and compared the classification accuracy of each model. Logistic regression had the highest accuracy at 88.59%. The student concluded logistic regression is best for this medical dataset and future work could involve automating breast cancer analysis or predicting other diseases.

Uploaded by

Vijay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
167 views20 pages

Breast Cancer

This document describes a student project to predict breast cancer using machine learning algorithms. The student analyzed three algorithms: decision tree, K-nearest neighbors (KNN), and logistic regression. The student implemented these algorithms on a breast cancer dataset, took snapshots of the results, and compared the classification accuracy of each model. Logistic regression had the highest accuracy at 88.59%. The student concluded logistic regression is best for this medical dataset and future work could involve automating breast cancer analysis or predicting other diseases.

Uploaded by

Vijay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

DEPARTMENT OF COMPUTER SCIENCE ENGINEERING

EAST WEST INSTITUTE OF POLYTECHNIC [597]


YELAHANKA NEW TOWN, BENGALURU – 560064

Predicting Breast Cancer Using Machine


Learning.
Under the guidance of
BY
RAMESH B.N
VIJAY KUMAR M [597CS20045]
HEAD OF THE
DEPARTMENT(CSE)
INTRODUCTION

What is breast cancer


 Breast cancer is the disease in which cells in the breast grow out of control .there are different kind of breast cancer.
 The kind of breast cancer depends on which cells in the breast turns into Cancer. Breast cancer can begin with
different parts of the breast.
 Breast cancer can spread outside the breast through blood vessels and lymph vessels

2
OBJCETIVE
The proposed machine-learning approaches could predict breast cancer as the early detection of this
disease could help slow down the progress of the disease and reduce the mortality rate through
appropriate therapeutic interventions at the right time. Applying different machine learning approaches,
accessibility to bigger datasets from different institutions (multi-centre study), and considering key
features from a variety of relevant data sources could improve the performance of modelling.

3
LITERATURE SURVEY
• Cancer is the second death-causing disease that affects worldwide women.
• Cancer is a disorder range of the lethal cell if left untreated leads to indolent lesions and mortality.
• Abnormal cells are created as a result of a genetic mutation that grows out of control and becomes cancerous due to
the changes in its deoxyribonucleic acid.
• Early identification of breast cancer can assist in the prognosis process which can successfully mitigate serious
complications of the disease with higher recovery.
- Jaffar et a and Khan

4
SYSTEM REQUIREMENT SPECIFICATION
Hardware Requirements

System Processor : Core i3 / i5

Hard Disk : 500 GB.

Ram : 4 GB.

Software Requirements

Operating system : Windows XP / 7

Coding Language : Python

Software : Anaconda

IDE : Jupyter Notebook

5
SYSTEM REQUIREMENT SPECIFICATION

Non-functional requirements

• Reliability
• Performance
• Portability
• Scalability
• Flexibility
• Security

6
MODEL DESIGN

7
METHODOLOGY

This section describes about three algorithms used in this system namely

• Decision tree Algorithm


• K-Nearest Neighbour (KNN) Algorithm
• Logistic Regression algorithm

8
IMPLEMENTATION

• Decision tree Algorithm

Decision Tree algorithm belongs to the supervised learning algorithms. decision tree algorithm can be
use regression and classification problems. The general motive of using Decision Tree is to create a
training model which can use to predict class or value of target variables by learning decision rules
inferred from prior data(training data).

9
IMPLEMENTATION

 K-Nearest Neighbour (KNN)


Algorithm
• The k-nearest neighbors' algorithm, also known as KNN or k-NN, is a non-parametric, supervised
learning classifier, which uses proximity to make classifications or predictions about the grouping of an
individual data point.

10
Implementation
• Logistic Regression algorithm

Logistic regression is an example of supervised learning. It is used to calculate or predict the


probability of a binary (yes/no) event occurring. An example of logistic regression could be
applying machine learning to determine if a person is likely to be infected with COVID-19 or not.
Since we have two possible outcomes to this question - yes, they are infected, or no they are not
infected - this is called binary classification.

11
SNAPSHOTS
Decision tree Algorithm Snap 1

12
SNAPSHOTS
K-Nearest Neighbour (KNN) Algorithm Snap 2

13
SNAPSHOTS
Logistic Regression algorithm Snap 3

14
SNAPSHOTS
Logistic Regression algorithm Snap 3.1

15
SNAPSHOTS
Logistic Regression algorithm Snap 3.2 (Confusion Matrix)

16
SNAP SHOT
Classification Accuracy Comparison of Models snap 4

17
CONCLUSION AND FUTURE ENHANCEMENT
Conclusion
• Medical dataset can not only be classified with the previously mentioned algorithms from
machine learning, there are many algorithms and techniques which may perform better
than these.
• Logical Regression surpasses all the other algorithms with an accuracy of 88.5964
%.Thus I Conclude, this project by saying Logical Regression Classification algorithm is
best and better for handling medical data set.

Future enhancement
• In the future, the designed system with the used machine learning classification algorithm can be used
to predict or diagnose other diseases.
• The work can be extended or improved for the automation of Breast cancer analysis including some
other machine learning algorithms

18
REFERENCES

[1] Abdelghani,Bellaachia.,Erhan,Guven.2006. Predicting Breast Cancer Survivability Using Data Mining

Techniques . Scientific data mining workshop in conjuction with SIAM conference on Data Mining.
[2] Chen,M., Han,J., and Yu,P. 1997. IEEE Trans. Knowledge and Data Eng.8(866) .
[3] Diana, D. 2009. Prediction of recurrent events in breast cancer using the Naive Bayesian
Classification. Annals of University of Craiova, Math. Comp. Sci. 36(2):92-96 ISSN: 1223-6934.
[4] Harry, Z.,Shengli,S. 2004.Learning weighted Naive Bayes with accurate Ranking. 4th IEEE
International Conference on Data Mining.567-570,ISBN-0-7695- 2142-8.
[5] Item Intensities. Knowledge and Information Systems, 6(2):203–229.
[6] Kharya ,S.2012. Using data mining techniques for diagnosis and prognosis of cancer disease.
International Journal of Computer Science, Engineering and Information Technology 2(2):55-66.
[7] Kharya, S., Agrawal, S., and Soni,S.2014. Naive Bayes Classifiers: A Probabilistic Detection Model
for Breast Cancer. International Journal of Computer Applications (0975 – 8887) Volume 92 (10):26-31 .

19

You might also like