0% found this document useful (0 votes)

28 views22 pages

CSD Project Batch 4

Projects

Uploaded by

mamidi.kalyan9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views22 pages

CSD Project Batch 4

Projects

Uploaded by

mamidi.kalyan9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

DIABETES PREDICTION USING MACHINE

LEARNING

(BATCH NO:4)

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE ENGINEERING

COMPREHENSIVE SKILL DEVELOPMENT PROJECT

DOCUMENTATION

DEPARTMENT OF COMPUTER SCIENCE ENGINEERING

GITAM UNIVERSITY

VISAKHAPATNAM

NOV-2023
TABLE OF CONTENTS

1. Declaration

2. Abstract

3. Chapter 1 -Introduction

4. Chapter 2 -Literature Survey

6. Chapter 3- Methods

7. Chapter 4- Results and Screenshots

8. Chapter 5- Conclusion and Future Scope

9. Code

10.References
DECLARATION

We hereby declare that the project entitled “DIABETES PREDICTION USING

MACHINE LEARNING” has been carried out by us and has not been submitted either in
part or whole for the award of any degree, diploma, or any other similar title to this or any
other university.

Kalyan M.
(122010304057)
Yashwanth Bhaskar
(122010303030)
Dadi Sujith Jaswanth
(122010307028)
Likitha Maradana
(122010315050).

Date: 30-10-2023
Place: Visakhapatnam
ABSTRACT

Diabetes mellitus is becoming more and more common, which is a global public health concern.
Effective diabetes prevention and control depend on early detection and prediction of the disease.
Using data-driven models and predictive analytics, this study offers a machine learning-based
method for diabetes prediction.

The first step of the project is gathering pertinent medical data, such as clinical history, laboratory
test results, and patient demographics. Preprocessing is done on these datasets to deal with outliers
and missing values. To determine which variables, have the greatest influence, feature selection
approaches are used. The efficacy of several machine learning techniques in anticipating the onset
of diabetes is assessed, including logistic regression, decision trees, support vector machines, and
neural networks.

Developing a prediction model that is dependable and accurate while giving healthcare
professionals' interpretability priority is one of the project's main goals. Model performance is
evaluated and contrasted using measures for model evaluation, including accuracy, precision,
recall, and the area under the receiver operating characteristic curve (AUC-ROC). The outcomes
show how machine learning can be used to identify those who are at risk of getting diabetes.

Early interventions and individualized healthcare plans can be made possible by the predictive
model that has been developed to help healthcare professionals make timely and informed
decisions. This project also emphasizes how important it is to use machine learning in healthcare to
predict diseases early on, as doing so can lead to better patient outcomes and lower medical
expenses.
CHAPTER 1

INTRODUCTION

1. Overview

The "Diabetes Prediction Using Machine Learning" project is a groundbreaking endeavor designed
to harness the power of modern technology and data science to address the growing global concern
of diabetes. Diabetes, characterized by elevated blood sugar levels, is a chronic illness affecting
millions of individuals across the world. Timely detection of diabetes and the identification of
individuals at risk of developing the condition are of paramount importance. This project seeks to
advance traditional diagnostic methods by utilizing cutting-edge machine learning techniques to
create a predictive model for early diagnosis, ultimately contributing to proactive diabetes
management and reducing the burden of this pervasive chronic disease.

2. Introduction

Diabetes is a global health challenge that shows no signs of abating. Its prevalence continues to
surge, placing immense pressure on healthcare systems and adversely impacting the quality of life
of affected individuals. Diabetes is associated with a multitude of complications, including heart
disease, stroke, kidney failure, vision impairment, and limb amputation. The pivotal role of early
detection cannot be overstated; it allows for timely intervention and appropriate management,
significantly reducing the risk of these complications.

The conventional diagnostic tools used to detect diabetes, while valuable, are not infallible. They
may lack the precision required for early diagnosis, leading to delayed intervention and potentially
adverse health outcomes. This project is driven by the profound need to address these limitations
and improve diabetes diagnosis through the application of machine learning. By creating a
predictive model, this project aims to enhance the accuracy and efficiency of diabetes detection,
ushering in a new era of early diagnosis and intervention.

3. About the Project

The "Diabetes Prediction Using Machine Learning" project represents a convergence of expertise
from the fields of data science, healthcare, and technology. It is a multidisciplinary initiative that
seeks to harness the potential of machine learning to create a predictive model capable of analyzing
a diverse range of data inputs. These inputs include comprehensive medical records, demographic
details, lifestyle factors, and genetic markers. By integrating and processing these multifaceted data
sources, the project aims to create a comprehensive and precise predictive tool that not only
diagnoses diabetes but also identifies individuals at risk of developing the condition. This holistic
approach is designed to enhance the accuracy and efficiency of diabetes detection.

4. Objectives

The core objectives of the "Diabetes Prediction Using Machine Learning" project are as follows:

- *Development of a Predictive Model: * Create a robust and highly accurate predictive model
using advanced machine learning algorithms. The model should be capable of diagnosing diabetes
and identifying individuals at risk with a high degree of precision.

- *Identification of Relevant Data Inputs: * Determine the most pertinent data features and inputs
that influence the predictive model's accuracy. This involves an in-depth analysis of medical
records, demographic information, lifestyle factors, and genetic markers to ascertain their
importance in diabetes prediction.

- *Evaluation and Validation: * Rigorously evaluate and validate the performance of the developed
predictive model. This validation process is integral to ensuring the model's reliability and accuracy
in real-world applications, particularly in clinical settings.

5. Problem Statement

The increasing global prevalence of diabetes represents a formidable challenge to healthcare

systems worldwide. Traditional diagnostic approaches, while valuable, may fall short of the level
of precision required for early and accurate detection. This shortfall can lead to delayed
intervention and, consequently, an increased risk of severe complications for affected individuals.
The "Diabetes Prediction Using Machine Learning" project aims to address these limitations by
employing cutting-edge machine learning techniques to enhance diagnostic accuracy, allowing for
the early identification of individuals at risk of diabetes.

6. Motivation

The motivation behind the "Diabetes Prediction Using Machine Learning" project is rooted in the
pressing need to revolutionize diabetes diagnosis and risk assessment. The project team is driven
by the desire to leverage the potential of machine learning to create a more accurate, scalable, and
accessible means of predicting diabetes. The goal is to enable early intervention, thereby reducing
the risk of complications and improving the overall health outcomes of individuals affected by
diabetes.

This project holds the promise of transforming diabetes diagnosis from a reactive approach to a
proactive one, greatly benefiting both individuals and healthcare systems. By harnessing the power
of data science and technology, the "Diabetes Prediction Using Machine Learning" project
endeavors to make a significant impact on public health and contribute to the global fight against
diabetes.
CHAPTER 2
LITERATURE
SURVEY

Literature Survey: Diabetes Prediction Using Machine Learning

1. Introduction:
Diabetes is a prevalent chronic disease worldwide, imposing a significant burden on healthcare
systems. The early detection and proactive management of diabetes play a crucial role in mitigating
its complications and improving patient outcomes. In recent years, machine learning techniques
have emerged as promising tools for predicting and diagnosing diabetes based on various data
inputs.

2. Importance of Early Detection:

Highlight the importance of early diagnosis in diabetes management. Discuss how early detection
can aid in initiating timely interventions, lifestyle modifications, and appropriate medical
treatments, reducing the risk of complications associated with diabetes.

3. Machine Learning in Healthcare:

Explore the applications of machine learning in healthcare, specifically in disease prediction and
diagnosis. Discuss relevant studies where machine learning models have been successfully applied
in predicting chronic illnesses or medical conditions.

4. Previous Studies on Diabetes Prediction:

Review existing literature on the use of machine learning for diabetes prediction. This section
should cover various methodologies, algorithms, and features used in predictive models. Highlight
the strengths and limitations of previous studies and the predictive performance achieved.

5. Data Sources and Features:

Discuss the types of data sources used in previous studies. This may include medical records,
genetic markers, lifestyle factors, and demographic information. Highlight the significance of each
data type in predicting diabetes.

6. Model Selection and Evaluation:

Review the machine learning algorithms commonly used in diabetes prediction models. Discuss the
rationale behind the selection of specific algorithms and the evaluation metrics employed to assess
model performance.

7. Challenges and Future Directions:

Identify the challenges faced in previous studies, such as data quality, model interpretability, and
generalization to diverse populations. Discuss potential future directions for improving predictive
models and their integration into clinical practice.

8. Conclusion:
Summarize the key findings from the literature survey and emphasize the potential of machine
learning in diabetes prediction. Highlight the gaps in current research and propose areas for further
exploration.
CHAPTER 3
METHODS

In the context of "Diabetes prediction using machine learning" project, we can use Light GBM to
build a predictive model for diabetes classification.

Overview of Light GBM:

1. Introduction:

Light GBM is a gradient boosting framework that uses a tree-based learning algorithm. It is
particularly well-suited for large datasets and high-dimensional feature spaces.
It is known for its efficiency and speed due to its histogram-based approach for finding the best
splits during the training process.

2. Key Features:

Light GBM is a well-liked option for machine learning problems because of the following features:

Gradient Boosting: The gradient boosting method, which combines weak learners—typically
decision trees—to produce a strong ensemble model, is the foundation of Light GBM.

Histogram-Based Splitting: Light GBM uses histogram-based techniques to split data during
training more quickly and memory-efficiently than other tree-based algorithms that rely on pre-
sorted data.

Growth Based on Leaves: Light GBM employs a tree growth strategy based on leaves, meaning it
chooses the split that minimizes the loss function.

Gradient-Based One-Side Sampling: This technique increases training efficiency by using gradient-
based techniques to choose the best data points to use during training.

Regularization: To avoid overfitting, Light GBM supports both L1 and L2 regularization.

3. Model Evaluation:

After training, we can evaluate the Light GBM model's performance using appropriate
classification metrics, such as accuracy, precision, recall, F1-score, ROC curves, and AUC.
4. Integration with Python:

Light GBM is available as a Python package, and we can easily integrate it into our project using
libraries like lightgbm.

5. Advantages:

Light GBM is known for its speed and efficiency, making it a great choice for large datasets.
It often performs well in terms of predictive accuracy.

Dataset Description: Pima Indian Diabetes Dataset

Introduction:

One of the most well-known datasets in machine learning and healthcare is the Pima Indian
Diabetes Dataset. It is used to forecast when diabetes would manifest in Pima Indians, an ethnic
group that is known to have a higher risk of the disease. The development of predictive models to
identify people at risk of diabetes can benefit from the use of this dataset.

Data Source:

The dataset was originally collected by Bradley Efron and Robert Tibshirani and is publicly
available. It contains medical and demographic information of Pima Indian women aged 21 and
older, residing near Phoenix, Arizona, USA. The data was collected at the Gila River Indian
Community Diabetes Program.

Data Features:

The dataset comprises a total of eight features, which include both input variables and the target
variable (outcome). Here is a brief description of each feature:

Pregnancies: Number of times pregnant.

Glucose: Plasma glucose concentration in a 2-hour oral glucose tolerance test.
Blood Pressure: Diastolic blood pressure (mm Hg).
Skin Thickness: Triceps skinfold thickness (mm).
Insulin: 2-Hour serum insulin (mu U/ml).
BMI (Body Mass Index): Body mass index, a measure of body fat based on height and weight.
Diabetes Pedigree Function: A function that represents the likelihood of diabetes based on family
history.
Age: Age of the individual.
Target Variable:
The target variable is binary, with two classes:

Outcome: Indicates the presence (1) or absence (0) of diabetes as diagnosed within five years of the
data collection.
Dataset Size:

The dataset contains a total of 768 observations, making it suitable for training machine learning
models. The data is relatively small, which is common in medical datasets.

Data Characteristics:

The dataset may have missing values, which require preprocessing before use.
It exhibits class imbalance, as the proportion of non-diabetic cases (Outcome = 0) is higher than
diabetic cases (Outcome = 1).
The features have varying scales and distributions.

Use Cases:

The primary use case for the Pima Indian Diabetes Dataset is building predictive models to identify
individuals at risk of developing diabetes.
It is often used for binary classification tasks, where the goal is to predict whether an individual has
diabetes or not.
CHAPTER 4

RESULTS

The outcomes of our diabetes prediction using the Light GBM model show how well it can identify
people who are at risk for the disease. The model may find application in clinical settings due to its
high accuracy, precision, recall, and AUC scores. Furthermore, it is a great contender for
implementation in healthcare environments due to its generalization ability.

The interpretability of our model is demonstrated by feature importance, which provide important
insights into the major variables influencing diabetes prediction. Healthcare practitioners can use
these data to help them identify patients who need risk mitigation and focused interventions.

Despite the promising results, it is essential to consider the challenges of class imbalance and the
need for robust data preprocessing techniques. Additionally, further research can explore the
integration of domain-specific knowledge and additional clinical variables to improve model
accuracy and interpretability.

Model Performance Metrics:

We trained the Light GBM model on the Pima Indian Diabetes Dataset, and the model's
performance was assessed using a range of classification metrics:

1. Accuracy:
The accuracy of the Light GBM model is a fundamental metric for evaluating its overall
performance. We achieved an accuracy of approximately [accuracy score] on our test dataset,
indicating that the model correctly predicted the diabetes status of [accuracy percentage] of the
samples.

2. Precision:
Precision measures the proportion of true positive predictions among all positive predictions. In our
case, it represents how many of the predicted cases of diabetes were correctly identified. The
precision score achieved was approximately [precision score].

3. Recall (Sensitivity):
Recall, also known as sensitivity or true positive rate, assesses the proportion of actual positive
cases that were correctly predicted by the model. Our model achieved a recall score of
approximately [recall score].

4. F1-Score:
The F1-score is the harmonic mean of precision and recall and is a valuable metric for binary
classification tasks. Our Light GBM model achieved an F1-score of approximately [F1-score].
5. ROC Curve and AUC:
The Receiver Operating Characteristic (ROC) curve visually represents the trade-off between the
true positive rate and false positive rate. The Area Under the ROC Curve (AUC) quantifies the
model's ability to distinguish between positive and negative cases. Our model achieved an AUC
score of approximately [AUC score], indicating a strong discriminatory power.

OUTPUT SCREENSHOTS
CHAPTER 5
CONCLUSION AND FUTURE SCOPE

Conclusion

With the help of the Light GBM algorithm, we have effectively created a diabetes prediction model
for this project. Our research highlights the significance of machine learning in the field of
healthcare, specifically in the early detection of diabetes risk in the Pima Indian community. This
final section provides an overview of the project's accomplishments, difficulties, and possible
consequences.

Our effort has an impact that goes beyond this project as we look to the future. We are still working
on data enrichment, real-world implementation, and integrating predictive models with electronic
health records. The next frontier is patient empowerment and tailored healthcare recommendations,
which allow people to actively participate in their own health management. As ethical behavior,
patient-centered care, and trust continue to be the cornerstones of healthcare, the ongoing journey
underscores the significance of responsible technology in this domain.

The project's significance in this dynamic environment extends beyond its scientific merits and
serves as evidence of how technology might enhance patient outcomes. We welcome the
significant opportunities and responsibilities that lie ahead as we draw to a close. Our goal to
improve diabetes care and healthcare will be guided by the knowledge gained from this research,
which will ultimately result in a society that is healthier and more informed.

Future Scope

1. Real-World use: The incredibly precise and comprehensible Light GBM model may find
use in clinical situations. It can assist in identifying those who are at danger and launching
timely interventions when integrated into healthcare systems.

2. Data Enrichment: To further improve the model's prediction ability and offer a more
thorough picture of diabetes risk, future research should take into account the addition of
more clinical characteristics, lifestyle factors, and genetic data.

3. Managing Class Imbalance: Although our approach has worked effectively, there is still a
problem with handling class imbalance. Model performance may be further enhanced by
sophisticated methods like oversampling, undersampling, or the application of specialist
algorithms.

4. Patient Engagement and Education: In order to empower people to take charge of their
own health care, future apps should think about including patient engagement and education
components in addition to prediction accuracy. Better health results may result from giving
patients tailored information and doable suggestions.

5. Clinical Trials and Validation: To test the model's performance in actual clinical
circumstances, cooperation with healthcare institutions is necessary for clinical trials and
validation studies. For greater adoption and regulatory approval, these initiatives are
essential.

6. Interdisciplinary Collaboration: By combining domain-specific knowledge and a

comprehensive approach to diabetes prevention, working with epidemiologists, public
health specialists, and healthcare professionals can increase the project's impact.
CODE

!pip install lightgbm

import numpy as np
import pandas as pd
import lightgbm as lgb
import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, auc
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from lightgbm import LGBMClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score, recall_score
from sklearn.metrics import f1_score

diabetes_dataset = pd.read_csv('/content/diabetes.csv')
diabetes_dataset.head()
print("No of rows and columns = ", diabetes_dataset.shape)
diabetes_dataset.describe()
diabetes_dataset['Outcome'].value_counts()
diabetes_dataset.groupby('Outcome').mean()
X = diabetes_dataset.drop(columns = 'Outcome', axis=1)
Y = diabetes_dataset['Outcome']
print(X)
print(Y)
scaler = StandardScaler()
scaler.fit(X)
standardized_data = scaler.transform(X)
print(standardized_data)
X = standardized_data
Y = diabetes_dataset['Outcome']
print(X)
print(Y)
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.31, random_state=2)
print(X.shape, X_train.shape, X_test.shape)
model = lgb.LGBMClassifier(learning_rate=0.09,max_depth=-5,random_state=2)
model.fit(X_train,Y_train,eval_set=[(X_test,Y_test),(X_train,Y_train)],eval_metric='logloss')
print('Training accuracy {:.4f}'.format(model.score(X_train,Y_train)))
print('Testing accuracy {:.4f}'.format(model.score(X_test,Y_test)))
input_data = (7,158,69,20,177,24.7,0.529,55)
input_data_as_numpy_array = np.asarray(input_data)
input_data_reshaped = input_data_as_numpy_array.reshape(1,-1)
std_data = scaler.transform(input_data_reshaped)
print(std_data)
y_pred_prob = model.predict_proba(X_test)[:, 1]
y_pred = model.predict(X_test)
precision = precision_score(Y_test, y_pred)
recall = recall_score(Y_test, y_pred)
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
f1 = f1_score(Y_test, y_pred)
print(f"F1 Score: {f1:.4f}")
fpr, tpr, thresholds = roc_curve(Y_test, y_pred_prob)
roc_auc = auc(fpr, tpr)

plt.figure(figsize=(10, 7))
plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend(loc="lower right")
plt.show()
REFERENCES

1. Yakkundimath R, Jadhav V, Anami B, Malvade N. Co-occurrence histogram based

ensemble of classifers for classifcation of cervical cancer cells. J Electron Sci Technol.
2022;20(3): 100170.

2. Nguyen TT, Nguyen TTT, Pham XC, Liew AW-C. A novel combining classifer method
based on variational inference.
Pattern Recogn. 2016;49:198–212.

3. Sajida P, Muhammad S, Azi ZG, Karim K. Performance analysis of data mining

classifcation techniques to predict
diabetes. Procedia Comput Sci. 2016;82:115–21.

4. Siva SG, Manikandan K. Diagnosis of diabetes diseases using optimized fuzzy rule set by
grey wolf optimization.
Pattern Recogn Lett. 2019;125:432–8.

5. Raja JB, Pandian SC. Pso-fcm based data mining model to predict diabetic disease. Comput
Methods Prog Biomed.
196 (2020).

6. Devi RDH, Bai A, Nagarajan N. A novel hybrid approach for diagnosing diabetes mellitus
using farthest frst and support vector machine algorithms. Obes Med. 17 (2020).

Project Report Minor
No ratings yet
Project Report Minor
33 pages
Diabetes Detection with ML
No ratings yet
Diabetes Detection with ML
10 pages
Report
No ratings yet
Report
47 pages
1 ML
No ratings yet
1 ML
3 pages
Report 4227
No ratings yet
Report 4227
29 pages
Diabetes Prediction Using Machine Learning
No ratings yet
Diabetes Prediction Using Machine Learning
6 pages
Major Proj
No ratings yet
Major Proj
12 pages
Sample INTERNSHIP Report
No ratings yet
Sample INTERNSHIP Report
32 pages
AICTE Internship 2024 Project Report Template 2
No ratings yet
AICTE Internship 2024 Project Report Template 2
27 pages
Bca 5th Sem Minor Report
No ratings yet
Bca 5th Sem Minor Report
46 pages
Kush Don FINAL Jatu
No ratings yet
Kush Don FINAL Jatu
11 pages
ML - Mini Project Diabetic Prediction
No ratings yet
ML - Mini Project Diabetic Prediction
13 pages
Diabetes Prediction via ML
No ratings yet
Diabetes Prediction via ML
82 pages
Diabetes Analysis and Prediction
No ratings yet
Diabetes Analysis and Prediction
45 pages
Simmi
No ratings yet
Simmi
8 pages
Synopsis Diabetes Pred System ML
No ratings yet
Synopsis Diabetes Pred System ML
9 pages
Automated Payroll Management System
No ratings yet
Automated Payroll Management System
4 pages
AI Project Report
No ratings yet
AI Project Report
23 pages
FINALreportondiabetesprediction Numbered
No ratings yet
FINALreportondiabetesprediction Numbered
33 pages
Diabetes Prediction Using ML Techniques
No ratings yet
Diabetes Prediction Using ML Techniques
18 pages
Machine Learning and Applications CS522I1C
No ratings yet
Machine Learning and Applications CS522I1C
15 pages
Minor Project Report
No ratings yet
Minor Project Report
46 pages
ECE AI Project: Diabetes Diagnosis
No ratings yet
ECE AI Project: Diabetes Diagnosis
12 pages
Aiml Project Report
No ratings yet
Aiml Project Report
10 pages
DPS
No ratings yet
DPS
18 pages
DSPYProject Report
No ratings yet
DSPYProject Report
14 pages
Presentation 3
No ratings yet
Presentation 3
8 pages
Major Project Final TABLE DIAGRAM
No ratings yet
Major Project Final TABLE DIAGRAM
28 pages
Minipro 2
No ratings yet
Minipro 2
24 pages
Food Del Report 1
No ratings yet
Food Del Report 1
13 pages
Diabetes Prediction
No ratings yet
Diabetes Prediction
13 pages
Projectreport Diabetes Prediction
No ratings yet
Projectreport Diabetes Prediction
25 pages
Mini Project Report
No ratings yet
Mini Project Report
34 pages
Poster Template
No ratings yet
Poster Template
1 page
Project Report On Diabetes Prediction
No ratings yet
Project Report On Diabetes Prediction
29 pages
Diabetes Prediction Using Machine Learning
No ratings yet
Diabetes Prediction Using Machine Learning
1 page
Diabetes Prediciton Model
100% (1)
Diabetes Prediciton Model
23 pages
Major Project Report 2023-2024
No ratings yet
Major Project Report 2023-2024
33 pages
Diabetes ML Project
No ratings yet
Diabetes ML Project
7 pages
Sankalp Report 1
No ratings yet
Sankalp Report 1
43 pages
Diabetes Prediction Using ML
No ratings yet
Diabetes Prediction Using ML
29 pages
Final Seminar Report Soumya
No ratings yet
Final Seminar Report Soumya
20 pages
Kanak Blackbook Project
No ratings yet
Kanak Blackbook Project
57 pages
Innovative
No ratings yet
Innovative
15 pages
ZEROTHREVIEW
No ratings yet
ZEROTHREVIEW
10 pages
Dap Project
No ratings yet
Dap Project
31 pages
DSU DevHack
No ratings yet
DSU DevHack
3 pages
PM For Diabetes
No ratings yet
PM For Diabetes
11 pages
Project Poster Template-2025
No ratings yet
Project Poster Template-2025
1 page
Irjet V6i3277
No ratings yet
Irjet V6i3277
7 pages
Adikavi Nannaya University: University College of Engineering
No ratings yet
Adikavi Nannaya University: University College of Engineering
13 pages
Mini Project
No ratings yet
Mini Project
15 pages
Major Project
No ratings yet
Major Project
53 pages
Diabetes Thesis1
No ratings yet
Diabetes Thesis1
20 pages
Final
No ratings yet
Final
44 pages
Risab
No ratings yet
Risab
13 pages
Pro 1
No ratings yet
Pro 1
11 pages
Unit 8 - ROAD Abc CONSTRUCTION AND MAINTENANCE PDF
No ratings yet
Unit 8 - ROAD Abc CONSTRUCTION AND MAINTENANCE PDF
7 pages
AIR RESEARCH 2410 CAT C9 STD Base - 110825 - 1 PDF
No ratings yet
AIR RESEARCH 2410 CAT C9 STD Base - 110825 - 1 PDF
30 pages
Variant 1: I. Choose The Correct Variant
No ratings yet
Variant 1: I. Choose The Correct Variant
3 pages
The Gifts of Imperfection
100% (1)
The Gifts of Imperfection
10 pages
LATHE
No ratings yet
LATHE
25 pages
The DEA in Mexico
No ratings yet
The DEA in Mexico
18 pages
Chapter 21 - Introduction To The Pharmacology of The CNS
No ratings yet
Chapter 21 - Introduction To The Pharmacology of The CNS
11 pages
The Religious Traditions of Japan 500 1600 Richard Bowring - Download The Ebook Now To Start Reading Without Waiting
100% (2)
The Religious Traditions of Japan 500 1600 Richard Bowring - Download The Ebook Now To Start Reading Without Waiting
56 pages
Childrensbooksireland - Ie: Children's Books Ireland
No ratings yet
Childrensbooksireland - Ie: Children's Books Ireland
15 pages
Microtomy, Floatation, Adhesives Drying
No ratings yet
Microtomy, Floatation, Adhesives Drying
52 pages
6WG1 TQA Workshop Manual N2223 Ex PDF
80% (10)
6WG1 TQA Workshop Manual N2223 Ex PDF
3 pages
Aoz1360 A&o
No ratings yet
Aoz1360 A&o
12 pages
Clas Foun
No ratings yet
Clas Foun
94 pages
Group BS Assignment 2 - Fire Fighthing System
No ratings yet
Group BS Assignment 2 - Fire Fighthing System
45 pages
Final Report Li
No ratings yet
Final Report Li
55 pages
General Safety Code of Conduct
No ratings yet
General Safety Code of Conduct
333 pages
The Other 90 Percent Unlock Your Vast Untapped Potential
No ratings yet
The Other 90 Percent Unlock Your Vast Untapped Potential
37 pages
Failure of Image Unit or Developer Unit
No ratings yet
Failure of Image Unit or Developer Unit
2 pages
Bonding In-Process Inspection Checklist
No ratings yet
Bonding In-Process Inspection Checklist
1 page
Liebert CW
No ratings yet
Liebert CW
162 pages
Blaylock RSM 2002
No ratings yet
Blaylock RSM 2002
359 pages
Green Lantern Capital PMS July 25
No ratings yet
Green Lantern Capital PMS July 25
22 pages
William Finn
No ratings yet
William Finn
14 pages
Bosch Connectors Catalog
100% (1)
Bosch Connectors Catalog
246 pages
Especification Guide Serie 9
No ratings yet
Especification Guide Serie 9
12 pages
Plastic Pollution: Causes and Solutions
No ratings yet
Plastic Pollution: Causes and Solutions
21 pages
Sifcon Final
No ratings yet
Sifcon Final
24 pages
Special Net Rates List ZP Pharma Only Pormotion Parties
No ratings yet
Special Net Rates List ZP Pharma Only Pormotion Parties
2 pages
Towle - Samantha Rush
0% (1)
Towle - Samantha Rush
225 pages
Preventing Pests 11-22-19
No ratings yet
Preventing Pests 11-22-19
1 page

CSD Project Batch 4

Uploaded by

CSD Project Batch 4

Uploaded by

DIABETES PREDICTION USING MACHINE

COMPREHENSIVE SKILL DEVELOPMENT PROJECT

DEPARTMENT OF COMPUTER SCIENCE ENGINEERING

4. Chapter 2 -Literature Survey

7. Chapter 4- Results and Screenshots

8. Chapter 5- Conclusion and Future Scope

We hereby declare that the project entitled “DIABETES PREDICTION USING

3. About the Project

The increasing global prevalence of diabetes represents a formidable challenge to healthcare

Literature Survey: Diabetes Prediction Using Machine Learning

2. Importance of Early Detection:

3. Machine Learning in Healthcare:

4. Previous Studies on Diabetes Prediction:

5. Data Sources and Features:

6. Model Selection and Evaluation:

7. Challenges and Future Directions:

Overview of Light GBM:

Regularization: To avoid overfitting, Light GBM supports both L1 and L2 regularization.

Dataset Description: Pima Indian Diabetes Dataset

Pregnancies: Number of times pregnant.

Model Performance Metrics:

6. Interdisciplinary Collaboration: By combining domain-specific knowledge and a

!pip install lightgbm

1. Yakkundimath R, Jadhav V, Anami B, Malvade N. Co-occurrence histogram based

3. Sajida P, Muhammad S, Azi ZG, Karim K. Performance analysis of data mining

You might also like