Disease Pred
Disease Pred
Minor Project-2
Submitted by: -
This is to certify that the Minor Project-II entitled “EXPANCE TRACKER” submitted by
SHASHANK GUPTA, SARANSH LAKHERA, and SAHIL PANDEY has been carried out
under my guidance & supervision. The project report is approved for submission towards partial
fulfillment of the requirement for the award of degree of BACHELOR OF TECHNOLOGY in
COMPUTER SCIENCE AND ENGINEERING from RAJIV GANDHI PROUDYOGIKI
VISHWA-VIDYALAYA, BHOPAL (M.P).
We hereby declare that the project entitled “DISEASE PREDICTION” which is being
submitted in partial fulfillment of the requirement for award of the Degree of Bachelor of
Technology in Computer Science to “RAJIV GANDHI PROUDYOGIKI
VISHWAVIDYALAYA, BHOPAL (M.P.)”
is an authentic record of our own work done under the guidance of PROF. SHIVENDU DUBEY,
Department of Computer Science and Engineering, GYAN GANGA INSTITUTE OF
TECHNOLOGY & SCIENCES, JABALPUR.
The matter reported in this Project has not been submitted earlier for the award of any other degree.
Date:
Place: JABALPUR
ACKNOWLEDGEMENT
We sincerely express indebtedness to esteemed and revered guide PROF. SHIVENDU DUBEY,
of Department of Computer Science for his invaluable guidance, supervision and
encouragement throughout the work. Without his kind patronage and guidance, the project would
not have taken shape.
We take this opportunity to express deep sense of gratitude to Dr.Ashok Verma, Head of
Department of Computer Science for his encouragement and kind approval. Also, we thank him
in providing the computer lab facility. We would like to express our sincere regards to him for
advice and counseling from time to time.
We owe sincere thanks to all the faculties in Department of Computer Science and Engineering
for their advice and counseling time to time.
1. INTRODUCTION 8
2. PROBLEM STATEMENT
2.1 Business Requirements 10
2.1.3 Reports 11
2.1.4 Usability 12
3.2 Objective 12
4. DURATION 13
4.1 Timeline 13
5. REQUIREMENTS 14
6. DESIGN TECHNIQUES 16
6.1 Flask 17
6.2 Python 18
6.3 React 18
7.4 Observation 20
8. DESIGN 21
11. CONCLUSION 40
12. BIBLIOGRAPHY 40
Abstract
In today’s healthcare landscape, timely diagnosis of chronic diseases like diabetes is critical. With
the increasing load on the healthcare system and lack of adequate resources in rural and semi-
urban areas, the integration of machine learning-based predictive systems has become essential.
This project aims to develop a disease prediction model focusing on diabetes using a machine
learning algorithm trained on medical datasets. The application will allow users to input
symptoms and receive a predictive outcome based on data patterns, offering a cost-effective and
scalable solution. Our goal is to support pre-diagnosis, especially for underserved populations,
and enable early interventions. The system’s effectiveness has been validated using statistical
methods and cross-validation techniques, showing promising results.
INTRODUCTION
1.1 The application of Artificial Intelligence (AI) and Machine Learning (ML) in the healthcare sector is
revolutionizing disease prediction and diagnosis. With the rise of lifestyle diseases such as diabetes,
hypertension, and cardiovascular illnesses, early detection has become paramount. According to the World
Health Organization, diabetes alone caused an estimated 1.5 million deaths globally in 2021. Most of these
cases could have been prevented or managed better with early diagnosis. This project aims to build a
machine learning-based system for predicting diabetes, providing preliminary diagnostic support based on
clinical and behavioral parameters. This helps reduce dependency on immediate physician availability,
speeds up intervention, and increases awareness. The initiative aligns with the Digital India and Ayushman
Bharat initiatives to improve accessibility and efficiency in healthcare delivery.
2. PROBLEM STATEMENT
Diabetes has become one of the most common non-communicable diseases worldwide, with rising cases in both
urban and rural India. The current medical infrastructure is overburdened, and access to early diagnosis is
limited. Delays in detection lead to complications such as organ failure, vision loss, and cardiovascular issues.
Despite the availability of data, most systems are not intelligent enough to use this data for early detection.
Challenges include:
• BUSINESS REQUIREMENTS
Stakeholder Requirement
Patient Simple interface, quick prediction, privacy
Doctor/Admin Accurate results, analytics dashboard
Developer Clean datasets, scalable ML models
Government Align with public health mission goals
Ethics Committee No misuse of sensitive patient data
2.1.4 USABILITY
Assumptions:
Users have internet access
Dataset used is reliable and verified
Inputs are clinically relevant
Constraints:
Limited dataset
No integration with real-time hospital data
Works only for diabetes in current version
Dependencies:
Python 3.x
Flask Framework
Pandas, NumPy, Scikit-learn
Frontend: HTML, CSS, JavaScript
3.1 OBJECTIVE
4.
The main objective of the Disease Prediction System using Machine Learning is to provide a reliable, fast, and
intelligent platform that can assess the likelihood of a patient developing a disease—primarily diabetes—based on input
health parameters. This system aims to support medical decision-making by offering an initial assessment and reducing
the workload of medical professionals.
Specific Objectives:
To develop a web-based interface that collects patient data in real-time.
To implement a machine learning model that predicts the risk of diabetes.
To improve the accuracy and speed of disease diagnosis using automation.
To create a system that is scalable, secure, and accessible across devices.
To assist in early diagnosis and preventive care for high-risk patients.
To minimize the gap in healthcare accessibility between urban and rural areas.
.
Management Information System (MIS) reports are essential for tracking the usage,
efficiency, and outcomes of the disease prediction system. These reports provide
meaningful insights into the system's operation and its user base.
Usage Report: Number of users, login frequency, and peak usage times.
Prediction Summary Report: Number of positive and negative cases predicted over a
selected time frame.
Geographical Reports: Location-based usage statistics (urban vs. rural).
Performance Reports: Accuracy rate of predictions, average time taken per diagnosis.
Error Reports: Logs of failed predictions, invalid inputs, and form errors.
Maintenance Logs: Updates, patches applied, and system downtime logs.
These reports can be exported in formats such as CSV or PDF for use in
administrative dashboards or government health department submissions.
.
5. DURATION
Total
Duration Team
Project Phase Person
(Days) Members
Days
Requirement
12 3 3
Gathering
Design &
15 1 1
Architecture
ML Training &
30 1 1
Development
Testing &
10 1 3
Debugging
Documentation 1 1 1
Total
Duration Team
Project Phase Person
(Days) Members
Days
Final Deployment 1 2 1
2
Total 69 Days — Person
Days
External interface requirements define how the system will interact with users and other systems.
Works on any device with a web browser, including mobile phones, tablets, and desktops.
:
The platform operates solely on the client side, using HTML, CSS, and JavaScript. Data storage can
be done through local storage (e.g., browser local storage) for persistence.
1. PERFORMANCE REQUIREMENTS:
2. SAFETY REQUIREMENTS:
The responsibility for shared content within expense tracker lies with the administrators of each
college. Administrators are held accountable for the materials they upload to the app, ensuring a
safe and controlled environment for users.
3. SECURITY REQUIREMENTS:
Safety requirements are necessary to ensure the integrity, confidentiality, and responsible use of the system, especially
since it deals with health data.
Data Privacy: All personal information must be anonymized or encrypted before storage.
Access Control: Only authorized users can access sensitive modules like admin dashboards.
Input Validation: All user inputs should be validated to prevent SQL injection or XSS attacks.
Secure Communication: Use HTTPS to prevent interception of data during transmission.
Backup Systems: Periodic backups to recover data in case of system failure.
Error Handling: Clear error messages without exposing internal system logic.
These attributes describe how well the software behaves in terms of performance, usability, maintainability, and reliability.
1. Reliability:
The system should produce consistent and accurate results under normal and high load.
Fail-safe mechanisms should be implemented in case of model failure or API timeout.
2. Availability:
The application should be available 24/7 with minimal downtime.
Scheduled maintenance should be announced in advance.
3. Security:
Implementation of secure login (with password encryption).
Protection against common cyber threats like SQL injection, brute force attacks, etc.
4. Maintainability:
Modular design allows for easy debugging and upgrades.
Proper documentation to help new developers understand the codebase.
5. Portability:
The system should be deployable on different environments (Windows/Linux).
Mobile browser compatibility ensures usability across devices.
6. Performance:
Prediction result should be generated in less than 3 seconds.
Efficient use of RAM and CPU during model training and runtime.
6. DESIGN TECHNIQUES
Dataset cleaning using Pandas
Feature scaling for better model accuracy
Splitting data into training and testing
Training logistic regression, decision tree classifiers
Storing model with pickle for reuse
Frontend-backend communication via Flask routes
Design of the site has been done using the following technologies: -
● HTML
● CSS
● JAVASCRIPT
6.1 HTML
Provides the structural layout of the pages, including forms, buttons, and lists.
6.2 CSS
Implements responsive design and layout, ensuring compatibility across devices. Stylesheets manage
the look and feel, using Flexbox or Grid for layout control.
6.3 JAVASCRIPT
Adds interactivity by allowing users to add and manage their expense entries. JavaScript handles
React is a popular choice for many developers and businesses due to its extensive set of features
and benefits. Here are some compelling reasons to use React for your mobile and web
application development projects:
● Real-time Database: React provides a real-time NoSQL database that allows you to
synchronize data across clients in real time. This is ideal for applications that require
instant updates and collaboration, such as chat apps and collaborative tools.
● Authentication: React simplifies user authentication with support for various
authentication methods, including email/password, social logins (e.g., Google, Facebook,
Twitter), and more. It offers a secure and scalable way to manage user identities.
● Cloud Firestore: Cloud Firestore, React's scalable NoSQL database, offers a more
powerful query engine compared to the Realtime Database. It provides real-time data
synchronization, making it suitable for applications with complex data storage needs.
● Cloud Functions: React Cloud Functions allows you to run server-side code in response
to events triggered by React services or HTTP requests. This is valuable for creating
custom backend logic, processing data, and integrating with external services.
● React Hosting: React Hosting provides a straightforward and secure hosting solution for
web applications. You can deploy your web app directly from the React CLI and benefit
from content delivery through a global Content Delivery Network (CDN).
● React Storage: React Storage offers cloud-based file storage with automatic scaling and
easy integration into your React applications. It's often used for storing user-generated
content, such as images, videos, and files.
● Cloud Messaging: React Cloud Messaging (FCM) is a cloud solution for sending
messages and notifications to iOS, Android, and web applications. It supports real-time
messaging and targeting specific user groups.
● Machine Learning Integration: React integrates with Google's machine learning
capabilities, allowing you to leverage features like ML Kit to add machine learning
capabilities to your apps.
● Performance Monitoring: React Performance Monitoring provides insights into your
app's performance, helping you identify and resolve issues related to network requests,
app startup time, and more.
● Analytics: React Analytics offers detailed user analytics, enabling you to track user
behavior, measure in-app events, and gain insights into user engagement with your
application.
● Remote Config: React Remote Config allows you to modify your app's behavior and
appearance without the need to publish a new app update. You can target specific user
groups or app versions with customized configurations.
● A/B Testing: React A/B Testing lets you run experiments in your app to determine which
variations of features or user experiences perform better with users.
● Crashlytics: React Crashlytics provides detailed crash reporting and analysis to help you
identify and fix issues in your app quickly.
● App Indexing: React App Indexing helps your app get discovered on Google Search by
allowing you to index content from your app and make it accessible through Google
search results.
React offers a comprehensive and integrated set of services that simplify many aspects of
application development. It can help you save time, improve app quality, and enhance user
engagement. Whether you are building a mobile app, a web app, or a combination of both, react
is a powerful tool for developers and businesses looking to streamline their development process
and create successful applications.
Sprint Breakdown:
Sprint 1: Requirement analysis and dataset collection
The project does not involve a progressive refinement process that requires evolutionary models, as all
requirements are predefined.
Since user feedback could prompt design improvements, a strict sequential approach like the
Waterfall model would restrict flexibility.
7.4 OBSERVATION
The Agile model allows iterative development and user feedback, which can lead to
enhancements in usability and performance
● Technical: Utilizes Html and Css for efficient development and maintenance. Integrated
mood tracker showcases technical soundness.
● Operational: Streamlined user roles for clients, and administrators ensure smooth
navigation and effective service delivery.
● Market: Meets a critical need in the food recipe care space, offering accessible services
aligning with growing demand.
● Technically
● Operationally
● Economically Feasible
1. DESIGN
User Roles:
Guest User: Can input symptoms and get prediction
Admin: Can monitor logs, view usage stats
A use case is a description of how end-users will use a software code. It describes a task or a
series of tasks that users will accomplish using the software, and includes the responses of the
software to user actions.
SYSTEM DESIGN
1.2 ACTIVITY DIAGRAM
1.3 SEQUENCE DIAGRAM
.
SEQUENCE DIAGRAM
1.4 DFD
It involves systematically evaluating a software product to identify and fix defects, errors, and
vulnerabilities.
Testing of software is a critical phase in the software development life cycle aimed at
identifying and fixing defects or issues in the software to ensure its quality, reliability, and
functionality.
Effective testing is crucial to delivering high-quality software. It helps identify and rectify
issues early in the development cycle, reducing the cost and impact of defects on the final
product.
Test Strategy:
Unit Testing: Functions and form validation
Integration Testing: Frontend-backend interactions
System Testing: Full system evaluation
Manual Testing: GUI and user inputs
Test Case Input Expected Output Result
Form Validation Empty Fields Error Message Pass
Prediction Model Valid Input Diabetes: Yes/No Pass
2.1 Types of Software Testing:
● Unit Testing: Testing individual components or functions to ensure they work as intended.
● Integration Testing: Verifying the interaction between different modules or components.
● System Testing: Evaluating the entire software system as a whole.
● User Acceptance Testing (UAT): Performed by end-users to confirm the software meets their
requirements.
● Regression Testing: Ensuring that new code changes don't break existing functionality.
● Performance Testing: Assessing the software's speed, scalability, and responsiveness.
● Security Testing: Identifying vulnerabilities and assessing the security of the software.
● Usability Testing: Evaluating the user-friendliness and user experience of the software.
● Compatibility Testing: Checking the software's compatibility with different devices, browsers,
and platforms.
● Test Planning: Developing a testing strategy, defining objectives, and creating test plans.
● Test Design: Creating test cases, scripts, and test data.
● Test Execution: Running the tests and collecting results.
● Defect Reporting: Documenting and managing identified issues.
● Test Closure: Summarizing the testing process, archiving test materials, and generating reports.
● Manual Testing: Testers perform tests manually without using automation tools.
● Automated Testing: Test scripts and tools are used to automate the testing
process, improving efficiency and repeatability.
● Waterfall Model: Testing is typically done at the end of each development phase.
● Agile Model: Testing is integrated into the development process with continuous testing
iterations.
● DevOps/Continuous Integration (CI)/Continuous Delivery (CD): Automated testing is a crucial
part of the development pipeline, ensuring rapid and reliable code deployment.
● Various testing tools are available for different types of testing, such as Selenium for web
testing, JUnit for Java unit testing, and JIRA for test management.
● QA is the overall process of ensuring software quality, which includes testing but also
encompasses processes like code reviews and best practices.
2.1.6 Challenges:
● Test data management, test environment setup, and the evolving nature of software can present
challenges in software testing.
● Define clear testing objectives, create comprehensive test cases, automate repetitive tests,
and collaborate between development and testing teams.
●
2.2 BETA TESTING
Beta testing is a type of user acceptance testing where a pre-release version of a software
product is made available to a select group of external users, known as beta testers. These users
are not part of the development team but represent the actual target audience for the software.
Beta testing is a crucial step in the software development process to ensure that the software
product is well- received by its intended users and to identify and rectify issues before the
official release. It helps in enhancing the software's quality, user satisfaction, and market
readiness.
3. Usability Testing:
Beta testing provides insights into the software's usability, user interface, and user experience.
This feedback helps in making the software more user-friendly.
4. Stress Testing:
Beta testers can provide information on how the software performs under different conditions,
including heavy usage, various hardware configurations, and network conditions.
6. Real-World Testing:
It allows the software to be tested in a variety of real-world scenarios, providing a more accurate
assessment of its performance and reliability.
● Closed vs. Open Beta: Beta testing can be "closed" (limited to a specific group of invited
testers) or "open" (available to the public). Closed beta tests are often used for more controlled
feedback, while open beta tests can involve a broader range of users.
● Time-Limited: Beta testing typically has a predefined time frame during which feedback is
collected and issues are addressed before the final release.
● Iterative: Feedback from beta testing can lead to multiple iterations and subsequent beta
releases to address identified issues and improve the product.
● Communication with Testers: Effective communication with beta testers is important. This
includes providing them with instructions, collecting feedback, and addressing their questions
and concerns.
White-box testing is a software testing method that examines the internal structure, code, and
logic of a software application. Testers who perform white-box testing have knowledge of the
internal workings of the application, including the source code. This testing method is
sometimes referred to as "clear box testing," "glass box testing," or "structural testing." Its
primary goal is to evaluate the application's internal logic, data flow, and the way it handles
different conditions and scenarios.
Here are some key aspects of white-box testing:
Purpose: White-box testing focuses on verifying the correctness of the code, identifying logical
errors, and ensuring that all code paths are executed as intended.
2.3.2 Testers:
White-box testing is typically performed by developers, code reviewers, or specialized quality
assurance engineers who have access to the source code.
Limitations:
Requires a deep understanding of the code, which may not be available for third-party or
legacy software.
Testing every possible code path can be time-consuming and may not be feasible in
complex applications.
White-box testing is often used in combination with other testing methods, such as black-box
testing, to provide a comprehensive assessment of software quality. It's particularly valuable in
critical systems and applications where code integrity and reliability are of utmost importance.
Black-box testing is a software testing method that assesses the functionality of a software
application without examining its internal code, structure, or logic. Testers who perform black-
box testing do not have access to the source code and focus solely on testing the software based
on its specifications and requirements. This method is sometimes referred to as "behavioral
testing" or "functional testing." Its primary goal is to ensure that the software performs its
intended functions correctly and meets the specified requirements.
1. Purpose:
Black-box testing is primarily used to validate that the software behaves as expected from an
end- user perspective. It focuses on functional correctness, input-output behavior, and system
functionality.
2. Testing Techniques:
Functional Testing:
Evaluates whether the software's functions and features work as specified.
-Non-Functional Testing:
Assesses aspects like performance, usability, security, and compatibility.
- Boundary Testing:
Examines how the software handles input at the boundaries of valid and invalid data.
- Error Handling Testing:
Verifies how the software manages and reports errors and exceptions.
3. Testers:
Black-box testing can be performed by quality assurance engineers, independent testing teams, or
end-users that do not have knowledge of the application's internal code.
4. Types of Black-Box Testing:
- Functional Testing:
Ensures that the software functions as expected and meets user requirements.
- System Testing:
Evaluates the entire system to ensure it operates correctly as a whole.
- Integration Testing:
Tests how different components or modules interact with each other.
- User Acceptance Testing (UAT):
Performed by end-users to validate that the software meets their needs.
5. Advantages:
- Does not require knowledge of the application's internal code, making it suitable for testing
third-party or legacy software.
- Focuses on user requirements and real-world scenarios.
6. Limitations:
- May not uncover certain types of issues like logic errors or inefficiencies within the code.
- Testing coverage depends on the quality of the requirements and test cases.
Black-box testing is an essential part of the software testing process, providing an independent
assessment of software quality from an end-user perspective. It complements white-box testing,
which focuses on the internal structure of the code, and is crucial for identifying functional
issues, ensuring compliance with requirements, and enhancing the overall quality of software
applications.
Software testing is critically important for several reasons in the software development process:
Ensuring Reliability:
Testing ensures that the software operates reliably under various conditions and user
interactions. This is essential to build trust among users and stakeholders.
Meeting Requirements:
Testing ensures that the software meets its intended requirements and specifications. It verifies
that the software behaves as expected and delivers the functionality users require.
Enhancing Security:
Security testing identifies vulnerabilities and weaknesses in the software that could be exploited
by attackers. Addressing these vulnerabilities is crucial to protect sensitive data and prevent
security breaches.
Optimizing Performance:
Performance testing helps evaluate the software's speed, scalability, and responsiveness. This is
crucial for ensuring that the software can handle the expected load and user demands.
User Satisfaction:
Usability testing and user acceptance testing (UAT) ensure that the software is user-friendly and
meets the needs and expectations of its users. Satisfied users are more likely to continue using
the software.
Reducing Costs:
Identifying and fixing issues during the development and testing phases are generally more cost-
effective than addressing them after the software is in production. Maintenance costs are
significantly lower when issues are resolved early.
Continuous Improvement:
Testing provides valuable feedback and data that can be used to improve the software over time.
It helps in identifying areas for enhancement and optimization.
Risk Mitigation:
Testing helps mitigate risks associated with software development. By identifying and
addressing issues early, it reduces the likelihood of project delays and cost overruns.
Quality Assurance:
Software testing is an integral part of the quality assurance process. It ensures that the software
is of high quality and meets predefined quality standards.
In summary, software testing is an essential and integral part of the software development
process. It helps ensure that the final product is reliable, secure, performs well, and meets user
expectations. Testing is a cost-effective way to identify and address issues early in the
development cycle, ultimately leading to a better software product and a more successful
software project.
3. RESULT AND DISCUSSION
The project demonstrates that machine learning can provide reliable predictions for diabetes based on input
features like age, BMI, glucose levels, etc. The application is lightweight, scalable, and easy to use, even for
non-technical users. Future enhancements include adding other diseases, improving accuracy with deep
learning, and creating an Android version of the app.
4. BIBLIOGRAPHY
Appendix
Sample dataset rows
Output JSON structure
Terminal screenshots of training
Sample error messages and UI output