0% found this document useful (0 votes)
30 views37 pages

AI Multi-Disease Detection System Report

The document presents a synopsis report on 'DiagnostiX', an AI-driven platform for multi-disease detection and patient interaction support, submitted for a Bachelor of Technology degree. It outlines the project's objectives, methodology, and significance in addressing challenges in healthcare diagnostics, particularly through the use of machine learning and natural language processing. The report includes acknowledgments, a list of figures and tables, and a detailed structure of the content covered in the project.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views37 pages

AI Multi-Disease Detection System Report

The document presents a synopsis report on 'DiagnostiX', an AI-driven platform for multi-disease detection and patient interaction support, submitted for a Bachelor of Technology degree. It outlines the project's objectives, methodology, and significance in addressing challenges in healthcare diagnostics, particularly through the use of machine learning and natural language processing. The report includes acknowledgments, a list of figures and tables, and a detailed structure of the content covered in the project.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Synopsis Report

on

DiagnostiX: AI-Driven multi-disease detection


and patient interaction support

Submitted as partial fulfilment for the award of

BACHELOR OF TECHNOLOGY
DEGREE
Session 2025-26
`

Computer Engineering
By

Ashutosh Maurya (2200320150017)


Harshit Rai (2200320150028)
Himanshu Mishra (2200320150030)

Under the guidance of


Ms. Monica
Assistant Professor

DEPARTMENT OF COMPUTER ENGINEERING

ABES ENGINEERING
COLLEGE, GHAZIABAD

AFFILIATED TO
DR. A.P.J. ABDUL KALAM TECHNICAL UNIVERSITY, U.P.,
LUCKNOW
(Formerly UPTU)
Declaration

We hereby declare that the work being presented in this report


entitled ―DiagnostiX: AI-Driven multi-disease detection
and patient interaction support‖ is an authentic record of our
own work carried out under the supervision of Ms. Monica,
Assistant Professor, Information Technology.

The matter embodied in this report has not been submitted by us for
the award of any other degree.

Date: 27/08/2025

Signature of Student Signature of Student

Name: Ashutosh Maurya Name: Harshit Rai

Roll No: 2200320150017 Roll No: 2200320150028

(Department of CE&IT) (Department of CE&IT)

Signature of Student

Name: Himanshu Mishra

Roll No: 2200320150030


(Department of CE&IT)

i
Certificate

This is to certify that Abstract Report entitled ―DiagnostiX: AI-


Driven multi-disease detection and patient interaction support‖
which is submitted by

 Ashutosh Maurya (2200320150017)


 Harshit Rai (2200320150028)
 Himanshu Mishra (2200320150030)

in partial fulfilment of the requirement for the award of degree


[Link]. in the Department of Information Technology of Dr. A.P.J.
Abdul Kalam Technical University, Lucknow is a record of the
candidate's own work carried out by him under our supervision.
The matter embodied in this thesis is original and has not been
submitted for the award of any other degree.

Signature of HOD Signature of supervisor

Prof (Dr.) Amrita Jyoti Assistant Professor


Department of CE&IT Department of CE&IT

ii
Acknowledgement

It gives us a great sense of pleasure to present the report of the


[Link]. abstract report undertaken during [Link] Final Year. We owe
a special debt of gratitude to Ms. Monica, Assistant Professor
Department of Information Technology, ABES Engineering College for
her constant support and guidance throughout our work. Her sincerity,
thoroughness and perseverance have been a constant source of
inspiration for us. It is only her cognizant efforts that our endeavors
have been seen light of the day.

We also take the opportunity to acknowledge the contribution of our


Head of Department (Information Technology) for her full support and
assistance during the development of the project.

We also do not like to miss the opportunity to acknowledge the


contribution of all faculty members of the department for their kind
assistance and cooperation during the development of our project. Last
but not the least, we acknowledge our friends for their contribution in
the completion of the project.

Signature of Student Signature of Student

Name: Ashutosh Maurya Name: Harshit Rai


Roll No: 2200320150017 Roll No: 2200320150028

(Department of CE&IT) (Department of CE&IT)

Signature of Student
Name: Himanshu Mishra

Roll No: 2200320150030


(Department of CE&IT)

iii
List of Figures

Figure No. Figure Caption


Data preprocessing pipeline for imaging and tabular
Fig 1
data

Ensemble pipeline for structured-data diseases


Fig 2
(Random Forest + XGBoost)

Fig 3 Overall system architecture of DIAGNOSTIX

Fig 4 Chatbot dialog flow


Sample UI wireframe for upload, results and
Fig 5
explanation display

iv
List of Tables

Table No. Table Caption


Table 1 Datasets selected per disease

Table 2 Models and hyper parameter choices

Table 3 Evaluation metrics and acceptance criteria

Expected performance summary (baseline vs.


Table 4
proposed)

v
List of Symbols, Abbreviations and
Nomenclature

Abbreviation Full Form


AI Artificial Intelligence

ML Machine Learning

DL Deep Learning

CNN Convolutional Neural Network

NLP Natural Language Processing

XAI Explainable Artificial Intelligence

SHAP SHapley Additive exPlanations

Grad-CAM Gradient-weighted Class Activation Mapping

vi
Abstract

The prompt identification and precise diagnosis of life-threatening

illnesses remain major obstacles in the healthcare industry. The majority

of conventional diagnostic systems are disease-specific, resource-

intensive, and frequently unavailable to the general public. To address

these challenges, this project proposes DIAGNOSTIX, an AI-driven multi-

disease detection and patient support platform. The system uses deep

learning and machine learning models such as CNN, Random Forest,

and XGBoost to predict heart disease, diabetes, pneumonia, Alzheimer's

disease, and brain tumors. An NLP-powered medical chatbot not only

diagnoses but also provides interactive support to patients, increasing

user engagement and reducing communication barriers. Furthermore, the

use of Explainable AI methods (SHAP, Grad-CAM) ensures that model

predictions are transparent and interpretable, which boosts the

confidence and productivity of medical professionals to a great extent.

vii
Table of Contents

S. No. Contents Page No.

Student’s Declaration i

Certificate Ii

Acknowledgment Iii

List of Figures Iv

List of Tables v

List of Symbols, Abbreviations, Nomenclature vi

Abstract vii

Chapter 1 : Introduction 1-2

1.1 : Background

1.2 : Motivation & Significance

1.3 : Problem Statement

1.4 : Scope & Limitations

1.5 : Organization of report

Chapter 2 : Related Work/Methodology 3-4

Chapter 3 : Project Objective 5-6

Chapter 4 : Proposed Methodology 7-13

4.1 : Dataset

4.2 : Data Preprocessing

4.3 : Model design & architectures

4.4 : Training strategy & hyperparameters

4.4 : Explainability methods (technical)

4.5 : Evaluation protocol (rigor & stats)

viii
4.6 : Deployment & MLOPs

4.7 : Ethics, privacy & safety

Chapter 5 : Details of Project Work (Design and Implementations 14-20

5.1 : System architecture

5.2 : API contract schema

5.3 : Database schema (concise)

5.4 : Implementation milestones & concrete deliverables

5.5 : Testing plan

5.6 : Deployment plan (of both semesters)

Chapter 6 : Results and Discussion 21-25

Chapter 7 : Conclusion and Future Scope 26

Chapter 8 : References 27

ix
Chapter 1
Introduction
The healthcare industry faces challenges in early detection and accurate

diagnosis of multiple critical diseases. Traditional diagnosis systems are often

disease-specific, lacking integration and scalability. Artificial Intelligence (AI)

and Machine Learning (ML) offer powerful tools to analyze medical data with

high precision. This project proposes a unified AI-based system for the

prediction of Brain Tumor, Alzimer’s Disease, Diabetes, Pneumonia, Heart

Disease.

1.1 Background
Large volumes of diverse data are produced by modern healthcare, including

electronic health records, clinical test results, and medical images (MRI, X-ray).

Predictive analytics and image interpretation automation have shown promise

thanks to developments in ML/DL. The majority of implemented systems,

however, only treat one illness, which results in redundant infrastructure and

user annoyance when several conditions need to be assessed.

1.2 Motivation & Significance


A unified diagnostic tool that handles several high-impact conditions can reduce

the time to diagnosis, save resources, and enable remote access in low-

resource environments. Adding a conversational assistant improves

accessibility for non-expert users and makes awareness or triage easier.

Explainability is crucial: clinicians must be able to comprehend the logic behind

AI-driven recommendations before implementing them.

1
1.3 Problem Statement
Current diagnostic pipelines are siloed and often lack interpretability and

patient-facing communication. The project aims to design a scalable platform

that:

 Predicts multiple diseases from imaging and structured data,

 Provides model explanations understandable to clinicians, and

 Offers a patient-oriented chatbot to clarify results and next steps.

1.4 Scope and Limitations


Scope: Six target conditions (brain tumor, breast cancer, Alzheimer’s, diabetes,

pneumonia, and heart disease), web-based prototype, integration of XAI

techniques, and a rule-based plus ML-powered medical chatbot.

Limitations: This is a prototype for academic evaluation. Clinical-grade

deployment requires regulatory approvals and larger clinical datasets;

predictive outputs will be presented as decision-support rather than definitive

diagnoses.

1.5 Organization of the Report


This overview adheres to the given TOC, which consists of the following

chapters: Chapter 2's literature review; Chapter 3's objectives; Chapter 4's

detailed methodology; Chapter 5's implementation; Chapter 6's results and

discussion; and Chapter 7's conclusions with future implications.

2
Chapter 2

Related Work
Current research in medical AI mostly looks at detecting one disease at a time,

making predictions using patient data and using chatbots to help with health-

related questions. These studies show good results and offer useful information

but they rarely combine the prediction of multiple diseases with clear

explanations and support that is focused on helping patients talk and interact.

2.1 Overview of prior art


Selected studies demonstrate strong single-disease performance: CNNs for

brain tumor segmentation/detection; deep networks for chest X-ray pneumonia

detection; ensemble classifiers for diabetes prediction on clinical datasets; and

chatbots for general medical information. However, these contributions are

mostly task-specific and rarely combine multi-disease prediction with XAI and

patient-facing conversational support.

2.2 Key takeaways from literature


 Transfer learning with pre-trained CNN backbones is advantageous for

imaging tasks.

 Ensemble tree-based models (Random Forest, XGBoost) frequently

produce reliable results when applied to tabular clinical data.

 Clinical trust is enhanced by explainability; Grad-CAM and SHAP are

popular and complementary; SHAP is used for feature-level attribution,

while Grad-CAM is used for spatial localization.

3
 Chatbots trained or fine-tuned on medical QA data provide higher

relevance; linking the chatbot outputs to model predictions can improve

interpretability for lay users.

4
Chapter 3
Project Objective
PrimaryObjective

Create and deploy DIAGNOSTIX, a unified, modular platform that offers a

patient-facing medical chatbot and explainable AI-based decision support for

clinically relevant conditions (heart disease, diabetes, pneumonia, Alzheimer's

disease, and brain tumors).

Specific, measurable objectives:


1. Data & Dataset Integration

o Unify data ingestion pipelines and obtain and pre-process a

minimum of one representative dataset for each disease (refer to

Chapter 4).

o Make certain that every dataset is anonymized, cleaned, and

divided into train, val, and test segments using repeatable splits.

2. Model Accuracy Targets

o Target ROC-AUC ≥ 0.85 and F1 ≥ 0.80 on hold-out test sets for

image tasks (MRI, X-ray, and mammography) (subject to data

complexity).

o Target ROC-AUC ≥ 0.80 and a strong precision/recall trade-off for

tabular tasks (diabetes, heart disease).

3. Explainability

o Integrate Grad-CAM for image localization and SHAP for tabular

feature attribution.

5
o Produce per-sample visual explanations that correlate with known

clinical markers at least 70% of the time (to be validated against

clinician annotations if available).

4. Chatbot & Interaction

o Build an NLP assistant to answer common health queries and

explain model outputs in lay language.

o Achieve a conversational intent accuracy ≥ 85% on a held-out

intent classification set.

5. System & Deployment

o Deliver a secure web prototype (upload → prediction +

explanation + chat).

6. By containerizing models and APIs, it is possible to demonstrate end-to-

end inference with a latency of less than three seconds for single-image

inference on a GPU (prototype target).

7. Reproducibility & Documentation:

o Provide environment files, training scripts, and a README that

are all fully reproducible; record experiments using MLflow or a

comparable program.

6
Chapter 4
Proposed Methodology
Most existing diagnostic tools are disease-specific, leading to high

infrastructure cost and limited accessibility. There is a need for a unified,

intelligent system that can detect multiple diseases using medical data and

also assist users through interactive AI-based support. The lack of model

interpretability in AI-based healthcare systems reduces trust and limits clinical

adoption.

4.1 Datasets
 Brain Tumor (MRI): BraTS (Brain Tumor Segmentation) dataset:

segmentation and classification using T1, T2, and FLAIR modalities.

 Alzheimer's disease: ADNI (Alzheimer's Disease Neuroimaging

Initiative) cognitive scores and structural MRI.

 Chest X-ray dataset for pneumonia (RSNA Pneumonia Detection

Challenge; Kaggle/NIH ChestX-ray14).

 Diabetes: Pima Indians Diabetes dataset (UCI) and, for robustness,

larger clinical datasets, if available

 Heart Disease: UCI Heart Disease dataset (Cleveland) or MIMIC subset

Table 1: Datasets selected per disease


Disease Dataset Used Sample Size Data Type
MRI (T1, T2,
BraTS (Brain Tumor
Brain Tumor FLAIR),
Segmentation) ~2,000 cases
(MRI) segmentation
Dataset
& classification
ADNI (Alzheimer’s Structural MRI
Alzheimer’s
Disease Neuroimaging ~1,500 subjects + Cognitive
Disease Initiative) Scores

7
Chest X-ray
datasets (Kaggle, ~100,000 X- Radiography
Pneumonia
NIH ChestX-ray14, rays (X-ray)
RSNA Challenge)
Pima Indians
Diabetes Dataset ~768 (PIMA) + Tabular clinical
Diabetes
(UCI) + larger clinical extended attributes
datasets (if available)
UCI Heart Disease
~303 (UCI) + Tabular (EHR
Heart Disease Dataset (Cleveland)
extended features)
or MIMIC subset

4.2 Data preprocessing — imaging and tabular


Imaging pipeline
 Normalization: Pixel-wise z-score normalization or rescale to [0,1]

depending on model pretraining.

 Resize: Use model input sizes (e.g., 224×224 for ResNet/EfficientNet;

256×256 for custom networks). Use slice-level models with majority

voting or 3D CNN for MRI multi-slice volumes if computing allows.

 Augmentation:

o Geometric: rotations (±15°), horizontal/vertical flips (if

anatomically valid), random crops, scaling.

o Photometric: brightness/contrast jitter, Gaussian noise, elastic

deformation (careful for medical plausibility).

o Spatial: random erasing/cutout for robustness.

 Class imbalance: Oversampling minority class, focal loss, or

mixup/SMOTE (tabular) to prevent bias.

8
Cleaning & Normalization
Raw Data
Imputation /Scaling

Feature
Ready for Augmentation
Grouping
Model (Images)
(Tabular)

Fig 1: Data preprocessing pipeline for imaging and tabular data

Tabular pipeline
 Cleaning: Drop identifiers, impute missing with median (continuous) or

mode (categorical); for clinical labs prefer domain imputation strategies.

 Feature engineering: Create clinically meaningful derived features (BMI

from weight/height, risk scores), bin continuous variables if useful.

 Scaling: StandardScaler for tree-less models; not required for tree

ensembles.

Random Forest

Ensemble
Input Features
(Average/Stack)

XGBoost

Fig 2: Ensemble pipeline for structured-data diseases (Random Forest + XGBoost)

9
4.3 Model design & architectures (specifics)
 Backbone: Pretrained transfer-learning models: EfficientNet-B3 or

ResNet50 for 2D images. For histopathology consider DenseNet.

 Head: GlobalAveragePooling → Dense(256, ReLU) → Dropout(0.5) →

Dense(num_classes, Softmax).

 Loss: categorical cross-entropy for multi-class tasks; binary cross-

entropy for binary tasks. For imbalance, think about weighted loss.

Tabular models
 Main: XGBoost with early stopping (max_depth 4–8, learning_rate 0.01–

0.1, n_estimators up to 1000).

 Alternative/Ensemble: RandomForest (n_estimators 200–500) and a

small MLP (2–3 layers) to compare.

Volumetric MRI option


 Use 3D UNet for segmentation/localization; use 3D classification head

for whole-volume prediction when segmentation masks exist.

Chatbot NLU
 Intent classifier: Fine-tuned DistilBERT or BERT-base with softmax

over intent labels.

 Entity recognition: CRF layer on top of BERT embeddings (if entities

are needed).

 Response: Template + slot filling or retrieval augmented generation

(RAG) for factual answers from curated FAQ.

10
Imaging Model
(CNN, Transfer-
Learning)

Data Sources Preprocessing


(MRI,X-Ray, (Augment/Imput
Tabular) e)

Tabular Model
(Boosted Trees)

Explainability API + Chatbot


(Grad-CAM / (FastAPI,
Shap) Summaries)

Fig 3: Overall system architecture of DIAGNOSTIX

4.4 Training strategy & hyperparameters


• Batch sizes range: from 16 to 64, depending on the GPU; AdamW is

optimized for CNNs with weight decay 1e-4.

• Learning rate schedule: ReduceLROnPlateau or One-cycle LR; LR 1e-4 is

where fine-tuning begins.

• Regularization: Label smoothing (0.1) for noisy labels, dropout 0.3–0.5, and

weight decay.

• Early stopping: Keep an eye on the validation ROC-AUC; wait 8–12 epochs.

• Cross-validation: if the dataset is small, use the hold-out test plus cross-val;

for tabular data, use stratified k-fold (k=5).

• Reproducibility: log seed values, package versions, and fix seeds

([Link], torch, and tf).

11
Table 2: Models and hyper parameter choices

Model Hyperparameter Purpose


Conv layers = 3, Filter size =
Tumor classification
CNN (Imaging) (3x3), Dropout = 0.3,
from MRI
Optimizer = Adam
Trees = 200, Max Depth = 10 Tabular data
Random Forest
classification
Learning rate=0.1, Estimators Structured-data disease
XGBoost = 250, Max Depth = 6 prediction

4.5 Explainability methods (technical)


 Quantitative explainability validation: If clinician-annotated ROIs

available, compute IoU/Dice between Grad-CAM high-activation region

and ground-truth mask.

 Grad-CAM: Compute gradient of predicted class w.r.t final conv layer →

weighted sum → ReLU → upsample to original image. Save heatmap

overlays and heatmap score (localization confidence).

 SHAP (TreeExplainer/DeepExplainer): For XGBoost use

TreeExplainer; produce per-sample SHAP value lists, global SHAP

summary, and dependence plots for top features.

4.6 Evaluation protocol (rigor & stats)


 Metrics: Accuracy, Precision, Recall, F1, ROC-AUC, PR-AUC. For

segmentation: Dice, IoU. Report classwise metrics and macro/micro

averages.

 Calibration: Use reliability diagrams and Brier score; apply Platt scaling

or isotonic regression for calibration if needed.

12
 Statistical significance: For model A vs B use McNemar’s test (paired

classification) or bootstrap confidence intervals (95%) for AUC

differences.

 Error analysis: Manual review of false positives/negatives, grouped by

demographic or acquisition device to check dataset bias.

4.7 Deployment & MLOps


 Model serving: FastAPI or Flask + Gunicorn; Dockerized image for

inference. Use ONNX or TorchScript to speed up inference.

 Model registry: MLflow for basic model versioning, model artifacts, and

experiment tracking.

 GitHub is used for CI/CD. actions to execute tests, linters, API unit tests,

and automatic Docker builds.

 Monitoring: Record input distribution drift, error rates, and inference

times; plan retraining when drift is identified.

4.8 Ethics, privacy & safety


 PHI handling: Remove direct identifiers; encrypt data at rest; TLS for

transit.

 Informed disclaimers in UI: ―For research/decision-support only —

consult a medical professional.‖

 Bias mitigation: Stratified evaluation across age/gender groups and

device sources; if bias detected, consider re-sampling or domain

adaptation techniques.

13
Chapter 5
Details of Project Work
The project work gives a clear plan for creating, building, and putting into use

DiagnostiX, which is an AI system that helps detect multiple diseases and

interacts with patients. It starts by setting up the system structure, how the

APIs will work, and the database design. Then it outlines specific goals for

developing the AI models, adding explainability features, setting up chatbot

services, and making sure the front end and back end work together smoothly.

The project is split into two semesters. In the 7th semester, the focus is on

making the core version of the system, including disease prediction,

explanations using Grad-CAM, and a simple chatbot. The 8th semester is

about improving the system with a smarter AI assistant, better explainability

tools, making it scalable, and improving security. These steps together help

create a strong and user-friendly platform that supports diagnosis.

5.1 System architecture (module-level)


1. Data ingestion service
o Endpoints: /upload/image, /upload/csv.

o Implements validation, anonymization, and pushes to data lake

(S3 or local storage).

2. Preprocessing service
o Image transforms pipeline (TorchVision Albumentations).

o Tabular cleaning + feature engineering notebooks.

14
3. Model inference microservice
o Accepts preprocessed input, returns: {label, confidence, explanation:

{gradcam_path, shap_summary}}.

o Example endpoint: POST /predict/brain_tumor with multipart image.

4. Chatbot microservice
o Endpoints: /chat/query — returns {response, intent, confidence} and

optionally linked prediction explanation.

5. Frontend UI
o React or plain HTML/Bootstrap with pages: Login, Upload,

Results, Explanation viewer (Grad-CAM overlay slider), Chat

modal.

6. Database & logging


o PostgreSQL for structured records, S3 for image artifacts, ELK

stack for logs (optional).

5.2 API contract examples


Predict Image
POST /predict/{disease}

Headers: Authorization: Bearer <token>

Body: multipart/form-data { file: [Link], patient_id: "U123", metadata: { age: 54, sex: "M" } }

Response:

"prediction": "Tumor",

"confidence": 0.93,

"explanations": {

"gradcam_url": "/artifacts/gradcam/[Link]",

"notes": "High activation in left temporal lobe"

},

"model_version": "brain_v1.2" }

15
Chat Query
POST /chat/query

Body: { user_id: "U123", message: "What does early stage Alzheimer's mean?" }

Response:

{ "response": "Early stage AD often shows mild memory loss...", "intent": "explain_term",

"confidence": 0.87 }

5.3 Database schema (concise)


 users(id, name, email_hashed, role, created_at)

 records(id, user_id, disease, upload_path, created_at)

 predictions(id, record_id, model, label, confidence, explanation_ref,

created_at)

 chatlogs(id, user_id, message, response, intent, timestamp)

5.4 Implementation milestones & concrete deliverables


 Milestone 1 — Week 1–2: Datasets acquired & EDA notebooks;

baseline models trained. Deliverable: dataset report + baseline metrics.

 Milestone 2 — Week 3–6: Transfer-learning models + hyperparameter

tuning. Deliverable: tuned models & validation curves.

 Milestone 3 — Week 6–9: XAI integration (Grad-CAM & SHAP) +

chatbot core. Deliverable: explanation artifacts & chat prototype.

 Milestone 4 — Week 9–11: Web UI + API integration + security &

logging. Deliverable: full prototype deployed in Docker.

 Milestone 5 — Week 11–12: Testing, user simulation, final report &

appendix code. Deliverable: final report & reproducible code.

5.5 Testing plan


 Unit tests: For preprocessing steps, input validators, and API responses

(expected JSON schema).

16
 Integration tests: Upload → preprocess → predict → explanation

retrieval.

 Load testing: Use a simple load test to ensure server handles

concurrent requests (k6 or locust).

 User acceptance: Simulated users evaluate UI and chatbot

(questionnaire: clarity, helpfulness, trust).

5.6 Deployment plan (of both semesters):


 7th Semester – Core System Development: The primary focus during

the 7th semester will be to design and implement the foundational

system architecture of DiagnostiX. This phase establishes both the

frontend interface and the backend microservices, ensuring seamless

integration between data ingestion, model inference, and user

interaction.

 Key Objectives:
 Build the system skeleton (frontend + backend).

 Integrate machine learning models into functional APIs.

 Provide users with a working prototype for multi-disease prediction and

explanation.

 Planned Activities:
Frontend Development:
 Develop user-friendly UI with React/Bootstrap.

 Implement pages: Login/Registration, Image & CSV

Upload, Results Dashboard, Explanation Viewer, and Chat

modal with Grad-CAM visualization overlay for medical

image explanations.

17
Backend Development
 Data Ingestion Service: Implement endpoints (/upload/image,

/upload/csv) with validation, anonymization, and secure storage in

S3/local storage.

 Preprocessing Service: Build pipelines for image

transformations (TorchVision/Albumentations) and tabular data

cleaning + feature engineering.

 Model Inference Microservice: Integrate disease-specific ML

models with endpoints like /predict/{disease} returning predictions,

confidence scores, and explanation links.

 Database Integration: Configure PostgreSQL for structured

patient records and connect with S3 for image artifacts.

Basic Chatbot Service


 Develop intent-classification chatbot to answer predefined queries

(medical terms, workflow explanation).

 Enable linkages between chatbot responses and model outputs

(e.g., pointing to explanation images).

Testing & Deployment


 Unit tests for preprocessing, validation, and model outputs.

 Integration tests for complete pipeline (upload → preprocess →

predict → explain).

 Deploy prototype in Docker for reproducibility and testing.

Deliverable (End of 7th Semester): A fully functional prototype system

with frontend UI, backend APIs, disease prediction models, Grad-CAM

explanations, and a basic chatbot — all deployed in a containerized

environment.
18
 8th Semester – Advanced Features & AI Assistant Integration: The

8th semester will focus on enhancing the system’s intelligence, usability,

and scalability. The highlight of this phase is the integration of a virtual

AI-based medical assistant, making DiagnostiX more interactive,

informative, and patient-friendly.

 Key Objectives:
 Transform the rule-based chatbot into an AI-driven conversational

assistant.

 Enhance explainability features with natural language summaries.

 Improve scalability, security, and reliability for practical deployment.

 Planned Activities:
Virtual AI Assistant Integration
 Upgrade chatbot into an LLM-powered conversational

assistant (using Rasa, LangChain, or OpenAI API).

 Provide context-aware responses to medical queries such

as disease explanations, treatment guidelines, and

diagnostic interpretation.

 Enable assistant to reference Grad-CAM/SHAP results and

explain them in natural language.

 (Optional) Extend to voice-based interaction for

accessibility.

Explainable AI (XAI) Enhancements


 Expand Grad-CAM/SHAP integration with dynamic UI

visualizations (heatmaps, comparative views).

19
 Generate textual explanations aligned with prediction outputs

(e.g., ―High activation in left temporal lobe indicates possible tumor

growth‖).

System Optimization & Scalability


 Conduct load testing (k6/Locust) for concurrent predictions and

queries.

 Strengthen security with JWT-based authentication and role-

based access control.

 Improve logging and monitoring (ELK stack integration).

User Acceptance & Feedback Loop


 Conduct simulated patient–doctor interactions with the AI

assistant.

 Collect structured feedback via questionnaires (clarity, trust,

usability).

 Refine responses and UI based on user insights.

Final Integration & Reporting


 Consolidate all features into the final release.

 Prepare comprehensive documentation, final report, and appendix

with complete source code.

Deliverable (End of 8th Semester): An AI-driven diagnostic support

system with a fully integrated virtual assistant, enhanced explainability,

robust performance, and validated usability — ready for demonstration

and submission.

20
Chapter 6
Results & Discussion
The project evaluation framework defines a comprehensive plan for

experiments, reporting, and validation to assess both model performance and

system usability. It outlines experiment matrices, standardized reporting

templates, and statistical analyses for robust comparisons. Additionally, it

incorporates explainability validation, clinician alignment, error analysis,

limitations, and a structured user study for the chatbot, supported with dialog

flows and sample UI wireframes for results visualization.

6.1 Experiment matrix


 Baseline model (simple CNN / logistic regression)

 Transfer-learning backbone variants (ResNet50, EfficientNetB3)

 Data-augmented vs. non-augmented training

 Ensemble for tabular (XGBoost vs RandomForest vs MLP)

 Explainability validation (quantitative IoU / clinician scoring where

possible)

6.2 Reporting templates


Table 3: Evaluation metrics and acceptance criteria
Metric Acceptance Criteria Rationale

Accuracy ≥ 85% Minimum threshold for clinical


usability
Precision ≥ 0.80 Ensure low false positives

Recall (Sensitivity) Reduce false negatives (critical in


≥0.85 healthcare)
F1-score Balanced performance measure
≥0.82
Strong discrimination ability
AUC-ROC ≥ 0.90
21
Table 4: Expected performance summary (baseline vs. proposed)
Disease Model Test ROC- Accuracy Precision Recall F1 Notes
AUC

Brain Tumor ResNet50 0.92 0.88 0.86 0.90 0.88 Grad-CAM


(fine-tuned) localizes
tumor in
78% TP
Pneumonia EfficientNetB3 0.89 0.85 0.83 0.86 0.84 Good
sensitivity on
frontal X-
rays
Alzheimer 3D-CNN + 0.91 0.87 0.85 0.88 0.86 Combines
Cognitive MRI
Scores(multim volumetrics
odal) + cognitive
data; SHAP
highlights
hippocampal
volume
Diabetes XGBoost 0.87 0.83 0.82 0.84 0.83 Good
(tabular) interpretabilit
y via SHAP;
stable
across
datasets
Heart Random 0.9 0.86 0.84 0.87 0.85 Consistent
Disease Forest + performance
Logistic ; feature
Ensemble importance
highlights
cholesterol
& ECG

6.3 Analysis & Statistical tests


 Paired comparisons: Use McNemar’s test to compare two classifiers

on the same test set (e.g., ResNet50 vs EfficientNetB3). Report p-value

and whether difference is significant at α=0.05.

 Confidence intervals: Bootstrap AUC with 1000 resamples to compute

95% CIs.

 Calibration analysis: Brier score and reliability plots; if poorly

calibrated, apply isotonic regression and report post-calibration metrics.

22
6.4 Explainability & clinician validation
 Grad-CAM validation: If segmentation masks exist, compute IoU and

Dice between high-activation heatmap threshold and ground truth. Aim

for IoU > 0.4 as an initial target (depends on dataset).

 SHAP sanity checks: Confirm SHAP top features match clinical

knowledge (e.g., blood sugar importance for diabetes). If discrepancies

arise, perform feature interaction analysis to detect confounding.

6.5 Error analysis


 Review false negatives for high-risk cases (clinically unacceptable

misses).

 Analyze correlation between misclassifications and metadata (scanner

model, age group).

 Provide mitigation strategies: collect more data in underperforming

strata, calibration, or use model ensembling.

6.6 Limitations to report


 Dataset biases (geography, age, device) — explicitly state

generalizability limits.

 Clinical validation is outside the scope of the academic prototype;

emphasize decision-support role.

 Small datasets may inflate variance; use cross-validation and report

variance.

6.7 User study plan (chatbot)


 Participants: 10–20 volunteers (non-clinical) for initial usability.

 Tasks: Upload sample reports, ask 8 preset health questions, rate

answers for clarity (1–5).

23
 Metrics: Intent accuracy, response helpfulness, average response time.

 Success criteria: ≥ 4.0 average helpfulness and ≥ 85% intent accuracy.

Patient: Upload Symptoms

Bot: Request Details

Patient: Provide Information

Bot: Generate Prediction

Bot: Explain Result

Patient: Ask Clarification

Bot: Provide Guidance

Fig 4: Chatbot dialog flow

24
Patient Dashboard Doctor Dashboard

Upload Image/Enter Symptoms


Worklist/Cases

Prediction & Summary Image Viewer

Explaination (Heatmap) Feature Attributes

Chatbot Notes & Export

Fig 5: Sample UI wireframe for upload, results and explanation display

25
Chapter 7
Conclusion & Future Scope
Conclusion
The project DIAGNOSTIX presents a modular pipeline t

hat integrates machine learning and deep learning with explainability and

patient interaction features. Unlike black-box diagnostic tools, it emphasizes

transparency through Grad-CAM (for medical imaging) and SHAP (for clinical

data).

By combining prediction with a chatbot interface, the system allows patients

and doctors to interact with results in an interpretable and conversational

manner. Even in prototype form, DIAGNOSTIX highlights how pairing decision

support with explanation can lower diagnostic barriers, foster trust, and improve

accessibility, especially in resource-limited healthcare settings.

Future Scope
The system can be extended in several directions:
1. Broader Disease Coverage – Expand to liver, kidney, retinal, and skin

disorders, and incorporate multi-modal data like ECG, genomics, and

EHR records.

2. Continuous Monitoring – Integrate wearables and IoT devices for real-

time health tracking and early alerting.

3. Smarter Chatbot – Train on medical dialogue datasets, add multilingual

support, and improve patient engagement.

4. Clinical Validation – Collaborate with hospitals for real-world trials and

work towards regulatory compliance.

26
References
[1] Rajpurkar, P. et al., ―CheXNet: Radiologist-Level Pneumonia Detection on

Chest X-Rays with Deep Learning,‖ arXiv:1711.05225 (2017).

[2] Razzak, M. I., Naz, S., & Zaib, A., ―Deep Learning for Medical Image

Processing: Overview, Challenges and Future,‖ Neurocomputing (2018).

[3] Sharma, A., et al., ―Brain Tumor Detection Using CNNs,‖ International

Journal of Medical Imaging (2015).

[4] Gupta, R., et al., ―Predictive Modeling of Diabetes Using Machine Learning

Techniques,‖ Journal/Conference (2021).

[5] Patel, S., et al., ―Deep Learning for Pneumonia Detection in Chest

Radiographs,‖ IEEE Access (2022).

[6] Bora, N., et al., ―Applications of NLP in Healthcare: Medical Chatbots,‖

Review (2023).

[7] Selvaraju, R. R., et al., ―Grad-CAM: Visual Explanations from Deep

Networks via Gradient-based Localization,‖ ICCV (2017).

[8] Géron, A., ―Hands-On Machine Learning with Scikit-Learn, Keras, and

TensorFlow,‖ O’Reilly (2019).

[9] Scikit-learn Documentation, [Link] (accessed for

implementation details).

27

You might also like