0% found this document useful (0 votes)

2 views

Lecture 3 Design of a ML System

Design of ml system

Uploaded by

Paawan Sharma

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Lecture 3 Design of a ML System

Design of ml system

Uploaded by

Paawan Sharma

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

1/6/2025

ML: Course Objectives

NIMS University
NIET 1.Understand the concept of learning in computer and science.
DEPARTMENT OF CSE 2.Compare and contrast different paradigms for learning
(supervised, unsupervised, etc.).
3.Design experiments to evaluate and compare different
Faculty: Prof. (Dr.) Vineet Mehan machine learning techniques on real-world problems.

Lecture – 3 Design of a ML System

Machine Learning (CSC601B) 1 Machine Learning (CSC601B) 2

COURSE OUTCOMES Syllabus

On completion of this course, the students shall be able to:-

1. Comprehend core machine learning concepts (supervised/unsupervised
learning, models) for data analysis and prediction.
2. Implement various machine learning algorithms (e.g., linear regression, kNN,
decision trees) to solve real-world problems.
3. Evaluate and compare model performance using appropriate metrics (accuracy,
precision, recall).
4. Preprocess and prepare data for machine learning tasks (cleaning, normalization,
feature engineering).
5. Communicate machine learning results effectively, interpreting model behavior
and limitations.

Machine Learning (CSC601B) 3 Machine Learning (CSC601B) 4

SUGGESTIVE READINGS MODE OF EVALUATION

Theory
• Text/References Books:
Internal End Term
• 1. Kevin P. Murphy, “Machine Learning: A Probabilistic Perspective”, MIT Press, 2012. Components
• 2. Ethem Alpaydin, “Introduction to Machine Learning”, MIT Press, Third Edition, 2014.
Assessment Examination
• 3. Tom Mitchell, "Machine Learning", McGraw-Hill, 1997.
Marks 30 70
• 4. Sebastian Raschka, Vahid Mirjilili,”Python Machine Learning and deep learning”, 2nd edition, kindle book,
Total Marks 100
2018
Lab
• 5. Carol Quadros,”Machine Learning with python, scikit-learn and Tensorflow”, Packet Publishing, 2018
Internal End Term
• 6. Gavin Hackeling,” Machine Learning with scikit-learn”, Packet publishing, O’Reily, 2018 Components
Assessment Examination
• 7. Stanford Lectures of Prof. Andrew Ng on Machine Learning
Marks 15 35
Machine Learning (CSC601B) 5 Total Marks
Machine Learning (CSC601B) 50
By: Prof. (Dr.) Vineet Mehan 6

1
1/6/2025

Index Design of a ML System

1. Design • There are a few Design Steps (9) of a ML System
2. Example
• Step-by-Step

• Along with a relevant example

Machine Learning (CSC601B) By: Prof. (Dr.) Vineet Mehan 7 Machine Learning (CSC601B) 8

1. Problem Definition 1. Problem Definition

• Theory: • Example:

• Clearly identify the problem to solve and its scope. • Objective: Predict whether a customer will churn (stop using a
service) based on their usage patterns and demographics.
• Specify the input, output, and type of ML task (e.g., classification,
regression, clustering). • Inputs: Customer attributes (age, location, subscription type) and
behavioural data (session duration, payment history).

Machine Learning (CSC601B) 9 Machine Learning (CSC601B) 10

1. Problem Definition 2. Data Collection and Preprocessing

• Output: Binary outcome: 1 (churn) or 0 (no churn). • Theory:

• Type of ML Task: Supervised binary classification. • Gather data relevant to the problem.

• Clean and preprocess it for model readiness:

• Handle missing data, noise, and outliers.
• Transform and normalize features.
• Split data into training, validation, and testing sets.

Machine Learning (CSC601B) 11 Machine Learning (CSC601B) 12

2
1/6/2025

2. Data Collection and Preprocessing 2. Data Collection and Preprocessing

• Example: Demographics: the number and characteristics of people who live in a
• Example:
particular area or form a particular group, especially in relation to their age,
how much money they have and what they spend it on
• Data Sources: • Preprocessing:
• Customer demographics from the CRM database. • Handle missing age values by imputing the median age.
• Behavioral data (e.g., login frequency, average session time) from app • Normalize session duration between 0 and 1.
usage logs. • Encode subscription type (e.g., "Basic" = 0, "Premium" = 1).
• Payment history from billing systems.

Machine Learning (CSC601B) 13 Machine Learning (CSC601B) 14

2. Data Collection and Preprocessing 2. Data Collection and Preprocessing

• Example: • 1. Training Subset (70%)

• Splitting: Split the dataset into 70% training, 15% validation, and 15% • Purpose: The training set is used to teach the model to identify
testing subsets. patterns and learn from the data. It forms the foundation of the
model's understanding of the problem.

• Size: Allocating 70% of the dataset ensures that the model has a
sufficient amount of data to learn from, reducing the risk of
underfitting (where the model doesn't learn enough).

Machine Learning (CSC601B) 15 Machine Learning (CSC601B) 16

2. Data Collection and Preprocessing 2. Data Collection and Preprocessing

• 2. Validation Subset (15%) • 3. Testing Subset (15%)

• Purpose: The validation set is used to fine-tune the model. This subset
helps: • Purpose: The testing set evaluates the model's performance on
• Monitor the model's performance during training. unseen data after training is complete. It gives an unbiased estimate
• Tune hyperparameters (e.g., learning rate, number of layers). of how the model will perform in real-world scenarios.
• Detect overfitting, which occurs when the model performs well on the training data
but poorly on unseen data.
• Size: Reserving 15% ensures enough data to reliably assess the
• Size: A 15% allocation provides a good balance to evaluate the model model’s generalization capability.
during training without sacrificing too much data from the training subset.

Machine Learning (CSC601B) 17 Machine Learning (CSC601B) 18

3
1/6/2025

3. Model Selection 3. Model Selection

• Theory: • Example:

• Choose an algorithm suitable for the task and data type. • Algorithm: Start with a Logistic Regression as a baseline due to its
simplicity. Then move to Random Forest for better handling of mixed
data types and non-linearity.
• Compare traditional ML models (e.g., Random Forest, SVM) and deep
learning models (e.g., CNNs, RNNs).
• Baseline Model: Logistic Regression to establish a minimum expected
accuracy.
• Select a baseline model for benchmarking.

Machine Learning (CSC601B) 19 Machine Learning (CSC601B) 20

4. Model Design and Training 4. Model Design and Training

• Theory: • Example:

• Define the architecture or configuration of the selected model. • Model: Random Forest with the following hyperparameters:
• Number of trees: 100.
• Train the model using the training dataset and tune hyperparameters. • Max depth: 10.
• Minimum samples per leaf: 2.
• Monitor metrics during training to avoid overfitting or underfitting.
• Training Process: Train on the 70% training set.

Machine Learning (CSC601B) 21 Machine Learning (CSC601B) 22

5. Evaluation 5. Evaluation
• Theory: • Let's use a simple example to explain cross-validation, specifically 3-
fold cross-validation, with a small dataset.
• Use appropriate metrics to measure the model’s performance.
• Dataset:
• Conduct cross-validation to ensure robustness. • Imagine we have a dataset of 6 data points:
• Data: [A, B, C, D, E, F]
• Perform error analysis to identify areas for improvement. • Labels: [1, 1, 0, 0, 1, 0]

Machine Learning (CSC601B) 23 Machine Learning (CSC601B) 24

4
1/6/2025

5. Evaluation 5. Evaluation
• Goal • Step-by-Step Process

• We want to evaluate a model’s performance using cross-validation. • Step 1: Split Data into 3 Folds
We'll use 3-fold cross-validation, which means:
• We divide the dataset into 3 parts (folds):
1.The dataset will be split into 3 equal parts (folds). • Fold 1: [A, B]
• Fold 2: [C, D]
2.Each fold will take turns as the test set, while the other two are used • Fold 3: [E, F]
as the training set.
Machine Learning (CSC601B) 25 Machine Learning (CSC601B) 26

Final Result
The cross-validation process tells us the model's average
5. Evaluation 5. Evaluation accuracy is 50%. This is a more reliable estimate of the model's
performance than using a single train-test split, as it tests the
model on all parts of the dataset.

Machine Learning (CSC601B) 27 Machine Learning (CSC601B) 28

5. Evaluation 6. Deployment
• Results: • Theory:
• Accuracy: 92%.
• Precision: 85%. • Deploy the trained model into a production environment.
• Recall: 78%.
• F1-score: 81%. • Make predictions accessible via APIs or integrated systems.

Machine Learning (CSC601B) 29 Machine Learning (CSC601B) 30

5
1/6/2025

Flask is a lightweight and simple framework in Python that

helps you build web applications and APIs.

6. Deployment AWS Lambda is a service from Amazon Web Services (AWS)

that lets you run your code without needing to worry about
7. Monitoring and Maintenance
managing servers. It automatically handles the underlying
• Example: infrastructure for you, so you can focus on writing and
deploying your code. • Theory:
• Deployment:
• Use Flask to create an API that takes customer data and returns churn
probability. • Continuously monitor model performance to detect data drift or
degradation.
• Host the model on AWS Lambda for scalability.

• Workflow: • Set up alerts for drops in accuracy or significant changes in input

1.CRM system sends customer data to the API. distributions.
2.API returns a churn probability for each customer.
3.System triggers retention campaigns for high-risk customers. • Plan for regular retraining with new data.
Machine Learning (CSC601B) 31 Machine Learning (CSC601B) 32

7. Monitoring and Maintenance 8. Ethical and Regulatory Compliance

Grafana is an open-source platform used for visualizing data
• Example: and monitoring systems. It lets you create interactive • Theory:
dashboards to track metrics, logs, and other key data points in
real-time, making it ideal for system performance analysis and
troubleshooting.
• Monitoring: • Ensure the system adheres to ethical guidelines and legal
• Use a dashboard (e.g., Grafana) to track key metrics like prediction requirements.
accuracy, latency, and data distribution.

• Address biases in the model and explain decisions transparently.

• Maintenance:
• Retrain the model quarterly using the most recent customer data.
• Alert the team if precision falls below 80%.

Machine Learning (CSC601B) 33 Machine Learning (CSC601B) 34

8. Ethical and Regulatory Compliance 9. Scalability

• Example: • Theory:

• Bias Mitigation:
• Design the system to handle growing amounts of data and users.
• Check if the model unfairly predicts churn for specific demographics (e.g.,
age or location).
• Use techniques like caching, parallel processing, and distributed
• Compliance: systems for scalability.
• Follow Government regulations by anonymizing customer data.
• Provide explanations for churn predictions using SHAP values to
stakeholders.
SHapley Additive exPlanations
Machine Learning (CSC601B) 35 Machine Learning (CSC601B) 36

6
1/6/2025

Kubernetes, often abbreviated as K8s, is an open-source

9. Scalability platform for managing and orchestrating containerized
applications. It automates tasks like deploying, scaling, and Summary
managing applications across a cluster of machines.

• Example: • Summary of the Example:

1.Problem: Predict customer churn for a subscription business.
2.Data: Behavioral, demographic, and payment history data.
• Scaling:
3.Model: Started with Logistic Regression, then improved with Random
• Deploy the API on a Kubernetes cluster for load balancing. Forest.
• Use caching for commonly queried customer segments to reduce 4.Evaluation: Achieved 92% accuracy with a focus on recall.
latency. 5.Deployment: Flask API hosted on AWS Lambda.
6.Monitoring: Grafana dashboard and retraining every quarter.
7.Ethics: Checked for demographic bias and ensured Govt. compliance.
8.Scalability: Used Kubernetes for scaling and caching for efficiency.
Machine Learning (CSC601B) 37 Machine Learning (CSC601B) 38

Task REFERENCES
• Explain the steps involved in preprocessing data for a machine 1. ChatGPT
learning model. How would you handle missing values, categorical
variables, and scaling for numerical features in a churn prediction
system? 2. Gemini

• Given a classification task where the goal is to predict customer 3. Google

churn, how would you select and evaluate the performance of
different machine learning models? Discuss the metrics you would
use and why they are important for this specific problem. 4. YouTube

Machine Learning (CSC601B) By: Prof. (Dr.) Vineet Mehan 39 Machine Learning (CSC601B) 40

THANK YOU

Machine Learning (CSC601B) 41

MAchine Learning
No ratings yet
MAchine Learning
120 pages
Confidential 4096032: Uhrs T U G
No ratings yet
Confidential 4096032: Uhrs T U G
7 pages
Csa - 1
No ratings yet
Csa - 1
15 pages
Lecture 1 Foundations of Machine Learning
No ratings yet
Lecture 1 Foundations of Machine Learning
7 pages
Lecture 6 Unsupervised Learning
No ratings yet
Lecture 6 Unsupervised Learning
6 pages
ML Short U1-4
No ratings yet
ML Short U1-4
60 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
4 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
ML Overall
No ratings yet
ML Overall
76 pages
Introduction To Machine Learning: Pekka Parviainen
No ratings yet
Introduction To Machine Learning: Pekka Parviainen
39 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
EContent_7_2025_01_31_11_08_21_01IT0610pdf__2023_12_17_20_26_49pdf__2025_01_16_07_59_27
No ratings yet
EContent_7_2025_01_31_11_08_21_01IT0610pdf__2023_12_17_20_26_49pdf__2025_01_16_07_59_27
3 pages
Mchine Learning Outlines
No ratings yet
Mchine Learning Outlines
4 pages
M.L.CSE Syllabus
No ratings yet
M.L.CSE Syllabus
3 pages
Unit 3
No ratings yet
Unit 3
104 pages
Industrial Training report_Shreya
No ratings yet
Industrial Training report_Shreya
38 pages
Machine Learning Theory and Application
No ratings yet
Machine Learning Theory and Application
3 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
Building a Classification Model Using Different Machine Learning Algorithms
No ratings yet
Building a Classification Model Using Different Machine Learning Algorithms
19 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
Supervised - ML Complete Book
No ratings yet
Supervised - ML Complete Book
153 pages
Aids Cis Final
No ratings yet
Aids Cis Final
6 pages
21CSC305P Machine Learning C Professional Core L T P C 2 1 0 3
No ratings yet
21CSC305P Machine Learning C Professional Core L T P C 2 1 0 3
2 pages
Machine Learning: BE Sixth Semester 20CS610
No ratings yet
Machine Learning: BE Sixth Semester 20CS610
211 pages
AppliedMachineLearning S12023 24
No ratings yet
AppliedMachineLearning S12023 24
5 pages
ML Notes-1
No ratings yet
ML Notes-1
59 pages
Get Machine Learning Theory and Applications: Hands-on Use Cases with Python on Classical and Quantum Machines 1st Edition Vasques free all chapters
100% (6)
Get Machine Learning Theory and Applications: Hands-on Use Cases with Python on Classical and Quantum Machines 1st Edition Vasques free all chapters
65 pages
CS360 ML Syllabus - 12102022
No ratings yet
CS360 ML Syllabus - 12102022
5 pages
ML Lectures 2022 Part 1
No ratings yet
ML Lectures 2022 Part 1
231 pages
Get Machine Learning Theory and Applications: Hands-on Use Cases with Python on Classical and Quantum Machines 1st Edition Vasques free all chapters
100% (1)
Get Machine Learning Theory and Applications: Hands-on Use Cases with Python on Classical and Quantum Machines 1st Edition Vasques free all chapters
52 pages
BCS602-Module-1-2-Notes-1
No ratings yet
BCS602-Module-1-2-Notes-1
35 pages
ML-Unit 1
No ratings yet
ML-Unit 1
101 pages
Module 01- ML-21EC744
No ratings yet
Module 01- ML-21EC744
20 pages
BCM601-Module 1
No ratings yet
BCM601-Module 1
35 pages
Guidelines_Machine_Learning
No ratings yet
Guidelines_Machine_Learning
2 pages
DSC_ MachineLearning Regular HO (1)
No ratings yet
DSC_ MachineLearning Regular HO (1)
7 pages
CP4252-ML-SYLLABUS
No ratings yet
CP4252-ML-SYLLABUS
4 pages
6151A
No ratings yet
6151A
4 pages
Course Overview
No ratings yet
Course Overview
33 pages
Machine Learning
No ratings yet
Machine Learning
7 pages
Project Proposal Machine Learning
No ratings yet
Project Proposal Machine Learning
6 pages
Workflow of A Machine Learning Project
No ratings yet
Workflow of A Machine Learning Project
12 pages
Keyur ML A-1
No ratings yet
Keyur ML A-1
14 pages
ML MODEL 1
No ratings yet
ML MODEL 1
42 pages
Machine Learning Fundamentals (Updated)
No ratings yet
Machine Learning Fundamentals (Updated)
42 pages
Instant download Machine Learning Bookcamp 1st Edition Alexey Grigorev pdf all chapter
100% (1)
Instant download Machine Learning Bookcamp 1st Edition Alexey Grigorev pdf all chapter
55 pages
CONCEPTS IN MACHINE LEARNING-Ktunotes.in
No ratings yet
CONCEPTS IN MACHINE LEARNING-Ktunotes.in
14 pages
Syllabus
No ratings yet
Syllabus
2 pages
ML Full Syllabus
No ratings yet
ML Full Syllabus
576 pages
Unit - 3 - ML
No ratings yet
Unit - 3 - ML
53 pages
Module 3 Data Science Machine Learning
No ratings yet
Module 3 Data Science Machine Learning
53 pages
Pa 2
No ratings yet
Pa 2
13 pages
Scala for Machine Learning Second Edition Patrick R. Nicolasdownload
100% (1)
Scala for Machine Learning Second Edition Patrick R. Nicolasdownload
54 pages
ML with python Syllabus
No ratings yet
ML with python Syllabus
2 pages
INF385T IMLsyllabus
No ratings yet
INF385T IMLsyllabus
4 pages
Chapter 01 machine learning
No ratings yet
Chapter 01 machine learning
22 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
22 pages
Scala for Machine Learning Second Edition Patrick R. Nicolas - Own the complete ebook set now in PDF and DOCX formats
100% (1)
Scala for Machine Learning Second Edition Patrick R. Nicolas - Own the complete ebook set now in PDF and DOCX formats
52 pages
Ids Ashber
No ratings yet
Ids Ashber
9 pages
Aws ML PDF
No ratings yet
Aws ML PDF
74 pages
Manual Data
No ratings yet
Manual Data
13 pages
REPOWER High School STEM: 21st-Century STEM Education Problems Cannot Be Solved With a 19th-Century Academic Structure
From Everand
REPOWER High School STEM: 21st-Century STEM Education Problems Cannot Be Solved With a 19th-Century Academic Structure
Kenneth M Chapman
No ratings yet
DB Normalization
No ratings yet
DB Normalization
18 pages
PYTHON Programming For Aerospace
No ratings yet
PYTHON Programming For Aerospace
2 pages
Salesforce Admin Interview Questions
No ratings yet
Salesforce Admin Interview Questions
8 pages
Fujitsu Touch Panel (USB) Device Driver Setting Manual: For Windows98/Me/2000/XP
No ratings yet
Fujitsu Touch Panel (USB) Device Driver Setting Manual: For Windows98/Me/2000/XP
12 pages
Fragile Base Class: Java Example Solutions See Also References External Links
No ratings yet
Fragile Base Class: Java Example Solutions See Also References External Links
3 pages
Reference Architecture For Event-Driven Application Architecture
No ratings yet
Reference Architecture For Event-Driven Application Architecture
1 page
Chapter 08. Implementation Support
0% (1)
Chapter 08. Implementation Support
33 pages
Microservices On Aws
No ratings yet
Microservices On Aws
35 pages
J2EE Short Note
No ratings yet
J2EE Short Note
3 pages
ARM MICRO CONTROLLERS (PC-III)
No ratings yet
ARM MICRO CONTROLLERS (PC-III)
2 pages
Langevin - EtAl - 2021 - Heuristic Evaluation of Conversational Agents
No ratings yet
Langevin - EtAl - 2021 - Heuristic Evaluation of Conversational Agents
15 pages
Lab # 01: Introduction To Simulink Toolbar: Objective
No ratings yet
Lab # 01: Introduction To Simulink Toolbar: Objective
7 pages
VoIP RP
No ratings yet
VoIP RP
22 pages
Ms-A9251 Rev1.0
No ratings yet
Ms-A9251 Rev1.0
34 pages
State of Angular Javascript Framework in 2021: The Reasons To Choose Angular For Your Next Project
No ratings yet
State of Angular Javascript Framework in 2021: The Reasons To Choose Angular For Your Next Project
6 pages
ISCSI
No ratings yet
ISCSI
3 pages
SIEMENS PCS 7 DCS - Training Course Content V1.0
100% (1)
SIEMENS PCS 7 DCS - Training Course Content V1.0
3 pages
Customizing Oracle Applications 11i Using Custom - PLL Varun Tekriwal
No ratings yet
Customizing Oracle Applications 11i Using Custom - PLL Varun Tekriwal
15 pages
Letter of Recommendation.
No ratings yet
Letter of Recommendation.
1 page
Computer Science Class 11 Half Yearly ACS BARASAT
No ratings yet
Computer Science Class 11 Half Yearly ACS BARASAT
2 pages
Inside The Toyota Prius Part 1 - The Airbag Control Module
No ratings yet
Inside The Toyota Prius Part 1 - The Airbag Control Module
2 pages
CSM Unit-1question Bank
No ratings yet
CSM Unit-1question Bank
3 pages
Artificial Intelligence
100% (1)
Artificial Intelligence
23 pages
Raised Cosine
No ratings yet
Raised Cosine
4 pages
2022 CCNAExam StudyTool
No ratings yet
2022 CCNAExam StudyTool
8 pages
Sbi CMP Rest Realtime Api Specifications V1.9
No ratings yet
Sbi CMP Rest Realtime Api Specifications V1.9
24 pages
Federated Learning Advancements Applications and F
No ratings yet
Federated Learning Advancements Applications and F
7 pages
DLMS Client ReleaseNotes
No ratings yet
DLMS Client ReleaseNotes
24 pages