Open navigation menu

Scribd

0% found this document useful (0 votes)

25 views

A Practical and Technical Introduction To Machine Learning

The document discusses the machine learning project lifecycle including problem framing, data collection, data analysis, data preparation, and model training and evaluation. It provides details on each step such as expressing goals, sampling strategies, exploratory data analysis, feature engineering, establishing baselines, debugging models, and monitoring performance.

Uploaded by

Fandresena No Randrianarison

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

A Practical and Technical Introduction To Machine Learning

The document discusses the machine learning project lifecycle including problem framing, data collection, data analysis, data preparation, and model training and evaluation. It provides details on each step such as expressing goals, sampling strategies, exploratory data analysis, feature engineering, establishing baselines, debugging models, and monitoring performance.

Uploaded by

Fandresena No Randrianarison

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Introduction to

Machine Learning

Photo by Google DeepMind on Unsplash

What is Machine learning?
● Machine learning is a subset of Artificial Intelligence that enables
computers to learn from data without being explicitly programmed.

Rules Output
Traditional Machine
Output Rules
Programming Learning
Data Data
What is Machine learning?
● Machine learning is a subset of Artificial Intelligence that enables
computers to learn (progressively improve performance on tasks) from
data (examples, experience) without explicit (rule-based) programming and
make predictions or decisions.
Autonomously learning from examples; pattern recognition; autonomously identify patterns and extract
insights from data;training (learning) time followed by test (prediction, evaluation) time;

Rules Output
Traditional Machine
Output Rules
Programming Learning
Data Data

supervised learning at training time

Types of Machine Learning

● Prediction (supervised learning): Given an input observation, the model

predicts a numeric value (regression) or a class (classification). (one-time
prediction)
● Analysis (unsupervised learning): The model extracts information (patterns,
structure) from the data.
● Generation: The model generates content (possibly given an input).
● Decision (reinforcement learning): The model (agent) makes decisions by
taking actions and getting rewards in an environment to achieve a goal
(sequential decisions, no labels but learns from experience).
Types of Machine Learning (loosely speaking)

Ground Model
Data Objective
truth output
Prediction Label

Analysis Latent variable

Generation Target output

Decision World state

The four pillars of Machine Learning https://www.youtube.com/watch?v=ZlIjJ9Es-fg

Machine learning project lifecycle

Model
Problem Data Data Data Model
training/
framing collection wrangling analysis deployment
evaluation

Express the goal clearly and concisely

❏ What is the main goal ? In which context? (SMART)
❏ What value does it add? (cost-benefit analysis, include maintenance)
❏ Can we solve it without ML? Is it feasible?
❏ Do we have enough quality data? (correct, representative, predictive power)
Express the goal technically
❏ What is the model’s goal (measurable)? What is the success/failure metric?
❏ What are the Input and Output of the model?
❏ What is the performance measure?
❏ What are the non-ML baselines?
Machine learning project lifecycle

Model
Problem Data Data Data Model
training/
framing collection wrangling analysis deployment
evaluation

❏ Select the dataset size and the sampling strategy

❏ Identify feature and label sources (actual vs proxy labels)
❏ Measure data quality (noise, label or feature error, missing values, predictive power)
❏ Make sure data is representative of production use case (beware sampling bias)
❏ Split the data into train, validation , test (beware random splitting, fix random seed, beware
data leakage, train older than test)
Machine learning project lifecycle

Model
Problem Data Data Data Model
training/
framing collection wrangling analysis deployment
evaluation

❏ Data transformation (convert non-numeric features to numeric, resize input to fixed size)
- Numeric: Normalization (range scaling, log-scaling, clipping, z-score or
standardization), binning (equally-spaced, quantile-based)
- Categorical: one-hot encoding, tokenization
❏ Transform within a pipeline (beware data leakage)
❏ Data cleaning (missing values, imputation)
❏ Feature engineering (determining which features are important for training and creating
them from raw data)
Machine learning project lifecycle

Model
Problem Data Data Data Model
training/
framing collection wrangling analysis deployment
evaluation

❏ Look at data distributions (summary statistics, histogram, density, CDF)

❏ (Periodically) validate the data (expected behavior with data schema; consistency over time;
consider context, source and sampling strategy)
❏ Identify and decide what to do with outliers
❏ Handle noise (confidence interval, hypothesis testing)
❏ Look at individual examples
❏ Group the data (look from different subgroups perspective)
❏ Exploratory data analysis (graphs)
❏ Make hypothesis and look for evidence (scientific method)
❏ Remember correlation != causation
❏
Machine learning project lifecycle

Model
Problem Data Data Data Model
training/
framing collection wrangling analysis deployment
evaluation

❏ Establish strong baselines

❏ Make sure (data) pipeline is correct before training (always validate data quality)
❏ Start with a simple model (then incrementally add complexity, train on small data first)
❏ Train on training data, select models and tune hyperparameters on validation data and
test once on test data (update validation and test data)
❏ Debug based on loss and metric:
- Debug data (validate input data with tests: correct, representative, predictive power;
splits; preprocessing; numerical overflow)
- Debug model: Overfitting (reduce model capacity, regularization, more data,
train-test same distribution); Underfitting (increase model capacity, reduce
regularization, feature engineering); Hyperparameter tuning; Feature selection
(correlation with labels, performance on validation set)
❏ Document everything (especially failures)
Machine learning project lifecycle

Model
Problem Data Data Data Model
training/
framing collection wrangling analysis deployment
evaluation

❏ Periodically train model on new data

❏ Treat data and model as code (version control)
❏ Test each component of the pipeline (input data, data transformations, model updates,
serving infrastructure)
❏ Integration test of end-to-end pipeline (when introducing new models or training on new
data)
❏ Track training-serving skew (data schema skew, features skew, beware feedback loops,
update the model on new data)
❏ Monitor model performance and efficiency (regression testing, checkpointing)

❏
Machine learning project lifecycle
1. Problem framing
Express the problem within the business context, emphasize its values
Decide if solvable without ML, cost-benefit analysis; feasibility; data requirements
Define the problem technically and choose a performance measure
Prepare the environment
2. Data collection
Make sure data is representative of production use cases
Reduce sampling bias
Data annotation strategy if required
Split the data for evaluation)
3. Data analysis
EDA (summary statistics, visualisations, identify outliers)
Extract insights from data
4. Data preparation
Data cleaning and formatting (imputation, encoding, standardization)
Feature engineering
5. Model training and evaluation
Build an end-to-end pipeline that can be tested
Start simple simple models and find strong baselines
Model selection and Hyperparameter tuning
Error analysis
6. Model deployment and maintenance
Pipeline integration
Monitoring and regression testing
7. Presentation
Terminology
Data
❖ Features
❖ Examples
❖ Labels
❖ Dataset

● Supervised learning: Examples are labeled. The goal is to find a model that
predicts y from x.
❏ Classification: Label is a category.
❏ Regression: Label is a real number.
● Unsupervised learning: Examples are unlabeled.
● Reinforcement learning
Supervised learning
Data
❖ Features
❖ Examples (in sample space):
❖ Labels (in label space):
❖ Labeled Dataset:

Examples are labeled. The goal is to find that outputs a

“good” prediction of y given x on unseen examples.
➔ A loss function measures how good a prediction is.
The goal is to find f with a small loss on unseen examples.
Error decomposition
● All pairs are drawn i.i.d from an unknown
distribution on (Data generating assumption)

● The goal is to find f with a small expected loss or risk (generalization error):

● But we cannot compute it since is unknown (no access to population).

So we use an estimate (using a sample or train dataset) and minimize the
empirical risk (train error):

Law of Large Numbers L_emp → L_true a.s

● The function f* with the smallest risk is the Bayes function (best in theory)

● We have to choose f from a set of functions called the hypothesis space.

The function with the smallest risk in F is (best in class)

● The function with the smallest empirical risk in F is (best in practice)

approximation
rror
at ion e error
estim
r approximation error Bayes error
ion erro
at somewhere here
estim

● Approximation error: comes from restricting to class F instead of all

measurable functions (does not change with infinite data). Smaller when F
is bigger (more complex).
● Estimation error: comes from using finite training data (empirical risk
instead of true risk, zero with infinite data). Smaller when F is smaller (less
complex).
This is the Bias-Variance trade-off. Your job is to find F that balances these
errors.
r approximation error Bayes error
ion erro
at somewhere here
estim

● Approximation error: comes from restricting to class F instead of all

measurable functions (does not change with infinite data). Smaller when F
is bigger (more complex).
● Estimation error: comes from using finite training data (empirical risk
instead of true risk, zero with infinite data). Smaller when F is smaller (less
complex).
This is the Bias-Variance trade-off. Your job is to find F that balances these
errors.
r approximation error Bayes error
ion erro
at somewhere here
r estim
on erro
t im izati
op

● Approximation (Representation) error: comes from restricting to class

F instead of all measurable functions (does not change with infinite data).
Smaller when F is bigger (more complex).
● Estimation (Generalization) error: comes from using finite training data
(empirical risk instead of true risk, zero with infinite data). Smaller when F
is smaller (less complex).
● Optimization error: comes from the algorithmic problem of minimizing
the empirical risk ( may overfit more than , loss might not be convex)
Fundamental questions of Machine Learning

● Representation: What is the class of functions F we should

choose?
● Generalization: Will the performance of predictor transfer
from seen training examples to unseen examples?
● Optimization: How can we efficiently solve the optimization
problem?

Intertwined rather than independent questions

Generalization bounds
(Data generating assumption)
● Finite F
Let F be a finite hypothesis set. Then, for all f in F, for all δ>0, with probability at
least 1-δ

We can also bound the estimation error with this

● Infinite F (Vapnik-Chervonenkis)
Let F be a finite hypothesis set with finite VC dimension dVC. Then, for all f in F, for
all δ>0, with probability at least 1-δ

measure the “effective” size of the class, that is, the size of the projection of the class onto ﬁnite observations.
Regularization Complexity, capacity, richness, expressivity

● The main goal of regularization is to reduce the generalization error by

reducing the complexity of the hypothesis space F.
● Given a complexity measure (a norm on F) the constrained
hypothesis space is the set of functions with complexity at most C.

Increasing complexities C=0,1,2,3.56,... gives nested spaces

● Constrained or penalized (structural) empirical risk minimization

Photo by Google DeepMind on Unsplash

You might also like

Designing Machine Learning Systems by Chip Huygen by Rick
No ratings yet
Designing Machine Learning Systems by Chip Huygen by Rick
15 pages
Signal and System Norms
No ratings yet
Signal and System Norms
15 pages
ML notes
No ratings yet
ML notes
16 pages
ML (AutoRecovered)
No ratings yet
ML (AutoRecovered)
5 pages
Machine Learning – I[1]
No ratings yet
Machine Learning – I[1]
126 pages
ML 01
No ratings yet
ML 01
24 pages
July4 SaketAnand FriendlyIntroToML
No ratings yet
July4 SaketAnand FriendlyIntroToML
84 pages
ML Module 1
No ratings yet
ML Module 1
12 pages
Step-by-Step Machine Learning
No ratings yet
Step-by-Step Machine Learning
3 pages
Lecture 1
No ratings yet
Lecture 1
21 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Unit - 1 1.introduction To ML
No ratings yet
Unit - 1 1.introduction To ML
74 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
ChatGPT - Machine Learning Overview
No ratings yet
ChatGPT - Machine Learning Overview
34 pages
Chapter 02 Overview - 4
No ratings yet
Chapter 02 Overview - 4
43 pages
Air quality prediction using machine learning
No ratings yet
Air quality prediction using machine learning
29 pages
ML Checklist PDF
No ratings yet
ML Checklist PDF
4 pages
Basic_concepts_of_Machine_Learning_for_Beginners_1732109263
No ratings yet
Basic_concepts_of_Machine_Learning_for_Beginners_1732109263
102 pages
MACHINE LEARNING 1-5 (Ai &DS)
100% (1)
MACHINE LEARNING 1-5 (Ai &DS)
60 pages
Introduction To Machine Learning Top-Down Approach - Towards Data Science
No ratings yet
Introduction To Machine Learning Top-Down Approach - Towards Data Science
6 pages
LECTURE-2
No ratings yet
LECTURE-2
36 pages
FML - KNN
No ratings yet
FML - KNN
64 pages
Unit-1 Introduction to Machine Learning [5hrs]
No ratings yet
Unit-1 Introduction to Machine Learning [5hrs]
8 pages
Lecture 8
No ratings yet
Lecture 8
11 pages
machine_learning_roadmap.pdf
No ratings yet
machine_learning_roadmap.pdf
4 pages
UNIT-I
No ratings yet
UNIT-I
132 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
ML-chap-2
No ratings yet
ML-chap-2
60 pages
Unit-I
No ratings yet
Unit-I
23 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
Unit 3
No ratings yet
Unit 3
13 pages
1 - Machine Learning Overview
No ratings yet
1 - Machine Learning Overview
56 pages
Part 2 Introduction To ML
No ratings yet
Part 2 Introduction To ML
13 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
48 pages
Unit 1 Machine Learning - PDF Lands
No ratings yet
Unit 1 Machine Learning - PDF Lands
5 pages
Introduction Class
No ratings yet
Introduction Class
134 pages
Module 4
No ratings yet
Module 4
28 pages
Lec2 Intro to ML
No ratings yet
Lec2 Intro to ML
35 pages
unit 1 ml pdf
No ratings yet
unit 1 ml pdf
19 pages
ML U1 & U2 Notes
No ratings yet
ML U1 & U2 Notes
92 pages
How To Avoid Machine Learning Pitfalls
No ratings yet
How To Avoid Machine Learning Pitfalls
33 pages
Study Notes - Lesson 1 - 7 PDF
No ratings yet
Study Notes - Lesson 1 - 7 PDF
25 pages
CS3244 (2120) - Project Discussion 1 - Overview
No ratings yet
CS3244 (2120) - Project Discussion 1 - Overview
25 pages
Subtitle (4)
No ratings yet
Subtitle (4)
2 pages
Lecture 12 - Machine Learning
No ratings yet
Lecture 12 - Machine Learning
18 pages
module3_DS_ppt
No ratings yet
module3_DS_ppt
68 pages
Lecture 15 - Recap and Midterm Review
No ratings yet
Lecture 15 - Recap and Midterm Review
37 pages
Lones_2024
No ratings yet
Lones_2024
28 pages
Module_-1
No ratings yet
Module_-1
9 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Unit 1
No ratings yet
Unit 1
41 pages
Machine learning Life cycle
No ratings yet
Machine learning Life cycle
11 pages
Overfitting & Feature Engineering.pptx
No ratings yet
Overfitting & Feature Engineering.pptx
37 pages
1 - Machine Learning Overview
No ratings yet
1 - Machine Learning Overview
53 pages
ML
No ratings yet
ML
9 pages
L2_Problems in ML & Performance Evaluation - Copy
No ratings yet
L2_Problems in ML & Performance Evaluation - Copy
30 pages
Lecture - 2 Classification (Machine Learning Basic and KNN)
No ratings yet
Lecture - 2 Classification (Machine Learning Basic and KNN)
90 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Download Complete Numerical methods for inverse problems 1st Edition Kern PDF for All Chapters
100% (15)
Download Complete Numerical methods for inverse problems 1st Edition Kern PDF for All Chapters
60 pages
Logistic Regression
No ratings yet
Logistic Regression
37 pages
Numerics Assignment
No ratings yet
Numerics Assignment
14 pages
Process Design and Optimization-Lec
No ratings yet
Process Design and Optimization-Lec
31 pages
Fake News Detection Using Deep Learning
No ratings yet
Fake News Detection Using Deep Learning
6 pages
Deep Learning Enabled Semantic Communication Systems
No ratings yet
Deep Learning Enabled Semantic Communication Systems
13 pages
Amaral Et Al 2020
No ratings yet
Amaral Et Al 2020
19 pages
Least Square Method Definition
No ratings yet
Least Square Method Definition
7 pages
Department of Computer Science and Engineering: Course Name: Differential and Integral Calculus Course Code: MATH 207
No ratings yet
Department of Computer Science and Engineering: Course Name: Differential and Integral Calculus Course Code: MATH 207
12 pages
Tutorial 7 Matrix Algebra For Homogeneous Linear Algebraic System
No ratings yet
Tutorial 7 Matrix Algebra For Homogeneous Linear Algebraic System
3 pages
Extensions of DAMAS and Benefits and Limitations of Deconvolution in Beamforming
No ratings yet
Extensions of DAMAS and Benefits and Limitations of Deconvolution in Beamforming
13 pages
مثال تطبيقي في النمذجة والمحاكاة باستخدام CSIM
No ratings yet
مثال تطبيقي في النمذجة والمحاكاة باستخدام CSIM
11 pages
ML Experiment-1
No ratings yet
ML Experiment-1
3 pages
1st Half
100% (1)
1st Half
11 pages
A Novel Parity Bit Scheme For Sbox in Aes Circuits: 'L1Dwdoh0/) Orwwhv%5Rx) H/Uh
No ratings yet
A Novel Parity Bit Scheme For Sbox in Aes Circuits: 'L1Dwdoh0/) Orwwhv%5Rx) H/Uh
5 pages
Symbiosis Skills and Open University
No ratings yet
Symbiosis Skills and Open University
4 pages
Development of Empirical Models From Process Data: - An Attractive Alternative
No ratings yet
Development of Empirical Models From Process Data: - An Attractive Alternative
10 pages
Karnaugh Maps (K-Maps)
No ratings yet
Karnaugh Maps (K-Maps)
12 pages
Queue: (FIFO) Lists
No ratings yet
Queue: (FIFO) Lists
31 pages
Project Report First Phase @8 Suhana
No ratings yet
Project Report First Phase @8 Suhana
32 pages
IT8601 unitIV
No ratings yet
IT8601 unitIV
47 pages
Unit 5 Data Structures
No ratings yet
Unit 5 Data Structures
27 pages
Bai
No ratings yet
Bai
67 pages
Pengolahan Sinyal Digital: Adhi Harmoko Saputro
No ratings yet
Pengolahan Sinyal Digital: Adhi Harmoko Saputro
56 pages
Reinforcement Learning Optimization
No ratings yet
Reinforcement Learning Optimization
6 pages
(PPT) Uml Diagrams On Disease Prediction
100% (3)
(PPT) Uml Diagrams On Disease Prediction
13 pages
CH 3
No ratings yet
CH 3
25 pages
Minhash PDF
100% (1)
Minhash PDF
2 pages
Chapter 11
No ratings yet
Chapter 11
122 pages