Machine Leafning

Machine Learning faces several challenges, including data quality, overfitting, and generalization, which can hinder model performance on real-world data. Overfitting occurs when a model is too closely aligned with training data, leading to poor performance on unseen data, and techniques like regularization and validation are used to address this issue. The process of validation involves dividing training data into sets to monitor model performance and ensure better generalization to new data.

Uploaded by

ttaki8072

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Machine Leafning

Uploaded by

ttaki8072

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Challenges with machine learning

Machine Learning is a technique used to create models from data for tasks like image and
speech recognition. However, it faces challenges such as data quality, overfitting and
underfitting, computational resource demands, interpretability, generalization, ethical concerns,
and security. The learning process involves training a model on data and then applying it to real-
world data, as shown in the vertical and horizontal flows of the provided figure.

Machine Learning faces a fundamental challenge due to the distinctness of training data and
input data. A core issue is that a model trained with data from one source may fail to perform
well on data from another source, such as different handwriting styles. Achieving desired results
requires unbiased training data that accurately reflects real-world data. The process of ensuring
model performance is consistent across varied data is called generalization, and the success of
Machine Learning hinges on effective generalization.

overfitting
Overfitting occurs when a model is too closely fitted to the training data, causing it to perform
well on that data but poorly on new, unseen data. This term refers to the model's inability to
generalize beyond the training data. The explanation is better understood with a case study:
consider a classification problem where we need to divide position (or coordinate) data into two
groups based on training data points. The goal is to find a curve that accurately defines the
border between the two groups using the given training data.
When we judge this curve, there are some points that are not correctly classified according to
the border. What about perfectly grouping the points using a complex curve, as shown in
Figure 1-8?

This model yields the perfect grouping performance for the training data. How does it look? Do
you like this model better? Does it seem to reflect correctly the general behavior? Now, let’s use
this model in the real world. The new input to the model is indicated using the symbol ■, as
shown in Figure 1-9.
The previously error-free model is now identifying new data as a class ∆. However, this
classification is doubtful due to the general trend of the training data. Grouping the data as a
class • seems more reasonable. This raises questions about the model's performance, despite
its earlier 100% accuracy with the training data.

 Noisy Data: The data contains outliers and noise that disrupt the boundary between
groups.

 Overfitting: Machine Learning models consider all data, including noise, leading to
overfitting. Overfitting results in a model that fits the training data too closely but lacks
generalizability to new, unseen data.

 Model Generalizability: If the training data is assumed to be perfect, the model

produced will have lower generalizability.

 Balancing Accuracy: The goal is to derive an accurate model from the training data
without purposefully making it less accurate.

 Dilemma: Reducing training data error leads to overfitting, which affects generalizability.
The text hints at techniques to prevent overfitting in the following section.

The central theme is finding the balance between fitting the training data well and maintaining
the model's ability to generalize to new data

Confronting Overfitting
In the world of machine learning, overfitting is a common problem that can significantly affect
model performance. Tackling this issue effectively separates the pros from the amateurs. The
following discusses two typical methods used to confront overfitting: regularization and
validation.
Regularization simplifies the model to avoid overfitting, even if it means sacrificing some
performance. It uses a numerical method to construct a simpler model structure, which reflects
the overall characteristics of the data better.

Validation is a process that uses a reserved part of the training data to monitor the model's
performance. This reserved data, called the validation set, is not used during the training
process. By checking the model's performance on this validation set, we can determine if the
model is overfitted and make necessary adjustments.

In essence, these methods help in creating models that generalize better to new, unseen data.

When validation is involved, the training process of Machine Learning proceeds by the following
steps: 1. 2. 3. Divide the training data into two groups: one for training and the other for
validation. As a rule of thumb, the ratio of the training set to the validation set is 8:2. Train the
model with the training set. Evaluate the performance of the model using the validation set. a.
If the model yields satisfactory performance, finish the training. b. If the performance does not
produce sufficient results, modify the model and repeat the process from Step 2. Cross-
validation is a slight variation of the validation process. It still divides the training data into
groups for the training and validation, but keeps changing the datasets. Instead of retaining the
initially divided sets, cross-validation repeats the division of the data. The reason for doing this
is that the model can be overfitted even to the validation set when it is fixed. As the cross-
validation maintains the randomness of the validation dataset, it can better detect the
overfitting of the model. Figure 1-11 describes the concept of cross-validation. The dark shades
indicate the validation data, which is randomly selected throughout the training process.

Edet300 At2
No ratings yet
Edet300 At2
3 pages
Mba Itb Final Exam
No ratings yet
Mba Itb Final Exam
1 page
LECTURE - 1
No ratings yet
LECTURE - 1
35 pages
Underfitting and Overfitting
No ratings yet
Underfitting and Overfitting
4 pages
Overfitting
No ratings yet
Overfitting
7 pages
ML & DL
No ratings yet
ML & DL
19 pages
5.3 Model
No ratings yet
5.3 Model
26 pages
emsemble methods-pages-deleted
No ratings yet
emsemble methods-pages-deleted
2 pages
SML Updated UNIT 4
No ratings yet
SML Updated UNIT 4
44 pages
Bias_and_Variance
No ratings yet
Bias_and_Variance
4 pages
All DL
No ratings yet
All DL
72 pages
U&O Fitting
No ratings yet
U&O Fitting
6 pages
MACHINE LEARNING NOTES ANNA UNIVERSITY
No ratings yet
MACHINE LEARNING NOTES ANNA UNIVERSITY
9 pages
ML 5
No ratings yet
ML 5
14 pages
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
100% (2)
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
26 pages
Model Evaluation
No ratings yet
Model Evaluation
29 pages
Machine Learning Juunit2.pdf Lands
No ratings yet
Machine Learning Juunit2.pdf Lands
7 pages
Overfitting & Feature Engineering.pptx
No ratings yet
Overfitting & Feature Engineering.pptx
37 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
116 pages
AI - W7L14
No ratings yet
AI - W7L14
22 pages
Overfitting and Underfitting in Machine Learning
No ratings yet
Overfitting and Underfitting in Machine Learning
3 pages
Underfitting and Overfitting Slides and Transcript
No ratings yet
Underfitting and Overfitting Slides and Transcript
13 pages
ML Bu
No ratings yet
ML Bu
31 pages
Bias and Variance in Machine Learning
No ratings yet
Bias and Variance in Machine Learning
3 pages
Introduction To ML
No ratings yet
Introduction To ML
55 pages
DSOST3
No ratings yet
DSOST3
31 pages
Issues in ML and Generating Algo
No ratings yet
Issues in ML and Generating Algo
31 pages
DL_Unit1 (1)
100% (1)
DL_Unit1 (1)
79 pages
Unit IV
No ratings yet
Unit IV
51 pages
module3_DS_ppt
No ratings yet
module3_DS_ppt
68 pages
Complete ML Concepts
No ratings yet
Complete ML Concepts
30 pages
Machine Learning Notes "2023
No ratings yet
Machine Learning Notes "2023
31 pages
Unit 4
No ratings yet
Unit 4
50 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
Unit - 2 Deep Learning
No ratings yet
Unit - 2 Deep Learning
26 pages
Choosing Model and Tuning
No ratings yet
Choosing Model and Tuning
20 pages
Csa202 Unit 2
No ratings yet
Csa202 Unit 2
36 pages
Chapter 1-ML
No ratings yet
Chapter 1-ML
27 pages
Lecture 12 - Machine Learning
No ratings yet
Lecture 12 - Machine Learning
18 pages
4 - Bias-Variance Tradeoff
No ratings yet
4 - Bias-Variance Tradeoff
28 pages
06 Regularizations
No ratings yet
06 Regularizations
42 pages
Machine Learning General: Definiton
No ratings yet
Machine Learning General: Definiton
14 pages
Lec-1 Bias-variance-Tradeoff
No ratings yet
Lec-1 Bias-variance-Tradeoff
24 pages
Theory in Machine Learning
No ratings yet
Theory in Machine Learning
60 pages
Model Evaluation
No ratings yet
Model Evaluation
39 pages
02 - Diagnostics For Machine Learning Model
No ratings yet
02 - Diagnostics For Machine Learning Model
20 pages
Machine Learning-2
No ratings yet
Machine Learning-2
87 pages
15-The Bias - Variance - Trade-Off-08-04-2024
No ratings yet
15-The Bias - Variance - Trade-Off-08-04-2024
23 pages
Overfitting Regression
No ratings yet
Overfitting Regression
14 pages
MLquestions
No ratings yet
MLquestions
26 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
DL UNIT2
No ratings yet
DL UNIT2
22 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
ML 1-6
No ratings yet
ML 1-6
248 pages
Week 15
No ratings yet
Week 15
41 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Machine Learning Models
No ratings yet
Machine Learning Models
54 pages
Underfitting and Overfitting in Machine Learning by ROll (41,42)
No ratings yet
Underfitting and Overfitting in Machine Learning by ROll (41,42)
29 pages
Model Validation & Data Partition
No ratings yet
Model Validation & Data Partition
14 pages
Data Science Concepts Overfitting Underfitting
No ratings yet
Data Science Concepts Overfitting Underfitting
8 pages
Questions
No ratings yet
Questions
8 pages
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
US 101 Course Syllabus
No ratings yet
US 101 Course Syllabus
13 pages
Techniques of Summarizing
No ratings yet
Techniques of Summarizing
5 pages
Teaching Adverbs - (Literacy Strategy Guide)
No ratings yet
Teaching Adverbs - (Literacy Strategy Guide)
8 pages
How Does Digitalization Change The Role and Way of Working of Internal Audit - An Exploratory Overview
No ratings yet
How Does Digitalization Change The Role and Way of Working of Internal Audit - An Exploratory Overview
29 pages
Research 2 2nd Quarter
No ratings yet
Research 2 2nd Quarter
45 pages
Nontest and Authentic Measures
100% (1)
Nontest and Authentic Measures
6 pages
Participate and Assist: Classroom Environment Before Classroom Environment Now
67% (3)
Participate and Assist: Classroom Environment Before Classroom Environment Now
6 pages
CHILD Development Pre Test
No ratings yet
CHILD Development Pre Test
8 pages
Four Philosophies: Idealism, Realism, Pragmatism, Existentialism
No ratings yet
Four Philosophies: Idealism, Realism, Pragmatism, Existentialism
2 pages
Qur'aan Corpus (A Brief Summary On Arabic Grammar)
100% (1)
Qur'aan Corpus (A Brief Summary On Arabic Grammar)
74 pages
Sample Completed Lesson Plan - Phillip
No ratings yet
Sample Completed Lesson Plan - Phillip
9 pages
1021 3308 1 PB
No ratings yet
1021 3308 1 PB
7 pages
10 Great Warm Up Activities For The Classroom
No ratings yet
10 Great Warm Up Activities For The Classroom
3 pages
31619H Unit3 Rms 20170817
No ratings yet
31619H Unit3 Rms 20170817
11 pages
Group 5
No ratings yet
Group 5
21 pages
Book Reviews: B. J. Music Ed. 2002 19:2, 203 210 2002 Cambridge University Press
No ratings yet
Book Reviews: B. J. Music Ed. 2002 19:2, 203 210 2002 Cambridge University Press
8 pages
LDM2 Module 1 (Grade 12)
100% (1)
LDM2 Module 1 (Grade 12)
8 pages
Deen and Self of Man by G A Parwez Published by Idara Tulu-E-Islam
No ratings yet
Deen and Self of Man by G A Parwez Published by Idara Tulu-E-Islam
15 pages
Sample Lesson Plan: Usp Sharing
No ratings yet
Sample Lesson Plan: Usp Sharing
4 pages
Maher
No ratings yet
Maher
4 pages
PERDEV
No ratings yet
PERDEV
31 pages
Ethics Midterm Exam
No ratings yet
Ethics Midterm Exam
14 pages
Perdev Final Exam.
50% (2)
Perdev Final Exam.
3 pages
DLL4 Math 8 Week 10
No ratings yet
DLL4 Math 8 Week 10
2 pages
Abhinavagupta On Error
No ratings yet
Abhinavagupta On Error
33 pages
AI Model Paper-2 @
No ratings yet
AI Model Paper-2 @
7 pages
Receptive Macro Skill Listening and Reading
No ratings yet
Receptive Macro Skill Listening and Reading
5 pages
DElEd Curriculum 2015
No ratings yet
DElEd Curriculum 2015
153 pages

Machine Leafning

Uploaded by

Machine Leafning

Uploaded by

Challenges with machine learning

 Model Generalizability: If the training data is assumed to be perfect, the model

You might also like