Linear
Classes are distinct and separable
Iris model from other chapter
Better fit = more attributes
Patterns that do not generalize: over-fitting
Mathematical Functions Over-fitting and it's Avoidance
Adding more xi's is more complex Allows for flexibility when searching data
wi = a learned parameter
Measure accuracy on training and test set
If not pure: estimate based on average
Sweet spot: where it starts to over-fit
Generalization
Sectioning to get "pure" data Chapter 5: Mind Map Not fit with other data: over-fit
Over-fitting in Tree Induction
For previously unseen data
memorizes training data and doesn't generalize
Number of nodes = complexity of the tree
Sampling approach = table model
Growing trees until the leaves are pure: how to over-fit
If fails: more realistic models will fail too
All data models could and do, do this
Recognize and manage in the principle way
Based on how complex you allow the model to be
Tendency to make models with training data
Overfitting
At the expense of Generalization
Based on accuracy as a model of complexity
Fitting Graph
Comparing predicted values w/hidden true values Increases when you allow more flexibility
Generalizaiton Performance
Why is it bad?
estimated performance
estimates all data
Must mis-trust data on a training set Cross-validation:
More sophisticated
Churn Data-set Model will pick up harmful correlations
all models are susceptible to over-fitting effects
Tree induction
Stop growing the tree
Avoidance
Grow until it is too large hen prune it back
Estimate the generalizing performance of each model
Find the right balance
Equations
Parameter optimization