0% found this document useful (0 votes)
4 views

Parametric

The document provides an overview of parametric machine learning methods, defining them as algorithms that make strong assumptions about data distribution and estimate a fixed set of parameters. It contrasts parametric methods, like Linear Regression, with non-parametric methods, like Decision Trees, highlighting their differences in assumptions, complexity, and data requirements. Additionally, it discusses the assumptions of parametric models, their violations, and includes Python examples to illustrate concepts such as model fitting and residual analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Parametric

The document provides an overview of parametric machine learning methods, defining them as algorithms that make strong assumptions about data distribution and estimate a fixed set of parameters. It contrasts parametric methods, like Linear Regression, with non-parametric methods, like Decision Trees, highlighting their differences in assumptions, complexity, and data requirements. Additionally, it discusses the assumptions of parametric models, their violations, and includes Python examples to illustrate concepts such as model fitting and residual analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Okay, let's enhance the explanations with more

content and definitions for each topic.


1. Introduction to Parametric Machine Learning
Methods
 Deeper Content:
o Definition: Parametric machine learning
algorithms make strong assumptions about
the form of the mapping function (how
input features map to the output). They
estimate a fixed set of parameters based
on the data.
o Parametric vs. Non-parametric (Expanded):
 Parametric:
 Assumptions: Strong assumptions
about data distribution.
 Parameters: Fixed number of
parameters.
 Complexity: Lower complexity.
 Interpretability: Often easier to
interpret.
 Data Needs: Can work with smaller
datasets if assumptions hold.
 Examples: Linear Regression,
Logistic Regression, Naive Bayes.
 Non-parametric:
 Assumptions: Fewer assumptions
about data distribution.
 Parameters: Number of parameters
grows with the size of the data.
 Complexity: Higher complexity.
 Interpretability: Can be harder to
interpret.
 Data Needs: Generally require
larger datasets.
 Examples: Decision Trees, Support
Vector Machines, Neural Networks
(can be considered semi-
parametric in some cases).
o Model Complexity:
 Underfitting: A model is too simple to
capture the underlying structure of the
data (high bias).
 Overfitting: A model learns the training
data too well, including noise, and
performs poorly on new data (high
variance).
 Parametric models are more prone to
underfitting if the chosen form is
incorrect, while non-parametric models
are more prone to overfitting if not
regularized.
o Bias-Variance Tradeoff (Expanded):
 Bias: Error from erroneous assumptions
in the learning algorithm. High bias can
cause an algorithm to miss the relevant
relations between features and target
outputs.1
 Variance: Error from sensitivity to small
fluctuations in the training set. High
variance can cause an algorithm to
model the random noise in the training
data, rather than the intended
outputs.2
 Parametric models tend to have high
bias and low variance (if the chosen
form is simple), while non-parametric
models tend to have low bias and high
variance. The goal is to find a balance.
 Python Example (with more explanation):
Python
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import
LinearRegression
from sklearn.tree import DecisionTreeRegressor

# Generate some non-linear data


np.random.seed(0)
X = np.sort(5 * np.random.rand(80, 1), axis=0)
y = np.sin(X).ravel() + np.random.normal(0, 0.2,
len(X))

# Parametric: Linear Regression


lr = LinearRegression()
lr.fit(X, y)
y_lr = lr.predict(X)

# Non-parametric: Decision Tree


dt = DecisionTreeRegressor(max_depth=5) #
Limiting depth to prevent overfitting
dt.fit(X, y)
y_dt = dt.predict(X)

# Plot the results


plt.figure()
plt.scatter(X, y, s=20, label="Original")
plt.plot(X, y_lr, color="red", label="Linear
Regression")
plt.plot(X, y_dt, color="green", label="Decision
Tree")
plt.xlabel("X")
plt.ylabel("Y")
plt.title("Parametric vs. Non-parametric")
plt.legend()
plt.show()

# Explanation:
# - Linear Regression (Parametric):
# - Definition: Assumes a linear relationship
between the input (X) and output (y). The model
learns the parameters of a straight line (slope and
intercept).
# - fit(): The model learns the optimal
parameters by minimizing the sum of squared
errors between the predicted and actual values.
# - predict(): Uses the learned parameters
to make predictions on the input data.
# - Decision Tree (Non-parametric):
# - Definition: A tree-like model that makes
decisions based on the values of the input features.
It partitions the feature space into regions and
assigns a prediction to each region.
# - max_depth: A hyperparameter that
limits the depth of the tree to prevent overfitting.
# - fit(): The model learns the tree structure
by recursively partitioning the data based on the
features that best separate the target variable.
# - predict(): Traverses the tree based on
the input features to reach a leaf node, which
provides the prediction.
2. Assumptions of Parametric Machine Learning
Methods
 Deeper Content:
o Definition: Assumptions are conditions that
must be met for a parametric model to be
valid and provide reliable results.
o Impact of Violations (Expanded):
 Linearity:
 Violation: The relationship between
the independent and dependent
variables is not linear.
 Impact: The model will not
accurately capture the relationship,
leading to poor predictions.
 Example: Trying to fit a straight line
to a curved relationship.
 Independence of Errors (Residuals):
 Violation: The errors (residuals) are
correlated with each other. This
often occurs in time series data.
 Impact: Standard errors of the
coefficients will be underestimated,
leading to unreliable hypothesis
tests and confidence intervals.
 Example: In a time series, if one
day's error is positive, the next
day's error is also likely to be
positive.
 Homoscedasticity (Equal Variance of
Errors):
 Violation: The variance of the errors
is not constant across all levels of
the independent variables.
 Impact: Standard errors will be
unreliable, affecting hypothesis
tests and confidence intervals.
Predictions will be more precise in
some ranges of the independent
variable than others.
 Example: The variance of errors
might be higher for higher values
of income.
 Normality of Errors (Residuals):
 Violation: The errors are not
normally distributed.
 Impact: Hypothesis tests and
confidence intervals may be
unreliable, especially with small
sample sizes.
 Example: Errors might be skewed
or have heavy tails.
o Assumption Testing (Expanded):
 Scatter Plots:
 Purpose: Visual inspection of the
relationship between variables to
assess linearity.
 Interpretation: A linear pattern
suggests linearity. A curved pattern
suggests non-linearity.
 Residual Plots:
 Purpose: To check for
homoscedasticity and linearity.
 Interpretation:
 Homoscedasticity: Residuals
should be randomly scattered
around zero with no discernible
pattern. A funnel shape or
other pattern suggests
heteroscedasticity.
 Linearity: Residuals should be
randomly scattered around
zero. A curved pattern suggests
non-linearity.
 Durbin-Watson Test:
 Purpose: To test for autocorrelation
(correlation between errors) in time
series data.
 Interpretation: Values range from 0
to 4. A value of 2 indicates no
autocorrelation. Values close to 0
indicate positive autocorrelation,
and values close to 4 indicate
negative autocorrelation.
 Q-Q Plots (Quantile-Quantile Plots):
 Purpose: To assess the normality of
residuals.
 Interpretation: If the residuals are
normally distributed, the points on
the Q-Q plot will fall approximately
along a straight line.
 Shapiro-Wilk Test:
 Purpose: A statistical test for
normality.
 Interpretation: The test returns a
test statistic and a p-value. A small
p-value (typically < 0.05) indicates
that the data is significantly
different from a normal distribution.
o Remedial Measures (Expanded):
 Non-linearity:
 Variable Transformations:
 Log Transformation: Useful for
reducing skewness and
linearizing exponential
relationships.
 Square Root Transformation:
Useful for count data and
stabilizing variance.
 Polynomial Regression: Adding
polynomial terms of the
independent variables to the
model.
 Heteroscedasticity:
 Variable Transformations: Can
sometimes stabilize variance.
 Weighted Least Squares (WLS):
Assigns weights to observations
based on the variance of their
errors.
 Non-normality:
 Variable Transformations: Can
sometimes make the distribution of
errors more normal.
 Python Example (with more explanation):
Python
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm

# Generate data (with non-constant variance for


demonstration)
np.random.seed(1)
X = np.linspace(0, 10, 100)
y = 2 * X + np.random.normal(0, X, 100) #
Variance increases with X

# Plot the data to check for linearity


plt.scatter(X, y)
plt.xlabel("X")
plt.ylabel("Y")
plt.title("Scatter Plot")
plt.show()

# Fit a linear regression model


X = sm.add_constant(X) # Add a constant for
the intercept
model = sm.OLS(y, X).fit()
print(model.summary()) # Print the model
summary

# Get residuals
residuals = model.resid

# Plot residuals to check for homoscedasticity


plt.scatter(model.fittedvalues, residuals)
plt.xlabel("Predicted Values")
plt.ylabel("Residuals")
plt.title("Residual Plot")
plt.axhline(y=0, color='r', linestyle='--')
plt.show()
# Explanation:
# - Scatter Plot:
# - Definition: A graph that plots pairs of
data points (X, y) to visualize the relationship
between two variables.
# - In this example, it shows the
relationship between the independent variable (X)
and the dependent variable (y).
# - Residual Plot:
# - Definition: A graph that plots the
residuals (the differences between the actual and
predicted values) against the predicted values.
# - Homoscedasticity:
# - Definition: The property of having
equal variance of errors across all levels of the
independent variables.
# - In this example, the residuals show a
pattern where their spread increases with the
predicted values, indicating heteroscedasticity
(non-constant variance).
# - model.summary():
# - Definition: A method in statsmodels that
provides a detailed summary of the regression
results, including parameter estimates, standard
errors, t-values, p-values, R-squared, and other
statistics.
# - This output helps to assess the model's
fit and the significance of the parameters.
I'll continue this pattern for the remaining topics.
Let me know if you'd like me to proceed!

You might also like