INTRODUCTION TO AI & ML
Prepared by
Aamir Ajaz Malik
IT Trainer/Consultant
+916005352220
AI- ARTIFICIAL INTELLIGENCE
WHAT IS AI?
AI refers to machines or systems capable of performing tasks that
require human intelligence (e.g., decision-making, language
understanding, and pattern recognition).
Examples: Alexa/Siri, chatbots
WHAT IS MACHINE LEARNING?
ML
ML is a subset of AI where machines learn from data to make
predictions or decisions without being explicitly programmed.
Example: Email spam filters, image recognition.
REAL-LIFE APPLICATIONS OF AI & ML
•Healthcare: Predicting diseases or diagnosing from X-rays.
•E-commerce: Product recommendations (Amazon, Flipkart).
•Entertainment: Netflix/YouTube recommendations.
•Gaming: Smarter opponents or NPCs in video games.
•Education: AI-driven personalized learning platforms.
E-COMMERCE
Product Recommendations
Amazon, Flipkart, and other e-commerce platforms use AI to
suggest products based on a customer’s browsing history,
purchase patterns, and preferences.
Example: If a customer buys a smartphone, they might be
recommended accessories like a case or earphones.
Algorithm Used: Collaborative Filtering, Content-Based Filtering,
and Deep Learning techniques power these recommendation
engines.
ENTERTAINMENT
Netflix/YouTube Recommendations
AI analyzes user behavior (e.g., watch history, likes, duration spent
watching) to suggest personalized content.
Example: Netflix recommends shows you might like based on
genres or actors in your watch history. Similarly, YouTube suggests
videos based on viewing patterns and engagement metrics.
Real-world Example: The Netflix Recommendation Engine
reportedly drives over 80% of the platform's viewership.
KEY CONCEPTS OF ML
Supervised Learning
Learning with labeled data.
Example: Teach a system to differentiate between cats and dogs using a
dataset of labeled images.
Unsupervised Learning
Learning without labels, finding hidden patterns
Example: Grouping students based on their test performance (clustering)
Reinforcement Learning:
Learning by trial and error to maximize rewards
Example: Training robots to play chess.
HOW SUPERVISED LEARNING WORKS
Training Data:
The dataset contains inputs (features) and outputs (labels).
Example: In a dataset predicting house prices:
Inputs (features): Square footage, number of bedrooms, location.
Output (label): Price of the house.
Model Training
The algorithm analyzes the patterns in the input data and learns the mapping to the
corresponding outputs.
Prediction:
Once trained, the model can predict outputs for new input data
Evaluation:
The model's performance is tested on a separate dataset (test set) by comparing
predicted outputs with actual outputs.
TYPES OF SUPERVISED LEARNING
1. Regression
2. Used when the output is continuous
Examples:Predicting house prices based on size, location, and
number of rooms.
Forecasting stock prices.
COMMON ALGORITHMS:
•Linear Regression
•Polynomial Regression
•Support Vector Regression (SVR)
LINEAR REGRESSION
Linear Regression assumes a linear relationship between the
independent variable(s) XXX and the dependent variable YYY. It
tries to fit a straight line (or a hyperplane in higher dimensions)
that minimizes the difference between the predicted and actual
values.
Y=β0+β1X+ϵ
Y: Dependent variable (target)
X: Independent variable (feature)
β0: Intercept (value of YYY when X=0X = 0X=0)
β1: Slope of the line (indicates the rate of change of YYY with respect to XXX)
ϵ: Error term (accounts for variability not captured by the model)
TYPES OF LINEAR REGRESSION
Simple Linear Regression:
Linear regression with one independent variable
Example: Predicting house price (YYY) based on square footage
(XXX)
Multiple Linear Regression:
Linear regression with two or more independent variables.
Y=β0+β1X1+β2X2+⋯+βnXn+ϵ
Example: Predicting house price (YYY) based on square footage (X1X_1X1),
number of bedrooms (X2X_2X2), and location (X3X_3X3).
ASSUMPTIONS OF LINEAR
REGRESSION
To ensure the model performs well, the following assumptions
should hold:
1. Linearity: The relationship between XXX and YYY is linear.
2. Independence: Observations are independent of each other.
3. Homoscedasticity: The variance of residuals (errors) is
constant across all levels of XXX.
4. Normality of Errors: Residuals should follow a normal
distribution.
5. No Multicollinearity: Independent variables should not be
highly correlated.
COST FUNCTION
Linear Regression uses a cost function to measure the error
between predicted values (Y^\hat{Y}Y^) and actual values (YYY).
Mean Squared Error (MSE):
MSE=n1i=1∑n(Yi−Y^i)2
nnn: Number of observation
Yi Actual value
^I Predicted value
ADVANTAGES OF LINEAR
REGRESSION
Simplicity: Easy to implement and interpret.
Efficiency: Works well for smaller datasets.
Transparency: The relationship between variables is clearly
visible.
LIMITATIONS OF LINEAR REGRESSION
Sensitive to Outliers: Outliers can skew the model significantly.
Assumption Dependency: Violating assumptions (e.g., linearity,
normality) can reduce accuracy.
Limited Complexity: Cannot capture non-linear relationships
without transformation.
EXAMPLE
We'll predict house prices based on features like square footage
and the number of bedrooms.
We have a dataset with the following columns:
SquareFootage The area of the house in square feet
Bedrooms The number of bedrooms in the house.
Price The price of the house (our target variable)
UNSUPERVISED LEARNING
Unsupervised Learning is a type of Machine Learning where the
algorithm learns patterns and relationships in the data without
labeled outputs or targets. Unlike supervised learning, there is no
direct supervision, and the algorithm tries to find hidden structures
or groupings in the data
CHARACTERISTICS OF
UNSUPERVISED LEARNING
•No Labels: The data used has only input variables (features) and no corresponding target
labels.
•Learning Patterns: The algorithm identifies patterns, structures, or groupings in the data.
•Exploratory Analysis: Often used for understanding the dataset or discovering unknown
patterns.
TYPES OF UNSUPERVISED LEARNING
Unsupervised learning can be broadly divided into two categories
Clustering
Dimensionality Reduction
CLUSTERING
•Definition: The process of grouping data points into clusters based on similarity.
•Goal: Ensure that data points in the same cluster are more similar to each other
than to those in other clusters.
•Examples:
•Customer segmentation in marketing.
•Grouping genes with similar behavior in biology.
Popular Clustering Algorithms:
•K-Means Clustering: Partitions data into kkk clusters.
•Hierarchical Clustering: Builds a tree-like structure of clusters.
•DBSCAN: Groups based on density and handles noise in the data.
DIMENSIONALITY REDUCTION
•Definition: The process of reducing the number of features in a
dataset while preserving important information.
•Goal: Simplify the dataset for visualization, faster computation, or
removing redundant features.
•Examples:
•Visualizing high-dimensional data in 2D or 3D.
•Reducing data dimensions in image compression.
Popular Dimensionality Reduction Techniques:
•Principal Component Analysis (PCA): Finds new axes (principal
components) that maximize variance.
•t-SNE (t-Distributed Stochastic Neighbor Embedding): Reduces
dimensions for visualization by preserving local structure.
•Autoencoders: Neural networks used to learn efficient data
representations.
THANK YOU