0% found this document useful (0 votes)

16 views

3_AML _Lecture 3_Feature Engg

Feature scaling is a crucial preprocessing step in machine learning that standardizes independent features to improve model performance and accuracy. Techniques like normalization and standardization help ensure that features contribute equally to the learning process, preventing larger-magnitude features from dominating. Additionally, feature scaling aids in avoiding numerical instability and enhances the convergence of algorithms such as gradient descent and K-Nearest Neighbors.

Uploaded by

hetvibhora192

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

3_AML _Lecture 3_Feature Engg

Uploaded by

hetvibhora192

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 39

Feature Scaling

Scale data for better performance of Machine Learning

Model
https://towardsdatascience.com/all-about-feature-scaling-bcc0ad75cb35
Intro

• Feature Scaling is a technique to standardize the

independent features present in the data in a fixed range.

• It is a vital preprocessing step in machine learning that

involves transforming numerical features to a common scale.

• It plays a major role in ensuring accurate and efficient model

training and performance.

• Scaling techniques aim to normalize the range, distribution,

and magnitude of features, reducing potential biases and
inconsistencies that may arise from variations in their values.

• If feature scaling is not done, then a machine learning

algorithm tends to weigh greater values, higher and
consider smaller values as the lower values, regardless of
the unit of the values.

• Crucial for ensuring features are comparable in terms of

magnitude, values, and units.
Why Use Feature Scaling?

 Ensures Comparable Scales Across Features

• Feature normalization helps prevent larger-magnitude features from dominating the learning process.
• Guarantees equal contribution of each feature to the model.
 Enhances Algorithm Performance
• Improves Convergence and Accuracy
• Gradient descent-based algorithms converge faster with scaled features.
• Distance-based algorithms (e.g., K-Nearest Neighbors) rely on scaled data for accurate distance
measurement.
• Support Vector Machines perform better with standardized data.
 Prevents Numerical Instability
• Avoids Computational Issues
• Prevents overflow/underflow problems in distance calculations and matrix operations.
• Ensures stable and reliable computations, reducing the risk of errors in model predictions.

What is Feature Scaling and Why Does Machine Learning Need It? | Medium
• Suppose we have one data set of age ,salary and target as 1,0 for
product purchase or not.
• It’s a classification problem and we are applying KNN model for this
classification problem.
• We can clearly see that when we will calculate distance then salary
factor will dominate and KNN will not be able to perform nicely
In simple word: NO need to discuss same as previous

• Importance of Feature Scaling

• Equal Contribution: Scaling ensures that each feature has an equal impact on the learning

process.

• Improved Model Performance: Helps models like SVMs, k-nearest neighbors, and neural

networks perform better by improving convergence and accuracy.

• Prevent Numerical Instability: Avoids issues in calculations, especially with algorithms

sensitive to distance metrics.

• Image: Visualization showing improved convergence in gradient descent with scaling.

• Impact of Feature Scaling on Algorithms
• Gradient Descent Algorithms: Scaling allows gradient descent to converge faster and more
consistently by taking uniform steps towards the optimum.

• Distance-Based Algorithms: Algorithms like KNN rely on distance measurements, which

can be skewed by unscaled features.

• Impact on Performance: Without scaling, larger features can dominate distance calculations,
leading to biased results.

• Image: Diagram illustrating KNN with scaled vs. unscaled data.

Feature Scaling Type

Normalization (Min-Max Scaling):

•Description: Normalization transforms the features to a fixed range, typically between 0 and 1.
•Formula:
•Use Case: Best suited for scenarios where the data does not follow a Gaussian (normal) distribution.

Standardization (Z-Score Scaling):

Description: Standardization transforms the features so they have a mean of 0 and a standard deviation of 1.
Formula:

Use Case: Preferable when the data follows a normal distribution or when the distribution is unknown.
Key Point: Maintains the shape of the original distribution, does not restrict the feature to a specific range.
Standardization
Age salary
25
• Suppose we have a dataset having age and salary 26
15
• 500 rows then to standardize this we have to find xi-mean/SD
18
• Here we will get 500 new numbers after transforming. 19
.
• Now these new numbers mean will be 0 and SD will be 1 20
Understanding standardization
Geometric intuition

We are doing mean centering

We are spreading or reducing the spread in terms of SD
When to use Standardization
Normalization (Min –max)
• Normalization, a vital aspect of Feature Scaling, is a data preprocessing technique employed
to standardize the values of features in a dataset, bringing them to a common scale.

• In machine learning it is always said that we should not work with units like weight ,height
etc.So we should bring things in a common scale to eliminate the units.

• This process enhances data analysis and modeling accuracy by mitigating the influence of
varying scales on machine learning models.

• Normalization is a scaling technique in which values are shifted and rescaled so that they
end up ranging between 0 and 1. It is also known as Min-Max scaling.
Geometric intuition

• When the value of X is the minimum value in the column, the numerator will be 0, and hence X’ is 0
• On the other hand, when the value of X is the maximum value in the column, the numerator is
equal to the denominator, and thus the value of X’ is 1
• If the value of X is between the minimum and the maximum value, then the value of X’ is between 0
and 1
Type of Normalization
• min-max
• Mean normalization : used for cantered data like standardization hence
rarely used

• Max abs scaling : used when data is sparse i.e. too many zeros
Robust scaling

• Very good for outliers

the scaled values will have their median and IQR set to 0 and 1, respectively.
It is robust to outliers
Is feature scaling required? Depend on algorithm like if we are working with decision tree,xgboost
etc then no nee

Standardization is used mostly

Normalization When you know min and max like in image processing when we work on CNN
where we work on colored images, and each image has min 0 and max 255

If outliers use robust

https://proclusacademy.com/blog/robust-scaler-outliers/
Categorical Data
•Categorical data refers to variables that represent characteristics and can be divided into distinct groups or categories.
•Unlike numerical data, categorical data doesn’t involve numbers or measurements but rather labels or names.
•Common Use Cases:
•Grouping data by attributes like gender, nationality, brand, or type.
•Often used in statistical analysis to count occurrences, create frequency tables, or generate bar charts.

•Types of Categorical Data

•Nominal Data
• Definition:
• Nominal data consists of categories that are purely labels, with no specific order or ranking.
• Each category is unique, and there's no inherent logical sequence among them.
• Examples:
• Colors: Red, Blue, Green, Yellow (No color is “higher” than another).
• Types of Animals: Dog, Cat, Bird, Fish (Each type is distinct but not ordered).
• Brands: Nike, Adidas, Puma (Brand names without any ranking).
• Key Characteristics:
• No Order: Categories cannot be ordered or ranked.
• Mutually Exclusive: Each observation fits into one and only one category.
• Analysis Methods: Mode, frequency distribution, chi-square test for independence.
2. Ordinal Data
• Ordinal data involves categories that have a clear, meaningful order or ranking among them.
• However, the intervals between these categories are not necessarily equal or measurable.
•Examples:
• Survey Ratings: Poor, Fair, Good, Excellent (Ordered by quality).
• Education Levels: High School, Bachelor’s, Master’s, PhD (Ordered by level of education).
• Socioeconomic Status: Low, Middle, High (Ordered by status).
•Key Characteristics:
• Ordered: Categories can be ranked in a meaningful way.
• Unknown Intervals: Differences between ranks are not quantified.
• Analysis Methods: Median, percentile, non-parametric tests like the Mann-Whitney U test.

• Visualization & Analysis

• Visualization:
• Bar Charts: Used to represent the frequency of categories.
• Pie Charts: Used to show the proportion of categories within the whole.
• Common Analytical Techniques:
• Chi-Square Test: To assess relationships between categorical variables.
• Mode Analysis: To find the most common category.
• Logistic Regression: Often used when categorical data is the dependent variable.
Handling

• Some algorithms can work with categorical data directly (e.g., decision trees).

• Challenge: Many machine learning algorithms cannot operate on label data

directly and require all input and output variables to be numeric.

• Reason: This is generally a constraint of efficient algorithm implementation

rather than a fundamental limitation.

• Solution: Categorical data must be converted to numerical form, and

predictions may need to be converted back to categorical form for presentation
or application.
https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning/
Encoding Categorical data
• This involves two steps:
• Integer Encoding or label encoding or ordinal coding
• One-Hot Encoding

• 1. Integer Encoding
• As a first step, each unique category value is assigned an integer value.
• For example, “red” is 1, “green” is 2, and “blue” is 3.
• This is called a label encoding or an integer encoding and is easily reversible.
• For some variables, this may be enough.
• The integer values have a natural ordered relationship between each other and machine learning algorithms
may be able to understand and harness this relationship.
• For example, ordinal variables like the “place” example above would be a good example where a label
encoding would be sufficient.
• In sci-kitt learn, if the output is categorical we go for Label encoding.
• No major difference the method same
Use the ordinal encoding file,
One hot
• For categorical variables where no such ordinal relationship exists, the integer encoding is not enough.

• In fact, using this encoding and allowing the model to assume a natural ordering between categories may
result in poor performance or unexpected results (predictions halfway between categories).

• In this case, a one-hot encoding can be applied to the integer representation. This is where the integer
encoded variable is removed and a new binary variable is added for each unique integer value.

• Binary Vector Representation: Each category is represented as a binary vector, where only one element is
"1" (indicating the presence of that category) and the rest are "0".
• Example:
• Categories: [Red, Green, Blue]

• One-Hot Encoded Representation:

• Red: [1, 0, 0]

• Green: [0, 1, 0]

• Blue: [0, 0, 1]
Importance of One-Hot Encoding in Machine Learning

• Avoiding Ordinal Relationships:

• Prevents the model from assuming an ordinal relationship between categories (e.g., "Red" > "Green" >
"Blue"), which may not exist.

• Compatibility with Algorithms:

• Many ML algorithms require numerical input, and one-hot encoding allows categorical data to be used
effectively.

• Capturing Information:
• Each category is treated independently, ensuring that the model doesn't make incorrect assumptions about
the relationship between categories.

• Example :
• In logistic regression, treating categories as continuous variables without one-hot encoding can lead to
incorrect predictions.
Pros and Cons of One-Hot Encoding

•Advantages:
• Simplicity: Easy to implement and understand.

• No Assumptions: Does not impose an ordinal relationship between categories.

• Algorithm Compatibility: Works well with distance-based algorithms like KNN and linear models.

•Disadvantages:

• Curse of Dimensionality: For categorical features with a large number of unique values, the number
of dimensions increases significantly.

• Sparsity: The resulting vectors are sparse (many zeros), leading to inefficiencies in storage and
computation.

• Inapplicability for High-Cardinality Features: When categories are numerous (e.g., ZIP codes), one-
hot encoding becomes impractical.
Scenarios for Applying One-Hot Encoding
•Applicable Scenarios:
• Small Categorical Features: Works well for features with a limited number of unique categories (e.g., days
of the week).
• Nominal Data: Ideal for features where the categories do not have an intrinsic order (e.g., color, type).

•Not Ideal For:

• High Cardinality Features: Consider alternatives like target encoding or embedding for features with many
unique categories.
• Tree-Based Algorithms: Algorithms like decision trees and random forests can handle categorical data
without one-hot encoding.
What is Column Transformer?
•A utility in Scikit-Learn that allows for different preprocessing steps to be applied to different
subsets of features in a dataset.
•Handling Mixed Data Types: Essential when dealing with datasets containing both numerical
and categorical features.

•Example :
•Dataset Features: Age (Numerical), Gender (Categorical), Income (Numerical)

•Different Preprocessing: Age and Income might need scaling, while Gender
might require one-hot encoding.
Column transformer

Working:
Defining Transformers:
•Specify the preprocessing steps for different columns.
•Example:
•Numerical Columns: Apply StandardScaler to
standardize features.
•Categorical Columns: Apply OneHotEncoder to
transform categories into binary vectors.
•Combining Transformers:
•The ColumnTransformer combines these
preprocessing steps and applies them to the
corresponding columns simultaneously.
Mathematical Transformer

• Function transformer
• Power Transformer
• Binning and binarization
Feature splitting and
construction
• Feature splitting is a technique in machine learning that involves breaking down a
single feature into multiple features.

• This process generates more informative features that provide greater insight into the
relationships between input variables and the target variable.

• Purpose of Feature Splitting

• Convert continuous variables into categorical variables.

• Extract information from date and time features.

• Break down text features into smaller units for better analysis.
https://medium.com/@brijesh_soni/topic-11-feature-construction-splitting-b116c60c4b2f#:~:text=Feature%20splitting%20is%20a%20technique,variables%20and%20the%20target%20variable .
Benefits of Feature Splitting

• Enhances the performance of a machine-learning model by providing more relevant information.

• Helps models better capture relationships between features and the target variable.

• Increases the interpretability and effectiveness of machine-learning models.

Techniques for Feature Splitting

•Binning:
• Divides continuous variables into discrete intervals or bins.
• Useful for non-linear relationships.
•One-Hot Encoding:
• Converts categorical variables into binary features for each category.
• Each row has a binary value (0 or 1) for the corresponding category.

•Text Splitting:
• Splits text features into smaller units like words or phrases (tokenization, stemming).
•Date and Time Splitting:
• Extracts components like day, month, hour from date/time features.
• Useful for time-series data.
Feature Construction
•Feature construction involves creating new features from existing data to enhance the information available for model
training.
•Importance:
•Improves Model Accuracy: Well-constructed features can lead to better model performance by providing additional
insights.
•Captures Complex Relationships: Helps to model non-linear relationships that simple features may miss.
•Reduces Dimensionality: Effective feature construction can replace multiple simpler features with a single, more
informative feature.

•Key Techniques:

•Mathematical Transformations:
• Example: Creating polynomial features (e.g., x2 or x3).
•Aggregation:
• Example: Calculating the mean or sum of sales over different time periods.
•Domain-Specific Features:
• Example: For financial data, deriving ratios like debt-to-equity or creating features based on fiscal quarters.
Handling mixed data and time date
variables
Use L&T
Linear Regression

Simple
Multiple
Polynomial
Regression Matrices
Gradient Descent
Cost Function
Learning rate
Regularization
Logistic
Precision, recall, F1 score’

Unit 4 Basics of Feature Engineering
No ratings yet
Unit 4 Basics of Feature Engineering
33 pages
Electronic Speed Controller (ESC)
No ratings yet
Electronic Speed Controller (ESC)
22 pages
Eco Servo Smart Compression Testing Machine
No ratings yet
Eco Servo Smart Compression Testing Machine
2 pages
23.-Scaling-Techniques
No ratings yet
23.-Scaling-Techniques
30 pages
Machine Learning - Lec4 - 5
No ratings yet
Machine Learning - Lec4 - 5
41 pages
ML - WEEK 04
No ratings yet
ML - WEEK 04
33 pages
ML Unit 2
No ratings yet
ML Unit 2
90 pages
Unit 2 ML 2019
No ratings yet
Unit 2 ML 2019
91 pages
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
Week 10
No ratings yet
Week 10
50 pages
Unit 3-2
No ratings yet
Unit 3-2
15 pages
FeatureEngineering (1)
No ratings yet
FeatureEngineering (1)
50 pages
Lecture # 13 Data_Transformation_Techniques
No ratings yet
Lecture # 13 Data_Transformation_Techniques
36 pages
CH1
No ratings yet
CH1
64 pages
Unit-2Exploratory-Analysis
No ratings yet
Unit-2Exploratory-Analysis
37 pages
Data Preparation.2
No ratings yet
Data Preparation.2
18 pages
1737527078055
No ratings yet
1737527078055
111 pages
Lecture 7 Data Transformation and Dimensionality Reduction
No ratings yet
Lecture 7 Data Transformation and Dimensionality Reduction
22 pages
Feature Scaling (Standardization & Normalization)
No ratings yet
Feature Scaling (Standardization & Normalization)
35 pages
Feature Scaling Techniques: Machine Learning
No ratings yet
Feature Scaling Techniques: Machine Learning
27 pages
Machine Learning Summer Training
No ratings yet
Machine Learning Summer Training
118 pages
data processing
No ratings yet
data processing
19 pages
Lecture-11 - Feature Scaling
No ratings yet
Lecture-11 - Feature Scaling
26 pages
Feature Scaling in Machine Learning
No ratings yet
Feature Scaling in Machine Learning
4 pages
Data Transformation (1)
No ratings yet
Data Transformation (1)
16 pages
Feature Engineering For Machine Learning
No ratings yet
Feature Engineering For Machine Learning
41 pages
Data Mining
No ratings yet
Data Mining
33 pages
Lecture1-Introduction To Data Mining
No ratings yet
Lecture1-Introduction To Data Mining
46 pages
Data Transformation
No ratings yet
Data Transformation
5 pages
Session 7 Feature Selection & Dimensionality Reduction
No ratings yet
Session 7 Feature Selection & Dimensionality Reduction
20 pages
5 Data Preprocessing III Editted Notes
No ratings yet
5 Data Preprocessing III Editted Notes
17 pages
DS Day 5
No ratings yet
DS Day 5
11 pages
Machine Learning Mindmap PDF
100% (1)
Machine Learning Mindmap PDF
5 pages
Lec7 (1)
No ratings yet
Lec7 (1)
9 pages
IML 2 - Data Preparation
No ratings yet
IML 2 - Data Preparation
13 pages
Xplore Feature Engineering
No ratings yet
Xplore Feature Engineering
9 pages
2_DataPreProcessing_code
No ratings yet
2_DataPreProcessing_code
46 pages
Unit-II
No ratings yet
Unit-II
119 pages
Unit 4 Basics of Feature Engineering
100% (1)
Unit 4 Basics of Feature Engineering
33 pages
Lecture-2-20022025-092902am
No ratings yet
Lecture-2-20022025-092902am
87 pages
4 Data Pre Processing II
No ratings yet
4 Data Pre Processing II
26 pages
Normalization and Standardization: Methods To Preprocess Data To Have Consistent Scales and Distributions
No ratings yet
Normalization and Standardization: Methods To Preprocess Data To Have Consistent Scales and Distributions
10 pages
Summery of Feature Eng
No ratings yet
Summery of Feature Eng
4 pages
CS361 FA23 Lec2 Post
No ratings yet
CS361 FA23 Lec2 Post
67 pages
Exp2 - Data Visualization and Cleaning and Feature Selection
No ratings yet
Exp2 - Data Visualization and Cleaning and Feature Selection
13 pages
Chapter - 2 Data Mining
No ratings yet
Chapter - 2 Data Mining
21 pages
5.Feauture Engineering
No ratings yet
5.Feauture Engineering
34 pages
PPA Data Preparation
No ratings yet
PPA Data Preparation
31 pages
feature scaling
No ratings yet
feature scaling
6 pages
1.3.2. Feature Engineering and Variable - Transformation
No ratings yet
1.3.2. Feature Engineering and Variable - Transformation
29 pages
3 1 Chapter 3 Normalization
No ratings yet
3 1 Chapter 3 Normalization
22 pages
Eda
No ratings yet
Eda
48 pages
Lec06 7 Feature Engineering 08112022 100115am
No ratings yet
Lec06 7 Feature Engineering 08112022 100115am
44 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
Scikit Learn
No ratings yet
Scikit Learn
28 pages
Ds 5
No ratings yet
Ds 5
9 pages
Machine Learning (2) : Inteligência Artificial E Cibersegurança (Inacs)
No ratings yet
Machine Learning (2) : Inteligência Artificial E Cibersegurança (Inacs)
45 pages
ML1
No ratings yet
ML1
69 pages
Feature Engineering: Getting The Most Out of Data For Predictive Models
No ratings yet
Feature Engineering: Getting The Most Out of Data For Predictive Models
75 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
5_AML Lecture 5_Linear regression
No ratings yet
5_AML Lecture 5_Linear regression
56 pages
4_Outliers_+Transformaations ML
No ratings yet
4_Outliers_+Transformaations ML
28 pages
2_Machine Learning_130824
No ratings yet
2_Machine Learning_130824
81 pages
1_AML _Manish
No ratings yet
1_AML _Manish
72 pages
IMP_Hierarchical_Clustering
No ratings yet
IMP_Hierarchical_Clustering
3 pages
minors4proj_1
No ratings yet
minors4proj_1
13 pages
Purchase Order: SIK-47570494 /may 22, 2021
No ratings yet
Purchase Order: SIK-47570494 /may 22, 2021
2 pages
Mbs Public School: Std-Viii
No ratings yet
Mbs Public School: Std-Viii
14 pages
Vega Plus 69
No ratings yet
Vega Plus 69
3 pages
Logic Notes-MSI
No ratings yet
Logic Notes-MSI
40 pages
An Instrument For Measuring Customer Satisfaction Toward
No ratings yet
An Instrument For Measuring Customer Satisfaction Toward
14 pages
64 F 2626
No ratings yet
64 F 2626
1,071 pages
Aml
No ratings yet
Aml
1 page
DL syllabus 3164601
No ratings yet
DL syllabus 3164601
4 pages
Anamaria Petrescu Curriculum Vitae Engleza
No ratings yet
Anamaria Petrescu Curriculum Vitae Engleza
3 pages
Metron 05 C: Metal Detector
No ratings yet
Metron 05 C: Metal Detector
10 pages
Database Management Systems
No ratings yet
Database Management Systems
75 pages
Arc Protection Example 1 VAMP221 Master and Slave VAM12L VAM10L
No ratings yet
Arc Protection Example 1 VAMP221 Master and Slave VAM12L VAM10L
4 pages
Manual Lenovo PDF
No ratings yet
Manual Lenovo PDF
45 pages
Sap SD
100% (1)
Sap SD
3 pages
MEAN STACK (AutoRecovered)
No ratings yet
MEAN STACK (AutoRecovered)
85 pages
1 Hamidi Et Al 2023 Corporate Reputation in Industry 4 0 a Systematic Literature Review and Bibliometric Analysis
No ratings yet
1 Hamidi Et Al 2023 Corporate Reputation in Industry 4 0 a Systematic Literature Review and Bibliometric Analysis
19 pages
Module 2
No ratings yet
Module 2
3 pages
Signal Circuit EMI: Unwanted
No ratings yet
Signal Circuit EMI: Unwanted
4 pages
?? ??? ????
No ratings yet
?? ??? ????
4 pages
C++ Program 1
No ratings yet
C++ Program 1
36 pages
Micro and Nano Scale Flow - Direct Simulation Monte Carlo (DSMC)
100% (2)
Micro and Nano Scale Flow - Direct Simulation Monte Carlo (DSMC)
16 pages
Unity3D RaceGame Tutorial
100% (1)
Unity3D RaceGame Tutorial
42 pages
Ieee Standard For Ethernet Amendment 9physical Layer Specificati
100% (1)
Ieee Standard For Ethernet Amendment 9physical Layer Specificati
267 pages
Fully Automatic Shirt Ironing Machine
No ratings yet
Fully Automatic Shirt Ironing Machine
1 page
MAX-PAC Brochure Concepts NREC Web
No ratings yet
MAX-PAC Brochure Concepts NREC Web
8 pages
Algorithms For A Minimal Chess Player
No ratings yet
Algorithms For A Minimal Chess Player
25 pages
Artes Visuales I
No ratings yet
Artes Visuales I
130 pages

3_AML _Lecture 3_Feature Engg

Uploaded by

3_AML _Lecture 3_Feature Engg

Uploaded by

Feature Scaling

Scale data for better performance of Machine Learning

• Feature Scaling is a technique to standardize the

• It is a vital preprocessing step in machine learning that

• It plays a major role in ensuring accurate and efficient model

• Scaling techniques aim to normalize the range, distribution,

• If feature scaling is not done, then a machine learning

• Crucial for ensuring features are comparable in terms of

 Ensures Comparable Scales Across Features

• Importance of Feature Scaling

networks perform better by improving convergence and accuracy.

• Prevent Numerical Instability: Avoids issues in calculations, especially with algorithms

sensitive to distance metrics.

• Image: Visualization showing improved convergence in gradient descent with scaling.

• Distance-Based Algorithms: Algorithms like KNN rely on distance measurements, which

• Image: Diagram illustrating KNN with scaled vs. unscaled data.

Normalization (Min-Max Scaling):

Standardization (Z-Score Scaling):

We are doing mean centering

• Very good for outliers

Standardization is used mostly

If outliers use robust

•Types of Categorical Data

• Visualization & Analysis

• Challenge: Many machine learning algorithms cannot operate on label data

• Reason: This is generally a constraint of efficient algorithm implementation

• Solution: Categorical data must be converted to numerical form, and

• One-Hot Encoded Representation:

• Avoiding Ordinal Relationships:

• Compatibility with Algorithms:

• No Assumptions: Does not impose an ordinal relationship between categories.

•Not Ideal For:

• Purpose of Feature Splitting

• Convert continuous variables into categorical variables.

• Extract information from date and time features.

• Enhances the performance of a machine-learning model by providing more relevant information.

• Increases the interpretability and effectiveness of machine-learning models.

Techniques for Feature Splitting

You might also like