0% found this document useful (0 votes)

55 views42 pages

Customer Satisfaction Prediction with ML

Uploaded by

Ngoc Anh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views42 pages

Customer Satisfaction Prediction with ML

Uploaded by

Ngoc Anh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 42

SATISFACTI

ON
PREDICTION

USING
MACHINE
LEARNING
ALGORITHMS
TABLE OF CONTENTS
01 02 03
INTRODUCTION DATA OVERVIEW DATA EXPLORATION

04
DATA QUALITY
05
DATA
06
MODEL
ASSESSMENT PREPROCESSING PREPARATION

07 08 09
MODEL TRAINING MODEL EVALUATION CONCLUSON
AND COMPARISON
01
INTRODUCTIO
N
1.1. Purpose of Analysis
The purpose of this analysis is to
explore and predict customer
satisfaction in the airline industry.
Customer satisfaction is a critical
determinant of an airline's success and
competitiveness.
● The primary drivers of customer
satisfaction.
● Patterns or trends in passenger
feedback.
● Opportunities for airlines to improve
their services and address customer
pain points.
1.2. Importance of passenger’s
satisfaction prediction
In a highly competitive market, this can
lead to: Increased Revenue;
Improved Operational Efficiency;
Brand Loyalty and Reputation
02
Data
Exploratio
n
2.1.Loading the
Data
The dataset is loaded using Python, and its dimensions (97,410
rows and 24 columns) confirm its robustness for analysis.

Running Code:
● The pandas library is used to process and analyze tabular data
(DataFrame).
● pd is an alias for calling functions in the pandas library.
● pd.read_csv(): The function reads a CSV file and converts the content into
a DataFrame (table format).
● The head() function displays the first 5 rows of the train_data DataFrame
(default is 5 rows, but can be changed by passing the desired number of
rows
2.1.Loading the
Data
Output:

● 97,410 rows: This is the number of records or observations

in the dataset (e.g., number of customers or transactions).
● 24 columns: This is the number of features or variables in
the dataset (e.g., customer information such as gender, age,
ticket type, satisfaction level).
2.2. Initial
Inspection
Running Code:

Display the first 5 rows of the DataFrame to get an overview of

the data structure.
2.2. Initial
Inspection

Output:
● Gender: The dataset includes 2 genders: male and female.
● Customer Type: The dataset includes 2 types of customer, that
is disloyal customer and loyal customer
● Age: The dataset includes passenger ages from 22 (youngest)
to 49 (oldest).
● Flight distance: Range from 127 ( smallest ) to 3945 ( highest )
● Other types of services such as Inflight wifi service, On-board
service, Leg room service are rated on a scale of 1 to 5.
2.2. Initial
Inspection
Running Code:

The purpose of this code is to display detailed information about

the data train_data.info() and test_data.info() provide an
overview of two DataFrames (train_data and test_data), including:
● Number of rows and columns.
● Names of columns.
● Null values in each column.
● Data type of each column (e.g., int64, float64, object).
2.2. Initial
Inspection
Output:
● Numerical form: ID, Age,
Satisfaction.
● Categorical form: Gender,
Customer type, Type of travel,
and Class.
● Unstructured form:
Cleanliness, Baggage handling,
Departure delay in minutes, and
Arrival delay in minutes.
2.2. Initial
Inspection
Output:
• The final dataset now
comprises 32470 entries and
22 columns.

• There are 3 types of data

types including int64 which is
a 64-bit integer data type,
used to store large integer
values, float64 which is a 64-
bit floating-point data type,
used to store values with
decimal parts, and object
which is string data or non-
numeric data types.
03
Data Quality
Assessment
3.1. Outliers
In this code, outlier detection is
performed on the data columns
in the train_data dataset. This
screen shows the number of
outliers in each column of the
train_data dataset:

• Age: 0 outliers
• Flight distance: 2150 outliers
• Departure delay in minutes:
13711 outliers
• Arrival delay in minutes:
13,215 outliers
3.1. Outlier
The graph generated by this code is a
box plot, which displays important
information about the distribution of
numerical data in each column of
train_data.

Outliers can negatively impact a

model by skewing linear models,
causing incorrect predictions because
the regression line is pulled toward
them.

Therefore, dealing with outliers is an

important step to improve the
efficiency and accuracy of the model.
3.1. Outlier

The `Flight Distance` column has a

wide distribution, with the main value
in the range 500 - 1500, but there are
many outliers above the upper limit.
The `Departure Delay in Minutes` and
`Arrival Delay in Minutes` columns
have main distributions close to 0.
.2. Correlation Analysis
• The output of this code is a heatmap
correlation matrix: Each cell in the
matrix represents the relationship
between two data columns.

• -1 (completely negative) to 1
(completely positive), 0 indicates no
relationship.

• Colors make it easy to identify strong

(red) or weak (blue) relationships.

• Inflight wifi service and Inflight

entertainment are highly correlated
(0.71)

• Ease of Online booking and Online

boarding are moderately correlated
(0.44)

• Arrival Delay in Minutes and Departure

Delay in Minutes are highly correlated
(0.94)
04
Data
Preprocessing
4.1. Handling Missing
Value
• The above code is used to visualize the
number of missing values in each column of
the training dataset.

• The result is a Series, where index is the

column name and values is the number of
missing values.

• This bar chart displays the number of missing

values in each column of the training dataset.

• Only the `Arrival Delay in Minutes` column

contains about 300 missing values
4.1. Handling Missing
Value

Here is the code to handle missing data and reset the index for the
data. Specifically: Missing values (`NaN`) in the `Arrival Delay in
Minutes` column of both `train_data` and `test_data` are replaced
with the mean value of the corresponding column, ensuring data
integrity.
4.2. Feature
Transformation

• This code performs one-hot encoding for the columns

'Customer Type', 'Type of Travel', 'Class', and 'Gender' in both
the training set.
• Create a target variable y from the column 'satisfaction' in
train_data.
• Create a feature set X by copying the train_data and removing
the columns 'satisfaction' and 'id’.
• Check the size of the feature sets (X and X_test) after
processing.
• After running the code, the categorical columns in both
train_data and test_data will be converted to binary columns
(0 or 1).
4.3. Log Transformation

• The code performs a logarithmic

transformation on the `Flight Distance`,
`Departure Delay in Minutes` and `Arrival
Delay in Minutes`.

• The log-transform helps to address the

following issues: Skewness of the data
distribution

• Reducing the influence of outliers

• Improving display performance

4.3. Log Transformation

• The "Flight Distance" value ranges from 4 to 8

on a logarithmic scale.

• Actual initial: approximately 54 km to 2,980

km.

• The KDE curve (plain blue line) superimposed

on the histogram shows that the data is
approximately normally distributed.
05
Model
Preparation
5.1. Defining Features and Target
Variable

• Features (X_train): Features for training

• Target Variable (y_train): Labels for training.

• Test Data: Features for testing

5.2. Train-Test Split

• X_train and y_train are taken directly from train_data.

• Inaccurate evaluation of the model's performance and the risk of
overfitting
• Training set (X_train, y_train): Used to train the model.
• Test set (X_val, y_val): Used to evaluate the model performance
on unseen data.
• X_train: Contains 80% of the data (features only) used for
training.
• X_test: Contains 20% of the data (features only) reserved for
testing.
5.3. Preprocessing Pipelines for
Numeric and Categorical Features

Numeric Features:
• Numerical data columns (int64, float64).

• Easy to process with mathematical operations

Categorical_features:
• Categorical data columns (object type).

• Need to be encoded before being fed into the

machine learning model.
.3.1. Numeric Transformer
SimpleImputer(strategy='mean'):

• Fill in missing values with the mean of each

column.
StandardScaler():
• Standardize data to a normal distribution
(mean=0, standard deviation=1).

3.2. Categorical Transformer

SimpleImputer(strategy='constant'
, fill_value='missing'):
• Fill missing values with a fixed value (missing).

OneHotEncoder(handle_unknown='ignore'):

• Encode categorical values into One-Hot form

(binary variables).
3.3. Numeric Transformer
Aggregate in ColumnTransformer

• numeric_transformer applies to
numeric_features.

• categorical_transformer applies to
categorical_features.

5.3.4. Combining Preprocessing

and Modeling
Preprocessor:
• Perform preprocessing of all data before feeding
into the model.
RandomForestClassifier:
• Classification model using random forest
algorithm.

• Number of trees (estimators) = 100.

• Random_state=42 to ensure reproducible

5.3.5. Model Training

Procedure:
• Data preprocessing (imputation, scaling,
encoding).

• Training the RandomForestClassifier model.

Purpose:
• Ensure consistent transformations, reducing the
risk of data leakage and simplifying predictions
for X_test.
5.3.5. Model Training

The Output
Numerical data processing (num):
• SimpleImputer replaces missing values with the
mean.

• StandardScaler normalizes the data, bringing

the values to the same scale to improve the
efficiency of the machine learning algorithm.
Categorical data processing (cat):
• SimpleImputer replaces missing values with a
default value ('missing').

• OneHotEncoder converts categorical values

into one-hot encoding, creating representative
binary features.
Combining and training:
• ColumnTransformer combines both types of
data (numeric and categorical) after being
processed and feeds them into the
RandomForestClassifier model
06
MODEL
TRAINING
6.1. Random
Forest Classifier
Import:
• Random Forest Classifier:
Uses multiple decision trees
to improve predictive
performance and control
overfitting.
Creating the Random Forest Model:
o Pipeline
o Preprocessor
o Classifier
• n_estimators=100: the number of
decision trees in the forest
• random_state=42: Ensures the
reproducibility of the random selection
of features and samples for each tree.
6.1. Random
Forest Classifier
Evaluating Model Performance:
• The output "Random Forest
Accuracy: 1.00" indicates that
the Random Forest model
achieved a perfect score of
100% accuracy on the data it
was evaluated on.

• The model appears to

prioritize the aspects of the
travel experience : flight
distance, age of the customer,
and service quality compare
to categorical variables like
gender, customer loyalty, or
travel class.
.2. Logistic Regression

Import:
• Logistic Linear Regression: This is a
popular algorithm used for binary
classification .

Model Evaluation:
73% of the model's predictions on
the data are correct.
6.3. Neural Network
(MLP)
Classifier
Importing Multi-layer Perceptron
• Thisclassifier
line imports the MLPClassifier,
which is a type of neural network
model
MLP Classifier
• hidden_layer_sizes=(100,): one
hidden layer with 100 neurons.
• max_iter=300: The maximum
number of iterations for training.
The model will stop if it converges
before reaching this number.
Making Prediction
• The output "Neural Network
Accuracy: 0.78" indicates that the
Multi-layer Perceptron (MLP)
classifier achieved an accuracy of
78% on the dataset.
07
MODEL
EVALUATION &
COMPARISON
7. Model Evaluation &
Comparison
• Logistic Regression offers simplicity
with a 73% training accuracy,
suggesting good generalization.
• Its interpretability allows airlines to
clearly see how features like flight
duration or service quality directly
affect satisfaction.
• A negative coefficient for flight
duration indicates longer flights
reduce satisfaction.
7. Model Evaluation &
Comparison
Random Forest achieved a perfect 100%.
It can handle complex high dimensional
data, provides high interpretability through
feature importance scores. Model capture
the interaction through the structure of
multiple trees.
For instance: In-Flight Entertainment is a
crucial factor for long flights (> 4 hours),
but less so for short flights.

Neural Network (MLP) with a 78%

accuracy. It's particularly effective when
satisfaction is influenced by a mix of factors
in complex ways, like the interaction
between flight delays and customer service
quality
08
CONCLUSION
This study aimed to uncover the key factors influencing airline
passenger satisfaction and identify areas where airlines can enhance
their services to improve customer loyalty. We analyzed a dataset that
included various factors such as inflight entertainment, Wi-Fi service,
seat comfort, departure and arrival delays, and customer
segmentation. Airlines should focus on improving inflight services,
minimizing delays, and customizing their offerings to meet diverse
customer needs.
THANKS
for
listening!

Random Forest Model
No ratings yet
Random Forest Model
16 pages
Flight Price Prediction Report
No ratings yet
Flight Price Prediction Report
18 pages
Machine Learning For Airline Customer Satisfaction Prediction
No ratings yet
Machine Learning For Airline Customer Satisfaction Prediction
14 pages
Flight Price Prediction
No ratings yet
Flight Price Prediction
34 pages
Exemplar - Perform Logistic Regression
No ratings yet
Exemplar - Perform Logistic Regression
16 pages
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
No ratings yet
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
5 pages
Optimizing Flight Booking Decisions Through Machine Learning Price Predictions
No ratings yet
Optimizing Flight Booking Decisions Through Machine Learning Price Predictions
50 pages
Advance Python
No ratings yet
Advance Python
5 pages
BPP Business School - Applied Modelling and Visualisation
No ratings yet
BPP Business School - Applied Modelling and Visualisation
19 pages
Final
No ratings yet
Final
15 pages
Ict Project Report
No ratings yet
Ict Project Report
14 pages
Flight Price Prediction Guide
No ratings yet
Flight Price Prediction Guide
28 pages
Machine Learning
100% (1)
Machine Learning
33 pages
AI Exercises
No ratings yet
AI Exercises
2 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
Assignment1 Code and Conclude DSA Nikhil Mishra
No ratings yet
Assignment1 Code and Conclude DSA Nikhil Mishra
36 pages
Data Preprocessing Example Programs1
No ratings yet
Data Preprocessing Example Programs1
9 pages
Data Preprocessing Essentials
No ratings yet
Data Preprocessing Essentials
46 pages
Airline Passenger Satisfaction Analysis
No ratings yet
Airline Passenger Satisfaction Analysis
23 pages
Flight Fare Prediction Model Overview
No ratings yet
Flight Fare Prediction Model Overview
17 pages
Articles Xgboost Classification With Smote-Enn Algorithm
No ratings yet
Articles Xgboost Classification With Smote-Enn Algorithm
11 pages
Kaggle Course Notes
No ratings yet
Kaggle Course Notes
87 pages
DataAnalytics Lab Manual
No ratings yet
DataAnalytics Lab Manual
35 pages
Credit Card Approval Prediction Report-Final
No ratings yet
Credit Card Approval Prediction Report-Final
27 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
Data Preprocessing and Cleaning For Machine Learning
No ratings yet
Data Preprocessing and Cleaning For Machine Learning
16 pages
Flight Price Prediction Project Report in PDF
No ratings yet
Flight Price Prediction Project Report in PDF
34 pages
Lab 08 - Data Preprocessing
No ratings yet
Lab 08 - Data Preprocessing
9 pages
ML - Extended Project Business Report-Richa
No ratings yet
ML - Extended Project Business Report-Richa
32 pages
Project - Machine Learning (E)
No ratings yet
Project - Machine Learning (E)
34 pages
Machine Learning Extended Project - BrahmaChari
No ratings yet
Machine Learning Extended Project - BrahmaChari
29 pages
Data Analytics I
No ratings yet
Data Analytics I
4 pages
'Yatham Padma' 8 May 2022
No ratings yet
'Yatham Padma' 8 May 2022
82 pages
Titanic Akshaya
No ratings yet
Titanic Akshaya
12 pages
ML5 Decision Tree Airline Safety
No ratings yet
ML5 Decision Tree Airline Safety
3 pages
Compare and Contrast CSV, JSON, and XML Dataset Formats. Which Format Would You Choose For Image Data and Why?
No ratings yet
Compare and Contrast CSV, JSON, and XML Dataset Formats. Which Format Would You Choose For Image Data and Why?
9 pages
Machine Learning Project Checklist
No ratings yet
Machine Learning Project Checklist
30 pages
Cars Project PDF
No ratings yet
Cars Project PDF
9 pages
Capstone Project - Jaro-Prof. Babji
No ratings yet
Capstone Project - Jaro-Prof. Babji
5 pages
cz4041 Project Final Report Nyc Taxi Fare Prediction
0% (1)
cz4041 Project Final Report Nyc Taxi Fare Prediction
18 pages
Presentation Learbnbay - Flight Fare Prediction
No ratings yet
Presentation Learbnbay - Flight Fare Prediction
15 pages
BH GF
No ratings yet
BH GF
16 pages
Credit Risk Project
No ratings yet
Credit Risk Project
11 pages
DMcase 2
No ratings yet
DMcase 2
5 pages
Advanced Feature Engineering and Data Preprocessing in Machine Learning
No ratings yet
Advanced Feature Engineering and Data Preprocessing in Machine Learning
7 pages
B.Tech Flight Fare Prediction Report
No ratings yet
B.Tech Flight Fare Prediction Report
82 pages
1.1 Loading The Data: Survival by Sex
No ratings yet
1.1 Loading The Data: Survival by Sex
6 pages
Titanic & Airline ML Analysis Guide
No ratings yet
Titanic & Airline ML Analysis Guide
3 pages
S3 Data Processing and Classification
No ratings yet
S3 Data Processing and Classification
25 pages
Solution - Data Analysis With Python-Project-2 - v1.0
No ratings yet
Solution - Data Analysis With Python-Project-2 - v1.0
14 pages
Dsbda Lab - 1 - 1736243987425
No ratings yet
Dsbda Lab - 1 - 1736243987425
10 pages
Machine Learning for Transport Prediction
80% (5)
Machine Learning for Transport Prediction
118 pages
Progress of CATBOOST ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
No ratings yet
Progress of CATBOOST ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
9 pages
UNITIV BtechIot
No ratings yet
UNITIV BtechIot
43 pages
Flight Fare Prediction System Report
No ratings yet
Flight Fare Prediction System Report
38 pages
House Price Prediction for Analysts
No ratings yet
House Price Prediction for Analysts
91 pages
Subject - Machine Learning Group - E27-24 Name
No ratings yet
Subject - Machine Learning Group - E27-24 Name
18 pages
Python For Machine Learning
No ratings yet
Python For Machine Learning
66 pages
Topic 01 - Introduction To Corporate Finance - Le Hai Trung KNH
No ratings yet
Topic 01 - Introduction To Corporate Finance - Le Hai Trung KNH
9 pages
Lecture 3 - Testing Hypothesis
No ratings yet
Lecture 3 - Testing Hypothesis
19 pages
2023 MCI Audited Financials Final
No ratings yet
2023 MCI Audited Financials Final
19 pages
Business Dynamics Syllabus - Cityu Program 2024 2025
No ratings yet
Business Dynamics Syllabus - Cityu Program 2024 2025
8 pages
Chapter3 - Defining The AI Project
No ratings yet
Chapter3 - Defining The AI Project
17 pages
Business Analytics Exam Questions
No ratings yet
Business Analytics Exam Questions
32 pages
Inventory Costs in Rapid-Change Industries
No ratings yet
Inventory Costs in Rapid-Change Industries
38 pages
Business Dynamics 2024 2025 4std
No ratings yet
Business Dynamics 2024 2025 4std
163 pages
AEFBD - Chapter 2
No ratings yet
AEFBD - Chapter 2
12 pages
Final Exam: Business Analytics For Decision-Making
No ratings yet
Final Exam: Business Analytics For Decision-Making
4 pages
Leadership
No ratings yet
Leadership
30 pages
351 ArticleText 2606 1 10 20240304 - 240305 - 004831
No ratings yet
351 ArticleText 2606 1 10 20240304 - 240305 - 004831
26 pages
The Concept of Childhood in Western Countries IELTS Reading Answers With Explanation
33% (3)
The Concept of Childhood in Western Countries IELTS Reading Answers With Explanation
6 pages
Improving Patient Safety in Pharma Design
No ratings yet
Improving Patient Safety in Pharma Design
9 pages
Nurturing Talent Within The Family IELTS Reading Answers With Explanation
No ratings yet
Nurturing Talent Within The Family IELTS Reading Answers With Explanation
7 pages
Energies 14 06336 v2
No ratings yet
Energies 14 06336 v2
23 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
24 pages
Deepr A Convolutional Net For Medical Records
No ratings yet
Deepr A Convolutional Net For Medical Records
9 pages
Learning Capability and Storage Capacity of Two-Hidden-Layer Feedforward Networks
No ratings yet
Learning Capability and Storage Capacity of Two-Hidden-Layer Feedforward Networks
8 pages
Classification by Backpropagation - A Multilayer Feed-Forward Neural Network - Defining A Network Topology - Backpropagation
No ratings yet
Classification by Backpropagation - A Multilayer Feed-Forward Neural Network - Defining A Network Topology - Backpropagation
8 pages
H13 311 Enu V8.02
No ratings yet
H13 311 Enu V8.02
30 pages
Different Artificial Neural Networks Architectures
No ratings yet
Different Artificial Neural Networks Architectures
27 pages
Aplicación de La Inteligencia Artificial en La Industria Alimentaria - Una Guía
No ratings yet
Aplicación de La Inteligencia Artificial en La Industria Alimentaria - Una Guía
42 pages
250 MCQ of ML
100% (3)
250 MCQ of ML
47 pages
Master's Thesis on Neural Networks
100% (1)
Master's Thesis on Neural Networks
4 pages
Sparse Autoencoders in Deep Learning
No ratings yet
Sparse Autoencoders in Deep Learning
11 pages
Pengantar Deep Learning untuk NLP
100% (1)
Pengantar Deep Learning untuk NLP
109 pages
Counter Propagation Network
No ratings yet
Counter Propagation Network
229 pages
Machine Learning Overview and Concepts
No ratings yet
Machine Learning Overview and Concepts
124 pages
Module 1 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
100% (1)
Module 1 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
18 pages
IJSC
No ratings yet
IJSC
11 pages
Q1. What Is ANN?
No ratings yet
Q1. What Is ANN?
11 pages
Deep Learning: Techniques & Trends
No ratings yet
Deep Learning: Techniques & Trends
20 pages
Emerging Artificial Intelligence Applications in Computer Engineering 1st Edition by Ilias Maglogiannis, Kostas Karpouzis, Manolis Wallace, John Soldatos ISBN 1586037803 9781586037802 - Download the ebook now and own the full detailed content
100% (14)
Emerging Artificial Intelligence Applications in Computer Engineering 1st Edition by Ilias Maglogiannis, Kostas Karpouzis, Manolis Wallace, John Soldatos ISBN 1586037803 9781586037802 - Download the ebook now and own the full detailed content
80 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Data Science with Python: NumPy, Pandas, SciPy
No ratings yet
Data Science with Python: NumPy, Pandas, SciPy
48 pages
Deep Learning
No ratings yet
Deep Learning
26 pages
ML Long Answer Questions
No ratings yet
ML Long Answer Questions
6 pages
Mumbai University Computer Engineering Program Structure
No ratings yet
Mumbai University Computer Engineering Program Structure
56 pages
Deep Recurrent Neural Network For IoT Intrusion Detection System
No ratings yet
Deep Recurrent Neural Network For IoT Intrusion Detection System
26 pages
Artificial Neural Network: Presentation By: C. Vinoth Kumar SSN College of Engineering
No ratings yet
Artificial Neural Network: Presentation By: C. Vinoth Kumar SSN College of Engineering
9 pages
Be Computer Engineering Aids Final Year Be Semester 7 8 Rev 2019 C Scheme
No ratings yet
Be Computer Engineering Aids Final Year Be Semester 7 8 Rev 2019 C Scheme
145 pages
Electric Load Forecasting Literature Sur
No ratings yet
Electric Load Forecasting Literature Sur
12 pages
Artificial Intelligence Aided Electronic Warfare S
No ratings yet
Artificial Intelligence Aided Electronic Warfare S
21 pages

Customer Satisfaction Prediction with ML

Uploaded by

Customer Satisfaction Prediction with ML

Uploaded by

SATISFACTI

● 97,410 rows: This is the number of records or observations

Display the first 5 rows of the DataFrame to get an overview of

The purpose of this code is to display detailed information about

• There are 3 types of data

Outliers can negatively impact a

Therefore, dealing with outliers is an

The `Flight Distance` column has a

• Colors make it easy to identify strong

• Inflight wifi service and Inflight

• Ease of Online booking and Online

• Arrival Delay in Minutes and Departure

• The result is a Series, where index is the

• This bar chart displays the number of missing

• Only the `Arrival Delay in Minutes` column

• This code performs one-hot encoding for the columns

• The code performs a logarithmic

• The log-transform helps to address the

• Reducing the influence of outliers

• Improving display performance

• The "Flight Distance" value ranges from 4 to 8

• Actual initial: approximately 54 km to 2,980

• The KDE curve (plain blue line) superimposed

• Features (X_train): Features for training

• Target Variable (y_train): Labels for training.

• Test Data: Features for testing

• X_train and y_train are taken directly from train_data.

• Easy to process with mathematical operations

• Need to be encoded before being fed into the

• Fill in missing values ​with the mean of each

3.2. Categorical Transformer

• Encode categorical values ​into One-Hot form

5.3.4. Combining Preprocessing

• Number of trees (estimators) = 100.

• Random_state=42 to ensure reproducible

• Training the RandomForestClassifier model.

• StandardScaler normalizes the data, bringing

• OneHotEncoder converts categorical values ​

• The model appears to

Neural Network (MLP) with a 78%

You might also like

• Fill in missing values with the mean of each

• Encode categorical values into One-Hot form

• OneHotEncoder converts categorical values