This project is created by 4 students: Vu Truong, Quang Huynh, Anh Ngo, Thanh Nguyen at SP Jain School of Global Management for the purpose of learning fundamental Machine Learning. This is our first project about ML, which we hope it would be an important premise for us to build more larger projects in the future.
Implementing Machine Learning models and statistical methods to predict the churn target based on 10,000 sample of customer information. Using Python and its library (NumPy, Pandas, Matplotlib, Scikit - learn) to evaluate the most appropriate model, which has accuracy of above 80%. Generating a report to communicate and interpret the results to business users
OUTLINE
I) Data Exploration: (Thanh)
Import libraries and dataset
General description
Correlation matrix
Exploratory Data Analysis (EDA)
II) Data Cleaning and Preprocessing: (Vu)
Data Cleaning
Preprocessing
Dealing with outliers
III) Testing multiple models (Anh)
Splitting the dataset
Building and testing model
ROC curve for comparison
Precision – Recall curve
Choosing models
IV) Fit models (Thanh)
A – Linear Discriminant Analysis
B – Logistic Regression
C – Gaussian Naïve Bayes
D – K-Nearest Neighbors
V) Conclusion (Quang)
VI) Additional Information
LDA Reduced Model (Quang)
Decision Tree Model (Quang)
Defining threshold (Vu)