Sales Forecasting and Analysis
Improving Sales Forecasting Accuracy at ABC Inc
Agenda
Introduction
Data Preprocessing
Exploratory Data Analysis (EDA)
Model Building
Model Evaluation
Conclusion and Next Steps
Introduction
Company Overview: ABC Inc, a US-based specialty retailer of crafts and fabrics.
Problem Statement: Improving sales forecasting accuracy for better budget
planning.
Importance: Accurate forecasting informs budget decisions and inventory
management.
Link to the python code
Data Preprocessing
Data Sources: Sales Data, Promotion Calendar, Product Master.
Steps:
Outlier Detection and Treatment.
Missing Value Treatment.
Combining and Cleaning Datasets.
Link to the python code
Exploratory Data Analysis (EDA)
Visualizations:
Sales Patterns Over Time.
Seasonality and Trends.
Key Insights:
Seasonal Decomposition.
Offer Effects.
link to the python code
Seasonality and Trend Analysis
Model Building
Model Selection:
RandomForestRegressor, GradientBoostingRegressor.
Feature Engineering:
Product, Sub_Category, Category, Unit_Retail_Price, Offer, Offer_Type, Effective_Price,
Year, Month, Day, DayOfWeek, Quarter, WeekOfYear, IsWeekend, DayOfYear,
WeekdayName, SeasonNameCode
Creating relevant features.
Date Split: get Year, Month, Day, DayOfWeek, Quarter, WeekOfYear, IsWeekend,
DayOfYear, WeekdayName, SeasonNameCode encoding them in numeric form
Training and Validation sets: splitting 80 to 20 ratio
Hyperparameter Tuning: GridSearchCV for best model parameters.
link to the python code
Model Building (Revised)
Expanding the Regression Toolkit for Enhanced Performance.
Introduction to Additional Regression Techniques:
Lasso Regression ,Ridge Regression, Elastic Net Regression, Support Vector Regression
(SVR), Principal Component Analysis (PCA) ,Kernel Ridge Regression, Gaussian Process
Regressor ,MLP Regressor
Diverse Set of Techniques:
Aiming to further reduce errors and improve forecasting accuracy.
Rigorous Hyperparameter Tuning:
Employing GridSearchCV for each technique to identify optimal parameters.
Link to the python code
Model Evaluation
Evaluation Metrics: Random Forest Regressor
Mean Absolute Error (MAE): 19.488
Model Comparison:between Random Forest Regressor and Gradient
boosted tree
Performance on validation data.
Best Model Selection:Random Forest Regressor
Chosen model based on lowest MAE: Random Forest Regressor
Link to the python code
Conclusion
Summary of Findings:
Improved Sales Forecasting Accuracy.
Next Steps:
Deploying the model for real-time forecasting.
Continual model monitoring and updates.
Link to the python code
THANK YOU