EPL Prediction Web App
EPL Prediction Web App
Prediction
Web App
Presentation: Roshan Gautam
What we cover
❖ INTRODUCTION
❖ PROBLEM STATEMENT
❖ REQUIREMENT ENGINEERING
❖ LITERATURE REVIEW
❖ RESEARCH METHODOLOGY
❖ TESTING
❖ DISCUSSION
❖ CONCLUSION
INTRODUCTION
The EPL's global Growing demand for Machine learning Predicting football match The presentation aims to
significance lies in its precise match predictions algorithms have outcomes is complex due to explore machine learning's
unparalleled stems from the sports revolutionized EPL analysis, numerous variables, application in EPL outcome
competitiveness and betting industry, sports using extensive data and necessitating an prediction, comparing
immense fan base, making analytics, and the decision- advanced techniques to understanding of machine various algorithms, and
it one of the most making process for enhance prediction learning's limitations and providing insights into
celebrated football leagues managers and coaches. accuracy and deepen the multifaceted nature of performance variations and
worldwide. football understanding. the game. the future of prediction in
the league.
AIMS AND
OBJECTIVES
Aim: To develop a predictive model for English Premier League (EPL) football match outcomes
using machine learning algorithms and provide valuable insights for fans, sports analysts, and
team management.
Objectives:
Data Collection and Preparation: a) Collect historical EPL match data, including team and
player information. b) Process and format the data for suitability in training machine learning
models.
Model Development: a) Create predictive models using Logistic Regression, LSTM, and Poisson
Distribution. b) Train these models using the collected historical data to predict EPL match
outcomes.
Model Evaluation: a) Split the dataset into training (80%) and testing (20%) subsets to evaluate
model performance. b) Assess the accuracy and reliability of each model in predicting match
outcomes.
Web Application Integration: a) Incorporate the trained models into a web application. b) Test
the functionality and accuracy of the web application in providing match outcome predictions
based on user input.
User Experience Enhancement: a) Design a user-friendly and visually appealing interface for
the web application. b) Ensure that the application offers an engaging and informative
experience for users seeking match predictions.
PROBLEM STATEMENT
accurately predicting English Premier League (EPL)
Traditional prediction methods often lean on
match outcomes is a persistent challenge. The
historical statistics and basic statistical models,
EPL's global importance draws fans from diverse
potentially missing crucial factors like player form,
backgrounds, each seeking dependable
injuries, tactical variations, and external variables
predictions for purposes like sports betting,
like weather. These factors significantly impact
fantasy football, and informed decision-making by
match results.
teams.
DATA COLLECTION FEATURE MODEL SELECTION: MODEL TRAINING MODEL USER INTERFACE
AND ENGINEERING: AND INTEGRATION: (UI) DESIGN:
PREPROCESSING: OPTIMIZATION:
LITERATURE
REVIEW
The literature informing this research encompasses diverse studies in
football match outcome prediction. Ulmer and Fernandez's work explores
machine learning classifiers like Linear classification, Naive Bayes, Hidden
Markov Model, SVM, and Random Forest in the context of the EPL, laying
the foundation for our study. Palinggi's team introduces weather conditions
as predictive features, highlighting the value of external factors. Saiedy,
HemmatQachmas, and Faqiri compare SVM and Random Forest's
performance, offering insights into machine learning tools. Constantinou's
global predictions inspire broader applications beyond the EPL. Finally,
Azhari, Widyaningsih, and Lestari's Poisson regression model aligns with our
use of the Poisson Distribution, guiding our methodology. Collectively, this
literature shapes our research, spanning machine learning techniques,
external factors, algorithmic comparisons, global predictions, and Poisson
regression in football match outcome forecasting.
RESEARCH METHODOLOGY
Results
Data Model Training and Ethical
Data Collection: Model Selection: Feature Selection: Model Evaluation: Interpretation and Future Research:
Preprocessing: Validation: Considerations:
Visualization:
•Historical Match •Feature •Poisson •We divided the •Feature selection •We assessed model •Visualizations, •Ethical •We considered
Data: We collected Engineering: We Regression Model: dataset into training methods, including performance using including heatmaps considerations potential avenues
a comprehensive conducted Given its suitability and validation sets, recursive feature metrics such as and charts, were were taken into for future research,
dataset of historical extensive feature for modeling goal- employing elimination and accuracy, precision, used to interpret account, including the
EPL match results, engineering to related events in techniques like feature importance recall, F1-score, model outcomes particularly integration of real-
including team extract relevant football matches, cross-validation to analysis, were and area under the and understand the concerning data time data and
performance information, such as we employed the assess model employed to Receiver Operating impact of different privacy and bias extending
metrics, player team form, player Poisson regression performance. identify the most Characteristic features on match mitigation in model predictions to other
statistics, and form, home and model as the core •Hyperparameter influential (ROC-AUC) curve. predictions. training and football leagues.
match-specific away performance, predictive tuning was predictors for •Comparative •Model explanations predictions.
details. and weather- algorithm. conducted to match outcomes. analysis between and insights were
•Weather Data: related variables. •Machine Learning optimize the the Poisson derived to provide
Weather conditions •Data Cleaning: Classifiers: models for regression model context to the
for each match The dataset Additionally, we predictive accuracy. and machine predictions.
were obtained, underwent rigorous utilized various learning classifiers
drawing on sources cleaning to handle machine learning was conducted to
such as missing values, classifiers such as determine the most
meteorological outliers, and Logistic Regression, effective approach.
databases and inconsistencies, Random Forest,
historical weather ensuring data and Support Vector
records. quality. Machine (SVM) to
•Team and Player •Normalization and compare their
Data: Information Scaling: Features performance with
on team rosters, were normalized the Poisson
player attributes, and scaled to regression model.
and past maintain
performances was consistency and
sourced from facilitate machine
reputable sports learning model
databases and EPL training.
records.
SYSTEM ANALYSIS AND DESIGN
System
Data Flow Diagram Architecture:
(DFD): • The preliminary system
architecture was
• A high-level DFD was
outlined, including the
Preliminary System developed to illustrate
the flow of data within
high-level components
Design: the system. It outlined
and their interactions.
This helped identify the
the processes involved
major subsystems such
in data collection,
as data storage,
preprocessing, model
preprocessing,
training, and prediction.
modeling, and results
presentation.
Objective: To test the system's Objective: To evaluate the Objective: To validate whether Objective: To assess the Objective: To evaluate the
individual components interaction between different the system meets its functional system's performance, predictive accuracy and
(functions, methods, classes) in system modules and requirements and user scalability, and responsiveness generalization of machine
isolation. components. expectations. under various conditions. learning models.
Method: Develop test cases for Method: Test how different Method: Create test cases Method: Implement k-fold
Method: Conduct load testing
each component, providing parts of the system work based on functional cross-validation on the
to determine how the system
input data and assessing the together. Verify that data flows requirements and expected historical match data to assess
behaves under heavy user
output. Ensure that data correctly between data user interactions. Verify that the model's performance on
loads. Measure response times
preprocessing, model training, preprocessing, modeling, and users can input data, obtain different subsets of the data.
for predictions and ensure they
and result generation functions result visualization predictions, and view results as Measure metrics like accuracy,
meet acceptable thresholds.
behave as expected. components. intended. precision, recall, and F1-score.