Football Match Winner Prediction

Uploaded by

farizabid23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views

Football Match Winner Prediction

Uploaded by

farizabid23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

International Journal of Computer Applications (0975 – 8887)

Volume 154 – No.3, November 2016

Football Match Winner Prediction

Saurabh Vaidya Harshal Sanghavi Kushal Gevaria
Department of Department of Department of
Computer Engineering Computer Engineering Computer Engineering
Dwarkadas J. Sanghvi Dwarkadas J. Sanghvi Dwarkadas J. Sanghvi
College of Engineering College of Engineering College of Engineering
Mumbai, India Mumbai, India Mumbai, India

ABSTRACT source used by us in this project is www.football-co.uk . The

data has to be scraped and stored to extract the features. We
Prediction of football match outcome should follow
collect data over 10 seasons from 2004-05 to 2014-15. We
approaches that are more generalized. Hence for our project
extract set of 4 features per team. All the data are scraped with
we predict outcomes of English Premier League based on the
help of crawlers.
historical data of the matches and using machine learning
algorithms. We gathered data from past 10 seasons and The features generally used are taken in its direct form like
extracted features like form, goals scored and conceded, shots shots, cards, goals etc. However, we have attempted to
ratio. The computation of form feature is different from has perform some computations to make some complex features.
been prevalent till now. More focus is given to gain more Various machine learning techniques have been used to
insight and associate a deeper and better meaning to form of a predict match outcomes like Clustering, SVM, Bayesian
team. Basic features like shots ratio and goals scored are classifiers etc. We would be trying different techniques to find
combined to create feature of attacking quotient. We using the one which suits our data sets.
Logistic Regression and implement voting algorithm between
Random Forest and Naive Bayes classifier to achieve 2. LITERATURE REVIEW
accuracy between 47-50% with mean absolute error of 0.37. The term “Data Mining” was first used around 1990 in the
database community. Data mining and Knowledge discovery
Keywords are used interchangeably. Data mining is the process of
Machine learning; Data mining; Prediction system; Football; extracting information from a data set and converts it into
Classifiers; Knowledge discovery database system understandable structured form [4]. Data mining has many
applications and thus this term is much useful in predicting
1. INTRODUCTION the match winner in football sports by analyzing the previous
2010 FIFA World cup, showed a display of sheer brilliance by match data. Data mining with machine learning can make
Paul the Octopus. Paul predicted the winner correctly an such predictions work efficiently. Arthur Samuel in 1959,
astonishing 8 times when he was tested. There are other defined machine learning as "Field of study that gives
predicting techniques, which can predict the outcome after computers the ability to learn without being explicitly
half-time; while some predict the outcomes on an on-going programmed". Machine learning conflated with data mining
basis; however, the accuracy is not good. So, for the love of helps us to focus more towards exploratory data analysis.
the game and the eagerness to learn new techniques of Based on trained data, machine learning does the prediction
prediction, we have made an attempt to devise our own that depends on the properties learnt from those trained data
method to predict the outcome of a football match. [5].
The problem of predicting football match winner is a multi- Betting is widely popular among sporting events ranging from
class classification problem having three classes: win, loss, cricket, football to tennis and snooker. Douwe Buursma gives
draw. Out of these, win and loss are comparatively easy to importance towards effective betting on football matches [1].
classify. However, the class of draw is very difficult to predict Betting is prominently popular in football, as it is one of the
even in real world scenario. A draw is not a favored outcome world’s famous and most widely watched sport in the world.
for pundits as well as betting enthusiasts. The betting system works in following way: The bettor wins
money if his bets placed turn out to be correct and loses
English Premier League (EPL) is the most watched football
money otherwise. The money earned or lost is based on the
league in the world with almost 4.7 billion viewers. In our
odds determined by the bookmakers. When the probability of
paper, we have chosen English Premier League for its
the outcome is say 0.5, the bookmakers odds would be 5.
competitiveness as well as its random nature of outcomes. For
However to earn profit, the bookmakers place the odds at say
example, in the season of 2010-11, the distribution of wins,
4.5. Thus, to eliminate this “unfairness” it is necessary to find
losses and draws was 35.5%, 35.5% and 29% respectively. So
accurate probabilities of wins or draws to beat the
if we calculate the measure of randomness:
bookmakers’ odds. Douwe Buursma uses different machine
Entropy = − (.29 ∗ log3(.29) + 2(.355 ∗ log3(.355))) learning classifiers and the accuracy of 55.08% is obtained by
using regression and multi-class classifier [1].
= 0.72 [3].
Nivard van Wijk uses the betting concept which leads one to
This is very close to 1 (state of complete randomness). Thus predict a match winner and thus proposes two models to
testing our results on EPL would only help to justify the explain the prediction. These two models are toto-model and
generality of our approach. score-model respectively. This paper explains the prediction
The major challenge in task of predicting match outcome is system mathematically by all the methods and formulas
the extraction and availability of required data. The data specified in the article itself. The accuracy of about 53.03% is

31
International Journal of Computer Applications (0975 – 8887)
Volume 154 – No.3, November 2016

obtained after comparing all the models proposed in this paper ratio. These two features would signify how good the team is
[2]. in terms of attack. The defense quotient is computed using the
features: successful tackles and intercepted passes. These
Ben Ulmer and Matthew Fernandez predicted the soccer would signify the strength of the defense.
match results in English Premier League. They used some
machine learning techniques, which include classifiers namely After feature selection and computation, the next task would
Linear from stochastic gradient descent, Naïve Bayes, hidden be selecting upon the classifier to be used. Initially we used
Markov model, Support Vector Machine and Random forest. Logistic regression to classify the data set, however it
Accuracy of each and every model was calculated to find the classified only 2 classes and not the 3rd one.
better approach. They proposed that the results of the first few
matches couldn’t be predicted due to the lack of data
regarding the form of the team. They compared all the
methods out of which SVM showed the best result of 40% -
52% accuracy [3].

3. WORKING OF THE SYSTEM

As seen in literature survey, different systems had their own
different set of parameters and classifiers. The accuracy of the
system would thus depend on the feature selection and
computation as well as the type of classifier used. In order to
achieve a better accuracy than previous systems, we would
focus on selecting proper features and computing accurate
algorithms on those features and selecting the best classifier. Fig. 1: Form v/s Form Graph
The prediction system proposed by us would have three main
On plotting the dataset on a graph, we got the following
parameter components viz. current form, attacking quotient
result:
and defensive quotient.
As we can observe, the dataset is very sparse and hence using.
The current form is calculated keeping in mind two factors:
Decision trees and Naïve Bayes classification would yield
home/away outcome and relative position of two teams. A
better results. Hence, the next algorithm that we implemented
form matrix is constructed which implements the above
is Vote algorithm. This algorithm uses the best outcomes of
factors and gives a detailed information about the magnitude
all the listed algorithms and generates a cumulative outcome.
of a team’s loss or win.
We used Random forest and Naïve Bayes classification
Table 1. FORM MATRIX algorithms. This algorithm was able to classify the 3rd class
which was not possible using any other algorithm.
Teams Points Multiplying Home Away
Factor loss win The following is our system architecture:

A 0.75 0.15 -20% 20%

B 0.6 0.25 -16% 16%

C 0.4 0.4 -12% 12%

D 0.15 0.6 -10% 10%

Fig. 2: System Architecture

The above table is used to calculate a team’s form (recent 5
matches). 20 teams are divided equally in groups of 4 based As seen in the architecture we would extract all our features
on their table position. When a team wins, +1 and some extra that would be required, from a data source and compute the
points are awarded which depicts the magnitude of that win. above-mentioned parameters such as form and attack, defense
That magnitude is calculated using the above table. For quotients. The classifier system would give us a value that
example, if a team from group A wins against a team of group will determine the class to which the output would belong.
C (home of group C), points structure of Team A will be This output would then be approximated and mapped to
defined outputs (1 for win, 0 for a loss, and 0.5 for a draw).
Points = ((+1) + (0.15 * 0.4)) * 1.2 The final output would be a list of outcomes predicted for a
And that of team C will be Points = ((-1) – (0.15 * 0.4)) * 0.88 set of matches.

Finally, all the points of 5 recent matches will be added to 4. EXPECTED OUTCOME
generate a collective form. We collected data from various websites and data sources
using different scrapping tools. We generated a mathematical
Two main aspects of a football game are attack and defense.
model to represent the data in the format required by the
Thus comparing these two quotients of two teams gives us an
algorithms. The dataset was then divided in the ratio 80:20
intuition about the better team both attack-wise and defense-
(training: testing). We achieved 49.37% accuracy using
wise. The attacking quotient is again computed using
Logistic regression algorithm and below is the confusion
following features: shots on target and shots on target/goals
matrix:

32
International Journal of Computer Applications (0975 – 8887)
Volume 154 – No.3, November 2016

Table 2. Confusion Matrix of Logistic Regression Although this algorithm is not as accurate as the previous one,
it still classifies the 3rd class and hence there is a compromise
Predicted Predicted Predicted between accuracy and classification of all classes.
Win Loss Draw
5. CONCLUSION AND FUTURE SCOPE
Actual 268 32 1 Thus, it is seen that the case of draw reduces the accuracy of
Win predicting the remaining two classes. It is observed that by
removing the draw instances, accuracy can be increased up to
Actual 135 57 0 65%. Logistic regression fails to classify the draw class. So in
order to achieve generality, voting algorithm is preferred.
Loss
Availability of more features that can help in solving the issue
of predicting draw class would improve the accuracy. Also,
Actual 138 27 0 algorithms optimal for sparse data such as decision trees and
Draw boosting algorithms may also increase the accuracy.

6. REFERENCES
As we can see from the confusion matrix, Logistic regression [1] Douwe Buursma; Predicting sports events from past
classifies only 2 classes and just 1 instance of class 3. Hence, results, University of Twente, 2011.
we used a different algorithm Vote which selects the best
results of multiple algorithms. Here, we have used Random [2] Nivard, W. & Mei, R. D.Soccer analytics: Predicting the
forest and Naïve Bayes classification algorithms for voting. of soccer matches. (Master thesis: UV University of
Accuracy achieved is 47.11% and below is the confusion Amsterdam), 2012.
matrix:
[3] Ben Ulmer and Matthew Fernandez; Predicting Soccer
Table 3. Confusion Matrix of Vote Algorithm Match results in the English Premier League, cs229,
2014.
Predicted Predicted Predicted
Win Loss Draw [4] Data mining [Online]. Available:
https://en.wikipedia.org/wiki/Data_mining
Actual Win 235 52 14 [5] Machine Learning [Online]. Available:
https://en.wikipedia.org/wiki/Machine_learning
Actual Loss 114 66 12

Actual 112 44 9
Draw

IJCATM : www.ijcaonline.org 33

Enrique Dóal Pérez Frías - Predictive Methods For Football and Betting Markets (2023)
100% (1)
Enrique Dóal Pérez Frías - Predictive Methods For Football and Betting Markets (2023)
485 pages
Prediction and Analysis of Franchise Cricket
No ratings yet
Prediction and Analysis of Franchise Cricket
8 pages
Decision Trees
100% (1)
Decision Trees
23 pages
Using Bookmaker Odds To Predict The Final Result of Football Matches
No ratings yet
Using Bookmaker Odds To Predict The Final Result of Football Matches
11 pages
Models of Decision Making
100% (6)
Models of Decision Making
19 pages
Machine Learning For Football Matches and Tournaments
No ratings yet
Machine Learning For Football Matches and Tournaments
8 pages
Sports Result Prediction System: Random Forest Algorithm Performing Regression and Database
No ratings yet
Sports Result Prediction System: Random Forest Algorithm Performing Regression and Database
7 pages
Ben Ulmer, Matt Fernandez, Predicting Soccer Results in The English Premier League
No ratings yet
Ben Ulmer, Matt Fernandez, Predicting Soccer Results in The English Premier League
5 pages
Ben Ulmer, Matt Fernandez, Predicting Soccer Results in The English Premier League PDF
100% (1)
Ben Ulmer, Matt Fernandez, Predicting Soccer Results in The English Premier League PDF
5 pages
Predicting Epl Football Matches
No ratings yet
Predicting Epl Football Matches
9 pages
An Improved Prediction System For Football A Match Result - Data Mining
No ratings yet
An Improved Prediction System For Football A Match Result - Data Mining
9 pages
Comparison of Football Results Using Machine Learning Algorithms
No ratings yet
Comparison of Football Results Using Machine Learning Algorithms
7 pages
Predicting Game Results For Football League Using Deep Learning
No ratings yet
Predicting Game Results For Football League Using Deep Learning
6 pages
Predicting The Outcome of English Premier League Matches Using Machine Learning
No ratings yet
Predicting The Outcome of English Premier League Matches Using Machine Learning
6 pages
SportsAnalyticsforFootballLeagueTableandPlayerPerformancePredictionCR
No ratings yet
SportsAnalyticsforFootballLeagueTableandPlayerPerformancePredictionCR
8 pages
Predicting_Football_Match_Result_Using_Fusion-based_Classification_Models
No ratings yet
Predicting_Football_Match_Result_Using_Fusion-based_Classification_Models
6 pages
A Simulation-Basedmethodology For Predicting Footb
No ratings yet
A Simulation-Basedmethodology For Predicting Footb
22 pages
A Comparative Study of Data Mining Techniques On Football Match Prediction
No ratings yet
A Comparative Study of Data Mining Techniques On Football Match Prediction
8 pages
Football Result Prediction With Bayesian Network in Spanish League-Barcelona Team
No ratings yet
Football Result Prediction With Bayesian Network in Spanish League-Barcelona Team
4 pages
2408.08331v1
No ratings yet
2408.08331v1
10 pages
A Novel Approach For Predicting Football Match Results: An Evaluation of Classification Algorithms
No ratings yet
A Novel Approach For Predicting Football Match Results: An Evaluation of Classification Algorithms
8 pages
JARIE Volume 1 Issue 3 Pages 159-179
No ratings yet
JARIE Volume 1 Issue 3 Pages 159-179
21 pages
Sports Result Prediction System
No ratings yet
Sports Result Prediction System
2 pages
Prediction of Football Match Score and Decision Making Process
No ratings yet
Prediction of Football Match Score and Decision Making Process
4 pages
Ekefre Non Confidential
No ratings yet
Ekefre Non Confidential
59 pages
Predicting The Outcome of A Football Game: A Comparative Analysis of Single and Ensemble Analytics Methods
No ratings yet
Predicting The Outcome of A Football Game: A Comparative Analysis of Single and Ensemble Analytics Methods
9 pages
Proyect Predict Football Match Winners With Machine Learning and Python Foundations of Programming
100% (1)
Proyect Predict Football Match Winners With Machine Learning and Python Foundations of Programming
5 pages
Sports Analytics for Football League Table and Player Performance Prediction
No ratings yet
Sports Analytics for Football League Table and Player Performance Prediction
8 pages
Constantinou 2018 ML PDF
No ratings yet
Constantinou 2018 ML PDF
27 pages
LangsethSCAI14
No ratings yet
LangsethSCAI14
11 pages
A Comparative Study of The Different Classification Algorithms On Football Analytics
No ratings yet
A Comparative Study of The Different Classification Algorithms On Football Analytics
16 pages
Predictiveanalysis of PSL Match Winners Using Machine Learning Techniques
No ratings yet
Predictiveanalysis of PSL Match Winners Using Machine Learning Techniques
12 pages
Entropy 23 00090 v3
No ratings yet
Entropy 23 00090 v3
12 pages
Predicting Football Matches Using Neural Networks in MATLAB
100% (1)
Predicting Football Matches Using Neural Networks in MATLAB
6 pages
Football Predictions
No ratings yet
Football Predictions
4 pages
Using Machine Learning and Candlestick Patterns To
No ratings yet
Using Machine Learning and Candlestick Patterns To
18 pages
Predicting Sports Results Using Latent Features A Case Study
No ratings yet
Predicting Sports Results Using Latent Features A Case Study
6 pages
Omid Aryan, Ali Reza Sharafat, A Novel Approach to Predicting the Results of NBA Matches
No ratings yet
Omid Aryan, Ali Reza Sharafat, A Novel Approach to Predicting the Results of NBA Matches
5 pages
Football - Match - Result - Prediction - Using - Neural - Networks - and - Deep - Learning Yeah
No ratings yet
Football - Match - Result - Prediction - Using - Neural - Networks - and - Deep - Learning Yeah
4 pages
Introduction New
No ratings yet
Introduction New
3 pages
1998 - Prediction and Retrospetive Analysis of Soccer Matches in A League
No ratings yet
1998 - Prediction and Retrospetive Analysis of Soccer Matches in A League
23 pages
Sminton,+13509 Article+ (PDF) 30287 1 11 20220414
No ratings yet
Sminton,+13509 Article+ (PDF) 30287 1 11 20220414
38 pages
Predicting Outcome of Indian Premier League (IPL) Matches Using Machine Learning
No ratings yet
Predicting Outcome of Indian Premier League (IPL) Matches Using Machine Learning
12 pages
Analysis and Prediction of Football Statistics Using Data Mining Techniques
0% (1)
Analysis and Prediction of Football Statistics Using Data Mining Techniques
5 pages
English Premier League (EPL) Soccer Matches Prediction Using An Adaptive Neuro-Fuzzy Inference System (ANFIS)
No ratings yet
English Premier League (EPL) Soccer Matches Prediction Using An Adaptive Neuro-Fuzzy Inference System (ANFIS)
8 pages
iSCSi_RodriguesPintokk
No ratings yet
iSCSi_RodriguesPintokk
9 pages
journal.pone.0284318
No ratings yet
journal.pone.0284318
15 pages
Prediction and Retrospective An
No ratings yet
Prediction and Retrospective An
25 pages
Prediction of english premier league soccer matches
No ratings yet
Prediction of english premier league soccer matches
60 pages
Football Result Prediction by Stefan Samba
No ratings yet
Football Result Prediction by Stefan Samba
26 pages
Deep Learning Football
No ratings yet
Deep Learning Football
8 pages
Predicting Final Result of Football Match Using Poisson Regression Model
No ratings yet
Predicting Final Result of Football Match Using Poisson Regression Model
6 pages
applsci-14llj
No ratings yet
applsci-14llj
12 pages
Predicting Outcome of Soccer Matches Using Machine Learning
No ratings yet
Predicting Outcome of Soccer Matches Using Machine Learning
12 pages
Beating The Odds: Learning To Bet On Soccer Matches Using Historical Data
No ratings yet
Beating The Odds: Learning To Bet On Soccer Matches Using Historical Data
7 pages
Combining Textual Pre-Game Reports and Statistical Data For Predicting Success in The National Hockey League
No ratings yet
Combining Textual Pre-Game Reports and Statistical Data For Predicting Success in The National Hockey League
12 pages
Predicting The Outcome of Soccer Matches
100% (1)
Predicting The Outcome of Soccer Matches
97 pages
Smart Data Football
No ratings yet
Smart Data Football
25 pages
bMATH 2020 BruinsmaR
No ratings yet
bMATH 2020 BruinsmaR
43 pages
Game ON! Predicting English Premier League Match Outcomes
No ratings yet
Game ON! Predicting English Premier League Match Outcomes
5 pages
Projecting X 2.0: How to Forecast Baseball Player Performance
From Everand
Projecting X 2.0: How to Forecast Baseball Player Performance
Mike Podhorzer
5/5 (1)
A Simpler Football Simulation: A New Paradigm That Re-Frames the G.O.A.T. Debate
From Everand
A Simpler Football Simulation: A New Paradigm That Re-Frames the G.O.A.T. Debate
Andrew R. Crawford
No ratings yet
CMRIT B.tech Minor Honors Courses Regulations Syllabus
No ratings yet
CMRIT B.tech Minor Honors Courses Regulations Syllabus
75 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
Acknowledgement
No ratings yet
Acknowledgement
24 pages
Stratcost Finals-Reviewer
No ratings yet
Stratcost Finals-Reviewer
4 pages
Decision Tree Thesis
100% (3)
Decision Tree Thesis
5 pages
Machine Learning With Boosting
100% (1)
Machine Learning With Boosting
212 pages
80 Câu Hỏi Phỏng Vấn Về Python
No ratings yet
80 Câu Hỏi Phỏng Vấn Về Python
15 pages
Customer Relations 1st Edition Victoria J. Farkas 2024 scribd download
100% (14)
Customer Relations 1st Edition Victoria J. Farkas 2024 scribd download
60 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
24 pages
Online Payments Fraud Detection Documentation
No ratings yet
Online Payments Fraud Detection Documentation
40 pages
Conference Paper
No ratings yet
Conference Paper
11 pages
Crop Recommendation System KEC Conference
No ratings yet
Crop Recommendation System KEC Conference
16 pages
2 - TreePlan 182 Guide
No ratings yet
2 - TreePlan 182 Guide
22 pages
Modelling 2 Ed
No ratings yet
Modelling 2 Ed
74 pages
Lecture 11. Ch4. Decision Making Techniques (Part Two)
No ratings yet
Lecture 11. Ch4. Decision Making Techniques (Part Two)
32 pages
Group 22 - Final Year Black Book Plagiarism Report
No ratings yet
Group 22 - Final Year Black Book Plagiarism Report
13 pages
Decision Tree Analysis of Risk Technique
No ratings yet
Decision Tree Analysis of Risk Technique
8 pages
Detecting Phishing Websites Using Machine Learning
No ratings yet
Detecting Phishing Websites Using Machine Learning
7 pages
A Survey On Intrusion Detection System Using Machine Learning Techniques
No ratings yet
A Survey On Intrusion Detection System Using Machine Learning Techniques
7 pages
Detection of Fake Online Reviews by Using Machine Learning
No ratings yet
Detection of Fake Online Reviews by Using Machine Learning
7 pages
Making Hard Decisions
No ratings yet
Making Hard Decisions
3 pages
classXII DS Teacher Handbook
No ratings yet
classXII DS Teacher Handbook
73 pages
Data modification and predictive analytics_MCQ_1_2 (1)
No ratings yet
Data modification and predictive analytics_MCQ_1_2 (1)
24 pages
2022 Optimization of Random Forest Through The Use of MVO, GWO and MFO in Evaluating The Stability of Underground Entry-Type Excavations
No ratings yet
2022 Optimization of Random Forest Through The Use of MVO, GWO and MFO in Evaluating The Stability of Underground Entry-Type Excavations
22 pages
1822 B.E Cse Batchno 242
No ratings yet
1822 B.E Cse Batchno 242
54 pages
Unit-5_3161610
No ratings yet
Unit-5_3161610
92 pages
Decision Tree and Sensitivity Analysis
No ratings yet
Decision Tree and Sensitivity Analysis
18 pages
Analysis and Detection of Simbox Fraud in Mobility Networks: Proceedings - Ieee Infocom April 2014
No ratings yet
Analysis and Detection of Simbox Fraud in Mobility Networks: Proceedings - Ieee Infocom April 2014
9 pages

Football Match Winner Prediction

Uploaded by

Football Match Winner Prediction

Uploaded by

International Journal of Computer Applications (0975 – 8887)

Volume 154 – No.3, November 2016

Football Match Winner Prediction

ABSTRACT source used by us in this project is www.football-co.uk . The

3. WORKING OF THE SYSTEM

A 0.75 0.15 -20% 20%

B 0.6 0.25 -16% 16%

C 0.4 0.4 -12% 12%

D 0.15 0.6 -10% 10%

Fig. 2: System Architecture

You might also like