Prediction of Factors in Vehicular Accident Using Machine Learning
Prediction of Factors in Vehicular Accident Using Machine Learning
net/publication/344441742
CITATIONS READS
0 120
1 author:
SEE PROFILE
All content following this page was uploaded by Irfan Ahmad Ganie on 01 October 2020.
ABSTRACT 1. INTRODUCTION
Vehicle traffic accident is one of the major agenda for the Road or vehicle traffic accident is a universal problem [1] and
worldwide reports show that on average, more than four million
government in which special attention has been given to
peoples die because of many reasons in one year. Among this
continuously reduce its occurrence and related risks. Wolaita numbers, HIV AIDS and tuberculosis are the first and second cases
zone is one of the major areas in which increased vehicle for the deaths and vehicle traffic accident is the third known case for
traffic accident occurs. Government and concerned bodies those dying on every day.
have given special attention to reduce accident rate in the
country. By having this point as the motivating factor for According to WHO and World Bank [2] in 2004, World Health Day,
study, this work tried to predict factors of vehicle accidents by organized by the World Health Organization for the first time be
using machine learning algorithms. We used unbalanced devoted to Road Safety. Every year, according to the statistics, 1.2
datasets with 1611 instances, which was seven years data million people are known to die in road accidents worldwide. The
study conducted on Guardian [3] also shows that in the 2020, vehicle
from year 2012-2019. In order to analyze data and evaluate
traffic accident will become the first factor that causes the death of
patters of datasets, KDD process model was applied. The human beings in the world. More than half the people killed in
learning algorithms applied for experiments were J48 vehicle traffic crashes are young adults aged between 15 and 44
decision tree, Random forest tree, Rep tree, Naïve Bayes and years often the breadwinners in a family. Furthermore, road traffic
Bayesian network classifiers. The experimental results, injuries cost low income and middle-income countries between 1%
model evaluation and performance measurement shows that and 2% of their gross national product; more than the total
F-measure of J48 and Rep tree classifiers are comparatively development aid received by these countries WHO and World Bank
similar i.e. 97.87% and 97.80% respectively and Random [2]. A lot of researches were conducted on accidents from time to
Forest tree performed less i.e. 90.9%. We identified the first time in every parts of the world to reduce the accident rate and they
used their own view on accident data according to their respective
experiment of J48 tree as the best model by performance and
areas and country perspectives.
23 best rules were generated from this experiment; best
features were also identified. The most common victims, most Even though plenty of researches were conducted, vehicle traffic
commonly participated vehicles in accident and black spot accident increases rapidly and results in massive loss of humans’
areas for frequent accidents occurrences were identified. The life, materials damage and other equivalent losses. WHO and World
findings of this study are significant for road and traffic Bank [2] show that worldwide, an estimated 1.2 million people are
authority and police commission for the revision and killed in road crashes each year and as many as 50 million are
endorsement of the rules, regulations and standards related to injured. Projections indicate that these figures will increase by about
traffic accidents; and therefore vehicle traffic accidents and 65% over the next 20 years unless there is new commitment to
prevention. The increased loses and related injuries cause various
related risks can be reduced generally in our country Ethiopia
problems to the economic development of respective countries.
and specially at Wolaita Zone. We made accident data ready According to the perspectives of different countries, there are
for further analysis in order to get most important patterns of different kinds of attributes and contributing cases of the traffic
datasets for any future researchers. accidents. The accident risk factors are more over determined in the
developed countries and some preventive measures have been taken
Key words : About four key words or phrases in alphabetical to reduce the risk. But traffic accident risks, related material
order, separated by commas. damages and life lose increases from time to time in developing
countries. In Ethiopia, some researches has been conducted but the
risk factors cannot be reduced from time to time. In the case of
5171
Aklilu Elias Kurika et al., International Journal of Emerging Trends in Engineering Research, 8(9), September 2020, 5171 – 5176
Wolaita Zone, the timely recorded data realities on ground show that In Ethiopia, Wolaita zone is one of the most commonly
traffic accident is the major issue that should be given special known areas in which traffic accidents and related injuries
attention. The reason is that the risks of traffic accidents and related
take place. By analyzing the factors with learning algorithms,
material and live loses show enormous increase from time to time.
But the reasons for increased traffic accident factors are not well
the most contributing factors will be determined from traffic
known. Additional deep analysis on accident data is indeed needed accident data which is obtained from WZPC. Other
and this is also a motivating factor to conduct study by machine contributing factors other than these might also be obtained
learning algorithms. for increased traffic accidents. The methodologies used by
various researchers are of various types. Akinbola et al., [10]
Generally the amount of data used by previous researchers is lesser; and [11] machine learning algorithms to predict the factors of
some others used secondary data, which is collected by
traffic accidents. Both of these authors used only decision
questionnaire, as well as social media data for analysis. Using this
kind of data for predicting factors of traffic accident is not feasible. tree; and Tibebe et al., [12] is all about machine learning
Most of the studies that were conducted in the past literature are algorithm but it is not for determining the causes of traffic
mainly focus on J48 decision tree algorithms. Other kinds of decision accidents and Gupta and Baluni [7] also used classification
tree algorithms are not used for comparative analysis by most of the and machine learning algorithms to determine traffic injury
researchers. Thus, performance comparisons have not been made for occurrences.
more than two algorithms.
3. MATERIALS AND METHODS
2. RELATED WORKS
Classification algorithm has been identified as the best
Studies [5] and [4] are related to the locations of accident technique to attain our objectives in accordance with
related factors; accordingly the road features are one of the predetermined datasets we had. From various classification
contributing factors of traffic accidents. But the types of road algorithms, decision tree classifiers (J48, Random Forest and
features are not clearly specified in these studies. Rep Tree) classifiers and from Bayesian classifiers (Naïve
Bayes and Bayesian Network) classifiers were selected to
Studies performed by authors [6] and [7] are comparative conduct our experiments. We have computed 15 experiments,
analysis in the performance measurement and accuracy of (three for each classifiers i.e. by 10 fold cross validation, by
algorithms. The first author compared six algorithms 66% split and by 90% split for each of them respectively.) We
(classification and regression tree, Random Forest, ID3, have identified 14 best features among 36 attributes with
Functional trees, Naïve Bayes and J48) algorithms to wrapper method.
determine the accidents severity level. It reveals that Naive
Bayes value and J48 techniques value are approximately same Knowledge discovery in datasets (KDD) process modeling
in accuracy. The second one the comparative study on has been used as a study design based on Figure 1.
machine learning algorithms; the comparison has been made
for decision tree and neural networks to determine factors of
increased traffic injury. It comes up with that the decision
trees are better than neural networks in performance.
5172
Aklilu Elias Kurika et al., International Journal of Emerging Trends in Engineering Research, 8(9), September 2020, 5171 – 5176
5173
Aklilu Elias Kurika et al., International Journal of Emerging Trends in Engineering Research, 8(9), September 2020, 5171 – 5176
5174
Aklilu Elias Kurika et al., International Journal of Emerging Trends in Engineering Research, 8(9), September 2020, 5171 – 5176
4.5 Performance Measurement of Learning Algorithms In this study, machine Learning approaches have been
applied for data analysis and prediction of car traffic accident
In the experiment evaluation part, we have identified that J48 datasets to explore important features and pattern
and Rep tree are comparatively similar and better that the relationships to car traffic accident occurrences. We
remaining three classifiers. So we have used selected the first addressed various statements of problems and objectives to
and third experiments for each classifiers and measured determine determinant factors of car traffic accidents. We
performance of their classifiers accuracy as follows. identified 7 most commonly participating vehicles, 20 areas
for frequent accident occurrences, pedestrians and passengers
as the most common victims and J48 and Rep tree as best
algorithms by performance and model accuracy. 23 best rules
were generated from the selected model for accident
occurrences, results have been discussed and finally some
points have been recommended for the future researchers
5175
Aklilu Elias Kurika et al., International Journal of Emerging Trends in Engineering Research, 8(9), September 2020, 5171 – 5176
REFERENCES
[1] Micheale Kihishen Gebru, "Road traffic accident: Human
security perspective," International Journal of Peace and
Development Studies, vol. 8, no. ISSN 2141–6621, p. 16,
March 2017.
[2] WHO and World Bank, "World Report on Traffic Injury
Preventions," New York, 2013.
[3] Guardian. Traffic Accident Predictions. [Online].
http://politics.guardian.co.uk/homeaffairs/story/0
,11026,1187637,00.html. 2012
[4] David Ian White, An Inverstigation of Factors Associated
with Traffic Accidents and Causality Risk in Scotland.
Scotland: Napier University, October 2002. [5] Durga
Toshniwal2 Sachin Kumar1, A data mining approach to
characterize road accident locations.: Published Online:
Springerlink.com, 2016.
[6] Armit Kaur Maninder Singh, "A Review on Road
accidents in Traffic system Using Data Mining Techniques,"
International Journal of Science and Research, p. 6, 2014.
[7] Mrs.Bhumika Gupta Pragya Baluni, "A comparative study
of various Algorithms to explore factors for vehicle collision,"
International Journal of Emerging Trends & Technology in
Computer Science (IJETTCS), 2012.
[8] Sani Salisu, Atomsa Yakubu, Yusuf Musa Malgwi,
Elrufai Tijjani Abdullahi, I. A. Mohammed and Nuhu
Abdul’alim Muhammad L. J. Muhammad, "Using Decision
Tree Data Mining Algorithm to Predict Causes of Road
Traffic Accidents, its Prone Locations and Time along Kano
–Wudil Highway," International Journal of Database Theory
and Applications, 2017.
[9] Claus Pastor, Manfred Pfeiffer, Jochen Schmidt Heinz
Hautzinger, "Analysys for Accident and Injury Risk studies.,"
Heilbronn University, November 2007.
[10] Akinbola Olutayo2 Dipo T. Akomolafe1, "Using Data
Mining Technique to Predict Cause of Accident and Accident
Prone Locations on Highways," American Journal of
Database Theory and Application, pp. 1-13, 2012.
[11] S. Vasavi, "Extracting Hidden Patterns Within Road
Accident Data Using Machine Learning Techniques," in
Information and Communication Technology Proceedings,
Kanuru, AP, India, 2018, p. 11.
[12] Dejene Ejigu, Pavel Kromer, Vaclav Snasel, Jan Platos
and Ajith Abraham Tibebe Beshah, "Mining Traffic Accident
Features by Evolutionary Fuzzy Rules," IEEE Symposium on
Computational Intelligence in Vehicles and Transportation
Systems, 2013
5176