Crime Data Analysis Using ML
Crime Data Analysis Using ML
Presented by
KANDULA SAI
22A11F00D2
up. This continuous increase in the criminal cases is proving to be difficult to be classified and to be
solved. Recognizing the criminal activity patterns of a place is important in order to prevent it from
happening. The crime solving agencies can do a better work if they have a good idea of the pattern of
criminal activities that are happening in a particular area. This can be done by using machine learning by
employing different algorithms to find the patterns of the criminal activities in a particular area. This
paper uses crime data set and predicts the types of crimes in a particular area which helps in speeding up
the classification of criminal cases and proceed accordingly. This paper uses the data of past 18 years that
is collected from various trusted sources. Data pre-processing is as important as final prediction, this
paper used feature selection, removing null values and label encoding to clean and nourish the data. This
research gives an efficient machine leaning model for predicting the next criminal case.
INTRODUCTION
At present, the criminal cases that are pending in India are rapidly increasing with
the number of crimes committed are increasing. To solve a case based upon a
particular data there should be a thorough investigation and analysis that is to be done
internally [1]. With the amount of crime data that is present in India currently the
analysis and decision making of these criminal cases is too difficult for the officials.
Identifying this a major problem this paper concentrates on creating a solution for the
decision making of crime that is committed. Machine Learning is the branch of
science where computers decide without human intervention. In recent times Machine
Learning is being used in various domains one of the examples of such cases is
automated or self-driving cars. By Machine Learning algorithms there is a way where
we can predict certain results based upon our inputs given and provide a solution to
solving crime cases in India.
EXISTING SYSTEM
Many researchers have gone through this problem regarding the criminal cases
being unsolved for a long period. They proposed different crime prediction
algorithms. In all these models the accuracy will surely vary depending on the data
set and the features or attributes we select during data pre-processing.
In Crime prediction done on the Mississippi crime data set where models like
linear regression and Decision stump model are used gave a result of 83%, 88%
and 67% respectively [1].
Although these accuracies of the predictions may vary accordingly because it is
discovered that many machine learning algorithms are implemented on data sets
consisting of different places having distinctive features, so predictions are
changing in all cases.
DISADVANTAGES OF EXISTING
SYSTEM:
going through various such documentations. Nearly all of the crimes are predicting
based on the location and the types of crimes that are occurring in those areas.
On surveying previous works, Linear Regression, Decision Tree and Random
Forest tend to give good accuracy so these models are used in this paper to predict
crimes.
The dataset used in this paper is from data.world.com. The data set contains
different types of crimes that being committed in India according to the state and
year respectively [4].
This paper takes types of crimes as input and gives the area in which crimes are
committed as output.
The data pre-processing involves data cleaning, feature selection, dropping
System - Pentium i3
Hard Disk - 500 GB.
Monitor - 14’ Colour Monitor.
Mouse - Optical Mouse.
Ram - 4 GB.
SOFTWARE REQUIREMENTS
Operating system - Windows 8/10.
Coding Language - PYTHON
Software - Jupyter
MODULES:
that will be employed by machine learning agents to classify a criminal activity given
a location and date. The training agent suffers from imbalanced categories of the
dataset, it had been ready to overcome the problem by oversampling and under-
sampling the dataset. This paper presents a crime data prediction by taking the types
of crimes as input and giving are in which these crimes are committed as output
using Colab notebook having python as a core language and python provide inbuilt
libraries such as Pandas and Numpy through which the work will be completed faster
and Scikit provides all the processes of how to use different libraries providing by the
python. Results of prediction are different for different algorithms and the accuracy of
Random Forest Classifier found to be good with the accuracy of 95.122%.
REFERENCES
McClendon, Lawrence, and Natarajan Meghanathan. "Using machine learning
algorithms to analyze crime data." Machine Learning and Applications: An
International Journal (MLAIJ) 2.1 (2015): 1-12.
Alkesh Bharati, Dr Sarvanaguru RA. K,” Crime Prediction and Analysis Using
Machine Learning” in International Research Journal of Engineering and Technology
(IRJET), Volume: 05 Issue: 09| September 2018