0% found this document useful (0 votes)
6 views

Crime Data Analysis Using ML

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Crime Data Analysis Using ML

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Crime Data Analysis Using ML

Presented by
KANDULA SAI
22A11F00D2

Under the esteemed guidance of


M.ANJINEYULU
Associate Professor

NEWTONS INSTITUTE OF ENGINEERING


(Approved by AICTE, New Delhi & Affiliated to JNTU Kakinada)
MACHERLA-522426,A.P.,INDIA.
ABSTRACT
The criminal cases in India are increasing rapidly due to which number of cases pending are also piling

up. This continuous increase in the criminal cases is proving to be difficult to be classified and to be
solved. Recognizing the criminal activity patterns of a place is important in order to prevent it from
happening. The crime solving agencies can do a better work if they have a good idea of the pattern of
criminal activities that are happening in a particular area. This can be done by using machine learning by
employing different algorithms to find the patterns of the criminal activities in a particular area. This
paper uses crime data set and predicts the types of crimes in a particular area which helps in speeding up
the classification of criminal cases and proceed accordingly. This paper uses the data of past 18 years that
is collected from various trusted sources. Data pre-processing is as important as final prediction, this
paper used feature selection, removing null values and label encoding to clean and nourish the data. This
research gives an efficient machine leaning model for predicting the next criminal case.
INTRODUCTION

 At present, the criminal cases that are pending in India are rapidly increasing with
the number of crimes committed are increasing. To solve a case based upon a
particular data there should be a thorough investigation and analysis that is to be done
internally [1]. With the amount of crime data that is present in India currently the
analysis and decision making of these criminal cases is too difficult for the officials.
Identifying this a major problem this paper concentrates on creating a solution for the
decision making of crime that is committed. Machine Learning is the branch of
science where computers decide without human intervention. In recent times Machine
Learning is being used in various domains one of the examples of such cases is
automated or self-driving cars. By Machine Learning algorithms there is a way where
we can predict certain results based upon our inputs given and provide a solution to
solving crime cases in India.
EXISTING SYSTEM
Many researchers have gone through this problem regarding the criminal cases

being unsolved for a long period. They proposed different crime prediction
algorithms. In all these models the accuracy will surely vary depending on the data
set and the features or attributes we select during data pre-processing.
 In Crime prediction done on the Mississippi crime data set where models like
linear regression and Decision stump model are used gave a result of 83%, 88%
and 67% respectively [1].
Although these accuracies of the predictions may vary accordingly because it is

discovered that many machine learning algorithms are implemented on data sets
consisting of different places having distinctive features, so predictions are
changing in all cases.
 DISADVANTAGES OF EXISTING
SYSTEM:

By this methodology they had less accuracy in prediction.


PROPOSED SYSTEM
The proposed system is made on the basis of the research work that is done by

going through various such documentations. Nearly all of the crimes are predicting
based on the location and the types of crimes that are occurring in those areas.
 On surveying previous works, Linear Regression, Decision Tree and Random
Forest tend to give good accuracy so these models are used in this paper to predict
crimes.
The dataset used in this paper is from data.world.com. The data set contains

different types of crimes that being committed in India according to the state and
year respectively [4].
This paper takes types of crimes as input and gives the area in which crimes are

committed as output.
The data pre-processing involves data cleaning, feature selection, dropping

null values, data scaling by normalizing and standardizing. After data


preprocessing the data is free of null values which m ay alter the accuracy of
the model significantly and feature selection is used to select only the required
features that won’t affect the accuracy of model.
 After data pre-processing the models chosen i.e., Logistic Regression,
Decision Tree and Random Forest are trained by splitting the data
into as train and test data. As the output required is a categorical value
classification models are used here. Python language is used for the data
prediction.
 ADVANTAGES OF PROPOSED
SYSTEM:

We had a high accuracy in this model prediction methodology.


SYSTEM ARCHITECTURE
HARDWARE REQUIREMENTS

 System - Pentium i3
 Hard Disk - 500 GB.
 Monitor - 14’ Colour Monitor.
 Mouse - Optical Mouse.
 Ram - 4 GB.
SOFTWARE REQUIREMENTS
 Operating system - Windows 8/10.
 Coding Language - PYTHON
 Software - Jupyter
MODULES:

 Data Acquisition: Upload the URL data from the


local host

 Data Preprocessing: In this module, we will


perform label encoding, convert the text data into
token counts and quantify a word in documents, we
generally compute a weight to each word which
signifies the importance of the word in the
document and corpus.
Spliting: In this module we will split the data into
train and test data. x Train and y Train become data
for the machine learning, capable to create a
model.Once the model is created, input x Test and
the output should be equal to y Test. The more
closely the model output is to y Test: the more
accurate the model is.
 Modelling: in this module, we will apply the CNN-
LSTM and CNN-BiLSTM on URL text and we will
apply the ,machine learning algorithms on the
features of URL.
 Compariosn: Visualize the varies accuracy of
Modeling
USE CASE DIAGRAM:
SAMPLE RESULTS:
CONCLUSION
It is clear that basic details of criminal activities in a neighbourhood contain indicators

that will be employed by machine learning agents to classify a criminal activity given
a location and date. The training agent suffers from imbalanced categories of the
dataset, it had been ready to overcome the problem by oversampling and under-
sampling the dataset. This paper presents a crime data prediction by taking the types
of crimes as input and giving are in which these crimes are committed as output
using Colab notebook having python as a core language and python provide inbuilt
libraries such as Pandas and Numpy through which the work will be completed faster
and Scikit provides all the processes of how to use different libraries providing by the
python. Results of prediction are different for different algorithms and the accuracy of
Random Forest Classifier found to be good with the accuracy of 95.122%.
REFERENCES
 McClendon, Lawrence, and Natarajan Meghanathan. "Using machine learning
algorithms to analyze crime data." Machine Learning and Applications: An
International Journal (MLAIJ) 2.1 (2015): 1-12.

 Alkesh Bharati, Dr Sarvanaguru RA. K,” Crime Prediction and Analysis Using
Machine Learning” in International Research Journal of Engineering and Technology
(IRJET), Volume: 05 Issue: 09| September 2018

 McClendon, L., & Meghanathan, N. (2015). Using machine learning algorithms to


analyze crime data. Machine Learning and Applications: An International Journal
(MLAIJ), 2(1), 1- 12.
Chen, Hsinchun, et al. "Crime data mining: a general framework and some examples."
computer 37.4 (2004): 50-56.
THANK YOU

You might also like