Machine Learning Crime Prediction Models and The Gap Between Rese
Machine Learning Crime Prediction Models and The Gap Between Rese
Machine Learning Crime Prediction Models and the Gap Between Research
and Implementation: A Systematic Review
Ricardo Huamantingo
Faculty of Systems and Informatics Engineering, Universidad Nacional Mayor de San Marcos, Lima, Peru,
[email protected]
Miguel Cano-Lengua
Faculty of Systems and Informatics Engineering, Universidad Nacional Mayor de San Marcos, Lima, Peru.
Ciro Rodriguez
Faculty of Systems and Informatics Engineering, Universidad Nacional Mayor de San Marcos, Lima, Peru
Part of the Biology Commons, Chemistry Commons, Computer Sciences Commons, and the Physics Commons
Recommended Citation
Huamantingo, Ricardo; Cano-Lengua, Miguel; and Rodriguez, Ciro (2025) "Machine Learning Crime Prediction Models
and the Gap Between Research and Implementation: A Systematic Review," Karbala International Journal of Modern
Science: Vol. 11 : Iss. 3 , Article 11.
Available at: https://doi.org/10.33640/2405-609X.3419
This Review Article is brought to you for free and open access by
Karbala International Journal of Modern Science. It has been
accepted for inclusion in Karbala International Journal of
Modern Science by an authorized editor of Karbala International
Journal of Modern Science. For more information, please
contact [email protected].
Machine Learning Crime Prediction Models and the Gap Between Research and
Implementation: A Systematic Review
Abstract
A crime is an illegal or violent act committed by one individual against another. The increasing crime rate
has become a major concern as it negatively affects people's quality of life and generates significant
social and economic costs. This study aims to identify the most widely used machine learning (ML)
models for crime prediction, determine evaluation metrics for assessing model performance, and analyze
key data characteristics to enhance real-world implementation. The study follows the Preferred Reporting
Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology. A search string was
formulated using the population, intervention, comparison, and outcomes (PICO) framework and applied
to the Scopus and Web of Science database. After applying eligibility criteria, 50 articles were selected for
in depth analysis. The findings indicate that the most prominent ML models include extreme gradient
boosting (XGBoost), random forest (RF), gradient boosting decision trees (GBDT), and auto-regressive
integrated moving average (ARIMA), as well as deep learning models such as long short-term memory
(LSTM), which showed high performance in dynamic urban environments. The most relevant metrics for
classification are accuracy, recall, F1-score, precision, and area under the curve (AUC), while for
regression, mean absolute error (MAE), root mean squared error (RMSE), and R-squared (R2 ) are
preferred. Key data features include date, time, age, gender, education level, location, and coordinates.
Additionally, integrating climate and temperature data is recommended. This study provides a structured
analysis of crime prediction models and proposes an architecture for their development and deployment,
offering valuable insights for future research and practical applications.
Keywords
Machine learning, deep learning, supervised learning, feature selection, classification and regression,
predictive models, predictive analytics, crime prediction.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0
License.
This review article is available in Karbala International Journal of Modern Science: https://kijoms.uokerbala.edu.iq/
home/vol11/iss3/11
REVIEW ARTICLE
a
Faculty of Systems and Informatics Engineering, Universidad Nacional Mayor de San Marcos, Lima, Peru
b
Faculty of Systems and Informatics Engineering, Universidad Tecnol
ogica del Perú, Lima, Peru
Abstract
A crime is an illegal or violent act committed by one individual against another. The increasing crime rate has become
a major concern as it negatively affects people's quality of life and generates significant social and economic costs. This
study aims to identify the most widely used machine learning (ML) models for crime prediction, determine evaluation
metrics for assessing model performance, and analyze key data characteristics to enhance real-world implementation.
The study follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology. A
search string was formulated using the population, intervention, comparison, and outcomes (PICO) framework and
applied to the Scopus and Web of Science database. After applying eligibility criteria, 50 articles were selected for in-
depth analysis. The findings indicate that the most prominent ML models include extreme gradient boosting (XGBoost),
random forest (RF), gradient boosting decision trees (GBDT), and auto-regressive integrated moving average (ARIMA),
as well as deep learning models such as long short-term memory (LSTM), which showed high performance in dynamic
urban environments. The most relevant metrics for classification are accuracy, recall, F1-score, precision, and area under
the curve (AUC), while for regression, mean absolute error (MAE), root mean squared error (RMSE), and R-squared (R2)
are preferred. Key data features include date, time, age, gender, education level, location, and coordinates. Additionally,
integrating climate and temperature data is recommended. This study provides a structured analysis of crime prediction
models and proposes an architecture for their development and deployment, offering valuable insights for future
research and practical applications.
Keywords: Machine learning, Deep learning, Supervised learning, Feature selection, Classification and regression,
Predictive models, Predictive analytics, Crime prediction
* Corresponding author at: Faculty of Systems and Informatics Engineering, Universidad Nacional Mayor de San Marcos, Lima, Peru.
E-mail address: [email protected] (R. Huamantingo).
https://doi.org/10.33640/2405-609X.3419
2405-609X/© 2025 University of Kerbala. This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
472 R. Huamantingo et al. / Karbala International Journal of Modern Science 11 (2025) 471e488
culture, education, employment, traditional beliefs, With the continued rise in crime rates, it is crucial
and legal systems often influence the distribution of to address the associated problems immediately.
crime in urban areas. The presence of crime nega- The accelerated increase in criminal activity neces-
tively affects quality of life and contributes to sitates that law enforcement and justice agencies
broader social challenges, imposing significant costs implement effective measures to control and reduce
on both public and private sectors [7]. it. Despite the abundance of data available, pre-
In this context, crime in India has seen a marked dicting crimes and identifying those responsible
increase over time. According to data from the Na- remains a significant challenge for police de-
tional Crime Records Bureau (NCRB), an average of partments [16]. As [17] points out, crime is difficult
8837 offenses under the Indian Penal Code (IPC) to predict because it is unpredictable and can occur
were reported daily in 2019. In 2020, IPC case filings at any time and place, posing a significant challenge
increased by more than 430% compared to the same to society.
period the previous year. In 2022, violent crimes In recent years, interest in applying mathematical
included 28,522 murders and 107,588 cases of and statistical methods for crime prediction has
kidnapping and abduction, with crime rates of 66.4 grown considerably, enabling more proactive
per 100,000 women and 36.6 per 100,000 children, policing strategies [3]. Reliable data combined with
reflecting a persistent rise in criminal activity robust statistical models provides critical insights
nationwide [8e10]. In England and Wales, police- that improve the allocation of police resources, as
recorded crime rose by 13%, particularly in offenses demonstrated in Rio de Janeiro's Public Security
involving violence, weapons, sexual assaults, and Secretariat, where crime patterns are analyzed to
personal attacks [11]. Additionally, recidivism optimize police deployment [18]. Given that crime
significantly impacts public safety and incarceration rates and types vary significantly across regions, this
costs. A 2016 study in England and Wales estimated remains a persistent global challenge [5,16]. To
recidivism-related costs at £18.1 billion, with inter- address these complexities, machine learning has
national estimates placing the recidivism rate at emerged as a valuable tool, strengthening law en-
approximately 50% [12]. According to a United Na- forcement's capacity to anticipate and prevent
tions report, although crime has declined in some criminal activity [8].
developed regions, such as North America and Crime prediction has evolved from basic analysis
Western Europe, it has risen in areas like Africa and methods to sophisticated approaches incorporating
Latin America. Notably, 60% of urban residents in machine learning and deep learning. These models
developing countries have been victims of crime, integrate crime data with demographic, economic,
with some cities in these regions reporting victimi- and social information to generate more precise
zation rates as high as 70% over five years [13]. predictions, allowing for a deeper understanding of
Crime is closely linked to socioeconomic factors, crime patterns and more effective prevention stra-
particularly unemployment, creating a cycle where tegies [19]. However, the effectiveness of crime
people with criminal records face employment re- prevention efforts may be compromised when
strictions, increasing crime rates [14]. This urban applied rigidly, without taking into account the
crime imposes significant economic and social bur- unique dynamics of different urban settings [20].
dens, necessitating policies that enhance public Multiple factors influence crime rates and evolve
safety [7,15]. Law enforcement and policymakers are over time, making data-driven approaches essential.
focusing on crime prevention by optimizing re- Algorithms such as simple linear regression (SLR),
sources and addressing the drivers of crime, multiple linear regression (MLR), decision tree
including politics, economics, education, and de- regression (DTR), support vector regression (SVR),
mographics [3]. Furthermore, these factors are being and random forest regression (RFR) are widely used
investigated in Malaysia, using crime data from the to estimate crime incidents [8]. Additionally, time
Royal Malaysian Police and Meteorological series techniques like ARIMA and SARIMA have
Department from 2011 to 2020 [16]. Crime also in- proven effective in forecasting crime based on his-
fluences personal and business decisions, impacting torical data [21]. Spatiotemporal prediction is also
relocation, travel, and security investments. Conse- critical, as it enables the identification of potential
quently, governments, businesses, and individuals crime hotspots by integrating historical data, envi-
must allocate significant resources to law enforce- ronmental variables, and demographic characteris-
ment, judicial procedures, and other security tics through statistical and machine learning
mechanisms [11]. Addressing these challenges re- methods [22]. These models not only enhance crime
quires data-driven solutions for effective crime hotspot mapping but also support targeted, data-
prevention. driven interventions [4].
R. Huamantingo et al. / Karbala International Journal of Modern Science 11 (2025) 471e488 473
Nowadays, obtaining and analyzing crime data is responsible for public security, policymakers, and
essential. Crime data can be identified by several justice administration entities. This study is struc-
factors, such as the event's location, the time when tured in sections. Section II is presented below,
the crime was detected, and the prediction of their where the methodology and process used to obtain
future relationship, which is essential in a crime the inputs for the systematic review are described. In
prevention system. According to research by Section III, the results obtained are presented, along
Ref. [23], time and place are crucial in identifying with a proposal resulting from the reviews and the
crime patterns. Machine learning techniques enable respective discussions. Finally, the conclusions ob-
the extraction of valuable information from tained, and future works are given in Section IV.
collected datasets and the identification of relation-
ships between crime, place, and time. Many re- 2. Method
searchers have pointed out that detecting crime
patterns is a critical and time-consuming task, This article adopts the Preferred Reporting Items
which [23] suggests can be solved by machine for Systematic Reviews and Meta-Analyses
learning techniques. (PRISMA) methodology to ensure a rigorous and
However, despite these promising advances, the transparent systematic review process. This
application of machine learning to crime prediction approach provides a structured guide that allows
in real-world contexts faces significant challenges. researchers to reduce bias and communicate results
These include limited access to quality data, biases more clearly and comprehensively, facilitating
in datasets and algorithms, and ethical concerns replicability and usefulness for the scientific com-
regarding surveillance and privacy. A critical limi- munity and other interested users [24,25]. The
tation also lies in the lack of understanding of the PRISMA methodology was implemented to identify,
contextual realities and data structures within select, and analyze relevant articles, ensuring
institutional systems. Technical expertise alone is compliance with defined inclusion and exclusion
insufficient; effective modeling requires domain criteria. This structured approach enables an
knowledge of how and why the data were gener- objective and reliable assessment of the available
ated. These barriers help explain the persistent gap literature, thereby contributing to the quality and
between academic research and practical imple- accuracy of the results obtained in research on
mentation in public safety systems. crime prediction using machine learning.
This study arises from the need to know and
determine the most recent advances in crime pre- 2.1. Research questions
diction. It seeks to identify advances in machine
learning, real-world cases treated, the results ob- Today, emerging technologies such as machine
tained, and the difficulties and limitations in this learning are significantly transforming strategies for
thematic area. In addition, the results obtained have crime prediction and prevention [19]. This article
important implications in the academic world, conducts a systematic review of the literature,
particularly in decision-making that relies on evi- employing the PRISMA methodology to assess
dence and updated, rigorously examined informa- the effectiveness of predictive models in this field.
tion, as well as in state agencies and institutions Tables 1 and 2 also present a detailed breakdown of
the research questions formulated under the PICO (“metrics” OR “evaluation metrics” OR “model
framework (problem, intervention, comparison, and performance” OR “precision” OR “accuracy” OR
outcome), providing a clear structure for the “recall” OR “F1-score” OR “AUC” OR “ROC curve”
analysis. OR “confusion matrix” OR “mean squared error”
OR “MSE” OR “mean absolute error” OR “RMSE”
2.2. Search strategy OR “cross-validation”)).
For Web of Science:
The PICO approach was used as a search strategy, TS¼((“crime prediction” OR “crime detection” OR
facilitating the construction and organization of the “prediction of crime” OR “crime prevention” OR
search string based on its key components and the “crime forecasting” OR “crime analytics” OR “crime
questions outlined in Tables 1 and 2. Table 3 pre- analysis” OR “crime rate”) AND (“machine
sents the details of the elements that make up this learning” OR “learning machine” OR “transfer
framework, along with the selected keywords and learning” OR “unsupervised learning” OR “super-
terms used in formulating the search string. vised learning” OR “reinforcement learning” OR
The main search string is built once the keywords “neural network*” OR “deep learning” OR “artificial
and search strings have been identified. First, the intelligence”) AND (“feature” OR “data set” OR
boolean operator “OR” is used with each keyword in “dataset” OR “data features” OR “attributes” OR
each component respectively; finally, the “AND” is “variables” OR “data collection”) AND (“metrics”
used with all the PICO components according to OR “evaluation metrics” OR “model performance”
their coding {(PRO) AND (INT) AND (COM) AND OR “precision” OR “accuracy” OR “recall” OR “F1-
(OUT)}. The resulting search equation to be used is: score” OR “AUC” OR “ROC curve” OR “confusion
For Scopus: matrix” OR “mean squared error” OR “MSE” OR
TITLE-ABS-KEY ((“crime prediction” OR “crime “mean absolute error” OR “RMSE” OR “cross-
detection” OR “prediction of crime” OR “crime validation”)).
prevention” OR “crime forecasting” OR “crime an-
alytics” OR “crime analysis” OR “crime rate”) AND 2.3. Eligibility criteria
(“machine learning” OR “learning machine” OR
“transfer learning” OR “unsupervised learning” OR Inclusion and exclusion criteria were applied to
“supervised learning” OR “reinforcement learning” ensure a comprehensive and systematic review of
OR “neural network*" OR “deep learning” OR the literature. These criteria helped guarantee that
“artificial intelligence”) AND (“feature” OR “data the selected studies were relevant, high-quality, and
set” OR “dataset” OR “data features” OR “attributes” aligned with the research objectives. Table 4 out-
OR “variables” OR “data collection”) AND lines the specific parameters used to select or
short-term memory (LSTM) and deep neural the metric is used for classification (C) or regression
network (DNN) are the most widely used and best- (R), as well as another indicating the highest per-
performing models in this group. Finally, a group of formance value achieved and the model that ob-
hybrid models developed by the authors in their tained it. In addition to the metrics listed in Table 8,
respective research has been identified. Regarding other relevant measures were identified in specific
this, it is necessary to generate experimental envi- studies, such as the Pearson Correlation Coefficient
ronments with similar data sets and new character- (PCC), Predictive Accuracy Index of Raster (PAIR)
istics in the data to test and validate the effectiveness [60], execution time, memory usage, training and
of these models in various scenarios. The above in- testing time [5], Predictive Accuracy Index (PAI),
formation can be found in Table 6, where the most Predictive Efficiency Index (PEI), Recapture Rate
used models are listed in order of performance. Index (RRI) [37], JenseneShannon Divergence
Table 7 also illustrates the techniques employed to (JSD), and Total Variational Distance (TVD) [55].
enhance the performance of the models. Fig. 3 provides a visual summary of the top-
performing models, as evaluated by the metrics re-
B. RQ2. What evaluation metrics do machine ported across the reviewed studies. Each point
learning models use in crime prediction? represents a model and the metric where it achieved
its highest performance. The color indicates the
Table 8 presents the most used evaluation metrics model type (machine learning, deep learning, or
and their frequency across the reviewed articles. It hybrid), while the marker shape distinguishes
also includes a column that distinguishes whether whether the model is among the most frequently
R. Huamantingo et al. / Karbala International Journal of Modern Science 11 (2025) 471e488 477
used in recent literature. This visualization allows characteristics that stand out the most are the date
for a quick comparison of model effectiveness and and time, geographic location (including co-
popularity, highlighting both widely adopted and ordinates, latitude, and longitude), and characteris-
high-performing emerging approaches. tics related to the crime, the offender, and the
victim; interesting characteristics such as climate
C. RQ3. What data features are most relevant to and temperature have also been included. These
improving the accuracy of machine learning variables provide essential context for identifying
models in crime prediction? complex correlations and existing patterns in crim-
inal activity. Additionally, including broad and
The performance of machine learning models in recent time ranges facilitates the detection of his-
crime prediction depends mainly on the character- torical and seasonal trends, thereby increasing the
istics, variables, or attributes considered in the predictive capacity of the models. In summary,
datasets used. A fundamental aspect is the volume optimal datasets for predicting crimes combine
of data, since an extensive dataset allows for volume, quality, diversity of variables, and adequate
capturing more representative patterns, reducing temporal coverage, which enables the replication of
bias, and increasing the model's ability to gener- successful studies and advances in the accuracy of
alize. In addition, the quality of the dataset is crucial, crime prediction across different contexts.
with data from reliable sources such as police de-
partments, government entities, and open data D. RQ4. In what real-life context have machine
platforms being preferred. learning based crime prediction models been
In Table 9, the most relevant characteristics used implemented, and what have been the main
in the reviewed articles can be observed. This table challenges?
shows the city and origin from which the data were
obtained, the type of crime, the characteristics, and Although machine learning models designed for
the number of records in the dataset. The crime prediction often show promising performance
in scientific studies, the review of the selected arti-
Table 7. Techniques for model performance.
cles suggests that most of these models do not
Techniques Authors transcend the academic field. Therefore, although
Hyperparameter tuning [16,29,30,36,40, the results achieved are useful, they are rarely
43,48,51] implemented in real systems or integrated into
Min-max normalization [30,35,55]
operational applications for institutions such as
Factorial Analysis of Mixed Data (FAMD) [30,34]
Principal Component Analysis (PCA) [30,31,34] police forces or government agencies. This phe-
Multiple Correspondence Analysis (MCA) [30] nomenon highlights a significant gap between
Over-sampling e SMOTE [30,40,51,76] theoretical research and applied practice, where
Under sampling [40] models remain experimental and fail to have a
Data scaling [40]
tangible impact on decision-making or crime pre-
Shapley Additive exPlanation (SHAP) [11,36,52,74]
Stacking (STK) [38,59] vention in real-world contexts.
Adaptive pretraining [57] However, it is worth mentioning the effort made
Adversarial training strategies and focal [57] by Ref. [49], who created a dashboard using HTML,
loss to balance imbalanced classes CSS, and JavaScript to display interactive graphics
Lasso regression [73]
and other analyses in an organized manner. This
R. Huamantingo et al. / Karbala International Journal of Modern Science 11 (2025) 471e488 479
implies a clear intention and a further step to put The most relevant variables for prediction include
these models in a real-world context. On the other the date and time of the crime, geographic location
hand [57], suggests integrating machine learning (latitude and longitude), as well as contextual factors
models in judicial assistance systems and smart such as climate and temperature, highlighting the
courts. It is essential to clarify that the technologies importance of broad temporal data in detecting
and frameworks used include programming lan- historical patterns. Among the main gaps identified
guages such as Java and Python, as well as the Sci- are the lack of standardization in metrics and vari-
kit-learn library, along with RapidMiner Auto ables, the need to integrate heterogeneous data
Model, among other tools [29e31,35,43,49]. sources, and the limited interpretability of advanced
Furthermore, the model was applied to a real-life models. Future research is recommended to explore
case using daily data from Chicago's 22 police dis- more explanatory hybrid models, optimize spatial
tricts and proved useful in supporting operational data, and implement reinforcement learning tech-
decisions such as patrolling, resource allocation, and niques to enhance adaptability and accuracy in dy-
police investigation. However, no mention is made namic environments. Deployment architectures
of direct practical implementation by a functioning and integration of these machine learning models
police institution [68]. Also, the proposed algorithm should also be considered.
is applied in a real-time multi-drone patrol context Despite the promising results obtained in experi-
to respond to predicted crimes, assisting in patrol mental settings, evaluating these models in real-
planning and response to high-risk areas [76]. world scenarios poses significant challenges. Many
studies highlight issues such as data imbalance,
3.1. Main findings missing records, and limited access to high-quality
institutional datasets, which hinder consistent vali-
The systematic review identified that the most dation. Furthermore, the lack of standardization in
widely used models in crime prediction are XGBoost metrics and input variables across studies compli-
and Random Forest, due to their high performance cates direct comparisons and real-life applicability.
in classification and regression. Additionally, the These factors underscore the need for robust,
increasing use of deep learning models, such as context-aware evaluation methods that account for
LSTM and DNN, is also noted, particularly in hybrid operational constraints and local variability in crime
approaches. Additionally, an emerging trend is the reporting and data collection.
application of reinforcement learning, which enables
dynamic adaptation to changing crime patterns. In 3.2. Machine learning techniques
terms of evaluation, the most widely used metrics
include accuracy, AUC, F1-score, precision, and Machine learning techniques enhance the accuracy
recall in classification, and R2, MAE, MSE, and of crime prediction through several strategic ap-
RMSE in regression, with complementary metrics proaches. The integration of diverse data sourcesd
such as PAI, PEI, and PCC in specific studies. such as historical crime records, geospatial variables,
Table 9. Data features/dataset origin.
480
City Origin Type of Crime Features Volume Articles
Malaysia Police Theft, robbery Location, year, month, temperature, and humidity. e [16]
Ethiopia Police Multiple The age of the offender, the education status, the job, 1600 [29]
the victim's sex, the age of the victim, and the place
Turkey Police Robbery Location of each crime, age, and sex of the offender 2236 [35]
United Kingdom Police and twitter Diverse crime type Crime type, location, date, latitude, longitude e [11]
Chicago Open data Violent crimes, property, Type of crime, location (longitude, latitude), community 7 million [51,68]
drugs, theft, and assault area, and time of crime. Day of the week, holiday. 404269
Ukraine Criminal records Recidivism/Reoffending Age at first conviction, number of actual convictions, 12000 [44]
weather conditions, and social mediadenriches for crime prediction; (ii) deep learning models
datasets and enhances model performance [23,47,62]. (convolutional neural networks, computer vision) in
Algorithms like XGBoost, Random Forest, LSTM, and crime detection; (iii) classification and feature
DNN are particularly effective in identifying hidden extraction approaches in data mining; (iv) predictive
patterns and capturing non-linear relationships analysis and spatial analytics in smart cities and law
among variables. enforcement; and (v) hybrid methodologies and
Moreover, real-time data processing enabled by emerging techniques (contrastive learning, artificial
scalable architectures and frameworksdsuch as neural networks). This representation highlights
Machine Learning Operations (MLOps) [63,64]d current trends and identifies key areas for future
allows for continuous model updates and deploy- contributions in crime analysis.
ment. This adaptability ensures that predictions On the other hand, Fig. 5 illustrates the temporal
remain responsive to the evolving dynamics of evolution of key topics. Terms in blue, such as data
crime. Reinforcement learning further supports mining and crime detection, represent the founda-
adaptive learning by enabling models to adjust to tional aspects of the field. In contrast, terms in yel-
new patterns without constant manual retraining. low, including XGBoost, LSTM, and reinforcement
Advanced feature selection and engineering tech- learning, indicate emerging areas of recent interest.
niques, including dimensionality reduction and Fig. 6 shows that different countries are interested
spatial embeddings, improve data representation in researching the topic addressed in this review.
and model interpretability. Collectively, these stra-
tegies enhance the precision and practicality of 3.4. About the selection
crime prediction models, offering valuable tools for
public safety decision-making [48]. After the article selection process, 50 were ob-
tained for in-depth review. These copies come from
3.3. About the bibliometric analysis different years of publication and different quartiles.
Fig. 7 shows the growing trend of the study topic.
The bibliometric analysis was performed using Fig. 8 illustrates the number of articles selected ac-
Vos Viewer and R software, along with its Bib- cording to their quartiles, providing a more precise
liometrix and Biblioshiny libraries, which have measure of the research quality and relevance, with
proven effective in these studies. The search string 46% falling into Q1.
obtained 530 articles related to the research topic.
Fig. 4 shows the co-occurrence of key terms in the 3.5. Proposal model
scientific literature, grouping them into five the-
matic clusters: (i) traditional machine learning Although crime prediction models have demon-
techniques (decision trees, random forest, XGBoost) strated high accuracy in experimental settings, their
482 R. Huamantingo et al. / Karbala International Journal of Modern Science 11 (2025) 471e488
real-world impact depends on successful integra- studies highlight the importance of integrating
tion into institutional systems. These models offer predictive models into public safety frameworks in a
valuable potential for optimizing the allocation of dynamic and sustained manner.
police resources by forecasting high-risk areas and Despite significant advances in crime prediction
periods [77]. However, without mechanisms for through deep learning, a critical gap remains be-
explainability and continuous adaptation, their tween experimental success and real-world imple-
practical utility may diminish over time [78]. These mentation. While Shan et al. [79] report high
predictive accuracy in datasets from New York, ensure and guarantee a thorough understanding of
Chicago, and CN-County, they acknowledge the each dataset attribute directly with the individuals
challenge of generalizing to noisy and dynamic who provide this input. The second phase involves
urban environments beyond controlled settings. data preprocessing, which includes data cleaning,
Likewise, Gerards and Hashemighouchani [80] removing duplicates, handling outliers, selecting
emphasize that most AI-based crime prediction relevant attributes, and transforming data as
systems remain within academic boundaries, urging necessary [8,64].
the need to overcome the research-to-practice The third phase involves modeling, which in-
divide. Both studies agree that bridging this gap cludes the feature engineering sub-phase, where
requires not only robust models but also their techniques such as data scaling, principal compo-
integration into operational law enforcement sys- nent analysis (PCA), data regularization, and class
tems to ensure real-world impact and reliability. balancing may be applied. During the model
As a synthesis of the study's findings, architecture training sub-phase, the algorithms identified in this
for developing and deploying machine learning review are utilized. This is followed by the evalua-
models for crime prediction is proposed (See Fig. 9). tion sub-phase, where model performance is
The first phase involves understanding the data. assessed using the metrics outlined in the study. If
This implies understanding the business to the results do not meet performance expectations,
comprehend the database, including its tables and the training cycle is repeated with hyperparameter
columns, particularly when working directly with tuning until satisfactory accuracy is achieved. The
the source of information. If the research is con- fourth phase focuses on deployment, where the
ducted using provided datasets, it is essential to selected model is brought into production. This
stage must include performance monitoring to machine learning technologies within the public
automatically verify whether the model maintains safety domain [63,64].
efficiency or begins to degrade. To address this, a The use of MLOps in machine learning is essential,
continuous training pipeline should be imple- as it enables the structured and automated manage-
mented to automate the retraining process and ment of the entire model lifecycledfrom data analysis
maintain performance over time [8,43,63,64]. to deployment in production. Technologies such as
The integration of MLOps (Machine Learning Jupyter Notebook, Python, Docker, and Heroku
Operations) into crime prediction and detection facilitate continuous integration, reproducibility,
systems represents a significant advancement in and model monitoring, which enhances operational
automating and optimizing the continuous efficiency, reduces manual errors, and ensures
deployment of models. This study proposes a sys- ongoing adaptation to new data or changes in the
tematic framework that spans from data acquisition environment [65].
and preprocessing to model development, training, At this point, we must find a way to integrate
deployment, and monitoring, ensuring adaptability MLOps to facilitate the efficient and scalable man-
to evolving crime patterns. One of the key chal- agement and operation of machine learning models
lenges lies in automating monitoring and retraining in production. This proposal could be applied to
processesdan area where MLOps plays a pivotal other areas of study.
role by supporting scalability, reproducibility, and Implementing machine learning and MLOps in
sustained model maintenance. Ultimately, this crime prevention faces key challenges. Data quality
phase highlights the significance of addressing and availability remain a hurdle, as many datasets
ethical considerations, ensuring transparency, are incomplete or biased. Integration with existing
fairness, and data privacy in the deployment of systems is complex due to outdated infrastructures,
R. Huamantingo et al. / Karbala International Journal of Modern Science 11 (2025) 471e488 485
requiring interoperable solutions. Trust in models coordinates to improve location precision [34].
depends on their accuracy and consistent perfor- Likewise, variables outside of crime records should
mance, which requires continuous monitoring and be integrated to include climatic data [16,53]. At this
tuning. There are ethical and legal risks, including point, it is recommended to have detailed knowl-
privacy concerns and algorithmic bias, that neces- edge of the data source, specifically understanding
sitate clear regulations to address. Lack of resources the variables in each dataset. It is crucial to under-
can be mitigated with MLOps and cloud computing, stand the context and purpose behind data collec-
optimizing costs and scalability. Continuous evalu- tion by institutions, such as police departments or
ation is essential for adapting models to the justice agencies.
changing nature of crime, as crime is not random All the reviewed articles recommend applying the
but follows spatial and temporal patterns. Models proposed models in institutions such as police de-
that combine static data, such as demographic partments and justice agencies. However, there is
characteristics, with dynamic sources, such as urban little to evidence of real-world implementation or
mobility records (taxi data), have been shown to practical applications reaching end-users. This re-
improve accuracy in crime prediction [63e65]. flects the main gap identified by the review: the
Furthermore, public acceptance depends on transition from experimental models to operational
communication and transparency strategies. Over- deployment. Achieving this requires not only tech-
coming these challenges requires technological nical robustness but also institutional alignment,
innovation, effective regulation, and inter-agency ethical oversight, and contextual understanding of
collaboration. data systems. In real settings, significant challenges
arise, including inconsistent data quality, missing
3.6. Discussion values, and the fragmentation of datasets generated
under evolving administrative processes. Further-
This review highlights the growing study of ma- more, a limited understanding of the operational
chine learning models in crime prediction, focusing logic within justice institutions can hinder the
on classification models, while regression models alignment between algorithmic outputs and insti-
are comparatively less explored. In the first instance, tutional workflows. Bridging this gap demands
this leaves a gap for further study of regression interdisciplinary efforts to validate, adapt, and
models [33,47,58]. While it is true that the purpose of deploy these models under real-world constraints,
this review was to identify the most commonly used supported by strong collaboration between re-
machine learning models, it was also found that searchers and institutional stakeholders.
deep learning models were often used in conjunc-
tion, yielding interesting results, as shown in
4. Conclusions
Tables 5 and 6. This suggests that combining mul-
tiple models and techniques is necessary to achieve In this systematic literature review, conducted
high-performing and effective solutions [45,55e57]. using the PRISMA methodology, the 4 research
The studies and metrics found reveal that it is questions posed were answered after reviewing and
insufficient to use only one or two metrics to eval- analyzing the 50 selected articles. Regarding the first
uate a model's performance. In these studies and question about which machine learning models are
also in other areas, it is recommended to use all the most used and which are most effective for pre-
metrics in Table 8, depending on the type of dicting crimes, this study found that various models
research purpose and the model, whether it is are being employed. Still, the most notable are
classification –if it is desired to predict discrete la- XGBoost, random forest, gradient boosting decision
bels or categories– or regression –if it is desired to tree, LightGBM, Naive Bayes, support vector ma-
predict continuous numerical values–, also taking chine, ARIMA, and stochastic gradient descent;
into account the other metrics indicated in the likewise, deep learning models such as LSTM,
answer to question RQ2. DNN, and DeCXGBoost also stand out, and hybrid
To achieve both high performance and precision models are also present, as seen in Table 5.
in a model, the data quality, the number of records, Regarding the second question, which metrics are
and the characteristics or variables to be considered used to evaluate the performance of machine learning
in the dataset are significant. In this sense, regres- models, it was found that accuracy, recall, f1-score,
sion type prediction models consider date and time precision, and AUC are the most used and deter-
variables; from these variables, the year and month mining when evaluating classification models; on the
can be extracted if they are not available [58]. other hand, metrics such as MAE, RMSE and R2 are
Regarding location, it is advisable to include the most determining in regression models.
486 R. Huamantingo et al. / Karbala International Journal of Modern Science 11 (2025) 471e488
Regarding the third question, which characteris- effectively bridge the gap between theoretical ad-
tics are relevant in the dataset for greater precision vances and practical adoption in justice and public
in the models, it was found that most of the datasets safety contexts.
include types of crimes, date and time, month, year,
age, gender, educational level, both of the offender Ethics information
as well as the victim, it was also found that a large
The study did not involve humans or animals, so
part of the studies are oriented to the determination
ethical approval was not required under current
of the area, which is why they include data on lati-
tude and longitude (geographic coordinates) and regulations.
other interesting characteristics are the inclusion of
climatic data, temperature, humidity and concurrent Funding
areas. This research did not receive any grant from
Finally, regarding the implementation of the funding agencies in the public, commercial, or not-
models in a real context, surprisingly, no study was for-profit sectors.
found that after obtaining the most efficient model,
it was implemented in a real context or at least Conflict of interest
included in some public facing application, except
one that implemented a series of dashboards to The authors declare no conflict of interest.
display heat maps on a web page. This type of result
is important in demonstrating how they are inte- Acknowledgements
grated, what results can be achieved for end users,
and providing information for informed decision- The authors thank the Universidad Nacional
making within the corresponding organizations. Mayor de San Marcos, Lima, Peru, for their
There is a clear gap between academic research and invaluable contributions and support throughout
its implementation in real-world contexts. this research.
Future research should address the current chal-
lenges and limitations identified in this study. A key References
challenge is the need to improve regression models,
[1] K. Jenga, C. Catal, G. Kar, Machine learning in crime pre-
which can be optimized by integrating reinforce- diction, J. Ambient Intell. Hum. Comput. 14 (2023)
ment learning techniques and optimizing the eval- 2887e2913, https://doi.org/10.1007/s12652-023-04530-y.
uation metrics used in crime prediction. [2] P. Sarzaeim, Q.H. Mahmoud, A. Azim, G. Bauer, I. Bowles,
A systematic review of using machine learning and natural
Furthermore, data availability and quality remain language processing in smart policing, Computers 12 (2023)
critical issues. To improve prediction accuracy and 255, https://doi.org/10.3390/computers12120255.
model robustness, future studies should focus on [3] T. Coleman, P. Mokilane, M. Rangata, J. Holloway, N. Botha,
R. Koen, N. Dudeni, Exploring the usefulness of the INLA
expanding datasets by incorporating external factors model in predicting levels of crime in the city of Johannes-
beyond crime records, such as socioeconomic in- burg, South Africa, Crime Sci 13 (2024) 25, https://doi.org/10.
dicators, geospatial analysis, real-time surveillance 1186/s40163-024-00219-5.
[4] C. Jing, X. Lv, Y. Wang, M. Qin, S. Jin, S. Wu, G. Xu, A deep
data, behavioral trends, and the status of court multi-scale neural networks for crime hotspot mapping
cases, as essential indicators for prediction that prediction, Comput. Environ. Urban Syst. 109 (2024) 102089,
assist law enforcement agencies. https://doi.org/10.1016/j.compenvurbsys.2024.102089.
[5] S. Sankara, N. Sugitha, Crime rate analysis and mapping
Another critical challenge is the practical imple- from socio-economic data using deep neural networks, J.
mentation of predictive models. Future work should Comput. Sci. 20 (2024) 1203e1213, https://doi.org/10.3844/
not only develop more accurate models but also jcssp.2024.1203.1213.
[6] H.Y. Barrag an-Huam an, K.E. Cata~no-A~ nazco, M.A. Sevin-
explore how to integrate these solutions into law cha-Chacabana, O. Vargas-Salas, La inteligencia artificial y la
enforcement operational systems. This includes video-vigilancia en la prediccion y detecci on de delitos en
designing scalable implementation architectures, espacio-tiempo: una revisi on sistem atica, Rev. Crim. 65
(2023) 11e25, https://doi.org/10.47741/17943108.398.
creating intuitive visualization tools, and developing [7] J. He, H. Zheng, Prediction of crime rate in urban neigh-
prototypes that demonstrate real-world applica- borhoods based on machine learning, Eng. Appl. Artif. Intell.
bility. Furthermore, collaboration with law enforce- 106 (2021) 104460, https://doi.org/10.1016/j.engappai.2021.
104460.
ment agencies and local governments is essential to [8] R.M. Aziz, A. Hussain, P. Sharma, P. Kumar, Machine
ensure alignment with institutional processes, foster learning-based soft computing regression analysis approach
trust in predictive systems, and facilitate access to for crime data prediction, Karbala Int. J. Mod. Sci. 8 (2022)
1e19, https://doi.org/10.33640/2405-609X.3197.
real data for validation. By providing structured [9] K. Vijay, K.R. Sowmia, V. Jananee, D. Mathew, A systematic
guidelines, implementation roadmaps, and evi- review on Tanpin Kandri based crime prediction, Remit.
dence from pilot programs, future research can Rev. 7 (2022) 1e11, https://doi.org/10.47059/rr.v7i2.2407.
R. Huamantingo et al. / Karbala International Journal of Modern Science 11 (2025) 471e488 487
[10] M. Kaur, M. Saini, Role of artificial intelligence in the crime dimensions, web of science, and Open Citations' COCI: a
prediction and pattern analysis studies published over the multidisciplinary comparison of coverage via citations, Sci-
last decade: a scientometric analysis, Artif. Intell. Rev. 57 entometrics 126 (2021) 871e906, https://doi.org/10.1007/
(2024) 202, https://doi.org/10.1007/s10462-024-10823-1. s11192-020-03690-4.
[11] G.A. Taiwo, M. Saraee, J. Fatai, Crime prediction using [29] B.Z. Wubineh, Crime analysis and prediction using ma-
twitter sentiments and crime data, Informatica 48 (2024) chine-learning approach in the case of hossana police com-
35e42, https://doi.org/10.31449/inf.v48i6.4749. mission, Secur. J. 37 (2024) 1269e1284, https://doi.org/10.
[12] G.V. Travaini, F. Pacchioni, S. Bellumore, M. Bosia, F. De 1057/s41284-024-00416-6.
Micco, Machine learning and criminal justice: a systematic [30] R. de V. Dos Santos, J.V. Venceslau, N.A. Azevedo, D.S.
review of advanced methodology for recidivism risk pre- Amorin, A criminal macrocause classification model: an
diction, Int. J. Environ. Res. Publ. Health 19 (2022) 10594, enhancement for violent crime analysis considering an un-
https://doi.org/10.3390/ijerph191710594. balanced dataset, Expert Syst. Appl. 238 (2024) 121702,
[13] T. Cheng, T. Chen, Urban crime and security, in: W. Shi, M. https://doi.org/10.1016/j.eswa.2023.121702.
F. Goodchild, M. Batty, M.-P. Kwan, A. Zhang, eds., Urban [31] H.K. Sharma, T. Choudhury, A. Kandwal, Machine learning
Informatics, Springer, Singapore. 2021, pp. 213e228, https:// based analytical approach for geographical analysis and
doi.org/10.1007/978-981-15-8983-6_14. prediction of Boston city crime using geospatial dataset,
[14] S. Albahli, W. Albattah, Crime type prediction in Saudi Geojournal 88 (2023) 15e27, https://doi.org/10.1007/s10708-
Arabia based on intelligence Gathering, Comput. J. 66 (2023) 021-10485-4.
1936e1948, https://doi.org/10.1093/comjnl/bxac05. [32] R. Khatun, S.I. Ayon, R. Hossain, J. Alam, Data mining
[15] J. Feng, Y. Liang, Q. Hao, K. Xu, W. Qiu, Comparing effec- technique to analyse and predict crime using crime cate-
tiveness of point of interest data and land use data in theft gories and arrest records, Indones. J. Electr. Eng. Comput.
crime modelling: a case study in Beijing, Land Use Policy 147 Sci. 22 (2021) 1052e1060, https://doi.org/10.11591/ijeecs.v22.
(2024) 107357, https://doi.org/10.1016/j.landusepol.2024. i2.pp1052-1060.
107357. [33] W. Safat, S. Asghar, S.A. Gillani, Empirical analysis for crime
[16] A.Z.M. Zukri, S.R.M. Sakip, S. Masrom, P.R. Megat, N. prediction and forecasting using machine learning and deep
Zamin, The crime prediction of criminal activity based on learning techniques, IEEE Access 9 (2021) 70080e70094,
weather changes towards quality of life, J. Adv. Res. Appl. https://doi.org/10.1109/ACCESS.2021.3078117.
Sci. Eng. Technol. 42 (2024) 130e143, https://doi.org/10. [34] S. Albahli, A. Alsaqabi, F. Aldhubayi, H.T. Rauf, M. Arif, M.A.
37934/araset.42.1.130143. Mohammed, Predicting the type of crime: intelligence gath-
[17] M. Khan, A. Ali, Y. Alharbi, Predicting and preventing crime: ering and crime analysis, Comput. Mater. Continua (CMC) 66
a crime prediction model using San Francisco crime data by (2020) 2317e2341, https://doi.org/10.32604/cmc.2021.014113.
classification techniques, Complexity (2022) (2022) 4830411, [35] G. Bediroglu, H.E. Colak, Predicting and analyzing crime-
https://doi.org/10.1155/2022/4830411. dEnvironmental design relationship via GIS-Based machine
[18] M.L. Nascimento, L.M. Barreto, Improving crime count learning approach, Trans. GIS 28 (2024) 1377e1399, https://
forecasts in the city of Rio de Janeiro via reconciliation, doi.org/10.1111/tgis.13195.
Secur. J. 37 (2024) 1597e1618, https://doi.org/10.1057/s41284- [36] G. Kim, Y. Cho, J.-H. Lee, G. Lee, Correlation analysis be-
024-00433-5. tween urban environment features and crime occurrence
[19] J. Sui, P. Chen, H. Gu, Deep spatio-temporal graph attention based on explainable artificial intelligence techniques, J.
network for street-level 110 call incident prediction, Appl. Asian Architect. Build Eng. (2024) 1e20, https://doi.org/10.
Sci. 14 (2024) 9334, https://doi.org/10.3390/app14209334. 1080/13467581.2024.2421260.
[20] W. Choi, J. Na, S. Lee, Evaluating intelligent CPTED systems [37] Y. Deng, R. He, Y. Liu, Crime risk prediction incorporating
to support crime prevention decision-making in municipal geographical spatiotemporal dependency into machine
control centers, Appl. Sci. 14 (2024) 6581, https://doi.org/10. learning models, Inf. Sci. 646 (2023) 119414, https://doi.org/
3390/app14156581. 10.1016/j.ins.2023.119414.
[21] U.M. Butt, S. Letchmunan, F.H. Hassan, T.W. Koh, [38] A. Angbera, H.Y. Chan, MODEL for SPATIOTEMPORAL
Leveraging transfer learning with deep learning for crime CRIME PREDICTION with IMPROVED DEEP LEARNING,
prediction, PLoS One 19 (2024) e0296486, https://doi.org/10. Comput. Inf. 42 (2023) 568e590, https://doi.org/10.31577/cai_
1371/journal.pone.0296486. 2023_3_568.
[22] H. Gu, J. Sui, P. Chen, Graph representation learning for [39] A. Ature, C.H. Yong, Spatiotemporal bandits crime predic-
street-level crime prediction, ISPRS Int. J. GeoInf. 13 (2024) tion from web news archives analysis, Comput. Sist. 27 (2023)
229, https://doi.org/10.3390/ijgi13070229. 719e731, https://doi.org/10.13053/CyS-27-3-4110.
[23] R. Ganesan, S. Ravichandran, Performance analysis for [40] D. Kim, S. Jung, Y. Jeong, Theft prediction model based on
crime prediction and detection using machine learning al- spatial clustering to reflect spatial characteristics of adjacent
gorithms, Int. J. Intell. Syst. Appl. Eng. 12 (2024) 348e355. lands, Sustain. Switz. 13 (2021) 7715, https://doi.org/10.3390/
https://ijisae.org/index.php/IJISAE/article/view/4671/3345. su13147715.
[24] R. Neira, M. Cano, A systematic review of the literature on [41] A. Palanivinayagam, S.S. Gopal, S. Bhattacharya, N.
the use of artificial intelligence in forecasting the demand for Anumbe, E. Ibeke, C. Biamba, An optimized machine
products and services in various sectors, Int. J. Adv. Comput. learning and big data approach to crime detection, Wireless
IJACSA 15 (2024) 144e156, https://doi.org/10.14569/IJACSA. Commun. Mobile Comput. 2021 (2021) 5291528, https://doi.
2024.0150315. org/10.1155/2021/5291528.
[25] M.J. Page, J.E. McKenzie, P.M. Bossuyt, I. Boutron, T.C. [42] J. Zhou, Z. Li, J.J. Ma, F. Jiang, Exploration of the hidden
Hoffman, The PRISMA 2020 statement: an updated guideline influential factors on crime activities: a big data approach,
for reporting systematic reviews, Syst. Rev. 10 (2021) 89, IEEE Access 8 (2020) 141033e141045, https://doi.org/10.1109/
https://doi.org/10.1186/s13643-021-01626-4. ACCESS.2020.3009969.
[26] S.K. Sood, K.S. Rawat, A scientometric analysis of ICT- [43] A. Alsubayhin, M.S. Ramzan, B. Alzahrani, Crime prediction
assisted disaster management, Nat. Hazards 106 (2021) model using three classification techniques: Random forest,
2863e2881, https://doi.org/10.1007/s11069-021-04512-3. logistic regression, and LightGBM, Int. J. Adv. Comput. Sci.
[27] S.K. Sood, N. Kumar, M. Saini, Scientometric analysis of Appl. IJACSA 15 (2024) 240e251, https://doi.org/10.14569/
literature on distributed vehicular networks : VOSViewer IJACSA.2024.0150123.
visualization techniques, Artif. Intell. Rev. 54 (2021) [44] L. Babala, M. Kasianchuk, O. Kovalchuk, R. Shevchuk,
6309e6341, https://doi.org/10.1007/s10462-021-09980-4. Support vector machine to criminal recidivism prediction,
[28] A. Martín-Martín, M. Thelwall, E. Orduna-Malea, E. Delgado Int. J. Electron. Telecommun 70 (2024) 691e967, https://doi.
Lopez-C ozar, Google scholar, microsoft academic, scopus, org/10.24425/ijet.2024.149598.
488 R. Huamantingo et al. / Karbala International Journal of Modern Science 11 (2025) 471e488
[45] S. Tam, O.O. Tanriover, Multimodal deep learning crime Appl. 83 (2024) 22663e22700, https://doi.org/10.1007/s11042-
prediction using tweets, IEEE Access 11 (2023) 93204e93214, 023-16371-0.
https://doi.org/10.1109/ACCESS.2023.3308967. [63] C. Amrit, A.K. Narayanappa, An analysis of the challenges in
[46] S. Sridharan, N. Srish, S. Vigneswaran, P. Santhi, Crime the adoption of MLOps, J. Innov. Knowl. 10 (2025) 100637,
prediction using machine learning, EAI Endorsed Trans. IoT https://doi.org/10.1016/j.jik.2024.100637.
10 (2024) 1e7, https://doi.org/10.4108/eetiot.5123. [64] S.J. Warnett, U. Zdun, On the understandability of MLOps
[47] M. Muthamizharasan, R. Ponnusamy, A comparative study system architectures, IEEE Trans. Software Eng. 50 (2024)
of crime event forecasting using ARIMA versus LSTM 1015e1039, https://doi.org/10.1109/TSE.2024.3367488.
model, J. Theor. Appl. Inf. Technol. 102 (2024) 2162e2171. [65] P. Narang, P. Mittal, Nisha, hybrid prediction model by
https://www.jatit.org/volumes/Vol102No5/36Vol102No5.pdf. integrating machine learning techniques with MLOps, Int. J.
[48] L. Yang, J. Guofan, Z. Yixin, W. Qianze, Z. Jian, R. Aliza- Syst. Innov. 9 (2025) 47e59, https://doi.org/10.6977/IJoSI.
dehsani, P. Plawiak, A reinforcement learning approach 202504_9(2).0005.
combined with scope loss function for crime prediction on [66] H. Yue, J. Chen, Interpretable spatial machine learning for
Twitter (X), IEEE Access 12 (2024) 149502e149527, https://doi. understanding spatial heterogeneity in factors affecting
org/10.1109/ACCESS.2024.3473296. street theft crime, Appl. Geogr. 175 (2025) 103503, https://doi.
[49] M. Vivek, B.R. Prathap, Spatio-temporal crime analysis and org/10.1016/j.apgeog.2024.103503.
forecasting on twitter data using machine learning algo- [67] E.J. Alasfoor, O. Alshaikh, I. Inuwa-Dutse, S. Khan, S. Par-
rithms, SN Comput. Sci. 4 (2023) 383, https://doi.org/10.1007/ kinson, Offender characterization and prediction: a case
s42979-023-01816-y. study of the Kingdom of Bahrain, IEEE Access 13 (2025)
[50] N. Tasnim, I.T. Imam, M.M.A. Hashem, A novel multi- 29406e29431, https://doi.org/10.1109/ACCESS.2025.3531655.
module approach to predict crime based on multivariate [68] M. Hou, X. Hu, J. Cai, X. Han, S. Yuan, An integrated graph
spatio-temporal data using attention and sequential fusion model for spatialetemporal urban crime prediction based on
model, IEEE Access 10 (2022) 48009e48030, https://doi.org/ attention mechanism, ISPRS Int. J. GeoInf. 11 (2022) 294,
10.1109/ACCESS.2022.3171843. https://doi.org/10.3390/ijgi11050294.
[51] J. Alghamdi, T. Al-Dala’in, Towards spatio-temporal crime [69] A. Jan, G. Khan, Real world anomalous scene detection and
events prediction, Multimed. Tool. Appl. 83 (2024) classification using multilayer deep neural networks, Int. J.
18721e18737, https://doi.org/10.1007/s11042-023-16188-x. Interact. Multimedia. Artif. Intell. 8 (2023) 158e167, https://
[52] H. Xie, L. Liu, H. Yue, Modeling the effect of streetscape envi- doi.org/10.9781/ijimai.2021.10.010.
ronment on crime using street view images and interpretable [70] A. Solomon, M. Kertis, B. Shapira, L. Rokach, A deep
machine-learning technique, Int. J. Environ. Res. Publ. Health learning framework for predicting burglaries based on
19 (2022) 13833, https://doi.org/10.3390/ijerph192113833. multiple contextual factors, Expert Syst. Appl. 199 (2022)
[53] Y. Lamari, B. Freskura, A. Abdessamad, S. Eichberg, S. de 117042, https://doi.org/10.1016/j.eswa.2022.117042.
Bonviller, Predicting spatial crime occurrences through an [71] M.A. Salam, S. Taha, M. Ramadan, Time series crime pre-
efficient ensemble-learning model, ISPRS Int. J. GeoInf. 9 diction using a federated machine learning model, Int. J.
(2020) 645, https://doi.org/10.3390/ijgi9110645. Comput. Sci. Netw. Secur. 22 (2022) 119e130, https://doi.org/
[54] Z.K. Abdalrdha, A.M. Al-Bakry, A.K. Farhan, A hybrid CNN- 10.22937/IJCSNS.2022.22.4.16.
LSTM and XGBoost approach for crime detection in tweets [72] G. Bhardwaj, R.K. Bawa, Assaying the statistics of
using an intelligent dictionary, Rev. Intell. Artif. 37 (2023) crime against women in India using provenance and ma-
1651e1661, https://doi.org/10.18280/RIA.370630. chine learning models, Int. J. Adv. Comput. Sci. Appl.
[55] Y. Rayhan, T. Hashem, AIST: an interpretable attention- IJACSA 13 (2022) 492e501, https://doi.org/10.14569/IJACSA.
based deep learning model for crime prediction, ACM Trans. 2022.0130760.
Spat. Algorithms Syst. 9 (2023) 1e31, https://doi.org/10.1145/ [73] M. Saraiva, I. Matijosaitiene,_ S. Mishra, A. Amante, Crime
3582274. prediction and monitoring in Porto, Portugal, using machine
[56] J. Wu, S.M. Abrar, N. Awasthi, E. Frias-Martinez, V. Frias- learning, spatial and text analytics, ISPRS Int. J. GeoInf. 11
Martinez, Enhancing short-term crime prediction with (2022) 7, https://doi.org/10.3390/ijgi11070400.
human mobility flows and deep learning architectures, EPJ [74] X. Zhang, L. Liu, M. Lan, G. Song, L. Xiao, J. Chen, Inter-
Data Sci. 11 (2022) 53, https://doi.org/10.1140/epjds/s13688- pretable machine learning models for crime prediction,
022-00366-2. Comput. Environ. Urban Syst. 94 (2022) 101789, https://doi.
[57] W. Ma, Artificial intelligence-assisted decision-making org/10.1016/j.compenvurbsys.2022.101789.
method for legal judgment based on deep neural network, [75] M. Escobedo, C. Tapia, J. Gutierrez, V. Ayma, Comparing
Mob. Inf. Syst. 2022 (2022) 4636485, https://doi.org/10.1155/ regression models to predict property crime in high-risk
2022/4636485. Lima districts, Int. J. Adv. Comput. Sci. Appl. IJACSA 15
[58] J.V. Devi, K.S. Kavitha, Automating time series forecasting (2024) 62e68, https://doi.org/10.14569/IJACSA.2024.0150307.
on crime data using RNN-LSTM, Int. J. Adv. Comput. Sci. [76] C. Wang, F. Tian, Y. Pan, Swarm intelligence response
Appl. 12 (2021) 458e463, https://doi.org/10.14569/IJACSA. methods based on urban crime event prediction, Electronics
2021.0121051. 12 (2023) 4610, https://doi.org/10.3390/electronics12224610.
[59] S.S. Kshatri, D. Singh, B. Narain, S. Bhatia, M.T. Quasim, G. [77] _
E.G. Ilgün, M. Dener, Exploratory data analysis, time series
R. Sinha, An empirical analysis of machine learning algo- analysis, crime type prediction, and trend forecasting in
rithms for crime prediction using stacked generalization: an crime data using machine learning, deep learning, and sta-
ensemble approach, IEEE Access 9 (2021) 67488e67500, tistical methods, Neural Comput. Appl. 37 (2025)
https://doi.org/10.1109/ACCESS.2021.3075140. 1177e11798, https://doi.org/10.1007/s00521-025-11094-9.
[60] H. Yu, L. Liu, B. Yang, M. Lan, Crime prediction with his- [78] R. Khalfa, N. Theinert, W. Hardyns, Comparing XAI tech-
torical crime and movement data of potential offenders using niques for interpreting short-term burglary predictions at
a spatio-temporal cokriging method, ISPRS Int. J. GeoInf. 9 micro-places, Comput. Urban Sci. 5 (2025) 27, https://doi.
(2020) 732, https://doi.org/10.3390/ijgi9120732. org/10.1007/s43762-025-00185-x.
[61] Z. Yan, H. Chen, X. Dong, K. Zhou, Z. Xu, Research on [79] M. Shan, C. Ye, P. Chen, S. Peng, Ada-GCNLSTM: an
prediction of multi-class theft crimes by an optimized adaptive urban crime spatiotemporal prediction model, J.
decomposition and fusion method based on XGBoost, Expert Saf. Sci. Resil. 6 (2025) 226e236, https://doi.org/10.1016/j.
Syst. Appl. 207 (2022) 117943, https://doi.org/10.1016/j.eswa. jnlssr.2024.11.003.
2022.117943. [80] F. Ersoz, T. Ersoz, F. Marcelloni, F. Ruffini, Artificial intelli-
[62] R.M. Aziz, A. Hussain, P. Sharma, Cognizable crime rate gence in crime prediction: a survey with a focus on
prediction and analysis under Indian penal code using deep explainability, IEEE Access 13 (2025) 59646e59674, https://
learning with novel optimization approach, Multimed. Tool. doi.org/10.1109/ACCESS.2025.3553934.