WOMAN SECURITY AND SAFETY
Data Visualization
AMOGH BORGAVE
FA-10
24AMCO2121010
FY MTech Computer Engineering
Introduction
Crimes against women remain a critical social issue that
demands continuous analysis and awareness. Understanding
the patterns, trends, and distribution of such crimes is
essential for effective policy-making, resource allocation, and
public awareness. This project aims to analyze crime data
related to women in India between the years 2001 and 2014
using various data visualization techniques. By transforming
raw data into meaningful visual insights, we seek to uncover
hidden patterns, compare crime types, and track their
progression over time. The visualizations presented in this
project not only highlight key findings but also serve as a
powerful tool to support informed decision-making and raise
awareness about the gravity of the issue.
Source of Data
The dataset used in this project has been sourced from Kaggle,
a popular platform for data science competitions and datasets.
It contains statistics on various types of crimes committed
against women in India from 2001 to 2014, compiled from
official records. The dataset includes crimes such as rape,
domestic violence, dowry deaths, and kidnapping, reported
across different Indian states and union territories.
• Source: Kaggle – Crimes Against Women in India (2001–
2014)
• Publisher: Originally compiled from National Crime
Records Bureau (NCRB), India
• License: As per Kaggle dataset terms
CODE
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load the dataset
df = pd.read_csv("C:\\Users\\amogh\\OneDrive\\Desktop\\crimes.csv")
df.columns = df.columns.str.strip() # Clean column names
# Create Total Crimes column
df['Total Crimes'] = df[['Rape',
'Kidnapping and Abduction',
'Dowry Deaths',
'Assault on women with intent to outrage her modesty',
'Insult to modesty of Women',
'Cruelty by Husband or his Relatives',
'Importation of Girls']].sum(axis=1)
# 1. Line Chart: Total Crimes Over the Years
plt.figure(figsize=(10, 6))
df.groupby('Year')['Total Crimes'].sum().plot(kind='line', marker='o', color='red')
plt.title('Trend of Total Crimes Against Women (2001-2014)')
plt.xlabel('Year')
plt.ylabel('Total Number of Crimes')
plt.grid(True)
plt.tight_layout()
plt.show()
# 2. Bar Chart: Specific Crimes in 2014
year = 2014
crime_types = ['Rape',
'Kidnapping and Abduction',
'Dowry Deaths',
'Assault on women with intent to outrage her modesty',
'Insult to modesty of Women',
'Cruelty by Husband or his Relatives',
'Importation of Girls']
df_year = df[df['Year'] == year]
df_crime_types = df_year[crime_types].sum().sort_values(ascending=False)
plt.figure(figsize=(12, 6))
df_crime_types.plot(kind='bar', color='skyblue')
plt.title(f'Number of Specific Types of Crimes in {year}')
plt.xlabel('Crime Type')
plt.ylabel('Number of Crimes')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()
# 3. Pie Chart: Crime Proportions in 2014
plt.figure(figsize=(8, 8))
df_crime_types.plot(kind='pie', autopct='%1.1f%%', colors=sns.color_palette('pastel'))
plt.title(f'Proportion of Different Crime Types in {year}')
plt.ylabel('')
plt.tight_layout()
plt.show()
# 4. Heatmap: Correlation Between Crime Types
plt.figure(figsize=(10, 8))
sns.heatmap(df[crime_types].corr(), annot=True, cmap='coolwarm', fmt='.2f',
linewidths=0.5)
plt.title('Correlation Between Different Crime Types')
plt.tight_layout()
plt.show()
# 5. Stacked Bar Chart: Comparison for 2010 and 2014
selected_years = [2010, 2014]
df_selected = df[df['Year'].isin(selected_years)]
df_pivot = df_selected.pivot_table(index='STATE/UT', columns='Year', values='Total
Crimes', aggfunc='sum').fillna(0)
df_pivot.plot(kind='bar', stacked=True, figsize=(14, 7), colormap='viridis')
plt.title('Comparison of Total Crimes Across States for 2010 and 2014')
plt.xlabel('State/UT')
plt.ylabel('Total Number of Crimes')
plt.xticks(rotation=90)
plt.legend(title='Year')
plt.tight_layout()
plt.show()
# 6. Box Plot: Rape Case Distribution
plt.figure(figsize=(12, 6))
sns.boxplot(x='Year', y='Rape', data=df)
plt.title('Distribution of Rape Cases Across States (2001–2014)')
plt.xlabel('Year')
plt.ylabel('Number of Rape Cases')
plt.tight_layout()
plt.show()
# 7. Area Plot : Showing rise of different crimes over time
crime_trend = df.groupby('Year')[crime_types].sum()
crime_trend.plot.area(figsize=(12, 6), colormap='tab20')
plt.title('Area Chart: Trend of Various Crimes Against Women (2001-2014)')
plt.xlabel('Year')
plt.ylabel('Number of Crimes')
plt.legend(loc='upper left', bbox_to_anchor=(1, 1))
plt.tight_layout()
plt.show() # Show plot instead of saving
# 8. Pair Plot : Explore relationships between multiple crime types
sns.pairplot(df[crime_types])
plt.suptitle('Pair Plot: Relationships Between Crime Types', y=1.02)
plt.show() # Show plot instead of saving
# Clean the data: remove NaNs and clip outliers
rape_data = df['Rape'].dropna()
# Optional: Clip extreme values for better visualization
rape_data_clipped = rape_data.clip(upper=rape_data.quantile(0.98)) # clip top 2%
# Plot the histogram
plt.figure(figsize=(10, 6))
sns.histplot(rape_data_clipped, bins=30, kde=True, color='purple')
plt.title('Histogram: Distribution of Rape Cases Across All District-Year Entries')
plt.xlabel('Number of Rape Cases')
plt.ylabel('Frequency')
plt.grid(True)
plt.tight_layout()
plt.show()
1. Line Chart – Trend of Total Crimes Over the Years
The line chart illustrates the trend in the total number of crimes against women
from 2001 to 2014. This visualization helps in identifying the overall rise or fall in
crime rates over the years and highlights significant shifts or patterns in the
national data.
2. Bar Chart – Specific Crime Types in 2014
This bar chart showcases the number of reported cases for each major crime type
in the year 2014. It provides a clear comparison of which crimes were most
prevalent during that year, offering a snapshot of the crime landscape at its peak in
the dataset.
3. Pie Chart – Crime Type Proportions in 2014
The pie chart displays the proportion of each crime type relative to the total crimes
reported in 2014. It helps in understanding the distribution and dominance of
specific crimes, such as domestic violence or rape, among the total incidents
reported.
4. Heatmap – Correlation Between Crime Types
This heatmap presents the correlation matrix between different types of crimes
against women. It reveals how strongly different crimes are related to each other
statistically—useful for spotting potential behavioural or reporting patterns.
5. Stacked Bar Chart – Statewise Comparison (2010 vs 2014)
The stacked bar chart compares the total crimes reported by each state/UT in 2010
and 2014. This visualization helps analyze how crime levels have changed over time
on a state level, indicating areas of improvement or concern.
6. Box Plot – Distribution of Rape Cases by Year
This box plot shows the distribution of rape cases across states for each year from
2001 to 2014. It highlights the spread, median, and outliers in the data, giving a year-
wise view of how reporting or incidence has varied across regions.
7. Area Plot – Rise of Different Crimes Over Time
The area plot visualizes the rise and fall of various crimes against women over the
years. Each colored band represents a specific crime, enabling easy comparison
and showcasing which crime types have seen significant increases or declines.
8. Pair Plot – Relationship Between Crime Types
The pair plot explores the pairwise relationships between different crime types
using scatter plots and histograms. It helps detect correlations, clusters, and
potential dependencies between crimes that may be useful for deeper analytical
insights.
9. Histogram – Distribution of Rape Cases
This histogram visualizes the distribution of rape cases across all entries in the
dataset, with extreme values clipped for clarity. It shows how frequently different
ranges of cases occur and indicates whether the data is skewed or balanced.
Conclusion
These visualizations collectively provide a clearer understanding of the trends,
distributions, and interrelationships between different crimes against women in
India from 2001 to 2014. The area plot highlights the rise in certain crimes over time,
the pair plot uncovers hidden correlations, and the histogram reveals how
widespread specific crimes are. Together, they offer valuable insights for
policymakers, researchers, and law enforcement to strategize more effectively for
prevention and intervention.