NEW Final 2 Internship Project
NEW Final 2 Internship Project
The core aim of data analysis is to extract actionable insights from data,
enabling informed decision-making, trend detection, problem resolution, and
process improvement. It finds applications across diverse fields such as business,
finance, marketing, healthcare, and social sciences, among others, making it an
essential component in modern-day analytics.
1
Ultimately, data analysis empowers organizations and individuals to derive
meaningful insights, facilitating data-driven decision-making and enhancing
competitiveness in their sectors.
Descriptive Analytics
Diagnostic Analytics
Predictive Analytics
2
Prescriptive Analytics
Clearly define the problem or objective that data analytics will address. This
initial step establishes the direction and focus for the analysis, ensuring
alignment with organizational goals and stakeholder needs.
Identify relevant data sources and gather the necessary data. This phase
involves acquiring data, integrating different data sources, and ensuring data
cleanliness and completeness. Understanding the data also includes assessing its
quality, relevance, and potential biases.
Preprocess the data by addressing issues like missing values, outliers, and
data normalization or transformation as needed.
3
This stage is crucial for laying the groundwork for subsequent analysis and
ensuring the data is in a suitable format for modeling.
Implement the chosen analytical techniques on the prepared data. This step
involves applying algorithms, statistical tests, or models to extract meaningful
information and uncover patterns or relationships within the data. It is essential
to ensure that the techniques chosen are appropriate for the specific nature of
the data and aligned with the objectives of the analysis.
Analyze the output from the applied techniques and interpret the findings
in the context of the original problem or objective. This step requires a
combination of domain knowledge, statistical expertise, and critical thinking to
derive actionable insights. Understanding the implications of the results is
crucial for making informed decisions and deriving value from the analysis.
Present the results of the analysis in a clear and meaningful way. Effective
communication involves creating visualizations, dashboards, reports, or other
forms of communication to convey the insights to stakeholders, decision-makers
4
Visualizations should be intuitive and informative, highlighting key findings
and trends identified through the analysis. This step not only facilitates
understanding but also encourages stakeholders to engage with the data and
derive insights independently.
Business Intelligence
Involves using data analysis tools and techniques to gather, store, and
analyze business data to make informed decisions and improve operations.
Finance Banking
5
Marketing and Advertising
6
CHAPTER II
7
We collected data from Kaggle to help understand these challenges.
However, we faced problems with inconsistent and incomplete data. Some
regions had better data than others, and the quality of data varied. There were
also issues with different formats and definitions, making it hard to compare
information. Despite these challenges, using data from Kaggle helped us
identify key areas that need improvement.
Objectives
How many people in India received their first COVID-19 vaccination in 2021
How many COVID-19 tests were conducted in India in 2020 and 2021
Pandas is a powerful library for data manipulation and analysis. It provides data
structures such as Data Frames for efficiently working with structured data
Matplotlib is a popular plotting library that enables the creation of a wide range
of static, animated, and interactive visualizations in Python.
8
CHAPTER III
DATA PREPARATION
9
CHAPTER IV
METHODOLOGY
10
MODELS
Exploratory analysis
Positive Negative
23,06,25,655 3,68,79,52,310
Fig:1 The tests conducted in India in 2020, how many were positive and
negative
The distribution of COVID test results based on Exploratory analysis. It begins
with input data showing 23,06,25,655 positive cases and 3,68,79,52,310
negative cases. Through Exploratory analysis, The flowchart visually represents
this analysis with labeled boxes for "Positive Cases" and "Negative Cases,"
connected by arrows indicating the flow of data. This visual representation aids
in understanding the proportion and significance of each category within the
dataset.
Exploratory analysis
Positive Negative
8,94,27,591 9,33,44,13,022
Fig:2 The tests conducted in India in 2021, how many were positive and
negative
11
The 2021 report on COVID-19 test results highlights 8,94,27,591 positive cases
and 9,33,44,13,022 negative cases. Using Exploratory analysis, these numbers
are summarized and visually represented to illustrate the distribution of results.
A flowchart visually depicts "Positive Cases" and "Negative Cases" with
labeled boxes connected by arrows, clarifying the dataset's proportions and
significance. This visualization facilitates clear communication of COVID test
outcomes, aiding in understanding their implications.
Exploratory analysis
16,08,18,63,212 71,74,80,86,235
Fig:3 How many COVID-19 tests were conducted in India in 2020 and 2021
12
CHAPTER V
RESULTS
5.1 PYTHON
13
Fig:5 COVID-19 deaths reported in India in 2020 and 2021
14
Description: Here we analyse shows that a significant number of people in
India received the first dose of Covaxin, indicating widespread acceptance and
administration of this COVID-19 vaccine across the population.
Future work should focus on several key areas. Firstly, ensuring equitable
vaccine distribution to reach vulnerable populations and achieving higher
vaccination coverage overall. Secondly, improving healthcare infrastructure to
handle potential future waves effectively, including adequate ICU beds, oxygen
supply, and medical staff readiness. Thirdly, enhancing public awareness and
compliance with preventive measures like mask-wearing and social distancing,
especially during potential resurgence. Finally, continuing research on virus
mutations and developing updated vaccines to maintain long-term immunity.
15
CHAPTER VI
SUMMARY
CONCLUSION
17
CHAPTER VII
REFERENCE
[1] Sachs JD, Karim SSA, Aknin L, Allen J, Brosbøl K, Colombo F, et al. The Lancet
Commission on lessons for the future from the COVID-19 pandemic. Lancet. 2022.
Available from: https://doi.org/10.1016/S0140-6736(22)01585-9.
[2] World Health Organization. WHO responds to The Lancet COVID-19 Commission [press
release]. Geneva: WHO; 2022. Available from: https://www.who.int/es/news/item/15-09-
2022-who-responds-to-the-lancet-covid-19-commission.
[3] United Nations. Transforming our world: the 2030 Agenda for Sustainable Development.
New York: United Nations; 2015. Available from:
https://www.un.org/en/development/desa/population/migration/generalassembly/docs/globalc
ompact/A_RES_70_1_E.pdf.
[4] Pan American Health Organization. Sustainable Health Agenda for the Americas 2018-
2030: A Call to Action for Health and Well-Being in the Region. Washington, D.C.: PAHO;
2017. Available from: https://iris.paho.org/handle/10665.2/49170.
[5] Economic Commission for Latin America and the Caribbean. Repercussions in Latin
America Santiago: ECLAC; 2022. https://repositorio.cepal.org/handle/11362/47913.
18
CHAPTER VIII
APPENDIX
Source code
import pandas as pd
import matplotlib.pyplot as plt
19
# Print total deaths in 2020 and 2021
print(f"Total deaths in 2020: {total_deaths_2020}")
print(f"Total deaths in 2021: {total_deaths_2021}")
plt.subplot(1, 2, 1)
plt.bar(['2020', '2021'], [total_confirmed_2020, total_confirmed_2021],
color=['blue', 'green'])
plt.title('Total Confirmed COVID-19 Cases in India (2020 vs 2021)')
plt.xlabel('Year')
plt.ylabel('Total Confirmed Cases')
plt.yscale('log')
plt.grid(True, which="both", ls="--", linewidth=0.5)
20
# Show the plots
plt.tight_layout()
plt.show()
21
22