0% found this document useful (0 votes)
28 views

NEW Final 2 Internship Project

COVID 19 test cases

Uploaded by

Rajesh S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

NEW Final 2 Internship Project

COVID 19 test cases

Uploaded by

Rajesh S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

CHAPTER I

INTRODUCTION TO DATA ANALYTICS

1.1. Introduction to data Analytics

Data analysis is a structured approach to examining, cleaning, transforming,


and modeling data with the purpose of discovering valuable information,
forming conclusions, and aiding decision-making. This process leverages
various techniques and tools to extract insights and discern patterns from
unprocessed data, thereby enhancing our understanding of the subject in
question.

The core aim of data analysis is to extract actionable insights from data,
enabling informed decision-making, trend detection, problem resolution, and
process improvement. It finds applications across diverse fields such as business,
finance, marketing, healthcare, and social sciences, among others, making it an
essential component in modern-day analytics.

The process of data analysis typically involves the following steps


 Define the problem
 Data collection
 Data cleaning and pre processing
 Data exploration
 Data transformation and feature engineering
 Statistical analysis and modeling
 Interpretation and visualization
 Conclusion and communication
Data analysis encompasses a range of tools and programming languages,
including Excel, Python, R, SQL, and specialized software like Tableau or
Power BI. The selection of tools and techniques hinges on data characteristics,
analysis complexity, and desired outcomes.

1
Ultimately, data analysis empowers organizations and individuals to derive
meaningful insights, facilitating data-driven decision-making and enhancing
competitiveness in their sectors.

1.2. Data Analytics Approaches

Data analytics encompasses a wide array of methods and techniques for


analyzing and interpreting data to derive insights that inform decision-making.

Descriptive Analytics

Descriptive analytics focuses on summarizing and interpreting historical


data to understand past occurrences. Techniques such as data visualization,
summary statistics, and exploratory data analysis are employed to uncover
patterns, trends, and relationships within the data.

Diagnostic Analytics

Diagnostic analytics moves beyond descriptive methods to investigate why


certain events or patterns occurred. It involves deep-diving into data to identify
root causes or contributing factors behind specific outcomes. Common
techniques include root cause analysis, regression analysis, and hypothesis
testing, which help in understanding the reasons behind observed trends.

Predictive Analytics

Predictive analytics leverages historical data and statistical or machine


learning models to forecast future events or behaviors. Techniques such as
regression analysis, time series analysis, and machine learning algorithms like
classification and clustering are used to detect patterns that can be extrapolated
to predict future trends and outcomes.

2
Prescriptive Analytics

Prescriptive analytics takes predictive analysis further by recommending


actions to optimize outcomes. By combining historical data, predictive models,
and optimization algorithms, prescriptive analytics suggests optimal strategies
or actions to achieve desired outcomes. It aids in decision-making by providing
actionable insights and recommendations based on rigorous data analysis.

1.3. Steps of Data Analytics

Data analytics typically follows a systematic process consisting of several


key steps aimed at extracting insights and deriving value from data.

Define the problem or objective

Clearly define the problem or objective that data analytics will address. This
initial step establishes the direction and focus for the analysis, ensuring
alignment with organizational goals and stakeholder needs.

Collect and understand the data

Identify relevant data sources and gather the necessary data. This phase
involves acquiring data, integrating different data sources, and ensuring data
cleanliness and completeness. Understanding the data also includes assessing its
quality, relevance, and potential biases.

Explore and prepare the data

Conduct exploratory data analysis to gain initial insights and uncover


patterns or anomalies within the dataset.

Preprocess the data by addressing issues like missing values, outliers, and
data normalization or transformation as needed.

3
This stage is crucial for laying the groundwork for subsequent analysis and
ensuring the data is in a suitable format for modeling.

Select appropriate analytical techniques

Choose the most suitable analytical techniques based on the defined


problem or objective. This may include utilizing descriptive statistics, data
visualization, regression analysis, classification, clustering, time series analysis,
or other advanced analytical methods. The selection of techniques should be
driven by the nature of the data and the specific insights sought.

Apply the selected techniques

Implement the chosen analytical techniques on the prepared data. This step
involves applying algorithms, statistical tests, or models to extract meaningful
information and uncover patterns or relationships within the data. It is essential
to ensure that the techniques chosen are appropriate for the specific nature of
the data and aligned with the objectives of the analysis.

Interpret the results

Analyze the output from the applied techniques and interpret the findings
in the context of the original problem or objective. This step requires a
combination of domain knowledge, statistical expertise, and critical thinking to
derive actionable insights. Understanding the implications of the results is
crucial for making informed decisions and deriving value from the analysis.

Communicate and visualize the insights

Present the results of the analysis in a clear and meaningful way. Effective
communication involves creating visualizations, dashboards, reports, or other
forms of communication to convey the insights to stakeholders, decision-makers

4
Visualizations should be intuitive and informative, highlighting key findings
and trends identified through the analysis. This step not only facilitates
understanding but also encourages stakeholders to engage with the data and
derive insights independently.

Iterate and refine

Continuously evaluate the effectiveness of the analysis and the impact of


insights on decision-making. This iterative approach ensures that the analysis
remains relevant and actionable, driving continuous improvement and
enhancing the organization's analytical capabilities.

1.4. Applications of Data Analytics

Business Intelligence

Involves using data analysis tools and techniques to gather, store, and
analyze business data to make informed decisions and improve operations.

Finance Banking

Uses data analytics to manage financial transactions, assess risks, detect


fraud, and optimize investments to ensure financial stability and profitability.

Healthcare and Medicine

Applies data analytics to improve patient care, optimize hospital operations,


conduct medical research, and develop personalized treatments based on patient
data.

5
Marketing and Advertising

Utilizes data analysis to understand consumer behavior, target audiences


effectively, measure campaign performance, and optimize marketing strategies
for better ROI.

Manufacturing and Supply Chain

Uses data analytics to enhance production efficiency, manage inventory,


predict demand, optimize supply chain logistics, and reduce operational costs.

Social and Customer Analytics

Focuses on analyzing social media and customer data to understand


sentiment, behavior patterns, preferences, and trends to improve customer
satisfaction and engagement.

Energy and Utilities

Applies data analytics to monitor energy consumption, optimize resource


allocation, predict maintenance needs, and improve overall operational
efficiency in utilities and energy production.

Transportation and Logistics

Uses data to optimize route planning, and supply chain efficiency to


ensure cost savings.

6
CHAPTER II

OVERVIEW OF THE PROBLEM

2.1. Problem Study

This study investigates the effectiveness of COVID-19 interventions


implemented throughout India. It focuses on assessing the public's adherence to
preventive measures such as mask-wearing and social distancing. Additionally,
it evaluates the distribution, administration, and acceptance of COVID-19
vaccines across diverse demographic groups and geographical regions. The
study also scrutinizes the functionality and capacity of testing infrastructure in
identifying and monitoring COVID-19 cases nationwide. Understanding the
socioeconomic factors and geographic disparities that influence the outcomes of
these intervention strategies is crucial.

The study aims to provide evidence-based insights to policymakers, healthcare


professionals, and public health authorities. By evaluating the strengths and
weaknesses of current interventions, the study seeks to propose informed
recommendations for refining strategies, improving healthcare delivery, and
enhancing overall public health outcomes in India amidst the ongoing
challenges posed by the COVID-19 pandemic.

2.2. Challenges / Need of the study

Collecting reliable data on cases, testing, vaccinations, and healthcare


facilities is difficult. There are big differences in healthcare access between
regions and between urban and rural areas, making it hard to distribute help
equally. Vaccine hesitancy, influenced by demographics and socio-economic
factors, adds to the problem, as do logistical issues like storing and transporting
vaccines. Effective policies need to consider local contexts and address
economic hardships to improve the COVID-19 response.

7
We collected data from Kaggle to help understand these challenges.
However, we faced problems with inconsistent and incomplete data. Some
regions had better data than others, and the quality of data varied. There were
also issues with different formats and definitions, making it hard to compare
information. Despite these challenges, using data from Kaggle helped us
identify key areas that need improvement.

Objectives

How many people in India received their first COVID-19 vaccination in 2021

How many COVID-19 tests were conducted in India in 2020 and 2021

2.3. Hardware / System Requirements

Installed RAM: 8.00 GB


Processor: 11th Gen Intel(R) Core(TM) i3-1115G4 @ 3.00GHz
System type: 64-bit operating system, x64-based processor
2.4. Software

Python version 3.12

Tools and Libraries Requirements

NumPy is a fundamental library for numerical computing in Python, providing


support for large, multi- dimensional arrays and a collection of mathematical
functions

Pandas is a powerful library for data manipulation and analysis. It provides data
structures such as Data Frames for efficiently working with structured data

Matplotlib is a popular plotting library that enables the creation of a wide range
of static, animated, and interactive visualizations in Python.

8
CHAPTER III

DATA PREPARATION

3.1. Data Collection Approaches

We collected and curated the dataset directly from Kaggle. We downloaded


relevant COVID-19 datasets on cases, testing, vaccinations, and healthcare
facilities in India. The data was then cleaned and organized to ensure
consistency and accuracy. Finally, we analyzed the curated data to identify key
challenges and areas needing improvement.

3.2. Data Method

Exploratory data analysis (EDA), advocated by John Tukey, encourages


statisticians to thoroughly explore data, potentially prompting the formulation of
new hypotheses and the initiation of further data collection and experiments.
EDA specifically centers on validating assumptions necessary for model fitting
and hypothesis testing. Additionally, it addresses the handling of missing values
and the transformation of variables as required.[Reference:7.5]

3.3. Purpose of Data

The purpose of collecting data for assessing the effectiveness of COVID-19


interventions in India is to evaluate their impact comprehensively. This data
helps gauge the effectiveness of various interventions in reducing transmission
rates, increasing vaccination coverage, and improving healthcare outcomes. It
allows for a deeper understanding of the population's response to these
interventions, identifying challenges and optimizing strategies accordingly.

9
CHAPTER IV

METHODOLOGY

This chapter provides a comprehensive overview of both exploratory and


statistical analyses, detailing the methods employed for each. Exploratory
analysis utilizes various methods to summarize and interpret data effectively.
This approach helps in organizing and presenting data insights clearly through
categories and aggregates. In contrast, statistical analysis employs tools like the
forecast method to predict future trends based on historical data patterns. In
exploratory analysis, diagrams are crucially represented in the Input-Processing-
Output format, illustrating how data is collected (input), processed using
analytical methods (processing), and finally interpreted or visualized to derive
insights (output). These diagrams provide a structured approach to
understanding the flow of data analysis, enhancing clarity and facilitating
informed decision-making based on the findings.

4.1. Exploratory Analysis

Exploratory Data Analysis (EDA) aims to provide a comprehensive and


clear understanding of the dataset's properties and structures. It is the
foundational step in data analysis. EDA involves summarizing the main
characteristics of the data, often using visual methods. This process helps in
identifying patterns, spotting anomalies, and testing hypotheses.

10
MODELS

In 2020, how many were


positive and negative

Exploratory analysis

Positive Negative

23,06,25,655 3,68,79,52,310

Fig:1 The tests conducted in India in 2020, how many were positive and
negative
The distribution of COVID test results based on Exploratory analysis. It begins
with input data showing 23,06,25,655 positive cases and 3,68,79,52,310
negative cases. Through Exploratory analysis, The flowchart visually represents
this analysis with labeled boxes for "Positive Cases" and "Negative Cases,"
connected by arrows indicating the flow of data. This visual representation aids
in understanding the proportion and significance of each category within the
dataset.

In 2021, how many were positive


and negative

Exploratory analysis

Positive Negative

8,94,27,591 9,33,44,13,022

Fig:2 The tests conducted in India in 2021, how many were positive and
negative

11
The 2021 report on COVID-19 test results highlights 8,94,27,591 positive cases
and 9,33,44,13,022 negative cases. Using Exploratory analysis, these numbers
are summarized and visually represented to illustrate the distribution of results.
A flowchart visually depicts "Positive Cases" and "Negative Cases" with
labeled boxes connected by arrows, clarifying the dataset's proportions and
significance. This visualization facilitates clear communication of COVID test
outcomes, aiding in understanding their implications.

COVID-19 tests were conducted in


India in 2020 and 2021

Exploratory analysis

Total 2020 Total 2021

16,08,18,63,212 71,74,80,86,235

Fig:3 How many COVID-19 tests were conducted in India in 2020 and 2021

In 2020, India conducted a total of 16,08,18,63,212 COVID-19 tests as part of


its initial response to the pandemic. By 2021, the country significantly escalated
its testing efforts, performing 71,74,80,86,235 tests. This increase highlights
India's proactive approach to monitoring and controlling the spread of COVID
Exploratory analysis of these figures through a flowchart visually demonstrates
the substantial rise in testing from 2020 to 2021, aiding in assessing testing
strategies and their impact on public health measures.

12
CHAPTER V

RESULTS

5.1 PYTHON

Fig:4 Total Confirmed COVID-19 Cases in India 2020 to 2021

Description: Analyzing the increase in COVID-19 testing from 2020 to 2021


indicates a heightened awareness and response to the pandemic, likely driven by
a corresponding increase in COVID-19 cases during that period. The significant
rise in testing numbers in 2021 reflects efforts to identify and manage the spread
of the virus more effectively, which is crucial in understanding and mitigating
the impact of COVID-19 on public health.

13
Fig:5 COVID-19 deaths reported in India in 2020 and 2021

Description: In analyzing the total COVID-19 deaths in India from 2020 to


2021, it is observed that there were more deaths in 2021 compared to the
previous year.

Fig:6 First Dose of Covaxin and Covishield Administered in India 2021

14
Description: Here we analyse shows that a significant number of people in
India received the first dose of Covaxin, indicating widespread acceptance and
administration of this COVID-19 vaccine across the population.

5.2. DISCUSSION & FUTURE WORK

The effectiveness of COVID-19 interventions in India has been a critical


topic throughout the pandemic. Initially, measures like lockdowns, mask
mandates, and vaccination drives were implemented to curb the spread of the
virus. These interventions showed varying degrees of success. Lockdowns
helped in controlling the rapid transmission by limiting movement, although
they also posed economic and social challenges. Mask mandates were crucial in
reducing infection rates by preventing the airborne spread of the virus.
Vaccination drives played a pivotal role in boosting immunity across the
population, reducing severe cases and hospitalizations significantly.

Future work should focus on several key areas. Firstly, ensuring equitable
vaccine distribution to reach vulnerable populations and achieving higher
vaccination coverage overall. Secondly, improving healthcare infrastructure to
handle potential future waves effectively, including adequate ICU beds, oxygen
supply, and medical staff readiness. Thirdly, enhancing public awareness and
compliance with preventive measures like mask-wearing and social distancing,
especially during potential resurgence. Finally, continuing research on virus
mutations and developing updated vaccines to maintain long-term immunity.

Additionally, continued investment in research and development is needed to


monitor virus mutations and develop updated vaccines or treatments as
necessary. Strengthening testing and surveillance capabilities will enable
prompt detection and containment of outbreaks.

15
CHAPTER VI

SUMMARY

Testing efforts in India saw a notable increase, although access to timely


testing varied, posing challenges across different regions. The vaccination
campaigns achieved widespread success, administering doses of Covaxin,
Covishield, and other vaccines to millions, despite facing initial hurdles such as
vaccine hesitancy and logistical issues in rural areas. Both Covaxin and
Covishield played pivotal roles in the vaccination drive, overcoming early
supply chain challenges. While lockdowns and restrictions aided in controlling
outbreaks, their effectiveness was tempered by issues related to enforcement
and compliance.

CONCLUSION

Analyzing the effects of COVID-19 in India reveals a profound impact


across various dimensions. Healthwise, the pandemic strained hospitals and
healthcare resources, leading to significant challenges in managing patient
influx and providing adequate care. The emergence of new variants further
complicated efforts to control transmission rates. Economically, lockdowns and
restrictions disrupted supply chains, businesses, and livelihoods, particularly
affecting informal sectors and daily wage earners. The country experienced a
contraction in GDP growth and rising unemployment rates as a result of these
disruptions.

The pandemic exacerbated existing inequalities, disproportionately


affecting marginalized communities with limited access to healthcare and
resources. Educational institutions faced closures, impacting millions of
students' learning and development. Government responses included ramping
up testing facilities, accelerating vaccination drives, and implementing public
health measures to curb transmission.
16
Challenges such as vaccine hesitancy, misinformation, and vaccine equity
issues posed ongoing hurdles in achieving widespread immunization.
Strengthening resilience against future health crises involves learning from
COVID-19 experiences to build a more robust and inclusive public health
framework for the future.

17
CHAPTER VII

REFERENCE

[1] Sachs JD, Karim SSA, Aknin L, Allen J, Brosbøl K, Colombo F, et al. The Lancet
Commission on lessons for the future from the COVID-19 pandemic. Lancet. 2022.
Available from: https://doi.org/10.1016/S0140-6736(22)01585-9.

[2] World Health Organization. WHO responds to The Lancet COVID-19 Commission [press
release]. Geneva: WHO; 2022. Available from: https://www.who.int/es/news/item/15-09-
2022-who-responds-to-the-lancet-covid-19-commission.

[3] United Nations. Transforming our world: the 2030 Agenda for Sustainable Development.
New York: United Nations; 2015. Available from:
https://www.un.org/en/development/desa/population/migration/generalassembly/docs/globalc
ompact/A_RES_70_1_E.pdf.

[4] Pan American Health Organization. Sustainable Health Agenda for the Americas 2018-
2030: A Call to Action for Health and Well-Being in the Region. Washington, D.C.: PAHO;
2017. Available from: https://iris.paho.org/handle/10665.2/49170.

[5] Economic Commission for Latin America and the Caribbean. Repercussions in Latin
America Santiago: ECLAC; 2022. https://repositorio.cepal.org/handle/11362/47913.

18
CHAPTER VIII

APPENDIX

Source code

import pandas as pd
import matplotlib.pyplot as plt

# Load COVID-19 data


covid_data_path = '/covid_19_india.csv'
covid_data = pd.read_csv(covid_data_path)

# Convert the 'Date' column to datetime format


covid_data['Date'] = pd.to_datetime(covid_data['Date'], format='%Y-%m-%d')

# Extract the year from the date


covid_data['Year'] = covid_data['Date'].dt.year

# Filter the data for 2020 and 2021


covid_data_2020 = covid_data[covid_data['Year'] == 2020]
covid_data_2021 = covid_data[covid_data['Year'] == 2021]

# Calculate total confirmed cases for each year


total_confirmed_2020 = covid_data_2020['Confirmed'].max()
total_confirmed_2021 = covid_data_2021['Confirmed'].max()

# Calculate total deaths for each year


total_deaths_2020 = covid_data_2020['Deaths'].max()
total_deaths_2021 = covid_data_2021['Deaths'].max()

19
# Print total deaths in 2020 and 2021
print(f"Total deaths in 2020: {total_deaths_2020}")
print(f"Total deaths in 2021: {total_deaths_2021}")

# Plotting total confirmed cases


plt.figure(figsize=(15, 6))

plt.subplot(1, 2, 1)
plt.bar(['2020', '2021'], [total_confirmed_2020, total_confirmed_2021],
color=['blue', 'green'])
plt.title('Total Confirmed COVID-19 Cases in India (2020 vs 2021)')
plt.xlabel('Year')
plt.ylabel('Total Confirmed Cases')
plt.yscale('log')
plt.grid(True, which="both", ls="--", linewidth=0.5)

# Plotting total deaths


plt.subplot(1, 2, 2)
plt.bar(['2020', '2021'], [total_deaths_2020, total_deaths_2021], color=['red',
'orange'])
plt.title('Total COVID-19 Deaths in India (2020 vs 2021)')
plt.xlabel('Year')
plt.ylabel('Total Deaths')
plt.yscale('log')
plt.grid(True, which="both", ls="--", linewidth=0.5)

20
# Show the plots
plt.tight_layout()
plt.show()

# Load COVID-19 vaccine data


vaccine_data_path = '/covid_vaccine_statewise.csv'
vaccine_data = pd.read_csv(vaccine_data_path)

# Convert the 'Updated On' column to datetime format


vaccine_data['Updated On'] = pd.to_datetime(vaccine_data['Updated On'],
format='%d/%m/%Y')
# Extract the year from the date
vaccine_data['Year'] = vaccine_data['Updated On'].dt.year
# Filter the data for 2021 and for Covaxin first dose
vaccine_data_2021 = vaccine_data[vaccine_data['Year'] == 2021]
covaxin_first_dose_2021 = vaccine_data_2021['First Dose
Administered'].sum()
# Print the total number of people who received the first dose of Covaxin in
2021
print(f"Total number of people who received the first dose of Covaxin in 2021:
{covaxin_first_dose_2021}")
# Plotting first dose of Covaxin administered
plt.figure(figsize=(8, 6))
plt.bar(['First Dose Covaxin'], [covaxin_first_dose_2021], color='purple')
plt.title('First Dose of Covaxin Administered in India (2021)')
plt.xlabel('Dose Type')
plt.ylabel('Number of People')
plt.grid(True, which="both", ls="--", linewidth=0.5)
# Show the plot
plt.show()

21
22

You might also like