0% found this document useful (0 votes)
21 views15 pages

Data Analytics Life Cycle Explained

The Data Analytics Life Cycle consists of six phases: Discovery, Data Preparation, Model Planning, Model Building, Communicate Results, and Operationalize, which collectively transform raw data into actionable insights. The document also discusses the concept of 'Data Deluge,' driven by the increase in digital devices, IoT, social media, and more, emphasizing the importance of managing and analyzing vast amounts of data. Additionally, it outlines the differences between Business Intelligence and Data Science, as well as the characteristics and sources of Big Data, and describes three types of data analysis: Descriptive, Diagnostic, and Predictive.

Uploaded by

kulkarnisohama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views15 pages

Data Analytics Life Cycle Explained

The Data Analytics Life Cycle consists of six phases: Discovery, Data Preparation, Model Planning, Model Building, Communicate Results, and Operationalize, which collectively transform raw data into actionable insights. The document also discusses the concept of 'Data Deluge,' driven by the increase in digital devices, IoT, social media, and more, emphasizing the importance of managing and analyzing vast amounts of data. Additionally, it outlines the differences between Business Intelligence and Data Science, as well as the characteristics and sources of Big Data, and describes three types of data analysis: Descriptive, Diagnostic, and Predictive.

Uploaded by

kulkarnisohama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit :- 3 DSBDA

Data Analytics Life cycle (all Phases)

Data Analytics Life Cycle

The Data Analytics Life Cycle is a step-by-step process used to solve


data-related problems using analytics. It helps transform raw data into
valuable insights for decision-making. It consists of six phases:

1. Discovery

●​ This is the first phase, where the goal is to understand the


problem.
●​ Involves identifying:
○​ Business goals
○​ Key stakeholders
○​ Available resources
●​ Analysts gather domain knowledge, define objectives, and assess
tools and technologies.

📌 Example: A bank wants to reduce customer churn. In this phase,


analysts study why customers are leaving.

2. Data Preparation

●​ Also called Data Preprocessing or Data Wrangling.


●​ In this phase, raw data is:
○​ Collected
○​ Cleaned
○​ Integrated from multiple sources
●​ Data is transformed into a usable format for analysis.

📌 Example: Removing missing values, correcting data types, and


merging data from different departments.

3. Model Planning

●​ This phase focuses on deciding:


○​ Which analytical techniques to use
○​ Which algorithms or models might fit best
●​ Tools like R, Python, or SQL can be used to explore data visually
and statistically.

📌 Example: Choosing regression for prediction, or clustering to group


similar customers.

4. Model Building

●​ Actual creation of models happens here using:


○​ Machine learning
○​ Statistical methods
●​ Training data is used to build models; test data is used to
evaluate them.

📌 Example: Building a classification model to predict if a customer will


churn.
5. Communicate Results

●​ Findings are shared with stakeholders using:


○​ Charts
○​ Dashboards
○​ Reports
●​ Helps business users understand the impact and take action.

📌 Example: A dashboard showing which customer segments are at


highest risk of leaving

6. Operationalize

●​ Final phase where the model is:


○​ Deployed into production
○​ Integrated into business operations
●​ Also involves creating final reports and monitoring the model over
time.

📌 Example: The churn prediction model is integrated into the CRM


system to alert sales teams.

✅ Conclusion
The Data Analytics Life Cycle ensures a structured and effective way
to convert data into actionable insights. Each phase plays a crucial role
in solving real-world business problems with data science.
Driving Data Deluge

The term "Data Deluge" refers to the rapid and massive increase in
data being generated from various sources every second. This
explosion of data is also known as Big Data.

There are several key drivers that contribute to this data deluge:

1. Increase in Digital Devices

●​ Devices like smartphones, laptops, tablets, and wearables


constantly generate data.
●​ These devices are connected to the internet and produce
real-time data such as location, messages, videos, etc.

📌 Example: A smartphone using GPS, fitness tracking, and apps all at


once.

2. Internet of Things (IoT)

●​ Billions of devices like smart TVs, smartwatches, cars, home


appliances, and industrial machines are connected to the internet.
●​ These devices collect, transmit, and store huge amounts of
sensor and machine data.

📌 Example: A smart refrigerator monitoring temperature and sending


alerts.
3. Social Media Platforms

●​ Sites like Facebook, Instagram, Twitter, YouTube generate


vast data through posts, comments, likes, shares, and videos.
●​ This data is unstructured, large in volume, and produced at high
speed.

📌 Example: 500+ hours of video are uploaded to YouTube every


minute.

4. Cloud Computing

●​ Cloud platforms offer unlimited storage and processing power,


encouraging organizations to store and analyze more data than
ever before.

📌 Example: Amazon Web Services (AWS), Google Cloud, and


Microsoft Azure hosting enterprise data.

5. Digital Transactions

●​ Online activities like banking, e-commerce, ticket bookings,


online payments generate structured data.
●​ This includes timestamps, transaction IDs, user data, etc.

📌 Example: Every swipe of a credit card creates a new data record.

6. Multimedia Content
●​ High-definition videos, images, and audio shared online consume
large storage.
●​ Content from cameras, CCTV, drones, and video streaming adds to
the data deluge.

📌 Example: Netflix streaming or Zoom video meetings across the


world.

7. Business and Government Data

●​ Enterprises and governments collect vast amounts of data for


decision-making.
●​ This includes customer data, healthcare records, census data,
security footage, etc.

📌 Example: A city using data from sensors and traffic cameras to


manage congestion.

✅ Conclusion
The data deluge is driven by the growing number of connected
devices, online activity, and digital services. Managing and analyzing
this massive data is crucial for organizations to stay competitive, make
decisions, and innovate.

Data Science (Business Intelligence And Data Science)

✅ Definition of Data Science:


Data Science is the process of extracting meaningful insights and
knowledge from structured and unstructured data using mathematics,
statistics, machine learning, programming, and domain knowledge.

🔍 What is Business Intelligence (BI)?


Business Intelligence (BI) involves:

●​ Collecting, processing, and analyzing historical data


●​ Generating reports, dashboards, and visualizations
●​ Helping organizations monitor performance and make decisions

📌 BI answers questions like: "What happened?" and "Why did it


happen?"

🤖 What is Data Science?


Data Science is a more advanced approach that includes:

●​ BI techniques plus
●​ Predictive analytics, machine learning, and AI
●​ Used for forecasting, pattern detection, and decision
automation

📌 Data Science answers: "What will happen?" and "What should we do


next?"

🆚 Difference Between Business Intelligence and Data Science:


Feature Business Data Science
Intelligence (BI)

Data Type Structured Structured + Unstructured

Focus Descriptive Predictive and Prescriptive (Future)


(Past/Present)

Tools Used SQL, Excel, Tableau Python, R, Hadoop, Machine Learning

Techniques Reporting, Statistics, AI, ML, Data Mining


Dashboards

Main "What happened?" "What will happen?" and "Why?"


Question

User Type Business Analysts Data Scientists, Analysts

✅ How They Work Together:


●​ BI helps in understanding past performance.
●​ Data Science builds on BI to predict trends, automate
processes, and optimize decisions.
●​ Many organizations use BI for reporting and Data Science for
advanced analytics.

📌 Example:
●​ BI shows sales dropped last month.
●​ Data Science predicts sales for next month and suggests
marketing strategies.
📝 Conclusion:
Business Intelligence and Data Science are both essential parts of
modern data-driven decision-making.

●​ BI helps in understanding the "what and why" of past data.​

●​ Data Science provides deeper insights and helps in predicting and


optimizing the "future."

Big Data (Characteristics,Sources)

Big Data refers to extremely large and complex datasets that


traditional data processing tools cannot handle efficiently. It includes
data from various sources, in various formats, and at very high speed.

✅ Characteristics of Big Data (The 5 Vs)


Big Data is typically described using 5 key characteristics, often
called the 5 Vs:

1. Volume

●​ Refers to the huge amount of data generated every second.​


📌 Example: Facebook generates terabytes of data daily.

2. Velocity
●​ Describes the speed at which data is generated and processed.​
📌 Example: Stock market data changes every second.

3. Variety

●​ Refers to the different types of data:


○​ Structured (tables)
○​ Semi-structured (XML, JSON)
○​ Unstructured (videos, emails, images)​
📌 Example: Tweets, videos, logs, GPS data, etc.

4. Veracity

●​ Refers to the trustworthiness and quality of the data.


●​ Big data may have inconsistencies, noise, or errors.​
📌 Example: User-entered data may be inaccurate or missing.

5. Value

●​ The most important "V" — extracting meaningful insights from


data to make decisions and create business value.​
📌 Example: Predicting customer behavior to increase sales.

🔍 Sources of Big Data


Big Data comes from multiple sources, including:
1. Social Media

●​ Platforms like Facebook, Instagram, Twitter generate huge


amounts of unstructured data: images, videos, comments.​

2. Sensor Data / IoT Devices

●​ Smart devices, GPS, wearables, home automation systems collect


real-time sensor data.​

3. Transaction Data

●​ Online purchases, bank records, ATM transactions, e-commerce


data — mostly structured.​

4. Machine Data / Logs

●​ Web servers, network devices, system logs provide valuable data


for analysis.

5. Multimedia Content

●​ Audio, video, CCTV, satellite images — data with large volume and
storage needs.

6. Government and Public Data

●​ Weather reports, census data, healthcare records, public


services generate bulk data.
✅ Conclusion
Big Data is defined by its Volume, Velocity, Variety, Veracity, and
Value, and is sourced from social media, sensors, transactions, logs,
and more. Managing and analyzing this data helps in decision-making,
innovation, and improving services across industries.

Descriptive , Diagnostic , Predictive Analysis

Types of Data Analysis

Data Analytics is broadly divided into different types based on the


purpose and nature of insights. The three most commonly used types
are:

🔷 1. Descriptive Analysis – “What happened?”


●​ This type of analysis helps in summarizing and understanding
past data.
●​ It answers what has occurred in a business or system using
historical data.
●​ Used for reporting, dashboards, and KPIs.

✅ Key Features:
●​ Based on historical data
●​ Uses graphs, charts, and tables
●​ Helps monitor performance and trends
📌 Example: Monthly sales report, website traffic, attendance reports.

🔷 2. Diagnostic Analysis – “Why did it happen?”


●​ It goes one step beyond descriptive analysis.
●​ Focuses on finding the root cause of a problem or event.
●​ Uses data mining, correlation analysis, and drill-down
techniques.

✅ Key Features:
●​ Explains causes and reasons
●​ Involves comparison and data relationships
●​ Helps in understanding failures or successes

📌 Example: Analyzing why sales dropped in a particular region last


month.

🔷 3. Predictive Analysis – “What is likely to happen?”


●​ Uses historical data + statistical models + machine learning to
predict future outcomes.
●​ Helps in forecasting trends, behaviors, and risks.

✅ Key Features:
●​ Uses predictive models and algorithms
●​ Helps in proactive decision-making
●​ Improves business planning
📌 Example: Predicting customer churn, forecasting demand, or
detecting fraud.

📊 Comparison Table:
Type Main Focus Techniques Example
Question Used

Descriptive What Past data Reporting, Monthly sales report


happened? summary dashboards

Diagnostic Why did it Root Drill-down, Analyzing customer


happen? cause correlation, complaints
analysis queries

Predictive What will Forecasti ML models, Predicting product


happen? ng future regression, demand
time series

✅ Conclusion
These three types of analysis—Descriptive, Diagnostic, and
Predictive—play a vital role in the Data Analytics Life Cycle.

●​ Descriptive tells what happened,


●​ Diagnostic tells why it happened, and
●​ Predictive tells what is likely to happen next.​
Together, they help businesses make smarter, data-driven
decisions.

You might also like