Data Science

Data science is an interdisciplinary field that uses scientific methods and techniques from many fields to extract knowledge from structured and unstructured data. It employs techniques from mathematics, statistics, computer science and information science. The document discusses the data science life cycle which includes phases of discovery, data preparation, model planning, model building, operationalizing results, and communicating findings. It also outlines the key skills needed for a data scientist including statistics, machine learning, domain knowledge, coding, and communication.

Uploaded by

Shikhar Choudhary

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views

Data Science

Uploaded by

Shikhar Choudhary

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Data Science

Data science is an interdisciplinary field that uses scientific methods, processes,

algorithms and systems to extract knowledge and insights from data in various forms,
both structured and unstructured, similar to data mining.
Data science is a "concept to unify statistics, data analysis, machine learning and
their related methods" in order to "understand and analyze actual phenomena" with
data. It employs techniques and theories drawn from many fields within the context
of mathematics, statistics, information science, and computer science.
As the world entered the era of big data, the need for its storage also grew. It
was the main challenge and concern for the enterprise industries until 2010. The main
focus was on building framework and solutions to store data. Now when Hadoop and
other frameworks have successfully solved the problem of storage, the focus has
shifted to the processing of this data. Data Science is the secret sauce here. All the
ideas which you see in Hollywood sci-fi movies can actually turn into reality by Data
Science. Data Science is the future of Artificial Intelligence. Therefore, it is very
important to understand what is Data Science and how can it add value to your
business.
Data Science Life Cycle
Here is a brief overview of the main phases of the Data Science Lifecycle:

Phase 1—Discovery: Before you begin the project, it is important to understand the
various specifications, requirements, priorities and required budget. You must possess
the ability to ask the right questions. Here, you assess if you have the required
resources present in terms of people, technology, time and data to support the
project. In this phase, you also need to frame the business problem and formulate initial
hypotheses (IH) to test.

Phase 2—Data preparation: In this phase, you require analytical sandbox in which
you can perform analytics for the entire duration of the project. You need to explore,
preprocess and condition data prior to modeling. Further, you will perform ETLT
(extract, transform, load and transform) to get data into the sandbox. Let’s have a look
at the Statistical Analysis flow below.

Phase 3—Model planning: Here, you will determine the methods and techniques to
draw the relationships between variables.

1
These relationships will set the base for the algorithms which you will implement
in the next phase. You will apply Exploratory Data Analytics (EDA) using various
statistical formulas and visualization tools.

Phase 4—Model building: In this phase, you will develop datasets for training and
testing purposes. You will consider whether your existing tools will suffice for running
the models or it will need a more robust environment (like fast and parallel
processing). You will analyze various learning techniques like classification, association
and clustering to build the model.

Fig: Lifecycle of Data Science

2
Phase 5—Operationalize: In this phase, you deliver final reports, briefings, code and
technical documents. In addition, sometimes a pilot project is also implemented in a
real-time production environment. This will provide you a clear picture of the
performance and other related constraints on a small scale before full deployment.

Phase 6—Communicate results: Now it is important to evaluate if you have been

able to achieve your goal that you had planned in the first phase. So, in the last phase,
you identify all the key findings, communicate to the stakeholders and determine if the
results of the project are a success or a failure based on the criteria developed in
Phase1.

Data Scientist skills

Being a Data Scientist is easier said than done. So, let’s see what all you need to
be a Data Scientist. A Data Scientist requires skills basically from three major areas as
shown below.

Fig: Data scientist skills

3
As you can see in the above image, you need to acquire various hard skills and
soft skills. You need to be good at statistics and mathematics to analyze and visualize
data. Needless to say, Machine Learning forms the heart of Data Science and requires
you to be good at it. Also, you need to have a solid understanding of the domain you are
working in to understand the business problems clearly. Your task does not end here.
You should be capable of implementing various algorithms which
require good coding skills. Finally, once you have made certain key decisions, it is
important for you to deliver them to the stakeholders. So, good communication will
definitely add brownie points to your skills.

Be The Outlier - How To Ace Data Science Interviews - Shrilata Murthy
100% (2)
Be The Outlier - How To Ace Data Science Interviews - Shrilata Murthy
150 pages
Software Engineering For Machine Learning: A Case Study
No ratings yet
Software Engineering For Machine Learning: A Case Study
10 pages
Data Science
100% (2)
Data Science
33 pages
Unit-I Introduction To Data Science
No ratings yet
Unit-I Introduction To Data Science
40 pages
Life Cycle of Data Science - Complete Step-By-step Guide
No ratings yet
Life Cycle of Data Science - Complete Step-By-step Guide
3 pages
Unit 1
No ratings yet
Unit 1
30 pages
Handbook Introduction of Data Science AY 23-24
No ratings yet
Handbook Introduction of Data Science AY 23-24
171 pages
Data Science
No ratings yet
Data Science
5 pages
Unit-1 Data Science
No ratings yet
Unit-1 Data Science
74 pages
Data Science CLASS 12 INVESTIGATORY PROJECT
No ratings yet
Data Science CLASS 12 INVESTIGATORY PROJECT
9 pages
Life Cycle of DS Project
No ratings yet
Life Cycle of DS Project
9 pages
Data Science Process Stages Lecture 2
No ratings yet
Data Science Process Stages Lecture 2
4 pages
Unit 1
No ratings yet
Unit 1
28 pages
Comprehensive Guide to Implementing Data Science and Analytics: Tips, Recommendations, and Strategies for Success
From Everand
Comprehensive Guide to Implementing Data Science and Analytics: Tips, Recommendations, and Strategies for Success
Rick Spair
No ratings yet
Data Science
No ratings yet
Data Science
18 pages
Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn
No ratings yet
Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn
5 pages
Unit 2 Bi Unlocked Notes
No ratings yet
Unit 2 Bi Unlocked Notes
48 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
85 pages
Chapter 1
No ratings yet
Chapter 1
47 pages
COMPUTATIONAL DATA SCIENCE - UNIT 1
No ratings yet
COMPUTATIONAL DATA SCIENCE - UNIT 1
18 pages
Data Science
No ratings yet
Data Science
11 pages
Data Science-Lec 1
No ratings yet
Data Science-Lec 1
17 pages
M1 - FDS
No ratings yet
M1 - FDS
19 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
16 pages
Unit 1 - DSA
No ratings yet
Unit 1 - DSA
12 pages
Part1 Ds ML Introduction
No ratings yet
Part1 Ds ML Introduction
61 pages
Final Industrial Report
No ratings yet
Final Industrial Report
34 pages
DSE 3 Unit 1
100% (1)
DSE 3 Unit 1
10 pages
First Hand
No ratings yet
First Hand
4 pages
Introduction to Data-Science
No ratings yet
Introduction to Data-Science
246 pages
1.1 Idml
No ratings yet
1.1 Idml
3 pages
JobRecord MUHAMMAD NAEEM f70a3eba Db3d 11ef a12f 96f32f87411b
No ratings yet
JobRecord MUHAMMAD NAEEM f70a3eba Db3d 11ef a12f 96f32f87411b
63 pages
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Basic of ds
No ratings yet
Basic of ds
14 pages
Harsh Synopsis
No ratings yet
Harsh Synopsis
21 pages
Starting A Career in Data Science ?
No ratings yet
Starting A Career in Data Science ?
19 pages
UNIT- I
No ratings yet
UNIT- I
17 pages
Unit 3
No ratings yet
Unit 3
9 pages
Data Science
No ratings yet
Data Science
18 pages
PYTHON DATA SCIENCE: A Practical Guide to Mastering Python for Data Science and Artificial Intelligence (2023 Beginner Crash Course)
From Everand
PYTHON DATA SCIENCE: A Practical Guide to Mastering Python for Data Science and Artificial Intelligence (2023 Beginner Crash Course)
Calvert Long
No ratings yet
HUI-CMP201 Note 5
No ratings yet
HUI-CMP201 Note 5
62 pages
Data Science
No ratings yet
Data Science
5 pages
Fods Notes
No ratings yet
Fods Notes
139 pages
Data Science Components
No ratings yet
Data Science Components
7 pages
DS Skills
No ratings yet
DS Skills
4 pages
Fundamentals of Data Science
No ratings yet
Fundamentals of Data Science
53 pages
Data Science Life Cycle - All Details
No ratings yet
Data Science Life Cycle - All Details
12 pages
Foundation of Data Science
100% (2)
Foundation of Data Science
143 pages
PYDS 3150713 Unit-2
No ratings yet
PYDS 3150713 Unit-2
38 pages
Data Science Report
No ratings yet
Data Science Report
32 pages
Data Science Mastery: From Beginner to Expert in Big Data Analytics
From Everand
Data Science Mastery: From Beginner to Expert in Big Data Analytics
Kameron Hussain
No ratings yet
Data Science Report - Compress
No ratings yet
Data Science Report - Compress
31 pages
Introduction to Data Science Lecture 1
No ratings yet
Introduction to Data Science Lecture 1
4 pages
Differences between Data Science and Data Analytics
No ratings yet
Differences between Data Science and Data Analytics
10 pages
OceanofPDF - Com DATA SCIENCE Simple and Effective Tips An - Benjamin Smith
100% (1)
OceanofPDF - Com DATA SCIENCE Simple and Effective Tips An - Benjamin Smith
122 pages
Fundamentals of Data Science
100% (3)
Fundamentals of Data Science
62 pages
Be Data Curious!: Be Data Curious!, #1
From Everand
Be Data Curious!: Be Data Curious!, #1
Nick Jewell
No ratings yet
Activity 3. Mind Map. Data Science Methodology
No ratings yet
Activity 3. Mind Map. Data Science Methodology
4 pages
Data Science S (2 Files Merged)
No ratings yet
Data Science S (2 Files Merged)
30 pages
Data Science in IOT
No ratings yet
Data Science in IOT
220 pages
BSR-Data Science
No ratings yet
BSR-Data Science
308 pages
The Chief Data Officer Role Is Changing Heres How
No ratings yet
The Chief Data Officer Role Is Changing Heres How
30 pages
Data Science Essentials in Python Collect Organize Explore Predict Value 1st Edition Dmitry Zinoviev - The ebook is ready for download, no waiting required
No ratings yet
Data Science Essentials in Python Collect Organize Explore Predict Value 1st Edition Dmitry Zinoviev - The ebook is ready for download, no waiting required
56 pages
Youssef Atef: El-Khalifa, Cairo, Egypt +0201285695181 Github Kaggle Linkedin
No ratings yet
Youssef Atef: El-Khalifa, Cairo, Egypt +0201285695181 Github Kaggle Linkedin
2 pages
Analysis Data Statistic With Python
No ratings yet
Analysis Data Statistic With Python
25 pages
Cover Letter Exxon
No ratings yet
Cover Letter Exxon
2 pages
Introduction Data Science Edited
No ratings yet
Introduction Data Science Edited
33 pages
English
No ratings yet
English
12 pages
Mohd.'s Resume
No ratings yet
Mohd.'s Resume
1 page
Foundation (Week 2) - DeepTech_Ready Upskilling Program
No ratings yet
Foundation (Week 2) - DeepTech_Ready Upskilling Program
20 pages
MVP Journey To Become A PM 1717377079
No ratings yet
MVP Journey To Become A PM 1717377079
15 pages
A Review of Trustworthy and Explainable Artificial Intelligence XAI
No ratings yet
A Review of Trustworthy and Explainable Artificial Intelligence XAI
22 pages
Research Paper On Financial Modeling 1
No ratings yet
Research Paper On Financial Modeling 1
33 pages
Instant Access to R Visualizations: Derive Meaning from Data 1st Edition David Gerbing ebook Full Chapters
No ratings yet
Instant Access to R Visualizations: Derive Meaning from Data 1st Edition David Gerbing ebook Full Chapters
55 pages
Chapter 3 - Artificial Intelligence
100% (1)
Chapter 3 - Artificial Intelligence
47 pages
Siddhartha Chakraborty Resume
No ratings yet
Siddhartha Chakraborty Resume
2 pages
Messina
No ratings yet
Messina
1 page
Journal No 65
No ratings yet
Journal No 65
391 pages
Waterloo Mathematics Brochure
No ratings yet
Waterloo Mathematics Brochure
24 pages
Aryan Thakur
No ratings yet
Aryan Thakur
2 pages
Computer Graphics Notes 4 - TutorialsDuniya
No ratings yet
Computer Graphics Notes 4 - TutorialsDuniya
155 pages
Data Analytics Tools A Comprehensive Overview
No ratings yet
Data Analytics Tools A Comprehensive Overview
6 pages
Business Analyst: Eatclub Brands (Formerly Box8)
No ratings yet
Business Analyst: Eatclub Brands (Formerly Box8)
2 pages
Pgp-Dse: Data Science Engineering
No ratings yet
Pgp-Dse: Data Science Engineering
23 pages
62530-MDSc-2-Year-SEM 2 Study-Plan
No ratings yet
62530-MDSc-2-Year-SEM 2 Study-Plan
3 pages
Ayasdi Discovering The Whole Truth
No ratings yet
Ayasdi Discovering The Whole Truth
10 pages
FODS
No ratings yet
FODS
6 pages
FutureSkills4All - Learning Pathways - EN
No ratings yet
FutureSkills4All - Learning Pathways - EN
22 pages
Igcse Computer Studies Coursework Ideas
100% (2)
Igcse Computer Studies Coursework Ideas
6 pages
Essays On Data Analysis
100% (1)
Essays On Data Analysis
136 pages

Data Science

Uploaded by

Data Science

Uploaded by

Data Science

Data science is an interdisciplinary field that uses scientific methods, processes,

Fig: Lifecycle of Data Science

Phase 6—Communicate results: Now it is important to evaluate if you have been

Data Scientist skills

Fig: Data scientist skills

You might also like