M.
Sc (IT - AI/CC/Security) Semester I
DATA SCIENCE AND ANALYTICS
Ms. Pooja R. Tupe
Visiting Faculty ,UDIT, University of Mumbai.
TOPICS TO COVER
Chapter 1.
What is Data Science?
• Introduction to Data Science
• Some common definitions of Data Science
• Components of Data Science
• Domain Expertise and Scientific Methods
• What is data science used for?
• Different Sectors Using Data Science
• Data science use cases
• Challenges Faced by data Scientist
1.1 Introduction to Data Science?
• Data science is a process, not an event. It is the process of using data to
understand different things, to understand the world.
• For some people data science is when you have a model, or a hypothesis of a
problem, and you try to validate that hypothesis or model with your data.
• Data science is the art of uncovering the insights and trends that are hiding
behind data.
• It's when you translate data into a story, so you use the storytelling to
generate insights.
• And with these insights, you can make strategic choices for a company or
institution.
1.1 Introduction to Data Science?
• Data science is a field about processes and systems, to extract meaning from
data from various forms, whether it is an unstructured or structured form.
• Data science is the study of data, like biological sciences are the study of
biology; physical sciences, it's the study of physical reactions. Data is real, data
has real properties, and we need to study them, if we need to work on them.
• Data science involves data and some science.
1.3 COMPONENTS OF DATA SCIENCE
• Data science is created when subject expertise and scientific methodologies
are combined with technology.
1.4 DOMAIN EXPERTISE AND SCIENTIFIC
METHODS
• Data scientists collect, explore, analyze, and visualize data.
• They apply mathematical and statistical models to find patterns and solutions
in the data.
1.5 WHAT IS DATA SCIENCE USED FOR?
• Data science is used to study data in four main ways:
• Descriptive analysis: study a dataset to decipher the details
• examines data to gain insights into what happened or what is happening
in the data environment.
• Diagnostic analysis: Dive deep to find how things happened
• deep-dive or detailed data examination to understand why something
happened
• Predictive: Create a model based on existing information to predict
outcome and behavior
• uses historical data to make accurate forecasts about data patterns that
may occur in the future
• Prescriptive analysis: suggest actions for a current situation using
collected information's.
• It not only predicts what is likely to happen but also suggests an
optimum response to that outcome
1.6 DIFFERENT SECTORS USING DATA
SCIENCE
• Various sectors use data science to extract the information they need to
create different services and products.
1.7 DATA SCIENCE USE C ASES
1.8 WHAT ARE THE CHALLENGES FACED BY
DATA SCIENTISTS?
• Some of the challenges data scientists face in the real world are:
• Data quality doesn't conform to the set standards.
• Data integration is .a complex task.
• Data is distributed into large clusters in HDFS, which is difficult to
integrate and analyze.
• Unstructured and semi-structured data are harder to analyze