0% found this document useful (0 votes)

9 views50 pages

Da..u2 PPT 14102022

The document outlines the syllabus and key concepts of a Data Analytics course for B.Tech (CSE) students, focusing on tools, techniques, and applications in business. It covers the role of data analytics in decision making, customer service, and operational efficiency, as well as the steps involved in data analytics from problem understanding to result interpretation. Additionally, it discusses various data analytics tools like R, Python, and Apache Spark, and their applications across sectors such as retail, healthcare, and banking.

Uploaded by

DrSayyad Rasheeduddin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views50 pages

Da..u2 PPT 14102022

Uploaded by

DrSayyad Rasheeduddin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 50

DATA ANALYTICS

(Professional Elective
- I)

for B. TECH (CSE)

3rd YEAR – 1st SEM
(R18)
AY 22-23 Sem-1
Faculty: B.
RAVIKRISHNA
UNIT -2
INTRODUCTION &
TOOLS AND ENVIRONMENT
03:59 PM CSE
Analytics - Introduction to Tools and
Environment

UNIT – II Syllabus
Data Analytics: Introduction to Analytics, Introduction to Tools
and Environment, Application of Modeling in Business, Databases
& Types of Data and variables, Data Modeling Techniques,
Missing Imputations etc. Need for Business Modeling.

03:59 PM CSE
Analytics - Introduction to Tools and
Environment

Topics:
1.Introduction to Data Analytics
2.Data Analytics Tools and Environment
3.Need for Business Modeling.
4.Data Modeling Techniques
5.Application of Modeling in Business
6.Databases & Types of Data and variables
7.Missing Imputations etc.
03:59 PM CSE
Role of Data Analytics:
 Gather Hidden Insights – Hidden insights from data are gathered and
then analyzed with respect to business requirements.

 Generate Reports – Reports are generated from the data and are
passed on to the respective teams and individuals to deal with further
actions for a high rise in business.

 Perform Market Analysis – Market Analysis can be performed to

understand the strengths and weaknesses of competitors.

 Improve Business Requirement – Analysis of Data allows

improving Business to customer requirements and experience.
03:59 PM CSE
Ways to Use Data Analytics:

03:59 PM CSE
Role of Data Analytics:

1. Improved Decision Making:

 Data Analytics eliminates guesswork and manual tasks.

 Be it choosing the right content, planning marketing

campaigns, or developing products.
 organizations can use the insights they gain from data analytics
to make informed decisions. Thus, leading to better outcomes
and customer satisfaction.

03:59 PM CSE
Role of Data Analytics:
2. Better Customer Service:
 Data analytics allows you to tailor customer service according to
their needs.
 It also provides personalization and builds stronger relationships
with customers.
 Analyzed data can reveal information about customers’ interests,
concerns, and more.
 It helps you give better recommendations for products and services.
3. Efficient Operations:
 you can streamline your processes, save money, and boost
production.
 With
03:59an
PM improved understanding of what your audience wants, youCSE
Role of Data Analytics:

4. Effective Marketing:
 Data analytics gives you valuable insights into how your
campaigns are performing.
 Helps in fine-tuning them for optimal outcomes.

 Additionally, you can also find potential customers who are

most likely to interact with a campaign and convert into leads.

03:59 PM CSE
Steps Involved in Data Analytics:

03:59 PM CSE
Steps Involved in Data Analytics:
1. Understand the problem:
• Defining the organizational goals, and planning a lucrative
solution is the first step in the analytics process.
• Identifying relevant product recommendations, identifying frauds,
optimizing vehicle routing, etc.

2. Data Collection:
• Next, you need to collect transactional business data and
customer-related information from the past few years to address
the problems your business is facing.
• The data can have information about the total units that were sold
for a product, the sales, and profit that were made, and also when
was
03:59 PMthe order placed. Past data plays a crucial role in shaping theCSE
Steps Involved in Data Analytics:
3. Data Cleaning:
• All the data you collect will often be disorderly, messy, and
contain unwanted missing values. Such data is not suitable or
relevant for performing data analysis.
• Need to clean the data to remove unwanted, redundant, and
missing values to make it ready for analysis.

4. Data Exploration and Analysis:

• Applying suitable methods can tell you the impact and
relationship of a certain features as compared to other variables.
• You can use data visualization and business intelligence tools,
data mining techniques, and predictive modelling to analyze,
visualize,
03:59 PM and predict future outcomes from this data. CSE
Steps Involved in Data Analytics:
4. Data Exploration and Analysis:
Below are the results you can get from the analysis:
 You can identify when a customer purchases the next product.
 You can understand how long it took to deliver the product.
 You get a better insight into the kind of items a customer looks for,
product returns, etc.
 You will be able to predict the sales and profit for the next quarter.
 You can minimize order cancellation by dispatching only relevant
products.
 You’ll be able to figure out the shortest route to deliver the product,
etc.
03:59 PM CSE
Steps Involved in Data Analytics:
5. Interpret the results:

• The final step is to interpret the results and validate if the

outcomes meet your expectations.

• You can find out hidden patterns and future trends.

• This will help you gain insights that will support you with
appropriate data-driven decision making.

03:59 PM CSE
Tools in Data Analytics:

03:59 PM CSE
Tools in Data Analytics:
R programming –
 Leading analytics tool used for statistics and data modeling.
 R compiles and runs on various platforms such as UNIX, Windows, and
Mac OS.
 Provides tools to automatically install all packages as per user-
requirement.

Python –
 Python is an open-source, object-oriented programming language that
is easy to read, write, and maintain.
 It provides
03:59 PM various machine learning and visualization libraries suchCSE
Tools in Data Analytics:
Tableau Public:
• Free software that connects to any data source such as
Excel, corporate Data Warehouse, etc.
• Creates visualizations, maps, dashboards etc with real-time
updates on the web.
QlikView:
• Offers in-memory data processing with the results delivered to the
end-users quickly.
• Also offers data association and data visualization with data being
compressed
03:59 PM
to almost 10% of its original size. CSE
Tools in Data Analytics:
SAS:
• A programming language and environment for data manipulation
and analytics, this tool is easily accessible and can analyze data
from different sources.
Microsoft Excel:
• This tool is one of the most widely used tools for data analytics.
• Mostly used for clients’ internal data, this tool analyzes the tasks
that summarize the data with a preview of pivot tables.

03:59 PM CSE
Tools in Data Analytics:
 RapidMiner:
• A powerful, integrated platform that can integrate with any data
source types such as Access, Excel, Microsoft SQL, Tera data,
Oracle, Sybase etc.
• This tool is mostly used for predictive analytics, such as data
mining, text analytics, machine learning.
 KNIME:
 Konstanz Information Miner (KNIME) is an open-source data
analytics platform, which allows you to analyze and model data.
 Offers
03:59 PM
the benefit of visual programming, KNIME provides aCSE
Tools in Data Analytics:
 Open Refine:
 Also known as GoogleRefine, this data cleaning software will help
you clean up data for analysis.
 It is used for cleaning messy data, the transformation of data and
parsing data from websites.
 Apache Spark:
 One of the largest large-scale data processing engine, this tool
executes applications in Hadoop clusters 100 times faster in
memory and 10 times faster on disk.
 Also
03:59 PM
popular for data pipelines and machine learning modelCSE
Data Analytics Applications:
Data analytics is used in almost every sector of business.
1.Retail:
Data analytics helps retailers understand their customer needs and
buying habits to predict trends, recommend new products, and boost
their business.
They optimize the supply chain, and retail operations at every step of
the customer journey.
2. Healthcare:
Healthcare industries analyse patient data to provide lifesaving
diagnoses
03:59 PM
and treatment options. CSE
Data Analytics Applications:
Data analytics is used in almost every sector of business.
1.Retail:
Data analytics helps retailers understand their customer needs and
buying habits to predict trends, recommend new products, and boost
their business.
They optimize the supply chain, and retail operations at every step of
the customer journey.
2. Healthcare:
Healthcare industries analyse patient data to provide lifesaving
diagnoses
03:59 PM
and treatment options. CSE
Data Analytics Applications:
Data analytics is used in almost every sector of business.
3. Manufacturing: Using data analytics, manufacturing sectors can
discover new cost-saving opportunities. They can solve complex supply
chain issues, labour constraints, and equipment breakdowns.
4. Banking sector: Banking and financial institutions use analytics to
find out probable loan defaulters and customer churn out rate. It also
helps in detecting fraudulent transactions immediately.
5. Logistics: Logistics companies use data analytics to develop new
business models and optimize routes. This, in turn, ensures that the
delivery
03:59 PM
reaches on time in a cost-efficient manner.
CSE
Cluster Computing
 Cluster computing is a collection of tightly or loosely
connected computers that work together so that they act as a
single entity.
 The connected computers execute operations all together thus
creating the idea of a single system.
 The clusters are generally connected through fast local area
networks (LANs)

03:59 PM CSE
Cluster Computing

03:59 PM CSE
Cluster Computing
 A relatively inexpensive, unconventional to the large server or
mainframe computer solutions.
 Resolves the demand for content criticality and process services in
a faster way.
 IT companies are implementing cluster computing to augment their
scalability, availability, processing speed and resource
management at economic prices.
 Ensures that computational power is always available.
 Provides a single general strategy for the implementation and
application
03:59 PM
of parallel high-performance systems independent ofCSE
Apache Spark
 Apache Spark is a lightning-fast cluster computing
technology, designed for fast computation.
 Based on Hadoop MapReduce and it extends the MapReduce
model to efficiently use it for more types of computations, which
includes interactive queries and stream processing.
 The main feature of Spark is its in-memory cluster computing that
increases the processing speed of an application.
 Spark is designed to cover a wide range of workloads such as
batch applications, iterative algorithms, interactive queries and
streaming.
03:59 PM CSE
Apache Spark
 Apart from supporting all these workloads in a respective system,
it reduces the management burden of maintaining separate tools.

03:59 PM CSE
Apache Spark
Evolution of Apache Spark
• Spark is one of Hadoop’s sub project developed in 2009 in UC
Berkeley’s AMPLab by Matei Zaharia.
• It was Open Sourced in 2010 under a BSD license.
• It was donated to Apache software foundation in 2013
• Now Apache Spark has become a top level Apache project from
Feb-2014.

03:59 PM CSE
Apache Spark - Features
Speed −
• Spark helps to run an application in Hadoop cluster, up to 100
times faster in memory, and 10 times faster when running on
disk.
• Possible by reducing number of read/write operations to disk.
• It stores the intermediate processing data in memory, supports
‘Map’ and ‘reduce’.
Supports multiple languages −
• Spark provides built-in APIs in Java, Scala, or Python. Therefore,
you
03:59 PM can write applications in different languages. CSE
Apache Spark - Features
Advanced Analytics −
• Also supports SQL queries, Streaming data, Machine
learning (ML), and Graph algorithms.

03:59 PM CSE
Apache Spark – Built on Hadoop

Diagram shows three ways of how Spark can be built with Hadoop
03:59 PM CSE
components.
Apache Spark – Built on Hadoop

Big Data processing is becoming inevitable

from small to large enterprises.
There are three ways of Spark deployment
Standalone −
• Spark Standalone deployment means Spark occupies the place
on top of HDFS(Hadoop Distributed File System) and space is
allocated for HDFS, explicitly.
• Here, Spark and MapReduce will run side by side to cover all
spark
03:59 PM jobs on cluster. CSE
Apache Spark – Built on Hadoop

Hadoop Yarn −

• Hadoop Yarn deployment means, simply, spark runs on Yarn

(Yet Another Resource Negotiator) without any pre-installation
or root access required.
• Helps to integrate Spark into Hadoop ecosystem or Hadoop
stack.
03:59 PM CSE
Apache Spark – Built on Hadoop

Spark in MapReduce (SIMR) −

• Spark in MapReduce is used to launch spark job in addition to
standalone deployment.
• With SIMR, user can start Spark and uses its shell without any
administrative
03:59 PM access. CSE
Apache Spark – Components

Different components of
Spark
03:59 PM CSE
Apache Spark – Components
Apache Spark Core:
• Spark Core is the underlying general execution engine for
spark platform that all other functionality is built upon.
• Spark primarily achieves this speed via a new data model
called resilient distributed datasets (RDDs) that are stored in
memory while being computed upon, thus eliminating
expensive intermediate disk writes.

03:59 PM CSE
Apache Spark – Components
Spark SQL:
• Spark SQL is a component on top of Spark Core that
introduces a new data abstraction called SchemaRDD, which
provides support for structured and semi-structured data.
Spark Streaming:
• Spark Streaming leverages Spark Core's fast scheduling
capability to perform streaming analytics.
• It ingests data in mini-batches and performs RDD (Resilient
Distributed Datasets) transformations on those mini-batches of
03:59 PM CSE
Apache Spark – Components
MLlib (Machine Learning Library):
• MLlib is a distributed machine learning framework above
Spark because of the distributed memory-based Spark
architecture.
• It is, according to benchmarks, done by the MLlib developers
against the Alternating Least Squares (ALS)
implementations.
• Spark MLlib is nine times as fast as the Hadoop disk-based
version
03:59 PM
of Apache Mahout (before Mahout gained a SparkCSE
Apache Spark – Components
GraphX:
• GraphX is a distributed graph-processing framework
on top of Spark.
• Provides an API for expressing graph computation
that can model the user-defined graphs by using
Pregel abstraction API.
• Also provides an optimized runtime for this
abstraction.
03:59 PM CSE
Scala:

 Scala is a statically typed programming language

that incorporates both functional and object oriented,
also suitable for imperative programming approaches.
 It’s a General-purpose programming language.
 In scala, everything is an object whether it is a function or a number.
 It does not have concept of primitive data.
 Scala primarily runs on JVM platform and it can also be used to write
software for native platforms using Scala-Native and JavaScript runtimes
through
03:59 PM ScalaJs. CSE
Scala:
 Scala is a Scalable Language used to
write Software for multiple platforms. Hence,
it got the name “Scala”.
 Scala is a statically typed programming language that incorporates both
functional and object oriented, also suitable for imperative programming
approaches.
 It’s a General-purpose programming language.
 In Scala, everything is an object whether it is a function or a number. It does
not have concept of primitive data.
03:59 PM CSE
Scala:
 Scala primarily runs on JVM platform and it can also be used to write
software for native platforms using Scala-Native and JavaScript runtimes
through ScalaJs.
 It is designed for applications that are concurrent (parallel),
distributed, and resilient (robust) message-driven.
 Scala offers many Duck Types(Structural Types).
 Unlike Java, Scala has many features of functional programming
languages like Scheme, Standard ML and Haskell, including type
inference,
03:59 PM
immutability, lazy evaluation, and pattern matching.
CSE
Scala:
Where Scala can be used?
 Web Applications
 Utilities and Libraries
 Data Streaming
 Parallel batch processing
 Concurrency and distributed application
 Data analytics with Spark
 AWS lambda Expression

03:59 PM CSE
Cloudera Impala:
 Cloudera Impala is
Cloudera's open source massively parallel
processing (MPP) SQL query engine for data stored in a
computer cluster running Apache Hadoop.
 Massively parallel processing (MPP) SQL query engine for
native analytic database in a computer cluster running Apache
Hadoop.
 It is shipped by vendors such as Cloudera, MapR, Oracle, and
Amazon.
03:59 PM CSE
Cloudera Impala:
 Cloudera Impala is a query engine that runs on Apache
Hadoop.
 Impala brings enabling users to issue low latency SQL queries
to data stored in HDFS and Apache HBase without requiring
data movement or transformation.
 Integrated with Hadoop to use the same file and data formats,
metadata, security and resource management frameworks
used by MapReduce, Apache Hive, Apache Pig and other
Hadoop
03:59 PM software. CSE
Cloudera Impala:
 The result is that large-scale data processing (via MapReduce)
and interactive queries can be done on the same system using
the same data and metadata – removing the need to migrate
data sets into specialized systems and/or proprietary formats
simply to perform analysis.

03:59 PM CSE
Cloudera Impala:
Features include:
 Supports HDFS and Apache HBase storage,
 Reads Hadoop file formats, including text, LZO, SequenceFile,
Avro, RCFile, and Parquet,
 Supports Hadoop security (Kerberos authentication),
 Fine-grained, role-based authorization with Apache Sentry,
 Uses metadata, ODBC driver, and SQL syntax from Apache
Hive.
03:59 PM CSE
Databases & Types of Data and
variables

Data Base: A Database is a collection of related data.

Database Management System: DBMS is a software or set of

Programs used to define, construct and manipulate the data.

Relational Database Management System:

DBMS is a software system used to maintain relational
databases.
Many
03:59 PM relational database systems have an option of using the
CSE
Databases & Types of Data and
variables

Data Base: A Database is a collection of related data.

Database Management System: DBMS is a software or set of

Programs used to define, construct and manipulate the data.

Relational Database Management System:

DBMS is a software system used to maintain relational
databases.
Many
03:59 PM relational database systems have an option of using the
CSE
End of Unit-2

50 CSE
03:59 PM

Unit 2
100% (1)
Unit 2
22 pages
Da Unit-2
No ratings yet
Da Unit-2
23 pages
DA Unit-II Material
No ratings yet
DA Unit-II Material
23 pages
Da 2
No ratings yet
Da 2
25 pages
DA Unit 2
No ratings yet
DA Unit 2
16 pages
Introduction to Data Analytics
No ratings yet
Introduction to Data Analytics
30 pages
Da..u1 PPT 29092022
No ratings yet
Da..u1 PPT 29092022
84 pages
Unit 1
No ratings yet
Unit 1
57 pages
Here Is An Even More Detailed and Expanded Version of Chapter 1
No ratings yet
Here Is An Even More Detailed and Expanded Version of Chapter 1
5 pages
Sa Module Overview t12324 02
No ratings yet
Sa Module Overview t12324 02
20 pages
Controllable Variables in Decision Models
100% (1)
Controllable Variables in Decision Models
22 pages
Unit1 Introduction To Data Analytics and Data Analytics Lifecycle Notes
No ratings yet
Unit1 Introduction To Data Analytics and Data Analytics Lifecycle Notes
13 pages
Lecture 1
No ratings yet
Lecture 1
27 pages
CMR-BDA - Unit-I
No ratings yet
CMR-BDA - Unit-I
102 pages
Unit-1 For Students
No ratings yet
Unit-1 For Students
57 pages
Unit II
No ratings yet
Unit II
91 pages
Da Mod2
No ratings yet
Da Mod2
88 pages
Da Unit 2
No ratings yet
Da Unit 2
18 pages
Unit 2
No ratings yet
Unit 2
13 pages
Unit-1 DA
No ratings yet
Unit-1 DA
23 pages
Business Analytics Unit 1 Notes
No ratings yet
Business Analytics Unit 1 Notes
25 pages
PBS - 3
No ratings yet
PBS - 3
20 pages
Chapter 1 Introduction To Data Analytics
No ratings yet
Chapter 1 Introduction To Data Analytics
4 pages
What Is Duplicate Data?
No ratings yet
What Is Duplicate Data?
10 pages
3-Analytic Data Sets, Analytic Methods-11!01!2025
No ratings yet
3-Analytic Data Sets, Analytic Methods-11!01!2025
48 pages
Unit - 1
No ratings yet
Unit - 1
82 pages
Data Analytics Notes
No ratings yet
Data Analytics Notes
26 pages
Da Unit 2 R22
No ratings yet
Da Unit 2 R22
16 pages
Overview of Data Analytics Types
No ratings yet
Overview of Data Analytics Types
10 pages
FDA Notes - CCA 1
No ratings yet
FDA Notes - CCA 1
6 pages
Unit-2 Data Analytics
No ratings yet
Unit-2 Data Analytics
119 pages
Dadv Unit1
No ratings yet
Dadv Unit1
40 pages
Unit No 1 Intro To Business Analytics
No ratings yet
Unit No 1 Intro To Business Analytics
36 pages
Assignment OF Data Science (AIT 120) : Submitted To: Submitted by
No ratings yet
Assignment OF Data Science (AIT 120) : Submitted To: Submitted by
10 pages
Introduction To Data Analytics Techniques and Tools
No ratings yet
Introduction To Data Analytics Techniques and Tools
9 pages
Unit-II (Data Analytics)
100% (1)
Unit-II (Data Analytics)
17 pages
L4 Data Analytics
No ratings yet
L4 Data Analytics
13 pages
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
19 pages
Unit2 Da
No ratings yet
Unit2 Da
7 pages
WhatToDo (Sep, 19, 2025)
No ratings yet
WhatToDo (Sep, 19, 2025)
12 pages
Introduction
No ratings yet
Introduction
14 pages
Satyam Rana 4 Sem Business Analytics
No ratings yet
Satyam Rana 4 Sem Business Analytics
29 pages
Data Analysis & Analytics Guide
No ratings yet
Data Analysis & Analytics Guide
124 pages
Data Analytics-Wps Office
No ratings yet
Data Analytics-Wps Office
21 pages
What Is Data Analytics
No ratings yet
What Is Data Analytics
44 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
47 pages
Vivek 1
No ratings yet
Vivek 1
91 pages
Data Analytics Unit - I Data Analytics and Lifecycle
No ratings yet
Data Analytics Unit - I Data Analytics and Lifecycle
46 pages
Big Data Day II
No ratings yet
Big Data Day II
38 pages
Advanced Data Analytics and Visualization Course Material
No ratings yet
Advanced Data Analytics and Visualization Course Material
45 pages
What Is Data Analytics
No ratings yet
What Is Data Analytics
4 pages
Week 1
No ratings yet
Week 1
50 pages
Lecture 0
No ratings yet
Lecture 0
21 pages
UNIT2
No ratings yet
UNIT2
237 pages
As You Delve Into The World of Data Analytics
No ratings yet
As You Delve Into The World of Data Analytics
10 pages
IAT-1 - B gz..?-6
No ratings yet
IAT-1 - B gz..?-6
20 pages
Data Analysis
No ratings yet
Data Analysis
6 pages
React JS Course Content: 1. Java Script: 3. Environment Setup
No ratings yet
React JS Course Content: 1. Java Script: 3. Environment Setup
2 pages
C++ Stack Implementation and Usage
No ratings yet
C++ Stack Implementation and Usage
12 pages
Object-Oriented Modeling and Design With UML, 2nd Edition
100% (2)
Object-Oriented Modeling and Design With UML, 2nd Edition
501 pages
Remote Interpreting Potential Solutions To Communication Needs in The Refugee Crisis and Beyond
No ratings yet
Remote Interpreting Potential Solutions To Communication Needs in The Refugee Crisis and Beyond
21 pages
Review FinalExam
No ratings yet
Review FinalExam
10 pages
Transport and Application Layers Overview
No ratings yet
Transport and Application Layers Overview
3 pages
10th Mathametics English Medium Text WWW - Tntextbooks.in-161-202
No ratings yet
10th Mathametics English Medium Text WWW - Tntextbooks.in-161-202
42 pages
CS 311 - Computer Operating Systems and File Organization
No ratings yet
CS 311 - Computer Operating Systems and File Organization
3 pages
Design of Secure Authenticated Key Management Protocol For Cloud Computing Environments
No ratings yet
Design of Secure Authenticated Key Management Protocol For Cloud Computing Environments
23 pages
Seta PM-93 35000-0: Automated Pensky-Martens Flash Point
No ratings yet
Seta PM-93 35000-0: Automated Pensky-Martens Flash Point
4 pages
Keysight 1000B Series Oscilloscopes: Service Guide
No ratings yet
Keysight 1000B Series Oscilloscopes: Service Guide
46 pages
Week6 - Appplet - Awt
No ratings yet
Week6 - Appplet - Awt
38 pages
New Media For Social Change: Globalisation and The Online Gaming Industries of South Korea and Singapore
No ratings yet
New Media For Social Change: Globalisation and The Online Gaming Industries of South Korea and Singapore
21 pages
Traditional Media and New Media
No ratings yet
Traditional Media and New Media
32 pages
Unit-I - Optimization Techniques Introduction
No ratings yet
Unit-I - Optimization Techniques Introduction
19 pages
1 HCIE-Cloud Computing V3.0 Lab Guide
No ratings yet
1 HCIE-Cloud Computing V3.0 Lab Guide
150 pages
ANATOM ClearView Operation Manual
No ratings yet
ANATOM ClearView Operation Manual
243 pages
12th Maths EM Supplementary Exam June 2023 Question Paper With Answer Keys Sura Guide English Medium PDF Download
No ratings yet
12th Maths EM Supplementary Exam June 2023 Question Paper With Answer Keys Sura Guide English Medium PDF Download
16 pages
IoT-Based Federated Learning Model For Hypertensive Retinopathy Lesions Classification
No ratings yet
IoT-Based Federated Learning Model For Hypertensive Retinopathy Lesions Classification
10 pages
Resume Madhukar - Savalla
No ratings yet
Resume Madhukar - Savalla
11 pages
SEI Software Architecture
No ratings yet
SEI Software Architecture
176 pages
Manual de Alarma
No ratings yet
Manual de Alarma
5 pages
Big Data Strategy: A Beginner's Guide
No ratings yet
Big Data Strategy: A Beginner's Guide
5 pages
SDN and NFV Exam Qna
No ratings yet
SDN and NFV Exam Qna
8 pages
IMS - BVFM - 2023 - 4th Sem - MA 7 - Fundamentals of Rotoscoping-30 Long Answer Type Questions
No ratings yet
IMS - BVFM - 2023 - 4th Sem - MA 7 - Fundamentals of Rotoscoping-30 Long Answer Type Questions
26 pages
Guide Card: Casa Del Niño Montessori School Guinatan, City of Ilagan, Isabela
No ratings yet
Guide Card: Casa Del Niño Montessori School Guinatan, City of Ilagan, Isabela
5 pages
High School Department: Baguio Siloam Christian Academy Baguio City 2600
No ratings yet
High School Department: Baguio Siloam Christian Academy Baguio City 2600
1 page
NVG 375037 Fiches WM Fev2020 07 SpecSheet
No ratings yet
NVG 375037 Fiches WM Fev2020 07 SpecSheet
4 pages
Computer Networking Fundamentals Chapter 1
100% (1)
Computer Networking Fundamentals Chapter 1
93 pages
COVID-19 Blood Cataloguing App
No ratings yet
COVID-19 Blood Cataloguing App
43 pages

Da..u2 PPT 14102022

Uploaded by

Da..u2 PPT 14102022

Uploaded by

DATA ANALYTICS

for B. TECH (CSE)

 Perform Market Analysis – Market Analysis can be performed to

 Improve Business Requirement – Analysis of Data allows

1. Improved Decision Making:

 Be it choosing the right content, planning marketing

 Additionally, you can also find potential customers who are

4. Data Exploration and Analysis:

• The final step is to interpret the results and validate if the

• You can find out hidden patterns and future trends.

Big Data processing is becoming inevitable

• Hadoop Yarn deployment means, simply, spark runs on Yarn

Spark in MapReduce (SIMR) −

 Scala is a statically typed programming language

Data Base: A Database is a collection of related data.

Database Management System: DBMS is a software or set of

Relational Database Management System:

Data Base: A Database is a collection of related data.

Database Management System: DBMS is a software or set of

Relational Database Management System:

You might also like