0% found this document useful (0 votes)

6 views26 pages

UNIT-IV PDF

Uploaded by

Sharmila Devi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views26 pages

UNIT-IV PDF

Uploaded by

Sharmila Devi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

UNIT-IV

ENABLING TECHNOLOGIES FOR DATA SCIENCE &

ANALYTICS: IoT
SYLLABUS

UNIT IV:
Advanced Topics: Introduction, Apache Hadoop, Using Hadoop Map
Reduce for Batch Data Analysis.
IEEE 802.15.4: The IEEE 802 committee family of protocols, The
physical layer, The Media Access control layer, Uses of 802.15.4, The
Future of 802.15.4: 802.15.4e and 802.15.4g.

Introduction:
The definition of a powerful person has changed in this world. A powerful is one who has
access to the data. This is because data is increasing at a tremendous rate. Suppose we are
living in 100% data world. Then 90% of the data is produced in the last 2 to 4 years. This is
because now when a child is born, before her mother, she first faces the flash of the camera.
All these pictures and videos are nothing but data. Similarly, there is data of emails, various
smartphone applications, statistical data, etc. All this data has the enormous power to affect
various incidents and trends. This data is not only used by companies to affect their consumers
but also by politicians to affect elections. This huge data is referred to as Big Data. In such a
world, where data is being produced at such an exponential rate, it needs to maintained,
analyzed, and tackled. This is where Hadoop creeps in.
Hadoop is a framework of the open source set of tools distributed under Apache License. It is
used to manage data, store data, and process data for various big data applications running
under clustered systems. In the previous years, Big Data was defined by the “3Vs” but now
there are “5Vs” of Big Data which are also termed as the characteristics of Big Data.

1
UNIT-IV

 Volume: With increasing dependence on technology, data is producing at a large volume.

Common examples are data being produced by various social networking sites, sensors,
scanners, airlines and other organizations.
 Velocity: Huge amount of data is generated per second. It is estimated that by the end of
2020, every individual will produce 3mb data per second. This large volume of data is being
generated with a great velocity.
 Variety: The data being produced by different means is of three types:

 Structured Data: It is the relational data which is stored in the form of rows and columns.
 Unstructured Data: Texts, pictures, videos etc. are the examples of unstructured data which
can‟t be stored in the form of rows and columns.
 Semi Structured Data: Log files are the examples of this type of data.
 Veracity: The term Veracity is coined for the inconsistent or incomplete data which results in
the generation of doubtful or uncertain Information. Often data inconsistency arises because of
the volume or amount of data e.g. data in bulk could create confusion whereas less amount of
data could convey half or incomplete Information.
 Value: After having the 4 V‟s into account there comes one more V which stands for Value!.
Bulk of Data having no Value is of no good to the company, unless you turn it into something
useful. Data in itself is of no use or importance but it needs to be converted into something
valuable to extract Information. Hence, you can state that Value! is the most important V of all
the 5V‟s.

What is Big Data?

Big data is a collection of large datasets that cannot be processed using traditional computing
techniques. It is not a single technique or a tool, rather it has become a complete subject, which
involves various tools, technqiues and frameworks.

What Comes Under Big Data?

Big data involves the data produced by different devices and applications. Given below are
some of the fields that come under the umbrella of Big Data.

2
UNIT-IV

 Black Box Data − It is a component of helicopter, airplanes, and jets, etc. It captures
voices of the flight crew, recordings of microphones and earphones, and the
performance information of the aircraft.

 Social Media Data − Social media such as Facebook and Twitter hold information and
the views posted by millions of people across the globe.

 Stock Exchange Data − The stock exchange data holds information about the „buy‟ and
„sell‟ decisions made on a share of different companies made by the customers.

 Power Grid Data − The power grid data holds information consumed by a particular
node with respect to a base station.

 Transport Data − Transport data includes model, capacity, distance and availability of a
vehicle.

 Search Engine Data − Search engines retrieve lots of data from different databases.

Thus Big Data includes huge volume, high velocity, and extensible variety of data. The data in
it will be of three types.

 Structured data − Relational data.

3
UNIT-IV

 Semi Structured data − XML data.

 Unstructured data − Word, PDF, Text, Media Logs.

Benefits of Big Data:

 Using the information kept in the social network like Facebook, the marketing agencies
are learning about the response for their campaigns, promotions, and other advertising
mediums.

 Using the information in the social media like preferences and product perception of
their consumers, product companies and retail organizations are planning their
production.

 Using the data regarding the previous medical history of patients, hospitals are providing
better and quick service.

Big Data Challenges:

The major challenges associated with big data are as follows −

 Capturing data

 Curation

 Storage

 Searching

 Sharing

 Transfer

 Analysis

 Presentation

To fulfill the above challenges, organizations normally take the help of enterprise servers.

Traditional Approach

4
UNIT-IV

In this approach, an enterprise will have a computer to store and process big data. For storage
purpose, the programmers will take the help of their choice of database vendors such as Oracle,
IBM, etc. In this approach, the user interacts with the application, which in turn handles the part
of data storage and analysis.

Limitation

This approach works fine with those applications that process less voluminous data that can be
accommodated by standard database servers, or up to the limit of the processor that is
processing the data. But when it comes to dealing with huge amounts of scalable data, it is a
hectic task to process such data through a single database bottleneck.

Google’s Solution

Google solved this problem using an algorithm called MapReduce. This algorithm divides the
task into small parts and assigns them to many computers, and collects the results from them
which when integrated, form the result dataset.

5
UNIT-IV

Hadoop

Using the solution provided by Google, Doug Cutting and his team developed an Open Source
Project called HADOOP.

Hadoop runs applications using the MapReduce algorithm, where the data is processed in
parallel with others. In short, Hadoop is used to develop applications that could perform
complete statistical analysis on huge amounts of data.

6
UNIT-IV

Evolution of Hadoop: Hadoop was designed by Doug Cutting and Michael Cafarella in 2005.
The design of Hadoop is inspired by Google. Hadoop stores the huge amount of data through a
system called Hadoop Distributed File System (HDFS) and processes this data with the
technology of Map Reduce. The designs of HDFS and Map Reduce are inspired by the Google
File System (GFS) and Map Reduce. In the year 2000 Google suddenly overtook all existing
search engines and became the most popular and profitable search engine. The success of
Google was attributed to its unique Google File System and Map Reduce. No one except
Google knew about this, till that time. So, in the year 2003 Google released some papers on
GFS. But it was not enough to understand the overall working of Google. So in 2004, Google
again released the remaining papers. The two enthusiasts Doug Cutting and Michael Cafarella
studied those papers and designed what is called, Hadoop in the year 2005. Doug‟s son had a
toy elephant whose name was Hadoop and thus Doug and Michael gave their new creation, the
name “Hadoop” and hence the symbol “toy elephant.”

Components of Hadoop: Hadoop has three components:

1. HDFS: Hadoop Distributed File System is a dedicated file system to store big data with a
cluster of commodity hardware or cheaper hardware with streaming access pattern. It
enables data to be stored at multiple nodes in the cluster which ensures data security and
fault tolerance.
2. Map Reduce : Data once stored in the HDFS also needs to be processed upon. Now
suppose a query is sent to process a data set in the HDFS. Now, Hadoop identifies where
this data is stored, this is called Mapping. Now the query is broken into multiple parts and
the results of all these multiple parts are combined and the overall result is sent back to the
user. This is called reduce process. Thus while HDFS is used to store the data, Map Reduce
is used to process the data.
3. YARN : YARN stands for Yet Another Resource Negotiator. It is a dedicated operating
system for Hadoop which manages the resources of the cluster and also functions as a
framework for job scheduling in Hadoop. The various types of scheduling are First Come
First Serve, Fair Share Scheduler and Capacity Scheduler etc. The First Come First Serve
scheduling is set by default in YARN.

7
UNIT-IV

How the components of Hadoop make it as a solution for Big Data?

1. Hadoop Distributed File System: In our local PC, by default the block size in Hard Disk
is 4KB. When we install Hadoop, the HDFS by default changes the block size to 64 MB.
Since it is used to store huge data. We can also change the block size to 128 MB. Now
HDFS works with Data Node and Name Node. While Name Node is a master service and it
keeps the metadata as for on which commodity hardware, the data is residing, the Data
Node stores the actual data. Now, since the block size is of 64 MB thus the storage required
to store metadata is reduced thus making HDFS better. Also, Hadoop stores three copies of
every dataset at three different locations. This ensures that the Hadoop is not prone to
single point of failure.
2. Map Reduce: In the simplest manner, it can be understood that MapReduce breaks a query
into multiple parts and now each part process the data coherently. This parallel execution
helps to execute a query faster and makes Hadoop a suitable and optimal choice to deal
with Big Data.
3. YARN: As we know that Yet Another Resource Negotiator works like an operating system
to Hadoop and as operating systems are resource managers so YARN manages the
resources of Hadoop so that Hadoop serves big data in a better way.

8
IOT UNIT-4

Unit 4

IEEE 802.15.4

The IEEE 802 Committee Family of Protocols:

• The standard was first published in 2003

• The Institute of Electrical and Electronics Engineers (IEEE) committee 802 defines physical
and data link technologies.

• The IEEE decomposes the OSI link layer into two sublayers:

1.The media-access control (MAC) layer

* it presents on top of the physical layer (PHY), and implements the methods used to access
the network

* those methods are carriersense multiple access with collision detection (CSMA/CD) used by
Ethernet and

* the carrier-sense multiple access with collision avoidance (CSMA/CA) used by IEEE wireless
protocols.

2. The logical link control layer (LLC), which formats the data frames sent over the
communication channel through the MAC and PHY layers.

• IEEE 802.2 defines a frame format that is independent of the MAC and PHY
layers, and presents a uniform interface to the upper layers.

LINK LAYER

C C layer layer r C
MAC layer
LLC
1 CSMA/CD –Ethernet G.PRUDVI REDDY
CSMA/CA-Wirelss Data frames
IOT UNIT-4

The Physical Layer:

• The physical layer deals about the spectrum allocation rules.

• In the US, the management and allocation of frequency bands is the

responsibility of the Federal Communications Commission (FCC).
• The FCC has allocated frequencies for industrial scientific
and medical (ISM) applications.
• IEEE 802.15.4 can use:
✓ The 2.4 GHz ISM band (S-band) worldwide, providing
a data rate of 250 kbps (O-QPSK modulation) and 15
channels
✓ The 902–928 MHz ISM band (I-band) in the US,
providing a data rate of 40 kbps (BPSK modulation),
250 kbps (BPSK+O-QPSK or ASK modulation) or
250 kbps (ASK modulation) and ten channels
✓ The 868–868.6 MHz frequency band in Europe,
providing a data rate of 20 kbps (BPSK modulation),
100 kbps (BPSK+O-QPSK modulation) or 250 kbps

2 G.PRUDVI REDDY
IOT UNIT-4

The Media-Access Control Layer:

• One responsible for MAC layer is data transfer

• Another responsible is to management of the MAC layer itself

(the Mac layer management entity or MLME).

• The MLME contains the configuration and state parameters for the
MAC layer such as

– 64-bit IEEE address and 16-bit short address for the node

– Number of times to Retry accessing the network in case of a

collision

– Time to wait for an acknowledgment

– Number of times to resend a packet that has not been acknowledge

• 802.15.4 networks are composed of several devices for MAC layer

– 802.15.4 networks are setup by a PAN coordinator node, sometimes

simply called the coordinator. There is a single PAN coordinator for
each network identified by its PAN ID. The PAN coordinator is
responsible for scanning the network and selecting the optimal RF
channel

– Full Function Devices (FFD), also called coordinators: these

devices are capable of relaying messages to other FFDs, including
the PAN coordinator

– Reduced Function Devices (RFD) cannot route messages.

They can be attached to the network only as leaf nodes.

• Two alternative topology models can be used with its corresponding data-
transfer method:

3 G.PRUDVI REDDY
IOT UNIT-4

– The star topology: data transfers are possible only between the PAN
coordinator and the devices.

– The peer to peer topology: data transfers can occur between any two
devices

Each network, identified by its PAN ID, is called a cluster.

• The MAC layer specified by 802.15.4 defines two access control

methods for the network:

MAC layer Access Control methods for 802.15

The beacon-enabled access method

MAC layer (orControl methods for
Access 802.15
The nonbeacon-enabled access method
slotted CSMA/CA)
MAC layer Access Control methods for 802.15
(unslotted CSMA/CA ).

– The beacon-enabled access method (or slotted CSMA/CA). When this mode is
selected, the PAN coordinator periodically broadcasts a superframe, composed of a
starting and ending beacon frame, 15 time slots, and an optional inactive period during
which the coordinator may enter a low-power mode.

– The superframe is as shown in figure MAC layer Access Control methods for

4 G.PRUDVI REDDY
IOT UNIT-4

802.15.4 The beacon-enabled access method (or slotted CSMA/CA) The nonbeacon-
enabled access method (unslotted CSMA/CA).

@
k
E

Contension action period optional contusion period upto 7

The first time slots define the contention access period (CAP)

– The last N (N ≤ 7) time slots form the optional contention free period (CFP), for use by nodes
requiring deterministic network access or guaranteed bandwidth.

– The beacon frame starts by the generalMAC layer frame control field then includes the source
PAN ID, a list of addresses for which the coordinator has pending data, and provides superframe
settings parameters

– Devices willing to send data to a coordinator first listen to the superframe beacon, and
synchronizes superframe and transmit data either during the CAP using CSMA/CA, or during the
CFP.

✓ The nonbeacon-enabled access method (unslotted CSMA/CA). This is the

5 G.PRUDVI REDDY
IOT UNIT-4

mode used by ZigBee and 6LoWPAN. All nodes access the network using
CSMA/CA.
✓ The coordinator provides a beacon only when requested by a node, and sets
the beaconorder (BO) parameter to 15 to indicate use of the nonbeacon-
enabled access method.
✓ Nodes (including the coordinator) request a beacon during the active scan
procedure, it also identify whether networks are located in the vicinity, and
what is their PAN ID.
Association:-

✓ A node joins the network by sending an association request

to the coordinator’s address.

✓ The association request specifies the PAN ID that the node wishes to join,
and a set of capability flags encoded in one octet:

CSE/CBI 1

802.15.4 Addresses:

1. EUI-64:Each 802.15.4 node is required to have a unique 64-bit address, called the extended
unique identifier (EUI-64).

• global uniqueness, device manufacturers shouldacquire a 24-bit prefix, the organizationally

unique identifier (OUI), and for each device, concatenate a unique 40-bit extension identifier to
form the complete EUI-64.

2. 16-Bit Short Addresses

• Since longer addresses increase the packet size, therefore require more transmission time and
more energy, devices can also request a 16-bit short address from the PAN controller.

6 G.PRUDVI REDDY
IOT UNIT-4

• The special 16-bit address FFFF is used as theMAC broadcast address. TheMAC layer of all
devices will transmit packets addressed to FFFF to the upper layers.

802.15.4 Frame Format

– The MAC layer has its own frame format

– The type of data contained in the payload field is determined from the first 3 bits of the frame
control field:

1. Data frames contain network layer data directly in the payload part of the MAC frame.

2. The Ack frame format is specific: it contains only a sequence number and frame check
sequence, and omits the address and data fields.

3. The payload for command frames begins with a command identifier (Figure 1.10), followed
by a command specific payload.

Security in 802.15.4

– 802.15.4 facilitate the use of symmetric key cryptography in order to provide data
confidentiality, data authenticity and replay protection. It is possible to use a specific

– key for each pair of devices (link key), or a common key for a group of devices.

Uses of 802.15.4:

• 802.15.4 provides all the MAC and PHY level mechanisms required by higher-level
protocols to exchange packets securely, and form a network

• It does not provide a fragmentation and reassembly mechanism applications will need to be
careful when sending unsecured packets larger than 108 bytes

• Bandwidth is also very limited, and much less than the PHY level bitrate of 250
kbit/s.Packets cannot be sent continuously

• 802.15.4 is clearly targeted at sensor and automation applications

• ZigBee and 6LoWPAN introduce segmentation mechanisms that overcome the issue of
small and hard to predict application payload sizes at the MAC layer

7 G.PRUDVI REDDY
IOT UNIT-4

The Future of 802.15.4: 802.15.4e and 802.15.4g:

• The need for more modulation options, notably in the sub-GHz space

• The need for additionalMAC layer options enabling channel hopping

• 802.15.4e

– sensor networks performance and memory buffers, it is generally considered that in a 1000

- node network

– 15.4e was formed in 2008

– The focus of 802.15.4e was , the introduction of time-synchronized channel hopping

the major new features of 802.15.4e are

• Coordinated Sampled Listening (CSL):-

– The idea is that the receiver is switched on periodically ( about 5ms) but with a very low duty
cycle.

– On the transmission side, this requires senders to use preambles longer than the receive in
periodicity of the target,

– CSL is the mode of choice if the receive latency needs to be in the order of one second or less.

– In 802.15.4e, CSL communication can be used between synchronized nodes

– or between unsynchronized nodes in which case a long preamble is used

--802.15.4e CSL uses a series of microframes as preamble

– CSL supports streaming traffic

8 G.PRUDVI REDDY
IOT UNIT-4

Receiver-Initiated Transmission (RIT):-

– The RIT strategy is a simple power-saving strategy that is employed by many existing wireless
technologies

– the application layer of the receiving node periodically polls a server in the network for
pending data

– the receiver broadcasts a datarequest frame and listens for a short amount of time

– The receiver can also be turned on for a brief period after sending data.

Time-Synchronized Channel Hopping (TSCH)

– It adds frequency diversity to other diversity methods and will improve the resilience of
802.15.4 networks to transient spectrum pollution.

– In a multimode network, there are situations in which finding a common usable channel across
all nodes is challenging.

9 G.PRUDVI REDDY
IOT UNIT-4

– In TSCH all nodes are synchronized.

802.15.4g:

• 802.15.4g focuses on the PHY requirements for smart utility networks

• 802.15.4g defines 3 PHY modulation options

– Multiregional frequency shift keying (MR-FSK): providing typically transmission capacity

up to 50 kbps.

– Multiregional orthogonal quadrature phase shift keying (O-QPSK): providing typically

transmission capacity up to 200 kbps.

– Multiregional orthogonal frequency division multiplexing (OFDM): providing typically

transmission capacity up to 500 kbps.

10 G.PRUDVI REDDY
IOT UNIT-4

UNIT 4
Hadoop MapReduce
Introduction to Hadoop Framework:
• Apache top level project, open-source implementation of frameworks for reliable, scalable,
distributed computing and data storage.
• It is a flexible and highly-available architecture for large scale computation and data processing
on a network of commodity hardware.
• Hadoop offers a software platform that was originally developed by a Yahoo! group. The
package enables users to write and run applications over vast amounts of distributed data.
• Users can easily scale Hadoop to store and process petabytes of data in the web space. Hadoop
is economical in that it comes with an open source version of MapReduce that minimizes
overhead in task spawning and massive data communication.
• It is efficient, as it processes data with a high degree of parallelism across a large number of
commodity nodes nodes, and it is reliable in that it automatically keeps multiple data copies to
facilitate redeployment of computing tasks upon

• Here's what makes it especially useful:

– Scalable: It can reliably store and process petabytes.

– Economical: It distributes the data and processing across clusters of commonly available
computers (in thousands).
– Efficient: By distributing the data, it can process it in parallel on the nodes where the data is
located.
– Reliable: It automatically maintains multiple copies of data and automatically redeploys
computing tasks based on failures.

Hadoop:
• an open-source software framework that supports data-intensive distributed applications, licensed
under the Apache v2 license.

• Software platform that lets one easily write and run applications that process vast amounts of data. It
includes:

– MapReduce – offline computing engine

– HDFS – Hadoop distributed file system

– HBase (pre-alpha) – online data access

• Goals / Requirements:
• Abstract and facilitate the storage and processing of large and/or rapidly growing data sets

• Structured and non-structured data

PAGE NO 1 G.PRUDVI REDDY

IOT UNIT-4

• Simple programming models

• High scalability and availability

• Use commodity (cheap!) hardware with little redundancy

• Fault-tolerance

• Move computation rather than data

Hadoop Framework Tools:

Hadoop’sArchitecture:
• Distributed, with some centralization
• Main nodes of cluster are where most of the computational power and storage of the system
lies
• Main nodes run TaskTracker to accept and reply to MapReduce tasks, and also DataNode to
store needed blocks closely as possible
• Central control node runs NameNode to keep track of HDFS directories & files, and JobTracker
to dispatch compute tasks to TaskTracker
• Written in Java, also supports Python and Ruby

PAGE NO 2 G.PRUDVI REDDY

IOT UNIT-4

Dept of
CSE,CBIT
MapReduce Engine:
• JobTracker & TaskTracker
• JobTracker splits up data into smaller tasks(“Map”) and sends it tothe TaskTracker process in
each node
• TaskTracker reports back to the JobTracker node and reports on job
progress, sends data (“Reduce”) or requests newjobs
• None of these components are necessarily limited to using HDFS
• Many other distributed file-systems with quite different architectures work
• Many other software packages besides Hadoop's
MapReduce platform make use of HDFS
• Hadoop is in use at most organizations that handle big data:
• Yahoo
• Facebook
• Amazon
• Netflix Etc…

PAGE NO 3 G.PRUDVI REDDY

IOT UNIT-4

MapReduce:
• Hadoop implements Google’s MapReduce, using HDFS
• MapReduce divides applications into many small blocks of work.
• HDFS creates multiple replicas of data blocks for reliability, placing them on
compute nodes around the cluster.
• MapReduce can then process the data where it is located.
• MapReduce is Sort/merge based distributed computing
• The underlying system takes care of the partitioning of the input data, scheduling the
program’s execution across several machines, handling machine failures, and
managing required inter-machine communication. (This is the key for Hadoop’s
success)

How does MapReduce work

• The run time partitions the input and provides it to different Map instances;

• Map (key, value) →(key’, value’)

• The run time collects the (key’, value’) pairs and distributes them to several
Reduce functions so that each Reduce function gets the pairs with the same key’.
• Each Reduce produces a single (or zero) file output.
• Map and Reduce are user written functions

PAGE NO 4 G.PRUDVI REDDY

IOT UNIT-4

Running a MapReduce Job

• MapReduce Usage

✓ Log processing
✓ Web search indexing
✓ Ad-hoc queries

• MapReduce Proces (org.apache.hadoop.mapred)

• JobClient
Submit job
• JobTracker
Manage and schedule job, split job into tasks
• TaskTracker
Start and monitor the task execution
• Child
The process that really execute the task

PAGE NO 5 G.PRUDVI REDDY

IOT UNIT-4

Inter Process Communication

Protocol

JobSubmissionProtocol

JobClient<-------------> JobTracker
InterTrackerProtocol
TaskTracker<------------> JobTracker
TtaskTracker <-------------> Child
JobTracker impliments both protocol and works asserver in both IPC
TaskTracker implements the TaskUmbilicalProtocol; Child gets task
information and reports task status through it.

HDFS:
The Hadoop Distributed File System (HDFS) is a distributed file system
designed to run on commodity hardware. It has many similarities with existing
distributed file systems. However, the differences from other distributed file
systems are significant.

– highly fault-tolerant and is designed to be deployed on low-cost

hardware.

– provides high throughput access to application data and is suitable for

applications that have large data sets.

– relaxes a few POSIX requirements to enable streaming access to file

system data.

– part of the Apache Hadoop Core project. The project URL is

http://hadoop.apache.org/core/.

PAGE NO 6 G.PRUDVI REDDY

IOT UNIT-4

HDFS Architecture:

Data Node in Hadoop

• A BlockSever

✓ Stores data in local file system

✓ Stores meta-data of a block - checksum

✓ Serves data and meta-data to clients

• Block Report

✓ Periodically sends a report of all existing blocks to NameNode

• Facilitate Pipelining of Data

✓ Forwards data to other specified DataNodes

Block Placement

• Replication Strategy

✓ One replica on local node

✓ Second replica on a remote rack

• Third replica on same remote rack

PAGE NO 7 G.PRUDVI REDDY

IOT UNIT-4

• Additional replicas are randomly placed

• Clients read from nearest replica

• Data Correctness

• Use Checksums to validate data – CRC32

• File Creation

✓ Client computes checksum per 512 byte

✓ DataNode stores the checksum

• FileAccess

✓ Client retrieves the data and checksum from DataNode

✓ If validation fails, client tries other replicas

PAGE NO 8 G.PRUDVI REDDY

QAP For Pipes For Hydrant and Sprinkler System
No ratings yet
QAP For Pipes For Hydrant and Sprinkler System
3 pages
Updated Unit-2
0% (1)
Updated Unit-2
55 pages
0 Principles of Big Data
No ratings yet
0 Principles of Big Data
70 pages
Chapter 2-Data Science
No ratings yet
Chapter 2-Data Science
23 pages
hadoop-big-data-unit-2
No ratings yet
hadoop-big-data-unit-2
23 pages
Hadoop & BigData (UNIT - 2)
No ratings yet
Hadoop & BigData (UNIT - 2)
22 pages
BDA-UNIT-1
No ratings yet
BDA-UNIT-1
32 pages
Chapter 2 - Data Science
No ratings yet
Chapter 2 - Data Science
20 pages
Prepared by Richa Btech (Cse) 6 Sem Dav University Jalandhar
No ratings yet
Prepared by Richa Btech (Cse) 6 Sem Dav University Jalandhar
30 pages
BDA-UNIT-I-LM
No ratings yet
BDA-UNIT-I-LM
14 pages
Lecture8 -Big Data (Hadoop)
No ratings yet
Lecture8 -Big Data (Hadoop)
29 pages
BDA U1
No ratings yet
BDA U1
80 pages
Big Data Overview
No ratings yet
Big Data Overview
18 pages
Module 1
No ratings yet
Module 1
54 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Hadoop - MapReduce
No ratings yet
Hadoop - MapReduce
51 pages
Survey Paper On Big Data Analytics Using Hadoop Technologies
No ratings yet
Survey Paper On Big Data Analytics Using Hadoop Technologies
7 pages
Big Data Unit 1 AKTU Notes
No ratings yet
Big Data Unit 1 AKTU Notes
87 pages
Unit 1
No ratings yet
Unit 1
19 pages
Unit I LM
No ratings yet
Unit I LM
12 pages
BDA - Unit-1
No ratings yet
BDA - Unit-1
24 pages
CS8091 LN
No ratings yet
CS8091 LN
68 pages
BigData Terminology Hadoop MapReduce Yarn Spark File Formats
No ratings yet
BigData Terminology Hadoop MapReduce Yarn Spark File Formats
42 pages
01 Unit-I Introduction To Big Data
No ratings yet
01 Unit-I Introduction To Big Data
11 pages
Introduction To Big Data and Hadoop
No ratings yet
Introduction To Big Data and Hadoop
10 pages
Hadoop Notes Unit2
No ratings yet
Hadoop Notes Unit2
24 pages
Ashish_Presentation_Stage1_modify_LR
No ratings yet
Ashish_Presentation_Stage1_modify_LR
24 pages
Bdhs - Ebook
No ratings yet
Bdhs - Ebook
970 pages
Introduction To Bda
No ratings yet
Introduction To Bda
67 pages
Unit 1_BDS_DS307
No ratings yet
Unit 1_BDS_DS307
47 pages
Big Data Streams Analytics: Challenges, Analysis, and Applications
No ratings yet
Big Data Streams Analytics: Challenges, Analysis, and Applications
55 pages
Chaoter Data Science
No ratings yet
Chaoter Data Science
20 pages
biggdata
No ratings yet
biggdata
24 pages
Big Data Analytics Digital Notes
No ratings yet
Big Data Analytics Digital Notes
119 pages
BigData AmberSahai1
No ratings yet
BigData AmberSahai1
32 pages
Big Data complete Notes
No ratings yet
Big Data complete Notes
33 pages
Lect 2 Big Data Lesson01
No ratings yet
Lect 2 Big Data Lesson01
26 pages
Big Data Analytics - notes
No ratings yet
Big Data Analytics - notes
13 pages
UNIT-1_Big Data and Hadoop
No ratings yet
UNIT-1_Big Data and Hadoop
41 pages
Big Data Analytics
No ratings yet
Big Data Analytics
31 pages
Big Data 2022 Notes
No ratings yet
Big Data 2022 Notes
118 pages
Bangladesh University of Professionals: Submitted by Submitted To ID: Section: Batch
No ratings yet
Bangladesh University of Professionals: Submitted by Submitted To ID: Section: Batch
6 pages
BIG DATA ANALYTICS (1)
No ratings yet
BIG DATA ANALYTICS (1)
20 pages
BIGDATAUNIT1AKTUpdf
No ratings yet
BIGDATAUNIT1AKTUpdf
33 pages
big-data-2022-notes
No ratings yet
big-data-2022-notes
118 pages
CC Becse Unit 4 PDF
No ratings yet
CC Becse Unit 4 PDF
32 pages
Big Data
No ratings yet
Big Data
190 pages
BigData_Unit1
No ratings yet
BigData_Unit1
74 pages
Big Data 2022 Notes
No ratings yet
Big Data 2022 Notes
118 pages
Hand Book: Ahmedabad Institute of Technology
No ratings yet
Hand Book: Ahmedabad Institute of Technology
103 pages
Big Data
No ratings yet
Big Data
25 pages
Big Data
No ratings yet
Big Data
25 pages
Chapter Two Data Science: by Abdulaziz Oumer
No ratings yet
Chapter Two Data Science: by Abdulaziz Oumer
29 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
6 pages
The Growing Enormous of Big Data Storage
No ratings yet
The Growing Enormous of Big Data Storage
6 pages
BIG DATA Notes
No ratings yet
BIG DATA Notes
11 pages
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Big Data: the Revolution That Is Transforming Our Work, Market and World
From Everand
Big Data: the Revolution That Is Transforming Our Work, Market and World
PAT NAKAMOTO
No ratings yet
IBM InfoSphere: A Platform for Big Data Governance and Process Data Governance
From Everand
IBM InfoSphere: A Platform for Big Data Governance and Process Data Governance
Sunil Soares
2/5 (1)
Building Scalable Data-Intensive Applications
From Everand
Building Scalable Data-Intensive Applications
Chandani Kaul
No ratings yet
Hadoop Ecosystem for Big Data
From Everand
Hadoop Ecosystem for Big Data
Dr. Zemelak Goraga
No ratings yet
IRJIET-INSPIRE250291745454403 SUBBARAO1
No ratings yet
IRJIET-INSPIRE250291745454403 SUBBARAO1
4 pages
IRJIET-INSPIRE250301745454632 VLC2
No ratings yet
IRJIET-INSPIRE250301745454632 VLC2
5 pages
IRJIET-INSPIRE250281745454223 VLC1
No ratings yet
IRJIET-INSPIRE250281745454223 VLC1
4 pages
IRJIET-INSPIRE250561745465059 AMAR1
No ratings yet
IRJIET-INSPIRE250561745465059 AMAR1
6 pages
Toc QP
No ratings yet
Toc QP
8 pages
TOC4
No ratings yet
TOC4
2 pages
IOT UNIT-1
No ratings yet
IOT UNIT-1
25 pages
TOC3
No ratings yet
TOC3
2 pages
Unit 3
No ratings yet
Unit 3
30 pages
Student Feedback
No ratings yet
Student Feedback
1 page
Unit 4
No ratings yet
Unit 4
14 pages
Unit 5
No ratings yet
Unit 5
8 pages
Iot Material
No ratings yet
Iot Material
33 pages
Bosch Pumps PDF
No ratings yet
Bosch Pumps PDF
6 pages
NTE6354 Thru NTE6365 Silicon Power Rectifier Diode, 300 Amp: Features
No ratings yet
NTE6354 Thru NTE6365 Silicon Power Rectifier Diode, 300 Amp: Features
3 pages
Abbot Labolatories Philippines, Inc VS. Abbot Laboltories Employies Union 323 SCRA 392 (2000)
No ratings yet
Abbot Labolatories Philippines, Inc VS. Abbot Laboltories Employies Union 323 SCRA 392 (2000)
13 pages
Travel Proposal Example: American College Personnel Association (ACPA) Las Vegas, NV March 4-7, 2013
No ratings yet
Travel Proposal Example: American College Personnel Association (ACPA) Las Vegas, NV March 4-7, 2013
2 pages
Experiment 2: NOR and Nand Gates
No ratings yet
Experiment 2: NOR and Nand Gates
26 pages
Dynamic Programming
No ratings yet
Dynamic Programming
69 pages
Straberries With Growing Bag Final
No ratings yet
Straberries With Growing Bag Final
2 pages
Mee 303 Lecture 3 Adam
No ratings yet
Mee 303 Lecture 3 Adam
46 pages
Levis CSR Practices
No ratings yet
Levis CSR Practices
2 pages
Audition Read Me
No ratings yet
Audition Read Me
4 pages
Theo_Osterkamp
No ratings yet
Theo_Osterkamp
5 pages
Formal Requisites of Marriage
No ratings yet
Formal Requisites of Marriage
9 pages
DHTM-106 SLM
No ratings yet
DHTM-106 SLM
196 pages
Safety Data Sheet: 1. Identification of The Substance/Preparation and The Company/Undertaking
No ratings yet
Safety Data Sheet: 1. Identification of The Substance/Preparation and The Company/Undertaking
7 pages
Chapter 1 3
No ratings yet
Chapter 1 3
18 pages
EBOOK Sports in Society - (13. Sports and The Media Could They Survive Without Each Other)
No ratings yet
EBOOK Sports in Society - (13. Sports and The Media Could They Survive Without Each Other)
47 pages
Neoclassical Theory: Arjun Aryal Kumari Begam Shah (Mphil Schoalrs, 2020 Fall, Kusom)
No ratings yet
Neoclassical Theory: Arjun Aryal Kumari Begam Shah (Mphil Schoalrs, 2020 Fall, Kusom)
8 pages
Postgresql Vs MySQL
No ratings yet
Postgresql Vs MySQL
17 pages
The Unscrambler Methods
No ratings yet
The Unscrambler Methods
288 pages
Thesis Housing Project
100% (3)
Thesis Housing Project
8 pages
QP Ii - or - 2020
No ratings yet
QP Ii - or - 2020
1 page
United States v. Kati Karro, AKA "Kathy Karro," AKA "Cathay Karro," AKA "Kitty M. Karro," AKA "K. Karrow," AKA "Kity Karro Polli,", 257 F.3d 112, 2d Cir. (2001)
No ratings yet
United States v. Kati Karro, AKA "Kathy Karro," AKA "Cathay Karro," AKA "Kitty M. Karro," AKA "K. Karrow," AKA "Kity Karro Polli,", 257 F.3d 112, 2d Cir. (2001)
13 pages
Dow NF 70
No ratings yet
Dow NF 70
7 pages
TopSitesExtended 2 999 2023 - 08
No ratings yet
TopSitesExtended 2 999 2023 - 08
8 pages
Blackstar Architect EULA
No ratings yet
Blackstar Architect EULA
6 pages
Yamaguchi Kimberly Ann Resume
No ratings yet
Yamaguchi Kimberly Ann Resume
1 page
Precios Medicadmentos Septiembre 21 - Marzo 22
No ratings yet
Precios Medicadmentos Septiembre 21 - Marzo 22
2,060 pages
Practice Problems - Tension Members
100% (1)
Practice Problems - Tension Members
3 pages