0% found this document useful (0 votes)

12 views107 pages

Chapter 2_Introduction to MapReduce_new (1)

Uploaded by

thedeveloper333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views107 pages

Chapter 2_Introduction to MapReduce_new (1)

Uploaded by

thedeveloper333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 107

Introduction To MapReduce

AIDS – B.E – BDA

Dr. Pooja K Revankar
Assistant Professor,
Dept. of Computer Science and Engg.,
SIES Graduate School of Technology

Dr. Pooja K R
Traditional Way of Parallel & Distributed Processing

Dr. Pooja K R
SOLUTION IS MAP REDUCE FRAMEWORK
1. MAPPER-Software for doing the assigned task after organizing the
data blocks imported using keys.
2. REDUCER-Software for reducing the mapped data using the
aggregation.
3. AGGREGATION-Groups the values for multiple rows together to
result a single value of more significant meaning or measurement.
4. QUERYING FUNCTION-Finding best student of class.

Dr. Pooja K R
Features of Map Reduce
1. Provides automatic parallelization and distribution of computation
based on several processors.
2. Processes data stored on distributed clusters of DataNodes and
racks.
3. Provides scalability for usages of large number of servers.
4. Provides MapReduce batch-oriented programming model in
Hadoop version 1.
5. Provides additional processing modes in Hadoop 2 YARN-based
system and enables required parallel processing of 3V
characteristics data.

Dr. Pooja K R
What is MapReduce?
● MapReduce is a programming framework .
● It allows us to perform distributed and parallel processing on large
data sets in a distributed environment.

Dr. Pooja K R
Main Phases of Map reduce
● Map: each worker node applies the map function to the local data,
and writes the output to a temporary storage. A master node
ensures that only one copy of the redundant input data is
processed.
● Shuffle: worker nodes redistribute data based on the output keys
(produced by the map function), such that all data belonging to one
key is located on the same worker node.
● Reduce: worker nodes now process each group of output data, per
key, in parallel.

Dr. Pooja K R
MapReduce Job Execution Flow

Dr. Pooja K R
Mapper in Hadoop MapReduce

Dr. Pooja K R
Mapper in Hadoop MapReduce
● Hadoop Mapper task processes each input record and it generates a
new <key, value> pairs.
● The <key, value> pairs can be completely different from the input
pair.
● In mapper task, the output is the full collection of all these <key,
value> pairs.

Dr. Pooja K R
Reducer in Hadoop MapReduce

Dr. Pooja K R
Reducer in Hadoop MapReduce
1. Input to reducer will be output of mapper <key,value> pair.
2. Hadoop Reducer takes a set of an intermediate key-value pair
produced by the mapper as the input and runs a Reducer function
on each of them.
3. One can aggregate, filter, and combine this data (key, value) in a
number of ways for a wide range of processing.
4. Reducer first processes the intermediate values for particular key
generated by the map function and then generates the output (zero
or more key-value pair).

Dr. Pooja K R
Shuffling and Sorting

Dr. Pooja K R
https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2017/01/hadoop-
mapreduce-data-flow-execution-1.gif

Dr. Pooja K R
Word Count using MapReduce Algorithm

Dr. Pooja K R
Map REDUCE EXAMPLES

Dr. Pooja K R
Daemons used in Map Reduce programming

Dr. Pooja K R
JobTracker
• JobTracker process runs on a separate node and not usually on a DataNode.
• JobTracker is an essential Daemon for MapReduce execution in MRv1. It is
replaced by ResourceManager/ApplicationMaster in MRv2.
• JobTracker receives the requests for MapReduce execution from the client.
• JobTracker talks to the NameNode to determine the location of the data.
• JobTracker finds the best TaskTracker nodes to execute tasks based on the
data locality (proximity of the data) and the available slots to execute a task on
a given node.
• JobTracker monitors the individual TaskTrackers and the submits back the
overall status of the job back to the client.
• JobTracker process is critical to the Hadoop cluster in terms of MapReduce
execution.
• When the JobTracker is down, HDFS will still be functional but the MapReduce
execution can not be started and the existing MapReduce jobs will be halted.

Dr. Pooja K R
TaskTracker
• TaskTracker runs on DataNode. Mostly on all DataNodes.
• TaskTracker is replaced by Node Manager in MRv2.
• Mapper and Reducer tasks are executed on DataNodes administered by
TaskTrackers.
• TaskTrackers will be assigned Mapper and Reducer tasks to execute by
JobTracker.
• TaskTracker will be in constant communication with the JobTracker signalling
the progress of the task in execution.
• TaskTracker failure is not considered fatal. When a TaskTracker becomes
unresponsive, JobTracker will assign the task executed by the TaskTracker to
another node.

Dr. Pooja K R
Matrix-Vector Multiplication by MapReduce

●Created to execute very large matrix-vector multiplications

●When ranking of Web pages that goes on at search engines, n is in the tens of

billions.

●Page Rank- iterative algorithm

●Also, useful for simple (memory-based) recommender systems

Dr. Pooja K R
Problem Statement
Given,

●n × n matrix M, whose element in row i and column j will be denoted 𝑚𝑖𝑗 .

●A vector v of length n.

●Assume that

○The row-column coordinates of each matrix element will be discoverable,

either from its position in the file, or because it is stored with explicit

coordinates, as a triple (i, j, 𝑚𝑖𝑗).

●the position of element 𝑣𝑗 in the vector v will be discoverable in the analogous

Dr. Pooja K R
Algorithm for Map Function

Dr. Pooja K R
Algorithm for Reduce Function:

Dr. Pooja K R
Computing the mapper for Matrix A
# k, i, j computes the number of times it occurs.
# Here all are 2, therefore when k=1, i can have 2 values 1 & 2,
# each case can have 2 further values of j=1 and j=2.
#Substituting all values in formula

k=1 i=1 j=1 ((1, 1), (A, 1, 1))

j=2 ((1, 1), (A, 2, 2))

i=2 j=1 ((2, 1), (A, 1, 3))

j=2 ((2, 1), (A, 2, 4))

k=2 i=1 j=1 ((1, 2), (A, 1, 1))

j=2 ((1, 2), (A, 2, 2))

i=2 j=1 ((2, 2), (A, 1, 3))

j=2 ((2, 2), (A, 2, 4))

Dr. Pooja K R
Computing the mapper for Matrix

i=1 j=1 k=1 ((1, 1), (B, 1, 5))

k=2 ((1, 2), (B, 1, 6))

j=2 k=1 ((1, 1), (B, 2, 7))

k=2 ((1, 2), (B, 2, 8))

i=2 j=1 k=1 ((2, 1), (B, 1, 5))

k=2 ((2, 2), (B, 1, 6))

j=2 k=1 ((2, 1), (B, 2, 7))

k=2 ((2, 2), (B, 2, 8))

Dr. Pooja K R
The formula for Reducer is:

Reducer(k, v)=(i, k)=>Make sorted Alist and Blist

(i, k) => Summation (Aij * Bjk)) for j
Output =>((i, k), sum)

Dr. Pooja K R
Computing the reducer:

# We can observe from Mapper computation

# that 4 pairs are common (1, 1), (1, 2), (2, 1) and (2, 2)
# Make a list separate for Matrix A & B with adjoining values taken from
Mapper step.

(1, 1) =>Alist ={(A, 1, 1), (A, 2, 2)}

Blist ={(B, 1, 5), (B, 2, 7)}
Now Aij x Bjk: [(1*5) + (2*7)] =19 -------(i)

(1, 2) =>Alist ={(A, 1, 1), (A, 2, 2)}

Blist ={(B, 1, 6), (B, 2, 8)}
Now Aij x Bjk: [(1*6) + (2*8)] =22 -------(ii)

Dr. Pooja K R
Computing the reducer:
(2, 1) =>Alist ={(A, 1, 3), (A, 2, 4)}
Blist ={(B, 1, 5), (B, 2, 7)}
Now Aij x Bjk: [(3*5) + (4*7)] =43 -------(iii)

(2, 2) =>Alist ={(A, 1, 3), (A, 2, 4)}

Blist ={(B, 1, 6), (B, 2, 8)}
Now Aij x Bjk: [(3*6) + (4*8)] =50 -------(iv)

From (i), (ii), (iii) and (iv) we conclude that

((1, 1), 19)
((1, 2), 22)
((2, 1), 43)
((2, 2), 50)
Dr. Pooja K R
Solution

●Case I

n is large, but not so large that vector v cannot fit in main memory.

●Case II

n is large to fit into main memory.

Dr. Pooja K R
The MapReduce Phase

Dr. Pooja K R
Case I - n is large, but not so large that vector v cannot fit in
main memory
●The Map Function:
○ Map function is written to apply to one element of M.
○ v is first read to computing node executing a Map task and is
available for all applications of the Map function at this compute
node.
○ Each Map task will operate on a chunk of the matrix M.
○ From each matrix element 𝑚𝑖𝑗 it produces the key value pair

Dr. Pooja K R
The Reduce Function:
• The Reduce function simply sums all the values
associated with a given key i.

• The result will be a pair (i,𝑥𝑖 ).

Dr. Pooja K R
Map(I,j,Mij)

1 4 7 1

2 5 8 2 =

3 6 9 3

Vij Xi
Mij

Dr. Pooja K R
Map Function

1 (0,0,1) 1 (0,1)

2 (0,1,4) 2 (0,8)

3 (0,2,7) 3 (0,21)

4 (1,0,2) 1 (1,2)

5 (1,1,5) 2 (1, 10)

6 (1,2,8) 3 (1,24)

7 (2,0,3) 1 (2, 3 )

8 (2, 1 ,6) 2 (2,12)

9 (2, 2, 9) 3 (2,27)

Dr. Pooja K R
Shuffle and Reduce

30 36 42

Dr. Pooja K R
Output:

Dr. Pooja K R
Case 2: n is large to fit into main memory
● v should be stored in computing nodes used for the Map task.

● Divide the matrix into vertical stripes of equal width and divide the vector
into an equal number of horizontal stripes, of the same height.

● Our goal is to use enough stripes so that the portion of the vector in one
stripe can fit conveniently into main memory at a compute node.

Dr. Pooja K R
Division of a matrix and vector into five stripes

Dr. Pooja K R
Continue…..
● The ith stripe of the matrix multiplies only components from the ith stripe of
the vector.
● Divide the matrix into one file for each stripe, and do the same for the
vector.
● Each Map task is assigned a chunk from one of the stripes of the matrix
and gets the entire corresponding stripe of the vector.

Dr. Pooja K R
Dr. Pooja K R
The reduce( ) step in the MapReduce Algorithm for matrix multiplication

Dr. Pooja K R
•The input information of the reduce( ) step
(function) of the MapReduce algorithm are:
•One row vector from matrix A.
•One column vector from matrix B.

Dr. Pooja K R
Dr. Pooja K R
The reduce( ) function will compute

Dr. Pooja K R
Preprocessing for the map( ) function
•The map( ) function (really) only has one input stream:of the format ( key , value )
i i

Dr. Pooja K R
Pre-processing used for matrix multiplication:

Dr. Pooja K R
Overview of the MapReduce Algorithm for Matrix Multiplication

Dr. Pooja K R
Dr. Pooja K R
•The map( ) will duplicate N times as follows
where N = # rows in matrix A (= # columns in matrix B)

Dr. Pooja K R
Dr. Pooja K R
Dr. Pooja K R
Dr. Pooja K R
Relational- Algebra Operations
●There are a number of operations on large-scale data that are used in database queries.

●Many traditional database applications involve retrieval of small amounts of data, even though the

database itself may be large.

●For example, a query may ask for the bank balance of one particular account. Such queries are not

useful applications of MapReduce.

●There are several standard operations on relations, often referred to as relational algebra, that are used

to implement queries.

●The queries themselves usually are written in SQL.

Dr. Pooja K R
Relational- Algebra Operations

●Selections

●Projections

●Union

●Intersection

●Difference

Dr. Pooja K R
Representation of a table in HDFS

Dr. Pooja K R
Selection in MapReduce
●Selection can be done most conveniently in the map portion alone, although they could also be done

in the reduce portion alone.

●Here is a MapReduce implementation of selection σC(R).

●The Map Function:

○For each tuple t in R, test if it satisfies C.

■If so, produce the key-value pair (t,t). That is, both the key and value are t.

●The Reduce Function: The Reduce function is the identity. It simply passes each key-value pair to

the output.

Dr. Pooja K R
Selection in Map Reduce
● Selection: σC(R)
○ Apply condition C to each tuple of relation R.
○ Produce in output a relation containing only tuples that satisfy C.

Dr. Pooja K R
Selection in Map Reduce

For our example we will do Selection(B <= 3). Select all the rows where value of B is less than or
equal to 3.

Dr. Pooja K R
Selection in Map Reduce

Based on number or reduce workers (2 in our case). The files for reduce workers on map workers
will look like:

Dr. Pooja K R
Output of Selection

6 3

Dr. Pooja K R
Projection Using Map Reduce

• Map Function: For each row r in the table produce a key value pair r', r’, where r'
only contains the columns which are wanted in the projection.

• Reduce Function: The reduce function will get outputs in the form of r' :[r', r', r', r',
...]. As after removing some columns the output may contain duplicate rows. So it
will just take the value at 0th index, getting rid of duplicates.

Dr. Pooja K R
Projection in MapReduce

●Projection is performed similarly to selection, because projection may cause

the same tuple to appear several times, the Reduce function must eliminate

duplicates.

●We may compute as follows.

Dr. Pooja K R
computing projection(A, B)

Dr. Pooja K R
Union Using Map Reduce

• Map Function: For each row r generate key-value pair (r, r) .

• Reduce Function: With each key there can be one or two values (As we
don’t have duplicate rows), in either case just output first value.

Dr. Pooja K R
Union Using Map Reduce

Dr. Pooja K R
Intersection Using Map Reduce

• Map Function: For each row r generate key-value pair (r, r) (Same as
union).

• Reduce Function: With each key there can be one or two values (As we
don’t have duplicate rows), in case we have length of list as 2 we output
first value else we output nothing.

Dr. Pooja K R
Intersection Using Map Reduce

Dr. Pooja K R
Difference Using Map Reduce

• Map Function: For each row r create a key-value pair (r, T1) if row is from
table 1 else product key-value pair (r, T2).

• Reduce Function: Output the row if and only if the value in the list is T1 ,
otherwise output nothing.

Dr. Pooja K R
Difference Using Map Reduce

Dr. Pooja K R
Grouping and Aggregation Using Map Reduce

• Map Function: For each row in the table, take the attributes using which grouping is
to be done as the key, and value will be the ones on which aggregation is to be
performed.

• For example, If a relation has 4 columns A, B, C, D and we want to group by A, B

and do an aggregation on C we will make (A, B) as the key and C as the value.

• Reduce Function: Apply the aggregation operation (sum, max, min, avg, …) on the
list of values and output the result.

Dr. Pooja K R
Grouping and Aggregation Using Map Reduce

Dr. Pooja K R
Output of group by (A, B) sum(C)

Dr. Pooja K R
Natural Join Using Map Reduce

• Map Function: For two relations Table 1(A, B) and Table 2(B, C) the map
function will create key-value pairs of form b: [(T1, a)] for table 1 where T1
represents the fact that the value a came from table 1, for table 2 key-
value pairs will be of the form b: [(T2, c)].

• Reduce Function: For a given key b construct all possible combinations

for the values where one value is from table T1 and the other value is from
table T2. The output will consist of key-value pairs of form b: [(a, c)] which
represent one row a, b, c for the output table.

Dr. Pooja K R
Natural Join Using Map Reduce

Dr. Pooja K R
Projection in MapReduce

●The Map Function: For each tuple t in R, construct a tuple t′ by eliminating

from t those components whose attributes are not in S. Output the keyvalue

pair (t′, t′).

●The Reduce Function: For each key t′ produced by any of the Map tasks,

there will be one or more key-value pairs (t′, t′). The Reduce function turns (t′,

[t′, t′, . . . , t′]) into (t′, t′), so it produces exactly one pair (t′, t′) for this key.

Dr. Pooja K R
Projection in MapReduce
● Projection: πS(R) –
○ Given a subset S of relation R attributes.
○ Produce in output a relation containing only tuples for the attributes in S.

Dr. Pooja K R
Projection in MapReduce
● Similar process to selection.
○ But, projection may cause same tuple to appear several times !
● A MapReduce implementation of πS(R)
○ Map: - For each tuple t in R, construct a tuple t’ by eliminating those
components whose attributes are not in S - Emit a key/value pair (t’, t’).

○ Reduce: - For each key produced by any of the Map tasks, fetch t′, [t′, ···
, t′] - Emit a key/value pair (t’, t’)

Dr. Pooja K R
Union
●Suppose relations R and S have the same schema.

●Map tasks will be assigned chunks from either R or S; it doesn’t matter which.

●The Map tasks don’t really do anything except pass their input tuples as key-

value pairs to the Reduce tasks.

●The latter need only eliminate duplicates as for projection.

○The Map Function: Turn each input tuple t into a key-value pair (t, t).

○The Reduce Function: Associated with each key t there will be either one or

two values. Produce output (t, t) in either case.

Dr. Pooja K R
Intersection
To compute the intersection, we can use the same Map function.
•

●However, the Reduce function must produce a tuple only if both relations have the

tuple. If the key t has a list of two values [t, t] associated with it, then the Reduce

task for t should produce (t, t).

●However, if the value-list associated with key t is just [t], then one of R and S is

missing t, so we don’t want to produce a tuple for the intersection.

●The Map Function: Turn each tuple t into a key-value pair (t, t).

●The Reduce Function: If key t has value list [t, t], then produce (t, t). Otherwise,
Dr. Pooja K R
Difference

●The Map Function: For a tuple t in R, produce key-value pair (t,R), and for a

tuple t in S, produce key-value pair (t,S).

Note that the intent is that the value is the name of R or S (or better, a single

bit indicating whether the relation is R or S), not the entire relation.

●The Reduce Function: For each key t, if the associated value list is [R], then

produce (t,t). Otherwise, produce nothing.

Dr. Pooja K R
https://medium.com/swlh/relational-operations-using-mapreduce-f49e8bd14e31

Dr. Pooja K R
Thank You!!!
([email protected])

Dr. Pooja K R

IT Capstone Project 1 Module - 084404
No ratings yet
IT Capstone Project 1 Module - 084404
70 pages
BDA Module 3
No ratings yet
BDA Module 3
66 pages
BIG DATA UNIT -3
No ratings yet
BIG DATA UNIT -3
7 pages
Mapreduce Model Principles
No ratings yet
Mapreduce Model Principles
65 pages
Module2 C MapReduceParadigm
No ratings yet
Module2 C MapReduceParadigm
74 pages
exp5bdafinal
No ratings yet
exp5bdafinal
7 pages
Module 1 Algorithm For Massive Datasets
No ratings yet
Module 1 Algorithm For Massive Datasets
59 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
43 pages
Map Reduce
No ratings yet
Map Reduce
26 pages
3-bda-unit-3-notes
No ratings yet
3-bda-unit-3-notes
12 pages
Map Reduce 2
No ratings yet
Map Reduce 2
14 pages
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
No ratings yet
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
15 pages
Unit-2 (MapReduce-I)
No ratings yet
Unit-2 (MapReduce-I)
28 pages
3-bda-unit-3-notes
No ratings yet
3-bda-unit-3-notes
12 pages
3 Bda Unit 3 Notes
No ratings yet
3 Bda Unit 3 Notes
12 pages
5-Yarn architecture Components Workflow Scheduling-22-01-2025
No ratings yet
5-Yarn architecture Components Workflow Scheduling-22-01-2025
26 pages
Chapter 9 - Processing Big Data With Mapreduce
No ratings yet
Chapter 9 - Processing Big Data With Mapreduce
157 pages
Big Data Management Continued
No ratings yet
Big Data Management Continued
48 pages
Unit III
No ratings yet
Unit III
8 pages
Mapreduce Final
No ratings yet
Mapreduce Final
55 pages
3-bda-unit-3-notes
No ratings yet
3-bda-unit-3-notes
12 pages
Bda Experiment 5: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade
No ratings yet
Bda Experiment 5: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade
5 pages
BIG DATA
No ratings yet
BIG DATA
120 pages
BDA-MapReduce (1) 5rfgy656yhgvcft6
No ratings yet
BDA-MapReduce (1) 5rfgy656yhgvcft6
60 pages
unit-5-notes-data-analytics-kit-601[1]
No ratings yet
unit-5-notes-data-analytics-kit-601[1]
44 pages
Unit V Big Data Analytics
No ratings yet
Unit V Big Data Analytics
47 pages
Da Unit 5 Data Analytics
No ratings yet
Da Unit 5 Data Analytics
44 pages
777 1651400043 BD Module 4
No ratings yet
777 1651400043 BD Module 4
21 pages
3.Map-Reduce Framework - 1
No ratings yet
3.Map-Reduce Framework - 1
47 pages
Unit 2 Topic 4 Map Reduce
No ratings yet
Unit 2 Topic 4 Map Reduce
43 pages
Unit 5 Big Data
No ratings yet
Unit 5 Big Data
48 pages
Module 3 (Part-1) - Big Data
No ratings yet
Module 3 (Part-1) - Big Data
46 pages
3 Bda Unit 3 Notes
No ratings yet
3 Bda Unit 3 Notes
12 pages
Computational Tools DTU Presentation Week3
No ratings yet
Computational Tools DTU Presentation Week3
33 pages
Map Reduce
No ratings yet
Map Reduce
25 pages
Paper Map Reduce
No ratings yet
Paper Map Reduce
16 pages
Chapter_3_Map_Reduce_Framework_250525_070916
No ratings yet
Chapter_3_Map_Reduce_Framework_250525_070916
28 pages
Map Reduce
No ratings yet
Map Reduce
42 pages
Bda - Unit I - Lecture 6, 7
No ratings yet
Bda - Unit I - Lecture 6, 7
48 pages
Map-Reduce For Parallel Computing: Amit Jain
No ratings yet
Map-Reduce For Parallel Computing: Amit Jain
72 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
9 pages
Map Reduce Workflow Colloquim
No ratings yet
Map Reduce Workflow Colloquim
30 pages
Map Reduce Programming
No ratings yet
Map Reduce Programming
74 pages
unit3
No ratings yet
unit3
33 pages
Data Science
No ratings yet
Data Science
7 pages
Unit 1 Lecture 3
No ratings yet
Unit 1 Lecture 3
12 pages
Map Reduce
No ratings yet
Map Reduce
39 pages
Big Data Infrastructure: Week 2: Mapreduce Algorithm Design (2/2)
No ratings yet
Big Data Infrastructure: Week 2: Mapreduce Algorithm Design (2/2)
55 pages
Da Unit 5 Data Analytics
No ratings yet
Da Unit 5 Data Analytics
43 pages
Big Data BCA Unit4
No ratings yet
Big Data BCA Unit4
9 pages
UNIT - 5
No ratings yet
UNIT - 5
57 pages
Unit 5 - Mapreduce
No ratings yet
Unit 5 - Mapreduce
8 pages
Unit 3
No ratings yet
Unit 3
22 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
26 pages
Unit 3 - Big Data Technologies
No ratings yet
Unit 3 - Big Data Technologies
42 pages
03 Firstmrjob Invertedindexconstruction 141206231216 Conversion Gate01 PDF
No ratings yet
03 Firstmrjob Invertedindexconstruction 141206231216 Conversion Gate01 PDF
54 pages
Unit v Big Data Analytics
No ratings yet
Unit v Big Data Analytics
47 pages
Unit-4-1
No ratings yet
Unit-4-1
12 pages
Map Reduce
No ratings yet
Map Reduce
74 pages
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Install File
No ratings yet
Install File
44 pages
Computer Science Worksheet 01 DHR - LIS
100% (1)
Computer Science Worksheet 01 DHR - LIS
27 pages
programming in c arrear
No ratings yet
programming in c arrear
4 pages
2.conditional & Loops
No ratings yet
2.conditional & Loops
9 pages
R Code Snippets
No ratings yet
R Code Snippets
10 pages
CS2312 SET1 - Lab Question
100% (1)
CS2312 SET1 - Lab Question
5 pages
CS401 Quiz 1 by MCS of Virtuallians
No ratings yet
CS401 Quiz 1 by MCS of Virtuallians
10 pages
Summary Software Testing
No ratings yet
Summary Software Testing
53 pages
CPrgmAssignPro 200634472
No ratings yet
CPrgmAssignPro 200634472
21 pages
Write An Algorithm To Insert A New Node at The Beginning of A Singly Linked List Give Example
No ratings yet
Write An Algorithm To Insert A New Node at The Beginning of A Singly Linked List Give Example
5 pages
Bvoc CS 1 C Prog Lab
No ratings yet
Bvoc CS 1 C Prog Lab
11 pages
Ohshin Bhat TWT
No ratings yet
Ohshin Bhat TWT
1 page
Quiz 12
No ratings yet
Quiz 12
6 pages
Hsslive_xii-ca-key-dec-2024
No ratings yet
Hsslive_xii-ca-key-dec-2024
7 pages
Parrot
No ratings yet
Parrot
22 pages
Mobile Application Development
50% (2)
Mobile Application Development
3 pages
Bignum
No ratings yet
Bignum
2 pages
R - Lab Experiments - Manual
No ratings yet
R - Lab Experiments - Manual
39 pages
CSE II - II Syllabus
No ratings yet
CSE II - II Syllabus
16 pages
Santhan Resume
No ratings yet
Santhan Resume
2 pages
USN 18CS34: B. E. Degree (Autonomous) Third Semester End Examination (SEE)
No ratings yet
USN 18CS34: B. E. Degree (Autonomous) Third Semester End Examination (SEE)
3 pages
School Management System Activity Diagram (/Uml-Diagram/School-Management-System-Activity-Diagram)
No ratings yet
School Management System Activity Diagram (/Uml-Diagram/School-Management-System-Activity-Diagram)
15 pages
Smart Pointers
No ratings yet
Smart Pointers
3 pages
NCV4 Computer Programming Paper 1 November 2020
No ratings yet
NCV4 Computer Programming Paper 1 November 2020
9 pages
e102f20LectureMoreFunctions
No ratings yet
e102f20LectureMoreFunctions
26 pages
Chapter 1 - Intro To Programming Language 20182019
No ratings yet
Chapter 1 - Intro To Programming Language 20182019
54 pages
F
No ratings yet
F
101 pages
2014 p01 q02 Solutions
No ratings yet
2014 p01 q02 Solutions
2 pages
Chapter 2 IT Series Book
No ratings yet
Chapter 2 IT Series Book
51 pages

Chapter 2_Introduction to MapReduce_new (1)

Uploaded by

Chapter 2_Introduction to MapReduce_new (1)

Uploaded by

Introduction To MapReduce

AIDS – B.E – BDA

●Created to execute very large matrix-vector multiplications

●Page Rank- iterative algorithm

●Also, useful for simple (memory-based) recommender systems

●n × n matrix M, whose element in row i and column j will be denoted 𝑚𝑖𝑗 .

○The row-column coordinates of each matrix element will be discoverable,

coordinates, as a triple (i, j, 𝑚𝑖𝑗).

●the position of element 𝑣𝑗 in the vector v will be discoverable in the analogous

k=1 i=1 j=1 ((1, 1), (A, 1, 1))

i=2 j=1 ((2, 1), (A, 1, 3))

k=2 i=1 j=1 ((1, 2), (A, 1, 1))

i=2 j=1 ((2, 2), (A, 1, 3))

i=1 j=1 k=1 ((1, 1), (B, 1, 5))

j=2 k=1 ((1, 1), (B, 2, 7))

i=2 j=1 k=1 ((2, 1), (B, 1, 5))

j=2 k=1 ((2, 1), (B, 2, 7))

Reducer(k, v)=(i, k)=>Make sorted Alist and Blist

# We can observe from Mapper computation

(1, 1) =>Alist ={(A, 1, 1), (A, 2, 2)}

(1, 2) =>Alist ={(A, 1, 1), (A, 2, 2)}

(2, 2) =>Alist ={(A, 1, 3), (A, 2, 4)}

From (i), (ii), (iii) and (iv) we conclude that

n is large to fit into main memory.

• The result will be a pair (i,𝑥𝑖 ).

5 (1,1,5) 2 (1, 10)

8 (2, 1 ,6) 2 (2,12)

database itself may be large.

useful applications of MapReduce.

●The queries themselves usually are written in SQL.

in the reduce portion alone.

●Here is a MapReduce implementation of selection σC(R).

●The Map Function:

○For each tuple t in R, test if it satisfies C.

●Projection is performed similarly to selection, because projection may cause

●We may compute as follows.

• Map Function: For each row r generate key-value pair (r, r) .

• For example, If a relation has 4 columns A, B, C, D and we want to group by A, B

• Reduce Function: For a given key b construct all possible combinations

●The Map Function: For each tuple t in R, construct a tuple t′ by eliminating

pair (t′, t′).

value pairs to the Reduce tasks.

●The latter need only eliminate duplicates as for projection.

two values. Produce output (t, t) in either case.

task for t should produce (t, t).

missing t, so we don’t want to produce a tuple for the intersection.

tuple t in S, produce key-value pair (t,S).

produce (t,t). Otherwise, produce nothing.

You might also like