UNIT -4 PPT
UNIT -4 PPT
• It requires configuration.
4.8 Shuffle and Sort:
MapReduce makes the guarantee that the input to every reducer is sorted by key. The process by
which the system performs the sort—and transfers the map outputs to the reducers as inputs—is
known as the shuffle.
1. The Map Side
When the map function starts producing output, it is not simply written to disk. Each map task
has a circular memory buffer that it writes the output to. The buffer is 100 MB by default, a size
which can be tuned by changing the io.sort.mb property.
When the contents of the buffer reaches a certain threshold size (io.sort.spill.percent, default 0.80,
or 80%), a background thread will start to spill the contents to disk.
Map outputs will continue to be written to the buffer while the spill takes place, but if the buffer
fills up during this time, the map will block until the spill is complete.
Spills are written in round-robin fashion to the directories specified by the mapred.local.dir
property, in a job-specific subdirectory
2.The Reduce Side :
Let’s turn now to the reduce part of the process.
The map output file is sitting on the local disk of the machine that ran the map task.But now it is
needed by the machine that is about to run the reduce task for the partition. The reduce task needs
the map output for its particular partition from several map tasks across the cluster.
The copy phase of the reduce task. The reduce task has a small number of copier threads so that it
can fetch map outputs in parallel. The default is five threads, but this number can be changed by
setting the mapred.reduce.parallel.copies property.
The map outputs are copied to the reduce task JVM’s memory if they are small enough,otherwise
they are copied to disk.When the in-memory buffer reaches a threshold size , or reaches a
threshold number of map outputs (mapred.inmem.merge.threshold), it is merged and spilled to
disk.
When all the map outputs have been copied, the reduce task moves into the merge phase.which
merges the map outputs, maintaining their sort ordering.
4.9 Failures in YARN Classic MapReduce:
1. Task Failures:Failure of the running task is similar to the classic case.
Runtime exceptions and sudden exits of the JVM are propagated back to the
application master and the task attempt is marked as failed.
The configuration properties for determining when a task is considered to be
failed are the same as the classic case: a task is marked as failed after four
attempts.
2. Application Master Failure: An application master sends periodic
heartbeats to the resource manager, and in the event of application master failure,
the resource manager will detect the failure and start a new instance of the master
running in a new container (managed by a node manager).
In the case of the MapReduce application master, it can recover the state of the
tasks that had already been run by the (failed) application so they don’t have to
be rerun.
3. Node Manager Failure:
If a node manager fails, then it will stop sending heartbeats to the resource manager, and the node
manager will be removed from the resource manager’s pool of available nodes.
The property yarn.resourcemanager.nm.liveness-monitor.expiry-intervalms, which defaults to 600000
(10 minutes), determines the minimum time the resource manager waits before considering a node
manager that has sent no heartbeat in that time as failed.
Node managers may be blacklisted if the number of failures for the application is high. Blacklisting is
done by the application master, and for MapReduce the application master will try to reschedule tasks
on different nodes if more than three tasks fail on a node manager.
4. Resource Manager Failure:
Failure of the resource manager is serious, since without it neither jobs nor task containers can be
launched.
After a crash, a new resource manager instance is brought up (by an adminstrator) and it recovers from
the saved state. The state consists of the node managers in the system as well as the running
applications.
4.10 TASK EXECUTION:
1. The Task Execution Environment
2. Speculative Execution
3. Output Committers
1. The Task Execution Environment:
Hadoop provides information to a map or reduce task
about the environment in which it is running. For
example, a map task can discover the name of the file
it is processing, and a map or reduce task can find out
the attempt number of the task.
Task Execution Property:
2. Speculative Execution:
The MapReduce model is to break jobs into tasks and run the tasks in parallel to make the overall
job execution time smaller than it would otherwise be if the tasks ran sequentially.
This makes job execution time sensitive to slow-running tasks, as it takes only one slow task to
make the whole job take significantly longer than it would have done otherwise. When a job
consists of hundreds or thousands of tasks, the possibility of a few straggling tasks is very real.
Tasks may be slow for various reasons, including hardware degradation or software mis-
configuration, but the causes may be hard to detect since the tasks still complete successfully,
albeit after a longer time than expected. Hadoop doesn’t try to diagnose and fix slow-running
tasks; instead, it tries to detect when a task is running slower than expected and launches another,
equivalent, task as a backup. This is termed speculative execution of tasks.
speculative task is launched only after all the tasks for a job have been launched, and then only
for tasks that have been running for some time (at least a minute) and have failed to make as
much progress, on average, as the other tasks from the job.
When a task completes successfully, any duplicate tasks that are running are killed since they are
no longer needed. So if the original task completes before the speculative task, then the
speculative task is killed; on the other hand, if the speculative task finishes first, then the original
is killed.
3.Output Committers:
Hadoop MapReduce uses a commit protocol to ensure that jobs and tasks either succeed, or fail
cleanly. The behavior is implemented by the OutputCommitter in use for the job, and this is set
in the old MapReduce API by calling the setOutputCommitter() on JobConf, or by setting
mapred.output.committer.class in the configuration.
In the new MapReduce API, the OutputCommitter is determined by the OutputFormat, via its
getOut putCommitter() method. The default is FileOutputCommitter, which is appropriate for
file-based MapReduce.
The setupJob() method is called before the job is run, and is typically used to perform initialization.
For FileOutputCommitter the method creates the final output directory, ${mapred.output.dir}, and a
temporary working space for task output, ${mapred.output.dir}/_temporary.
If the job succeeds then the commitJob() method is called, which in the default filebased
implementation deletes the temporary working space, and creates a hidden empty marker file in the
output directory called _SUCCESS to indicate to filesystem clients that the job completed
successfully.
If the job did not succeed, then the abort Job() is called with a state object indicating whether the job
failed or was killed (by a user, for example). In the default implementation this will delete the job’s
temporary working space.
4.11 Map Reduce Types:
The map and reduce functions in Hadoop MapReduce have the following general form:
map: (K1, V1) → list(K2, V2)
reduce: (K2, list(V2)) → list(K3, V3)
In general, the map input key and value types (K1 and V1) are different from the map output
types (K2 and V2). However, the reduce input must have the same types as the map output,
although the reduce output types may be different again (K3 and V3).
If a combine function is used, then it is the same form as the reduce function (and is an
implementation of Reducer), except its output types are the intermediate key and value types (K2
and V2), so they can feed the reduce function:
map: (K1, V1) → list(K2, V2)
combine: (K2, list(V2)) → list(K2, V2)
reduce: (K2, list(V2)) → list(K3, V3)
Often the combine and reduce functions are the same, in which case, K3 is the same as K2, and
V3 is the same as V2.
1.Input Format: Input Format Class Hierarchy Diagram
Input Formats:
Hadoop can process many different types of data formats, from flat text files to databases.
The Relationship Between Input Splits and HDFS Blocks Figure shows an example. A single file is
broken into lines, and the line boundaries do not correspond with the HDFS lock boundaries. Splits honor
logical record boundaries,in this case lines, so we see that the first split contains line 5, even though it spans
the first and second block. The second split starts at line 6.
1. TextInputFormat :
TextInputFormat is the default InputFormat. Each record is a line of input. The key, a LongWritable, is the
byte offset within the file of the beginning of the line. The value is the contents of the line.
So a file containing the following text:
On the top of the Crumpetty Tree
The Quangle Wangle sat,
But his face you could not see,
On account of his Beaver Hat.
Like in the TextInputFormat case, the input is in a single split comprising four records, although this times
the keys are the Text sequences before the tab in each line:
(line1, On the top of the Crumpetty Tree)
(line2, The Quangle Wangle sat,)
(line3, But his face you could not see,)
(line4, On account of his Beaver Hat.)
2. NLineInputFormat:
With TextInputFormat and KeyValueTextInputFormat, each mapper receives a variable number of lines of
input. The number depends on the size of the split and the length of the lines.
N refers to the number of lines of input that each mapper receives. With N set to one (the default), each
mapper receives exactly one line of input.
On the top of the Crumpetty Tree
The Quangle Wangle sat,
But his face you could not see,
On account of his Beaver Hat.
If, for example, N is two, then each split contains two lines. One mapper will receive the first two key-value
pairs:
(0, On the top of the Crumpetty Tree)
(33, The Quangle Wangle sat,)
And another mapper will receive the second two key-value pairs:
(57, But his face you could not see,)
(89, On account of his Beaver Hat.)
3. Binary Input :
SequenceFileInputFormat
Hadoop’s sequence file format stores sequences of binary key-value pairs.
Sequence files are well suited as a format for MapReduce data since they are
splittable they support compression as a part of the format
SequenceFileAsTextInputFormat
SequenceFileAsTextInputFormat is a variant of SequenceFileInputFormat that
converts the sequence file’s keys and values to Text objects.
SequenceFileAsBinaryInputFormat
SequenceFileAsBinaryInputFormat is a variant of SequenceFileInputFormat that
retrieves the sequence file’s keys and values as opaque binary objects.
4. Multiple Inputs:
Although the input to a MapReduce job may consist of multiple input files (constructed by a combination of
file globs, filters, and plain paths), all of the input is interpreted by a single InputFormat and a single
Mapper.
These cases are handled elegantly by using the MultipleInputs class, which allows you to specify the
InputFormat and Mapper to use on a per-path basis. For example, if we had weather data from the UK Met
Office6 that we wanted to combine with the NCDC data for our maximum temperature analysis, then we
might set up the input as follows:
MultipleInputs.addInputPath(job, ncdcInputPath,
TextInputFormat.class, MaxTemperatureMapper.class);
MultipleInputs.addInputPath(job, metOfficeInputPath,
TextInputFormat.class, MetOfficeMaxTemperatureMapper.class);
5. Database Input:
DBInputFormat is an input format for reading data from a relational database, using JDBC. It is best used
for loading relatively small datasets, perhaps for joining with larger datasets from HDFS, using
MultipleInputs.
2. Output Formats: Output Format Class Hierarchy Diagram
1.Text Output:
The default output format, TextOutputFormat, writes records as lines of text. Its
keys and values may be of any type, since TextOutputFormat turns them to
strings by calling toString() on them.
Each key-value pair is separated by a tab character. The counterpart to
TextOutput Format for reading in this case is KeyValueTextInputFormat.
2.Binary Output:
SequenceFileOutputFormat: As the name indicates,
SequenceFileOutputFormat writes sequence files for its output.
SequenceFileAsBinaryOutputFormat: SequenceFileAsBinaryOutputFormat is
the counterpart to SequenceFileAsBinaryInput Format, and it writes keys and
values in raw binary format into a SequenceFile container.
MapFileOutputFormat:
MapFileOutputFormat writes MapFiles as output. The keys in a MapFile must be
added in order, so you need to ensure that your reducers emit keys in sorted
3. Multiple Outputs:
FileOutputFormat and its subclasses generate a set of files in the output directory.
There is one file per reducer, and files are named by the partition number: part-r-
00000, partr-00001, etc. There is sometimes a need to have more control over the
naming of the files or to produce multiple files per reducer.
4. Lazy Output:
FileOutputFormat subclasses will create output (part-r-nnnnn) files, even if they
are empty. Some applications prefer that empty files not be created, which is
where LazyOutputFormat helps.
5. Database Output:
The output formats for writing to relational databases and to HBase.
DBOutputFormat, which is useful for dumping job outputs (of modest size) into
a database.