Anatomy of Map Reduce Job Run
Anatomy of Map Reduce Job Run
Anatomy of MapReduce
There are five independent entities:
The client, which submits the MapReduce job.
The YARN resource manager, which coordinates the allocation of
compute resources on the cluster.
The YARN node managers, which launch and monitor the compute
containers on machines in the cluster.
The MapReduce application master, which coordinates the tasks
running the MapReduce job The application master and the MapReduce tasks
run in containers that are scheduled by the resource manager and managed by
the node managers.
The distributed filesystem, which is used for sharing job files between
the other entities.
Job Submission:
If the job does not qualify for running as an uber task, then the application
master requests containers for all the map and reduce tasks in the job from
the resource manager .
Requests for map tasks are made first and with a higher priority than those
for reduce tasks, since all the map tasks must complete before the sort
phase of the reduce can start.
Task Execution:
Once a task has been assigned resources for a container on a particular node
by the resource manager’s scheduler, the application master starts the
container by contacting the node manager.
The task is executed by a Java application whose main class is YarnChild.
Before it can run the task, it localizes the resources that the task needs,
including the job configuration and JAR file, and any files from the
distributed cache.
Finally, it runs the map or reduce task.
Streaming:
Streaming runs special map and reduce tasks for the purpose of launching the
user supplied executable and communicating with it.
The Streaming task communicates with the process (which may be written in
any language) using standard input and output streams.
During execution of the task, the Java process passes input key value pairs to
the external process, which runs it through the user defined map or reduce
function and passes the output key value pairs back to the Java process.
From the node manager’s point of view, it is as if the child process run the
map or reduce code itself.
Progress and status updates :
• MapReduce jobs are long running batch jobs, taking anything from tens of
seconds to hours to run.
• A job and each of its tasks have a status, which includes such things as the
state of the job or task (e g running, successfully completed, failed), the
progress of maps and reduces, the values of the job’s counters, and a status
message or description (which may be set by user code).
• When a task is running, it keeps track of its progress (i e the proportion of
task is completed).
• For map tasks, this is the proportion of the input that has been processed.
• For reduce tasks, it’s a little more complex, but the system can still
estimate the proportion of the reduce input processed.
It does this by dividing the total progress into three parts, corresponding to the
three phases of the shuffle.
• As the map or reduce task runs, the child process communicates with its
parent application master through the umbilical interface.
• The task reports its progress and status (including counters) back to its
application master, which has an aggregate view of the job, every three
seconds over the umbilical interface.
• The resource manager web UI displays all the running applications with links
to the web UIs of their respective application masters, each of which displays
further details on the MapReduce job, including its progress.
• During the course of the job, the client receives the latest status by polling the
application master every second(the interval is set via
mapreduce.client.progressmonitor.pollinterval)..
Job Completion:
• When the application master receives a notification that the last task for a
job is complete, it changes the status for the job to Successful.
• Then, when the Job polls for status, it learns that the job has completed
successfully, so it prints a message to tell the user and then returns from the
waitForCompletion() .
• Finally, on job completion, the application master and the task containers
clean up their working state and the OutputCommitter’s commitJob ()
method is called.
• Job information is archived by the job history server to enable later
interrogation by users if desired.