Chap8 YARN
Chap8 YARN
2
Unit 5. MapReduce and YARN
Uempty
YARN architecture
Uempty
Topics
• Introduction to MapReduce
• Hadoop v1 and MapReduce v1 architecture and limitations
• YARN architecture
• Hadoop and MapReduce v1 compared to v2
Uempty
YARN
• Acronym for Yet Another Resource Negotiator.
• New resource manager is included in Hadoop 2.x and later.
• De-couples the Hadoop workload and resource management.
• Introduces a general-purpose application container.
• Hadoop 2.2.0 includes the first generally available (GA) version of
YARN.
• Most Hadoop vendors support YARN.
Uempty
Apache
MapReduce v2 Tez HBase Others
(batch) (interactive) (online) Spark (varied)
(in memory)
YARN
(cluster resource management)
HDFS
Uempty
NodeManager @node134
NodeManager @node135
Resource
Manager
@node132
NodeManager @node136
Uempty
Application 1:
Analyze lineitem table.
NodeManager @node134
Launch
NodeManager @node135
Resource
Manager Application
@node132 Master 1
NodeManager @node136
Uempty
Application 1:
Analyze lineitem table.
NodeManager @node134
NodeManager @node135
Resource Resource request
Manager Application
@node132 Master 1
Container IDs
NodeManager @bigaperf136
Uempty
Application 1:
Analyze lineitem table.
NodeManager @node134
App 1 App 1
Launch
NodeManager @node135
Resource
Manager Application
@node132 Master 1
Launch
NodeManager @node136
App 1
Uempty
Application 1:
Analyze
lineitem
table. NodeManager @node134
App 1 App 1
Application 2:
Analyze customer table.
NodeManager @node135
Resource
Manager Application
@node132 Master 1
NodeManager @node136
Application
Master 2 App 1
Uempty
Application 1:
Analyze
lineitem
table.
NodeManager @node134
App 1 App 1
Application 2:
Analyze customer table.
NodeManager @node135
Resource
Manager Application
@node132 Master 1
NodeManager @node136
Application
Master 2 App 1
Uempty
App 2
Application 1:
Analyze
lineitem
table.
NodeManager @node134
App 1 App 1
Application 2:
Analyze customer table.
NodeManager @node135
Resource
Manager Application
AppApp
2 2
@node132 Master 1
NodeManager @nodef136
Application
Master 2 App 1
Uempty
Application Resource
client manager
1: Submit
YARN
Client node Resource manager node
application.
NodeManager
3: Allocate resources (heartbeat).
2b: Launch.
Container
Application NodeManager
process 4a: Start
container.
4b: Launch.
Node manager node
Container
Application
process
To run an application on YARN, a client contacts the resource manager and prompts it to run an
application master process (step 1). The resource manager then finds a node manager that can
launch the application master in a container (steps 2a and 2b). Precisely what the application
master does after it is running depends on the application. It might simply run a computation in the
container it is running in and return the result to the client, or it might request more containers from
the resource managers (step 3) and use them to run a distributed computation (steps 4a and 4b).
For more information, see White, T. (2015) Hadoop: The definitive guide (4th ed.). Sabastopol, CA:
O'Reilly Media, p. 80.
Uempty
YARN features
• Scalability
• Multi-tenancy
• Compatibility
• Serviceability
• Higher cluster utilization
• Reliability and availability
Uempty
YARN lifts the scalability ceiling in Hadoop by splitting the roles of the Hadoop JobTracker into two
processes: A ResourceManager controls access to the cluster’s resources (memory, CPU, and
other components), and the ApplicationManager (one per job) controls task execution.
YARN can run on larger clusters than MapReduce v1. MapReduce v1 reaches scalability
bottlenecks in the region of 4,000 nodes and 40,000 tasks, which stems from the fact that the
JobTracker must manage both jobs and tasks. YARN overcomes these limitations by using its split
ResourceManager / ApplicationMaster architecture: It is designed to scale up to 10,000 nodes and
100,000 tasks.
In contrast to the JobTracker, each instance of an application has a dedicated ApplicationMaster,
which runs for the duration of the application. This model is closer to the original Google
MapReduce paper, which describes how a master process is started to coordinate Map and
Reduce tasks running on a set of workers.
Uempty
Multi-tenancy generally refers to a set of features that enable multiple business users and
processes to share a common set of resources, such as an Apache Hadoop cluster that uses a
policy rather than physical separation, without negatively impacting service-level agreements
(SLA), violating security requirements, or even revealing the existence of each party.
What YARN does is de-couple Hadoop workload management from resource management, which
means that multiple applications can share a common infrastructure pool. Although this idea is not
new, it is new to Hadoop. Earlier versions of Hadoop consolidated both workload and resource
management functions into a single JobTracker. This approach resulted in limitations for customers
hoping to run multiple applications on the same cluster infrastructure.
To borrow from object-oriented programming terminology, multi-tenancy is an overloaded term. It
means different things to different people depending on their orientation and context. To say that a
solution is multi-tenant is not helpful unless you are specific about the meaning.
Uempty
Some interpretations of multi-tenancy in big data environments are:
• Support for multiple concurrent Hadoop jobs
• Support for multiple lines of business on a shared infrastructure
• Support for multiple application workloads of different types (Hadoop and non-Hadoop)
• Provisions for security isolation between tenants
• Contract-oriented service level guarantees for tenants
• Support for multiple versions of applications and application frameworks concurrently
Organizations that are sophisticated in their view of multi-tenancy need all these capabilities and
more. YARN promises to address some of these requirements and does so in large measure.
However, you will find in future releases of Hadoop that there are other approaches that are being
addressed to provide other forms of multi-tenancy.
Although it is an important technology, the world is not suffering from a shortage of resource
managers. Some Hadoop providers are supporting YARN, and others are supporting Apache
Mesos.
Uempty
To ease the transition from Hadoop v1 to YARN, a major goal of YARN and the MapReduce
framework implementation on top of YARN was to ensure that existing MapReduce applications
that were programmed and compiled against previous MapReduce APIs (MRv1 applications) can
continue to run with little or no modification on YARN (MRv2 applications).
For many users who use the [Link] APIs, MapReduce on YARN ensures full
binary compatibility. These existing applications can run on YARN directly without recompilation.
You can use JAR files from your existing application that code against mapred APIs and use
bin/hadoop to submit them directly to YARN.
Unfortunately, it was difficult to ensure full binary compatibility with the existing applications that
compiled against MRv1 [Link] APIs. These APIs have gone through
many changes. For example, several classes stopped being abstract classes and changed to
interfaces. Therefore, the YARN community compromised by supporting source compatibility only
for [Link] APIs. Existing applications that use MapReduce APIs are
source-compatible and can run on YARN either with no changes, with simple recompilation against
MRv2 .jar files that are included with Hadoop 2, or with minor updates.
Uempty
The NodeManager is a more generic and efficient version of the TaskTracker. Instead of having a
fixed number of Map and Reduce slots, the NodeManager has several dynamically created
resource containers. The size of a container depends upon the amount of resources it contains,
such as memory, CPU, disk, and network I/O.
Currently, only memory and CPU are supported (YARN-3); cgroups might be used to control disk
and network I/O in the future.
The number of containers on a node is a product of configuration parameters and the total amount
of node resources (such as total CPUs and total memory) outside the resources that are dedicated
to the secondary daemons and the OS.
Uempty
Uempty
Uempty
Uempty
5.4. Hadoop and MapReduce v1 compared to
v2
Uempty
The original Hadoop (v1) and MapReduce (v1) had limitations, and several issues surfaced over
time. We review these issues in preparation for looking at the differences and changes that were
introduced with Hadoop v2 and MapReduce v2.
Uempty
Topics
• Introduction to MapReduce
• Hadoop v1 and MapReduce v1 architecture and limitations
• YARN architecture
• Hadoop and MapReduce v1 compared to v2
Uempty
Hadoop v1 to Hadoop v2
The most notable change from Hadoop v1 to Hadoop v2 is the separation of cluster and resource
management from the execution and data processing environment. This change allows for many
new application types to run on Hadoop, including MapReduce v2.
HDFS is common to both versions. MapReduce is the only execution engine in Hadoop v1. The
YARN framework provides work scheduling that is neutral to the nature of the work that is
performed. Hadoop v2 supports many execution engines, including a port of MapReduce that is
now a YARN application.
Uempty
The fundamental idea of YARN and MRv2 is to split the two major functions of the JobTracker,
resource management and job scheduling / monitoring, into separate daemons. The idea is to have
a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is
either a single job in the classical sense of MapReduce jobs or a DAG of jobs.
The ResourceManager and per-node worker, the NodeManager (NM), form the data-computation
framework. The ResourceManager is the ultimate authority that arbitrates resources among all the
applications in the system.
The per-application ApplicationMaster is, in effect, a framework-specific library that is tasked with
negotiating resources from the ResourceManager and working with the NodeManagers to run and
monitor the tasks.
Uempty
The ResourceManager has two main components: Scheduler and ApplicationsManager:
• The Scheduler is responsible for allocating resources to the various running applications. The
Scheduler is pure scheduler in the sense that it performs no monitoring or tracking of status for
the application. Also, it offers no guarantees about restarting failed tasks either due to
application failure or hardware failures. The Scheduler performs its scheduling function based
the resource requirements of the applications; it does so based on the abstract notion of a
resource Container, which incorporates elements such as memory, CPU, disk, network, and
other resources. In the first version, only memory is supported.
The Scheduler has a pluggable policy plug-in, which is responsible for partitioning the cluster
resources among the various queues, applications, and other items. The current MapReduce
schedulers, such as the CapacityScheduler and the FairScheduler, are some examples of the
plug-in.
The CapacityScheduler supports hierarchical queues to allow for more predictable sharing of
cluster resources.
• The ApplicationsManager is responsible for accepting job submissions and negotiating the first
container for running the application-specific ApplicationMaster. It provides the service for
restarting the ApplicationMaster container on failure.
The NodeManager is the per-machine framework agent that is responsible for containers,
monitoring their resource usage (CPU, memory, disk, and network), and reporting the same to
the ResourceManager / Scheduler.
The per-application ApplicationMaster has the task of negotiating appropriate resource
containers from the Scheduler, tracking their status, and monitoring for progress.
MRv2 maintains API compatibility with previous stable release (hadoop-1.x), which means that
all MapReduce jobs should still run unchanged on top of MRv2 with just a recompile.
Uempty
Architecture of MRv1
Classic version of MapReduce (MRv1)
TaskTracker
Reduce task
Client
• Runs Map and
Reduce tasks.
In MapReduce v1, there is only one JobTracker that is responsible for allocation of resources, task
assignment to data nodes (as TaskTrackers), and ongoing monitoring ("heartbeat") as each job is
run (the TaskTrackers constantly report back to the JobTracker on the status of each running task).
Uempty
YARN architecture
High-level architecture of YARN
ResourceManager (RM) NodeManager
NodeManager (NM)
Uempty
The NodeManager is a more generic and efficient version of the TaskTracker. Instead of having a
fixed number of Map and Reduce slots, the NodeManager has several dynamically created
resource containers. The size of a container depends upon the amount of resources it contains,
such as memory, CPU, disk, and network I/O. Currently, only memory and CPU (YARN-3) are
supported. cgroups might be used to control disk and network I/O in the future. The number of
containers on a node is a product of configuration parameters and the total amount of node
resources (such as total CPU and total memory) outside the resources that are dedicated to the
secondary daemons and the OS.
The ApplicationMaster can run any type of task inside a container. For example, the MapReduce
ApplicationMaster requests a container to start a Map or a Reduce task, and the Giraph
ApplicationMaster requests a container to run a Giraph task. You can also implement a custom
ApplicationMaster that runs specific tasks and invent a new distributed application framework. I
encourage you to read about Apache Twill, which aims to make it easier to write distributed
applications sitting on top of YARN.
In YARN, MapReduce is simply degraded to the role of a distributed application (but still a useful
one) and is now called MRv2. MRv2 is simply the re implementation of the classical MapReduce
engine, now called MRv1, that runs on top of YARN.
Uempty
ApplicationMaster JobTracker
(but dedicated and short-lived)
NodeManager TaskTracker
Container Slot