一. 前言
在ResourceManager中, ClientRMService和AdminService两个服务分别负责处理来自普通用户和管理员的请求, 需要注意的是, 之所以让这两类请求通过两个不同的通信通道发送给ResourceManager, 是因为要避免普通用户请求过多导致管理员请求被阻塞而迟迟得不到处理。
二.ApplicationClientProtocol协议
ClientRMService是一个RPC Server, 为来自客户端的各种RPC请求提供服务。它实现了ApplicationClientProtocol协议.
默认使用端口 8032.
三. 方法解析
3.1.ApplicationClientProtocol
clients与RM之间的协议, JobClient通过该RPC协议提交应用程序、 查询应用程序状态、 集群状态、 节点、队列和权限控制等。
- 应用信息
方法名称 | 描述 |
---|---|
getNewApplication | client获得一个单调递增的ApplicationId用来提交Application . 响应信息中会包含集群的一些详细信息,比如集群中指定最大资源能力 |
submitApplication | client向ResourceManager提交Application. 客户端需要通过SubmitApplicationRequest提供诸如队列、运行ApplicationMaster所需的资源、用于启动ApplicationMaster的容器的数据量等详细信息 ResourceManager如果接受请求的话会立即返回一个<空的>SubmitApplicationResponse,如果拒绝则直接回抛出一个异常,然而 请求需要根据getApplicationReport方法确保application被正确的提交, 从ResourceManager获取SubmitApplicationResponse并不保证RM在故障转移或重新启动之后“记住”此应用程序。 如果ResourceManager在成功保存application之前发生了故障转义或者重启,getApplicationReport方法将会抛出一个ApplicationNotFoundException异常.当clinet在通过getApplicationReport获得一个ApplicationNotFoundException时,会以相同的ApplicationSubmissionContext配置重新提交application. 在提交 的过程中,会检查application 是否存在,如果application存在,将会返回SubmitApplicationResponse. 在安全模式下,ResouceManager会在接受应用程序提交之前验证对队列等的访问权限。 |
failApplicationAttempt | client用来请求ResourceManager使应用程序尝试失败 |
forceKillApplication | client请求ResourceManager终止已提交的Application |
moveApplicationAcrossQueues | 将一个application移动到另外一个队列 |
updateApplicationPriority | 更新application的优先级 |
资源预约
方法名称 | 描述 |
---|---|
getNewReservation | client请求ResourceManager获取ReservationId |
submitReservation | 提交reservation |
updateReservation | 更新reservation |
deleteReservation | 删除reservation |
listReservations | 根据筛选条件查询reservation列表信息. 可通过queue, reservationId, startTime,endTime,includeReservationAllocations等参数. |
signalToContainer | 客户端用于请求ResourceManager向容器发出信号的接口 比如像container发送OUTPUT_THREAD_DUMP命令获取线程快照信息. 安全模式下,ResourceManager在application发送命令给container之前会验证权限信息. 用户需要具有<MODIFY_APP>权限 |
updateApplicationTimeouts | 更新application的超时时间. 需要提供<yyyy-MM-dd’T’HH:mm:ss.SSSZ>格式时间,如果超时时间小于等于当前时间会抛出异常 |
- 用户提交预订(reservation)创建请求,并接收包含ReservationId的响应。
- 用户提交一个用RDL(Reservation Definition Language)表示的规格文件和ReservationId组成的预订请求。这描述了用户对资源随时间的需求(例如,资源的阈值)和时间约束(例如,期限)。这可以通过常用的Client-to-RM协议或通过RM的REST api以编程方式完成。如果使用相同的ReservationId提交预订,并且RDL相同,则不会创建新预留并且请求将成功。如果RDL不同,则将拒绝预留,并且请求将不成功。如果使用相同的ReservationId提交预订,并且RDL相同,则不会创建新预留但是请求将会成功。如果RDL不同,则将拒绝预留,并且请求将会不成功。
- ReservationSystem利用ReservationAgent(图中的GREE)启动一个客户端去查找合理的资源分配。一般是通过一个计划,计划是一个数据结构,描述了当前已经接受的保留和可获得的资源。
- SharingPolicy提供一种方式去接受或者拒绝预订。
- 成功验证后,ReservationSystem会向用户返回ReservationId。
6.到一定时间,一个名为PlanFollower的新组件通过动态的创建/调整/销毁队列,将计划状态发布到调度程序。 - 用户可以提交一个作业到可预留队列,可以明确指定ReservationId作为ApplicationSubmissionContext的一部分。
- Scheduler将从创建的特殊队列中提供容器,以确保遵守资源预留。在预留的限制内,用户已经保证访问资源,在资源共享之上进行标准的Capacity/Fairness共享。
- 系统同时提供一个机制去适应集群容量下降的机制。
- 集群/节点/配置信息
方法名称 | 描述 |
---|---|
getClusterMetrics | client 向ResourceManager获取集群信息 |
getClusterNodes | client 向ResourceManager获取集群所有节点的信息 |
getNodeToLabels | 获取node节点的label信息 |
getLabelsToNodes | 获取label的node 信息 |
getClusterNodeLabels | 获取集群中的label信息 |
getResourceProfiles | 获取特定资源配置[resource-profiles.json,] |
getResourceTypeInfo | 获取特定资源配置 |
getAttributesToNodes | 获取attribute->node属性 |
getClusterNodeAttributes | 获取集群node属性 |
getNodesToAttributes | 获取集群node->attribute属性 |
- 队列信息
方法名称 | 描述 |
---|---|
getQueueInfo | 获取集群队列信息[已使用/总容量/子队列/正在运行任务信息] |
getQueueUserAcls | Client从ResourceManager获取有关当前用户的队列ACL的信息 |
3.2. RMContext
ClientRMService类中保留了一个ResourceManager上下文对象RMContext, 通过该对象可获知ResourceManager中绝大部分信息, 包括节点列表、 队列组织、 应用程序列表等,这样ClientRMService可很容易通过查询RMContext中的信息为来自客户端的请求做出应答。
RMContext的实现类是RMContextImpl.
这个类里面有两个属性比较重要RMServiceContext serviceContext 和RMActiveServiceContext activeServiceContext
RMServiceContext中最重要的是Dispatcher rmDispatcher 属性.
RMActiveServiceContext 中的属性就比较多.主要都是当前运行application和node的管理,常用的属性如下:
//应用程序列表
private final ConcurrentMap<ApplicationId, RMApp> applications = new ConcurrentHashMap<ApplicationId, RMApp>();
//节点列表
private final ConcurrentMap<NodeId, RMNode> nodes = new ConcurrentHashMap<NodeId, RMNode>();
//非活跃节点列表
private final ConcurrentMap<NodeId, RMNode> inactiveNodes = new ConcurrentHashMap<NodeId, RMNode>();
private final ConcurrentMap<ApplicationId, ByteBuffer> systemCredentials = new ConcurrentHashMap<ApplicationId, ByteBuffer>();
private boolean isWorkPreservingRecoveryEnabled;
//运行中的AM心跳监控
private AMLivelinessMonitor amLivelinessMonitor;
//运行完成的AM心跳监控
private AMLivelinessMonitor amFinishingMonitor;
//ResourceManager状态保存处
private RMStateStore stateStore = null;
//Container超时监控, 应用程序必须在一定时间内使用分配到的Container, 否则将被回收
private ContainerAllocationExpirer containerAllocationExpirer;
// 资源调度器
private ResourceScheduler scheduler;
// 节点管理器
private NodesListManager nodesListManager;
属性有点多,这里就先不做展开,后续慢慢研究.
3.3. getNewApplicationId 方法
获取ApplicationId的方法. 通过时间戳和一个自增的数字构成,实例 : application_1606203073635_0007
ApplicationId getNewApplicationId() {
ApplicationId applicationId = org.apache.hadoop.yarn.server.utils.BuilderUtils
.newApplicationId(recordFactory, ResourceManager.getClusterTimeStamp(),
applicationCounter.incrementAndGet());
LOG.info("Allocated new applicationId: " + applicationId.getId());
return applicationId;
}
3.4. getNewApplication 方法
client获得一个单调递增的ApplicationId用来提交Application .响应信息中会包含集群的一些详细信息,比如集群中指定最大资源能力
@Override
public GetNewApplicationResponse getNewApplication( GetNewApplicationRequest request) throws YarnException {
GetNewApplicationResponse response = recordFactory
.newRecordInstance(GetNewApplicationResponse.class);
// 返回 ApplicationId
response.setApplicationId(getNewApplicationId());
// 返回 集群所能分配的最大资源
// Pick up min/max resource from scheduler...
response.setMaximumResourceCapability(scheduler.getMaximumResourceCapability());
return response;
}
3.5. getApplicationReport 方法
根据ApplicationId获取Application的状态信息
/**
* It gives response which includes application report if the application
* present otherwise throws ApplicationNotFoundException.
*/
@Override
public GetApplicationReportResponse getApplicationReport(
GetApplicationReportRequest request) throws YarnException {
// 获取ApplicationId
ApplicationId applicationId = request.getApplicationId();
if (applicationId == null) {
throw new ApplicationNotFoundException("Invalid application id: null");
}
UserGroupInformation callerUGI;
try {
callerUGI = UserGroupInformation.getCurrentUser();
} catch (IOException ie) {
LOG.info("Error getting UGI ", ie);
throw RPCUtil.getRemoteException(ie);
}
// 获取application
RMApp application = this.rmContext.getRMApps().get(applicationId);
if (application == null) {
// If the RM doesn't have the application, throw
// ApplicationNotFoundException and let client to handle.
throw new ApplicationNotFoundException("Application with id '"
+ applicationId + "' doesn't exist in RM. Please check "
+ "that the job submission was successful.");
}
// 权限检测
boolean allowAccess = checkAccess(callerUGI, application.getUser(),
ApplicationAccessType.VIEW_APP, application);
// 获取报告信息
ApplicationReport report =
application.createAndGetApplicationReport(callerUGI.getUserName(),
allowAccess);
// 设置响应信息
GetApplicationReportResponse response = recordFactory
.newRecordInstance(GetApplicationReportResponse.class);
response.setApplicationReport(report);
return response;
}
3.6. submitApplication 方法
这个是一个核心方法, client向ResourceManager提交Application.客户端需要通过SubmitApplicationRequest提供诸如队列、运行ApplicationMaster所需的资源、用于启动ApplicationMaster的容器的数据量等详细信息 ResourceManager如果接受请求的话会立即返回一个<空的>SubmitApplicationResponse,如果拒绝则直接回抛出一个异常,然而 请求需要根据getApplicationReport方法确保application被正确的提交, 从ResourceManager获取SubmitApplicationResponse并不保证RM在故障转移或重新启动之后“记住”此应用程序。如果ResourceManager在成功保存application之前发生了故障转义或者重启,getApplicationReport方法将会抛出一个ApplicationNotFoundException异常.当clinet在通过getApplicationReport获得一个ApplicationNotFoundException时,会以相同的ApplicationSubmissionContext配置重新提交application.在提交 的过程中,会检查application 是否存在,如果application存在,将会返回SubmitApplicationResponse.在安全模式下,ResouceManager会在接受应用程序提交之前验证对队列等的访问权限。
@Override
public SubmitApplicationResponse submitApplication(
SubmitApplicationRequest request) throws YarnException, IOException {
// 获取 请求的 ApplicationSubmissionContext
ApplicationSubmissionContext submissionContext = request.getApplicationSubmissionContext();
// 获取 ApplicationId
ApplicationId applicationId = submissionContext.getApplicationId();
// 获取 CallerContext
CallerContext callerContext = CallerContext.getCurrent();
// ApplicationSubmissionContext needs to be validated for safety - only
// those fields that are independent of the RM's configuration will be
// checked here, those that are dependent on RM configuration are validated
// in RMAppManager.
// 验证权限信息
String user = null;
try {
// Safety
user = UserGroupInformation.getCurrentUser().getShortUserName();
} catch (IOException ie) {
LOG.warn("Unable to get the current user.", ie);
RMAuditLogger.logFailure(user, AuditConstants.SUBMIT_APP_REQUEST,
ie.getMessage(), "ClientRMService",
"Exception in submitting application", applicationId, callerContext,
submissionContext.getQueue());
throw RPCUtil.getRemoteException(ie);
}
// timeline 服务
if (timelineServiceV2Enabled) {
// Sanity check for flow run
String value = null;
try {
for (String tag : submissionContext.getApplicationTags()) {
if (tag.startsWith(TimelineUtils.FLOW_RUN_ID_TAG_PREFIX + ":") ||
tag.startsWith(
TimelineUtils.FLOW_RUN_ID_TAG_PREFIX.toLowerCase() + ":")) {
value = tag.substring(TimelineUtils.FLOW_RUN_ID_TAG_PREFIX.length()
+ 1);
// In order to check the number format
Long.valueOf(value);
}
}
} catch (NumberFormatException e) {
LOG.warn("Invalid to flow run: " + value +
". Flow run should be a long integer", e);
RMAuditLogger.logFailure(user, AuditConstants.SUBMIT_APP_REQUEST,
e.getMessage(), "ClientRMService",
"Exception in submitting application", applicationId,
submissionContext.getQueue());
throw RPCUtil.getRemoteException(e);
}
}
// 检测 app是否已经放入到 rmContext 如果已经提交过了,直接返回一个空的响应
// Check whether app has already been put into rmContext,
// If it is, simply return the response
if (rmContext.getRMApps().get(applicationId) != null) {
LOG.info("This is an earlier submitted application: " + applicationId);
return SubmitApplicationResponse.newInstance();
}
// 获取 token配置 信息
ByteBuffer tokenConf =
submissionContext.getAMContainerSpec().getTokensConf();
if (tokenConf != null) {
int maxSize = getConfig()
.getInt(YarnConfiguration.RM_DELEGATION_TOKEN_MAX_CONF_SIZE,
YarnConfiguration.DEFAULT_RM_DELEGATION_TOKEN_MAX_CONF_SIZE_BYTES);
LOG.info("Using app provided configurations for delegation token renewal,"
+ " total size = " + tokenConf.capacity());
if (tokenConf.capacity() > maxSize) {
throw new YarnException(
"Exceed " + YarnConfiguration.RM_DELEGATION_TOKEN_MAX_CONF_SIZE
+ " = " + maxSize + " bytes, current conf size = "
+ tokenConf.capacity() + " bytes.");
}
}
//设置队列, 默认 : default
if (submissionContext.getQueue() == null) {
submissionContext.setQueue(YarnConfiguration.DEFAULT_QUEUE_NAME);
}
// 设置ApplicationName 默认: N/A
if (submissionContext.getApplicationName() == null) {
submissionContext.setApplicationName(
YarnConfiguration.DEFAULT_APPLICATION_NAME);
}
// 设置ApplicationType类型 默认: YARN
if (submissionContext.getApplicationType() == null) {
submissionContext
.setApplicationType(YarnConfiguration.DEFAULT_APPLICATION_TYPE);
} else {
if (submissionContext.getApplicationType().length() > YarnConfiguration.APPLICATION_TYPE_LENGTH) {
submissionContext.setApplicationType(submissionContext
.getApplicationType().substring(0,
YarnConfiguration.APPLICATION_TYPE_LENGTH));
}
}
// 获取预留资源id
ReservationId reservationId = request.getApplicationSubmissionContext()
.getReservationID();
// 检测权限
checkReservationACLs(submissionContext.getQueue(), AuditConstants
.SUBMIT_RESERVATION_REQUEST, reservationId);
if (this.contextPreProcessor != null) {
this.contextPreProcessor.preProcess(Server.getRemoteIp().getHostName(),
applicationId, submissionContext);
}
try {
// 请求RMAppManager 提交application
// call RMAppManager to submit application directly
rmAppManager.submitApplication(submissionContext, System.currentTimeMillis(), user);
LOG.info("Application with id " + applicationId.getId() + " submitted by user " + user);
// 审计 打印日志
RMAuditLogger.logSuccess(user, AuditConstants.SUBMIT_APP_REQUEST,
"ClientRMService", applicationId, callerContext,
submissionContext.getQueue());
} catch (YarnException e) {
LOG.info("Exception in submitting " + applicationId, e);
RMAuditLogger.logFailure(user, AuditConstants.SUBMIT_APP_REQUEST,
e.getMessage(), "ClientRMService",
"Exception in submitting application", applicationId, callerContext,
submissionContext.getQueue());
throw e;
}
return recordFactory
.newRecordInstance(SubmitApplicationResponse.class);
}
3.7. forceKillApplication 方法
client请求ResourceManager, 根据applicationId终止已提交的Application
@SuppressWarnings("unchecked")
@Override
public KillApplicationResponse forceKillApplication(
KillApplicationRequest request) throws YarnException {
// 获取applicationId
ApplicationId applicationId = request.getApplicationId();
CallerContext callerContext = CallerContext.getCurrent();
// 权限验证
UserGroupInformation callerUGI;
try {
callerUGI = UserGroupInformation.getCurrentUser();
} catch (IOException ie) {
LOG.info("Error getting UGI ", ie);
RMAuditLogger.logFailure("UNKNOWN", AuditConstants.KILL_APP_REQUEST,
"UNKNOWN", "ClientRMService", "Error getting UGI",
applicationId, callerContext);
throw RPCUtil.getRemoteException(ie);
}
// 获取application 信息
RMApp application = this.rmContext.getRMApps().get(applicationId);
if (application == null) {
RMAuditLogger.logFailure(callerUGI.getUserName(),
AuditConstants.KILL_APP_REQUEST, "UNKNOWN", "ClientRMService",
"Trying to kill an absent application", applicationId, callerContext);
throw new ApplicationNotFoundException("Trying to kill an absent"
+ " application " + applicationId);
}
if (!checkAccess(callerUGI, application.getUser(),
ApplicationAccessType.MODIFY_APP, application)) {
RMAuditLogger.logFailure(callerUGI.getShortUserName(),
AuditConstants.KILL_APP_REQUEST,
"User doesn't have permissions to "
+ ApplicationAccessType.MODIFY_APP.toString(), "ClientRMService",
AuditConstants.UNAUTHORIZED_USER, applicationId, callerContext);
throw RPCUtil.getRemoteException(new AccessControlException("User "
+ callerUGI.getShortUserName() + " cannot perform operation "
+ ApplicationAccessType.MODIFY_APP.name() + " on " + applicationId));
}
if (application.isAppFinalStateStored()) {
return KillApplicationResponse.newInstance(true);
}
StringBuilder message = new StringBuilder();
message.append("Application ").append(applicationId)
.append(" was killed by user ").append(callerUGI.getShortUserName());
InetAddress remoteAddress = Server.getRemoteIp();
if (null != remoteAddress) {
message.append(" at ").append(remoteAddress.getHostAddress());
}
String diagnostics = org.apache.commons.lang3.StringUtils
.trimToNull(request.getDiagnostics());
if (diagnostics != null) {
message.append(" with diagnostic message: ");
message.append(diagnostics);
}
// 想 Dispatcher 发送kill event
this.rmContext.getDispatcher().getEventHandler()
.handle(new RMAppKillByClientEvent(applicationId, message.toString(),
callerUGI, remoteAddress));
// For Unmanaged AMs, return true so they don't retry
return KillApplicationResponse.newInstance(
application.getApplicationSubmissionContext().getUnmanagedAM());
}
3.8. moveApplicationAcrossQueues 方法
将一个application移动到另外一个队列. 最终调用的是moveApplicationAcrossQueue方法
Capacity 调度器将会遵循以下规则
1. 验证移动application请求是否有访问权限或者其他的错误,如果验证失败,抛出YarnException
2. 更新存储信息
3. 执行实际操作,并且更新内存结构
/**
* moveToQueue will invoke scheduler api to perform move queue operation.
*
* @param applicationId
* Application Id.
* @param targetQueue
* Target queue to which this app has to be moved.
* @throws YarnException
* Handle exceptions.
*/
public void moveApplicationAcrossQueue(ApplicationId applicationId, String targetQueue)
throws YarnException {
// 获取 application
RMApp app = this.rmContext.getRMApps().get(applicationId);
// Capacity 调度器将会遵循以下规则
// 1. 检查变更是正确的.
// 2. 更新存储信息
// 3. 执行实际操作,并且更新内存结构
// Capacity scheduler will directly follow below approach.
// 1. Do a pre-validate check to ensure that changes are fine.
// 2. Update this information to state-store
// 3. Perform real move operation and update in-memory data structures.
synchronized (applicationId) {
// 验证app是否为null 或者已经执行完
if (app == null || app.isAppInCompletedStates()) {
return;
}
// 获取 源 queue
String sourceQueue = app.getQueue();
// 1. 验证移动application请求是否有访问权限或者其他的错误,如果验证失败,抛出YarnException
// 1. pre-validate move application request to check for any access
// violations or other errors. If there are any violations, YarnException
// will be thrown.
//
rmContext.getScheduler().preValidateMoveApplication(applicationId,
targetQueue);
// 2. 使用新的队列, 更新存储信息
// 2. Update to state store with new queue and throw exception is failed.
updateAppDataToStateStore(targetQueue, app, false);
// 3. 执行真实操作 , 根据调度器的不同,执行 moveApplication 方法
// 3. Perform the real move application
String queue = "";
try {
queue = rmContext.getScheduler().moveApplication(applicationId,
targetQueue);
} catch (YarnException e) {
// Revert to source queue since in-memory move has failed. Chances
// of this is very rare as we have already done the pre-validation.
updateAppDataToStateStore(sourceQueue, app, true);
throw e;
}
// 更新内存信息 update in-memory
if (queue != null && !queue.isEmpty()) {
app.setQueue(queue);
}
}
rmContext.getSystemMetricsPublisher().appUpdated(app,
System.currentTimeMillis());
}
3.9. updateApplicationPriority 方法
更新application的优先级,通过调用RMAppManager#updateApplicationPriority方法执行.
3.10. getClusterMetrics 方法
client 向ResourceManager获取集群信息, 包括 总NodeManager数量 和 退役,active,丢失,不健康,重启的 NodeManager数量
@Override
public GetClusterMetricsResponse getClusterMetrics(
GetClusterMetricsRequest request) throws YarnException {
GetClusterMetricsResponse response = recordFactory
.newRecordInstance(GetClusterMetricsResponse.class);
YarnClusterMetrics ymetrics = recordFactory
.newRecordInstance(YarnClusterMetrics.class);
// 设置NodeManager数量
ymetrics.setNumNodeManagers(this.rmContext.getRMNodes().size());
ClusterMetrics clusterMetrics = ClusterMetrics.getMetrics();
// 设置退役节点数量
ymetrics.setNumDecommissionedNodeManagers(clusterMetrics.getNumDecommisionedNMs());
// 设置acitve NodeManager树龄
ymetrics.setNumActiveNodeManagers(clusterMetrics.getNumActiveNMs());
// 设置 丢失 NodeManager 数量
ymetrics.setNumLostNodeManagers(clusterMetrics.getNumLostNMs());
// 设置 不健康的NodeManager 数量
ymetrics.setNumUnhealthyNodeManagers(clusterMetrics.getUnhealthyNMs());
// 设置重启 NodeManager 数量
ymetrics.setNumRebootedNodeManagers(clusterMetrics.getNumRebootedNMs());
response.setClusterMetrics(ymetrics);
return response;
}
3.11. getClusterNodes 方法
client 向ResourceManager获取集群所有节点的信息
包括: node id, 状态,http地址,机架名字,使用量,总容量,container数量,健康状态,最后一次汇报时间,标签,平均利用率,总利用率
@Override
public GetClusterNodesResponse getClusterNodes(GetClusterNodesRequest request)
throws YarnException {
GetClusterNodesResponse response = recordFactory.newRecordInstance(GetClusterNodesResponse.class);
EnumSet<NodeState> nodeStates = request.getNodeStates();
if (nodeStates == null || nodeStates.isEmpty()) {
nodeStates = EnumSet.allOf(NodeState.class);
}
Collection<RMNode> nodes = RMServerUtils.queryRMNodes(rmContext,
nodeStates);
List<NodeReport> nodeReports = new ArrayList<NodeReport>(nodes.size());
for (RMNode nodeInfo : nodes) {
// 赋值
nodeReports.add(createNodeReports(nodeInfo));
}
response.setNodeReports(nodeReports);
return response;
}
3.12. getNodeToLabels/ 方法
getNodeToLabels : 获取node节点的label信息 格式 : NodeId --> Set
getLabelsToNodes : 获取label的node节点 格式 : LableName --> Set