nuPlan 是一个针对自动驾驶车辆的闭环机器学习(ML-based)规划基准测试

nuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

nuPlan 是一个针对自动驾驶车辆的闭环机器学习(ML-based)规划基准测试

Abstract

In this work, we propose the world’s first closed-loop ML-based planning benchmark for autonomous driving. While there is a growing body of ML-based motion planners, the lack of established datasets and metrics has limited the progress in this area. Existing benchmarks for autonomous vehicle motion prediction have focused on short-term motion forecasting, rather than long-term planning. This has led previous works to use open-loop evaluation with L2-based metrics, which are not suitable for fairly evaluating long-term planning. Our benchmark overcomes these limitations by introducing a largescale driving dataset, lightweight closed-loop simulator, and motion-planning-specific metrics. We provide a highquality dataset with 1500h of human driving data from 4 cities across the US and Asia with widely varying traffic patterns (Boston, Pittsburgh, Las Vegas and Singapore). We will provide a closed-loop simulation framework with reactive agents and provide a large set of both general and scenario-specific planning metrics. We plan to release the dataset at NeurIPS 2021 and organize benchmark challenges starting in early 2022.
在这项研究中,我们首次提出了一个闭环的基于机器学习的自动驾驶规划基准测试。尽管基于机器学习的运动规划器日益增多,但缺乏成熟的数据集和评价指标限制了该领域的发展。现有的自动驾驶车辆运动预测基准主要集中在短期运动预测上,而不是长期规划。这导致以往的研究采用基于 L2 指标的开环评估,这并不适用于长期规划的公正评价。我们的基准测试通过引入大规模的驾驶数据集、轻量级的闭环模拟器和专门针对运动规划的度量标准来克服这些限制。我们提供了一个高质量的数据集,包含了来自美国和亚洲4个城市(波士顿、匹兹堡、拉斯维加斯和新加坡)的1500小时人类驾驶数据,这些地区交通模式差异显著。我们还将提供一个闭环模拟框架,其中包括反应性代理,并提供了一系列通用和特定场景的规划度量标准。我们计划在 2021 年的 NeurIPS 会议上发布该数据集,并从 2022 年初开始组织基准测试挑战。

1. Introduction

Large-scale human labeled datasets in combination with deep Convolutional Neural Networks have led to an impressive performance increase in autonomous vehicle (AV) perception over the last few years [9, 4]. In contrast, existing solutions for AV planning are still primarily based on carefully engineered expert systems, that require significant amounts of engineering to adapt to new geographies and do not scale with more training data. We believe that providing suitable data and metrics will enable ML-based planning and pave the way towards a full “Software 2.0” stack.
在过去几年中,结合了大规模人工标注数据集和深度卷积神经网络的技术,已经在自动驾驶汽车(AV)的感知能力上取得了令人瞩目的性能提升[9, 4]。然而,目前针对 AV 规划的解决方案主要还是依赖于精心设计的专业系统,这些系统需要大量的工程努力来适应不同的地理位置,并且它们并不随着训练数据的增加而自动扩展。我们认为,提供合适的数据和度量标准将促进基于机器学习的规划方法的发展,并为实现全面的“软件 2.0”技术体系铺平道路。这种技术体系强调利用机器学习模型来设计和实现软件功能,而不是传统的基于规则的编程方法。
Existing real-world benchmarks are focused on shortterm motion forecasting, also known as prediction [6, 4, 11, 8], rather than planning. This is evident in the lack of high-level goals, the choice of metrics, and the openloop evaluation. Prediction focuses on the behavior of other agents, while planning relates to the ego vehicle behavior.
现有的真实世界基准测试主要关注短期运动预测,也就是通常所说的预测[6, 4, 11, 8],而不是长期规划。这一点从缺少高级目标、所选的度量标准,以及开环评估方式中都可以看出。运动预测主要关注其他交通参与者的行为,而规划则与自车的行为密切相关。
在自动驾驶领域,运动预测通常涉及预测其他车辆、行人或自行车等在未来短时间内的运动轨迹。而规划则是基于这些预测,以及自车的当前状态和高级目标(如目的地),来决定自车的最佳行驶路径和行为。规划过程需要考虑更多的长期因素,如遵守交通规则、优化行程时间或舒适性等。
现有的基准测试可能没有为长期规划提供足够的支持,这可能是因为短期预测在技术上更容易实现,或者因为缺乏合适的数据和评估方法。然而,为了实现更高级的自动驾驶功能,需要开发能够进行长期规划的系统,并为这些系统提供相应的基准测试和度量标准。
Prediction is typically multi-modal, which means that for each agent we predict the N most likely trajectories. In contrast, planning is typically uni-modal (except for contingency planning) and we predict a single trajectory. As an example, in Fig. 1a, turning left or right at an intersection are equally likely options. Prediction datasets lack a baseline navigation route to indicate the high-level goals of the agents. In Fig. 1b, the options of merging immediately or later are both equally valid, but the commonly used L2 distance-based metrics (minADE, minFDE, and miss rate) penalize the option that was not observed in the data. Intuitively, the distance between the predicted trajectory and the observed trajectory is not a suitable indicator in a multimodal scenario. In Fig. 1c, the decision whether to continue to overtake or get back into the lane should be based on the consecutive actions of all agent vehicles, which is not possible in open-loop evaluation. Lack of closed-loop evaluation leads to systematic drift, making it difficult to evaluate beyond a short time horizon (3-8s).
预测通常具有多模态性,这意味着对于每个交通参与者,我们会预测 N 条最可能的轨迹。与此相反,规划通常是单模态的(紧急规划情况除外),我们只预测一条轨迹。例如,在图 1a 中,在一个交叉路口左转或右转是同样可能的选项。预测数据集缺少一个基线导航路线来指示参与者的高级目标。在图 1b 中,立即合并车道或稍后再合并都是同样有效的选择,但常用的基于 L2 距离的度量方法(最小平均误差 minADE、最小最终误差 minFDE 和未命中率)会惩罚那些在数据中未被观察到的选项。直观上讲,在多模态场景中,预测轨迹与实际观察到的轨迹之间的距离并不是一个合适的评估指标。在图 1c 中,是否继续超车或回到车道的决定应该基于所有交通参与者连续动作的考量,这在开环评估中是无法实现的。缺乏闭环评估会导致系统性偏差,使得评估难以扩展到更长远的时间范围(3-8 秒)。
在这里插入图片描述
图 1 展示了不同的驾驶场景,用以突出现有基准测试的不足之处。图中自车的观测行驶路线用白色表示,而假想的规划器路线则用红色表示:
(a) 由于缺少目标,导致在交叉路口出现不确定性。
(b) 位移度量标准并未充分考虑驾驶行为的多模态性。
©

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值