LaneCPP: 3D Lane Detection with Priors
LaneCPP: 3D Lane Detection with Priors
10639
data, the model can focus its full capacity on learning richer chor3DLane [12], introducing anchor projection with it-
features for the lane detection task. erative regression. Similar to grid-based representations,
We can further use physical knowledge about the road it requires subsequent curve-fitting to obtain smooth lines.
geometry to support the model in learning an internal trans- 4) Continuous curve representations [5, 22, 24, 43, 44] in-
formation from image features to 3D space. While methods stead directly model smooth curves without requiring costly
based on Inverse Perspective Mapping (IPM) [4, 6, 9, 17, post-processing. While CLGO [23] and CurveFormer [1]
23, 32] make false flat-ground assumptions, learning based use simple polynomials, 3D-SpLineNet [32] proposes B-
transformations [1, 2, 46] completely ignore road proper- Splines [3]. Since B-Splines offer local control over curve
ties. In contrast, integrating prior knowledge about the road segments, they are compatible to model complex shapes
surface allows us to model 3D features geometry-aware and with low-degree basis functions, while polynomials and
helps the network to focus on the 3D region of interest. Bézier curves show global dependence and thus require
Thus, we propose a novel 3D lane detection approach higher degrees causing expensive computation. Although
named LaneCPP that leverages valuable prior knowledge to 3D-SpLineNet achieves superior detection performance on
achieve accurate and robust perception behavior. It intro- synthetic data, it unfortunately lacks flexibility as the curve
duces a new sophisticated continuous curve representation, formulation is limited to monotonically progressing lanes,
which enables us to incorporate physical priors. In addition, making it hardly applicable to real-world data. To resolve
we present a spatial transformation component for learning this issue, we propose a more flexible representation based
a physically inspired mapping from 2D image to 3D space on actual 3D B-Splines. In contrast to discrete grids and an-
providing meaningful spatial features. chors, continuous representation even allow us to integrate
Our main contributions can be summarized as follows: prior knowledge in an analytical manner.
• We propose a novel architecture for 3D lane detection Geometry Priors. Several approaches suggest to in-
from monocular images using a more sophisticated flexi- corporate prior knowledge into learning-based methods,
ble parametric spline-based lane representation. e.g. by integrating invariance into the model architecture
• We present a way to incorporate priors about lane struc- [35, 36] or task-specific transformations as for trajectory
ture and geometry into our continuous representation. planning [10, 47, 49]. In the field of lane detection, line par-
• We introduce a new way to use prior knowledge about the allelism has been formulated as a hard constraint to resolve
road surface geometry for learning spatial features. depth ambiguity and determine camera parameters [27, 48].
• We demonstrate the benefits of our contributions in sev- Deep declarative networks [8] offer a general framework
eral ablation studies. to incorporate arbitrary properties as constraints, by solv-
• We show state-of-the-art performance of our model. ing a constrained optimization problem in the forward pass.
While such methods are appropriate when hard constraints
2. Related work must be enforced, our goal is rather to guide the network
Different Lane Representations. An important design in learning typical geometric lane properties by formulat-
choice in deep learning based lane detection is the rep- ing soft constraints in a regularization objective. Such a
resentation that the network uses to model lane line ge- regularization only affects training and does not require re-
ometry, which can be categorized as follows: 1) Pixel- solving an optimization problem in the forward pass, and
wise representations, which formulate lane detection as a thus, comes without additional computational cost during
segmentation problem, were used mainly in 2D methods inference. Following this paradigm, SGNet [24, 40] pro-
[7, 11, 16, 26, 29, 33, 51, 53] and were adopted in 3D poses to penalize the deviation of lateral distance from a
by SALAD [50] combining line segmentation with depth- constant lane width in the IPM warped top-view, but ignores
prediction. These representations come with high com- that the property does not hold for lines deviating from the
putational load since a large amount of parameters is re- ground plane. GP [17] presents a parallelism loss that en-
quired. 2) Grid-based approaches divide the space into forces constant distance between nearest neighbors locally,
cells and model lanes using local segments [13] or key- which depends on the number of anchor points. In contrast,
points [15, 34, 45]. 3D-LaneNet+ [4] suggests to use local our method presents a way to learn parallelism globally and
line-segments and BEV-LaneDet [46] defines key-points on independent of resolutions of discrete lane representations.
a BEV grid representation. Both depend on the grid res- We propose an elegant way to learn parallelism as well as
olution and require costly post-processing to obtain lines. other geometry priors using analytical formulations of tan-
3) Anchor-based representations [19, 40, 41, 52] model gents and normals, which are well-defined on our continu-
lines as straight anchors with positional offsets at prede- ous spline representation.
fined locations. They are widely used in 3D detection Leveraging 3D Features. An important model com-
approaches including 3D-LaneNet [6] and Gen-LaneNet ponent consists in the extraction of 3D features, encoding
[9], which use vertical anchors in the top-view, and An- valuable information to detect lanes along the road surface.
10640
Figure 1. Our approach: First, front-view image I is propagated through the backbone extracting multi-scale feature maps. These are
transformed to 3D using our spatial transformation and then fused to obtain a single 3D feature map. Feature pooling is applied to obtain
features for each line proposal that are propagated through fully connected layers to obtain the parameters for our line representation.
Finally, prior knowledge is exploited to regularize the lane representation and to produce surface hypotheses for the spatial transformation.
10641
obtain
\boldsymbol {f}(t) = \begin {pmatrix} x(t) \\ y(t) \\ z(t) \end {pmatrix} = \sum _{k=1}^{K} \, \boldsymbol {c}_k \cdot B_{k,d}(t) \, (1)
\mathcal {L}_{vis} =& - \frac {1}{|\mathcal {P}_{GT}|} \sum _{\boldsymbol {p} \in \mathcal {P}_{GT}} \hat {v}_{\boldsymbol {p}} \cdot \log \big (\sigma \big (v(t_{\boldsymbol {p}})\big )\big ) + \\ &(1- \hat {v}_{\boldsymbol {p}}) \cdot \log \big (1 - \sigma \big (v(t_{\boldsymbol {p}})\big )\big ) \, , \mathcal {L}^{(ij)}_{par} = \frac {\mathbbm {1}^{(ij)}_{{\boldsymbol {p}}}}{|\mathcal {P}^{(i)}|} \cdot \sum _{{\boldsymbol {p}} \in \mathcal {P}^{(i)}} 1 - \big (\mathbf {T}^{(i)}(t_{\boldsymbol {p}})\big )^T \cdot \mathbf {T}^{(j)}(t_{{\boldsymbol {p}}^*})\, . (6)
(4) Since the criterion of line parallelism should not hold for
1 For the concept of visibility, we follow the prevailing definition from
all normal point pairs of neighboring lines (e.g. merging
or splitting lines), 1p ∈ {0, 1} represents the indicator
(ij)
the literature [2, 9].
10642
function determining whether the parallelism loss is applied
to the point pair. More precisely, the function ensures that
only the overlapping range of neighboring lines is taken into
account. Furthermore, it determines whether the line pair
should be considered as a parallel pair based on the stan-
dard deviation of euclidean distances between normal point
pairs, i.e. high deviations indicate that the line pair might
belong to a merge or split structure. In our experiments, we
achieve state-of-the-art performance on test sets containing
merges and splits, proving that our model is also capable of
learning non-parallel line pairs using this indicator function.
Surface smoothness. Since the lines reside on a smooth
road, the surface normals of neighboring lanes should be
similar. Analogously to Lpar , we express this with the co-
sine distance between surface normals N(ih) and N(ij) as
Figure 4. Our proposed spatial transformation module. First, sev-
eral road surface hypotheses are defined (a) to which front-view
\mathcal {L}^{(i)}_{sm} = \frac {\mathbbm {1}^{(hij)}_{{\boldsymbol {p}}}}{|\mathcal {P}^{(i)}|} \cdot \sum _{{\boldsymbol {p}} \in \mathcal {P}^{(i)}} 1 - \big (\mathbf {N}^{(ih)}(t_{\boldsymbol {p}})\big )^T \cdot \mathbf {N}^{(ij)}({t_{\boldsymbol {p}}})\, , (7) features are lifted (b) and weighted according to the predicted
depth distribution. Afterwards, point features are aggregated in
a weighted manner to obtain the 3D feature map (c).
with indicator function 1p . The surface normal between
(hij)
10643
Priors F1(%)↑ X-near(m)↓ X-far(m)↓ Z-near(m)↓ Z-far(m)↓
None 65.0 0.316 0.384 0.106 0.153
Par. 66.2 0.291 0.373 0.103 0.150
Surf. 65.8 0.320 0.356 0.103 0.144
Curv. 66.7 0.322 0.366 0.105 0.146
Comb. 66.7 0.301 0.359 0.103 0.144
# Surface Hypotheses 1 3 5 15 27
F1-Score(%)↑ 65.0 65.9 66.6 66.1 66.0
10644
X-error X-error Z-error Z-error F1-Score(%) per Scenario ↑
Method F1-Score(%)↑
near(m)↓ far(m)↓ near(m)↓ far(m)↓ U&D C EW N I M&S
3D-LaneNet [6] 44.1 0.479 0.572 0.367 0.443 40.8 46.5 47.5 41.5 32.1 41.7
Gen-LaneNet [9] 32.3 0.591 0.684 0.411 0.521 25.4 33.5 28.1 18.7 21.4 31.0
PersFormer [2] 50.5 0.485 0.553 0.364 0.431 42.4 55.6 48.6 46.6 40.0 50.7
PersFormer* [2] 53.1 0.361 0.328 0.124 0.129 46.8 58.7 54.0 48.4 41.4 52.5
CurveFormer [1] 50.5 0.340 0.772 0.207 0.651 45.2 56.6 49.7 49.1 42.9 45.4
BEV-LaneDet [46] 58.4 0.309 0.659 0.244 0.631 48.7 63.1 53.4 53.4 50.3 53.7
Anchor3DLane [12] 53.7 0.276 0.311 0.107 0.138 46.7 57.2 52.5 47.8 45.4 51.2
Anchor3DLane-T [12] 54.3 0.275 0.310 0.105 0.135 47.2 58.0 52.7 48.7 45.8 51.7
LaneCPP (Ours) 60.3 0.264 0.310 0.077 0.117 53.6 64.4 56.7 54.9 52.0 58.7
Table 4. Quantitative comparison on OpenLane [2]. Best performance and second best are highlighted. The scenario categories are Up
and Down (U&D), Curve (C), Extreme Weather (EW), Night (N), Intersection (I), Merge and Split (M&S). PersFormer* denotes the latest
performance reported on the official code base, Anchor3DLane-T represents the temporal multi-frame method of [12].
Figure 6. Qualitative comparison on OpenLane. Our method is compared to PersFormer* with ground truth visualized as dashed lines.
Implementation details. We use input size 360 × 480 F1-Score and geometric errors. The positive effect of par-
and adopt the same backbone as in [2] based on a modi- allelism is confirmed by Fig. 5, where reinforcing paral-
fied EfficientNet [42]. We extract four feature maps of res- lel lane structure leads to better estimates in the near-range
olutions [ 21 , 41 , 18 , 16
1
]. The final 3D feature map has size (a) and far-range (b) compared to the unregularized model.
26 × 16 with 64 channels. We use M = 64 initial line pro- Learning parallel lines also is evidently beneficial in cases
posals and B-Splines of degree d = 3 and K = 10 control of poor visibility (b) and occlusions (a). In the latter case,
points. We apply Adam optimizer [14] with an initial learn- the regularized model even shows better predictions than
ing rate of 2 × 10−4 for OpenLane and 10−4 for Apollo the noisy ground truth. This emphasizes the high relevance
and a dataset specific step-wise scheduler. We train for 30 of priors for more robust behavior for real-world datasets,
epochs on OpenLane and 300 epochs on Apollo with batch where 3D ground truth often comes with inaccuracies.
size 16. For more details we refer to the supplementary. For the spatial transformation (see Table 2), too low
numbers of surface hypotheses result in worse score,
4.2. Ablation studies
presumably as 3D geometry is not captured sufficiently,
Table 1 indicates the effect of our proposed prior-based reg- whereas larger numbers tend to decreasing performance due
ularization. It is evident that each prior improves the F1- to the higher complexity. The best F1-Score is obtained
Score as well as geometric errors. While the surface and with 5 hypotheses, which is chosen for further experiments.
curvature priors result in better far-range estimates, line par- While the improvement over IPM is already considerable,
allelism supports X-regression in the near-range. Besides, we think that with the simplifications of plane hypotheses
using surface smoothness loss results in lowest Z-far errors. prevent the component from developing its full potential.
Finally, a combination of priors yields a good balance of We see ways to enhance the 3D transformation even further
10645
Balanced Scenes Rare Scenes
Method X-error (m) ↓ Z-error (m) ↓ X-error (m) ↓ Z-error (m) ↓
F1(%)↑ F1(%)↑
near far near far near far near far
3D-LaneNet [6] 86.4 0.068 0.477 0.015 0.202 72.0 0.166 0.855 0.039 0.521
GP [17] 91.9 0.049 0.387 0.008 0.213 83.7 0.126 0.903 0.023 0.625
PersFormer [2] 92.9 0.054 0.356 0.01 0.234 87.5 0.107 0.782 0.024 0.602
3D-SpLineNet [32] 96.3 0.037 0.324 0.009 0.213 92.9 0.077 0.699 0.021 0.562
CurveFormer [1] 95.8 0.078 0.326 0.018 0.219 95.6 0.182 0.737 0.039 0.561
BEV-LaneDet [46] 96.9 0.016 0.242 0.02 0.216 97.6 0.031 0.594 0.040 0.556
Anchor3DLane [12] 95.4 0.045 0.300 0.016 0.223 94.4 0.082 0.699 0.030 0.580
LaneCPP (Ours) 97.4 0.030 0.277 0.011 0.206 96.2 0.073 0.651 0.023 0.543
Table 5. Quantitative comparison of best methods on Apollo 3D Synthetic [9]. Best performance and second best are highlighted.
using more sophisticated spatial representations in future. tations with our formulation for line pairs with a similar ori-
The impact of our different contributions is summarized entation but weakly converging course as shown in Fig. 6e.
in Table 3, where the first row shows our baseline (see In such cases the indicator function might erroneously de-
Sec. 4.1). More than two percent in F1-Score are gained cide for parallelism loss during training. One possible solu-
with our novel lane representation compared to the simpli- tion for future work would be to consider ground truth for
fied one from [32]. Moreover, it is clear that both, the regu- the indicator function to identify such situations.
larization using combined priors and the spatial transforma-
tion using 5 hypotheses result in significant improvement. 4.4. Evaluation on Apollo 3D Synthetic
Eventually, combining all components yields the best model
The Apollo 3D Synthetic dataset is very limited in size and
configuration, which we choose for further evaluation.
only consists of simple situations in contrast to OpenLane.
4.3. Evaluation on OpenLane While we find the results on OpenLane more meaningful,
we would like to still provide and discuss the quantitative
On the real-world OpenLane benchmark our model evi- results on the Apollo dataset. Due to the simplicity of the
dently outperforms all other methods with respect to F1- dataset, our model cannot benefit that significantly from our
Score as well as geometric errors as shown in Table 4. priors but still achieves competitive results to state of the art
Compared to BEV-Lanedet, which achieves a high detec- with the highest F1-Score on the balanced scenes dataset
tion score, our model gains +1.9 %, while reaching sig- and comparable error metrics (second best for most errors).
nificantly lower geometric errors. In comparison to An-
chor3DLane the improvements with respect to X-errors are 5. Conclusions and future work
less substantial, however, our approach surpasses the F1-
Score by a large gap of +6.6 %. Analyzing the detection In this work, we present LaneCPP, a novel approach for
scores among different scenarios, outstanding performance 3D lane detection that leverages physical prior knowledge
gain is observed on the up- and down-hill test set (+5.9 %) about lane structure and road geometry. Our new continu-
that highlights the capability of our approach to capture 3D ous lane representation overcomes previous deficiencies by
space proficiently, which is supported by the low Z-errors. allowing arbitrary lane structures and enables us to regular-
Apart from quantitative results, we show qualitative ex- ize lane geometry based on analytically formulated priors.
amples in Fig. 6. In up-hill scenarios like Fig. 6b our model We further introduce a novel spatial transformation mod-
manages to estimate both lateral and height profile accu- ule that models 3D features carefully considering knowl-
rately, since our assumptions about road surface and line edge about road surface geometry. In our experiments, we
parallelism are satisfied. In contrast, PersFormer lacks spa- demonstrate state-of-the-art performance on real and syn-
tial features and does not use any kind of physical regular- thetic benchmarks. The full capability of our approach is re-
ization. Consequently, it fails to estimate the 3D lane ge- vealed on real-world OpenLane, for which we prove the rel-
ometry and even collapses in Fig. 6c, whereas our surface evance of priors quantitatively and qualitatively. In future,
and curvature priors always prevent such a behavior. Note- priors could be individualized for different driving scenar-
worthy is also the top performance on the merges and splits ios and might support to learn inter-lane relations to achieve
set. This proves that our soft regularization is even capable better scene understanding in a global context. We also see
to handle situations containing non-parallel lines, which is ways to leverage the full potential of the spatial transforma-
also confirmed by Fig. 6d. However, we rarely observe limi- tion by using more sophisticated surface representations.
10646
References [14] Diederik P. Kingma and Jimmy Ba. Adam: A method for
stochastic optimization. In Proc. of the International Conf.
[1] Yifeng Bai, Zhirong Chen, Zhangjie Fu, Lang Peng, Peng- on Learning Representations (ICLR), 2015. 7
peng Liang, and Erkang Cheng. Curveformer: 3d lane detec-
[15] YeongMin Ko, Jiwon Jun, Donghwuy Ko, and Moongu Jeon.
tion by curve propagation with curve queries and attention.
Key points estimation and point instance segmentation ap-
In Proc. IEEE International Conf. on Robotics and Automa-
proach for lane detection. arXiv/2002.06604, 2020. 1, 2
tion (ICRA), 2023. 1, 2, 3, 7, 8
[16] Seokju Lee, Junsik Kim, Jae Shin Yoon, Seunghak Shin,
[2] Li Chen, Chonghao Sima, Yang Li, Zehan Zheng, Jiajie Xu,
Oleksandr Bailo, Namil Kim, Tae-Hee Lee, Hyun Seok
Xiangwei Geng, Hongyang Li, Conghui He, Jianping Shi,
Hong, Seung-Hoon Han, and In So Kweon. Vpgnet: Van-
Yu Qiao, et al. Persformer: 3d lane detection via perspective
ishing point guided network for lane and road marking de-
transformer and the openlane benchmark. In Proc. of the
tection and recognition. In Proc. of the IEEE International
European Conf. on Computer Vision (ECCV), 2022. 2, 3, 4,
Conf. on Computer Vision (ICCV), 2017. 1, 2
6, 7, 8
[17] Chenguang Li, Jia Shi, Ya Wang, and Guangliang Cheng.
[3] Carl de Boor. On calculating with b-splines. Journal of Ap-
Reconstruct from top view: A 3d lane detection approach
proximation Theory, 1972. 2
based on geometry structure prior. In Proc. IEEE Conf. on
[4] Netalee Efrat, Max Bluvstein, Shaul Oron, Dan Levi,
Computer Vision and Pattern Recognition (CVPR), 2022. 2,
Noa Garnett, and Bat El Shlomo. 3d-lanenet+: An-
3, 8
chor free lane detection using a semi-local representation.
arXiv/2011.01535, 2020. 1, 2, 3 [18] Qi Li, Yue Wang, Yilun Wang, and Hang Zhao. Hdmapnet:
An online HD map construction and evaluation framework.
[5] Zhengyang Feng, Shaohua Guo, Xin Tan, Ke Xu, Min Wang,
In Proc. IEEE International Conf. on Robotics and Automa-
and Lizhuang Ma. Rethinking efficient lane detection via
tion (ICRA), 2022. 3
curve modeling. In Proc. IEEE Conf. on Computer Vision
and Pattern Recognition (CVPR), 2022. 2 [19] Xiang Li, Jun Li, Xiaolin Hu, and Jian Yang. Line-cnn: End-
to-end traffic line detection with line proposal unit. IEEE
[6] Noa Garnett, Rafi Cohen, Tomer Pe’er, Roee Lahav, and Dan
Trans. on Intelligent Transportation Systems (T-ITS), 2020.
Levi. 3d-lanenet: End-to-end 3d multiple lane detection. In
1, 2
Proc. of the IEEE International Conf. on Computer Vision
(ICCV), 2019. 1, 2, 3, 7, 8 [20] Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chong-
[7] Mohsen Ghafoorian, Cedric Nugteren, Nóra Baka, Olaf hao Sima, Tong Lu, Yu Qiao, and Jifeng Dai. Bevformer:
Booij, and Michael Hofmann. EL-GAN: embedding loss Learning bird’s-eye-view representation from multi-camera
driven generative adversarial networks for lane detection. In images via spatiotemporal transformers. In Proc. of the Eu-
Proc. of the European Conf. on Computer Vision (ECCV), ropean Conf. on Computer Vision (ECCV), 2022. 3
2018. 1, 2 [21] Tsung-Yi Lin, Priya Goyal, Ross B. Girshick, Kaiming He,
[8] Stephen Gould, Richard Hartley, and Dylan Campbell. Deep and Piotr Dollár. Focal loss for dense object detection. In
declarative networks. IEEE Trans. on Pattern Analysis and Proc. of the IEEE International Conf. on Computer Vision
Machine Intelligence (PAMI), 2021. 2 (ICCV), 2017. 6
[9] Yuliang Guo, Guang Chen, Peitao Zhao, Weide Zhang, Jing- [22] Ruijin Liu, Zejian Yuan, Tie Liu, and Zhiliang Xiong. End-
hao Miao, Jingao Wang, and Tae Eun Choe. Gen-lanenet: to-end lane shape prediction with transformers. In Proc. of
A generalized and scalable approach for 3d lane detection. the IEEE Winter Conference on Applications of Computer
In Proc. of the European Conf. on Computer Vision (ECCV), Vision (WACV), 2021. 2
2020. 1, 2, 3, 4, 6, 7, 8 [23] Ruijin Liu, Dapeng Chen, Tie Liu, Zhiliang Xiong, and Ze-
[10] Steffen Hagedorn, Marcel Milich, and Alexandru P. Con- jian Yuan. Learning to predict 3d lane shape and camera pose
durache. Pioneering se (2)-equivariant trajectory planning from a single image via geometry constraints. In Proc. of the
for automated driving. arXiv:2403.11304, 2024. 2 Conf. on Artificial Intelligence (AAAI), 2022. 1, 2, 3
[11] Yuenan Hou, Zheng Ma, Chunxiao Liu, and Chen Change [24] Pingping Lu, Chen Cui, Shaobing Xu, Huei Peng, and Fan
Loy. Learning lightweight lane detection cnns by self atten- Wang. SUPER: A novel lane detection system. IEEE Trans.
tion distillation. In Proc. of the IEEE International Conf. on on Intelligent Vehicles (T-IV), 2021. 2
Computer Vision (ICCV), 2019. 1, 2 [25] Hanspeter Mallot, Heinrich Blthoff, J.J. Little, and S Bohrer.
[12] Shaofei Huang, Zhenwei Shen, Zehao Huang, Zi han Ding, Inverse perspective mapping simplifies optical flow compu-
Jiao Dai, Jizhong Han, Naiyan Wang, and Si Liu. An- tation and obstacle detection. Biological Cybernetics, 1991.
chor3dlane: Learning to regress 3d anchors for monocular 3
3d lane detection. In Proc. IEEE Conf. on Computer Vision [26] Davy Neven, Bert De Brabandere, Stamatios Georgoulis,
and Pattern Recognition (CVPR), 2023. 2, 3, 7, 8 Marc Proesmans, and Luc Van Gool. Towards end-to-end
[13] Brody Huval, Tao Wang, Sameep Tandon, Jeff Kiske, Will lane detection: an instance segmentation approach. In Proc.
Song, Joel Pazhayampallil, Mykhaylo Andriluka, Pranav Ra- IEEE Intelligent Vehicles Symposium (IV), 2018. 1, 2
jpurkar, Toki Migimatsu, Royce Cheng-Yue, Fernando A. [27] Marcos Nieto, Luis Salgado, Fernando Jaureguizar, and Jon
Mujica, Adam Coates, and Andrew Y. Ng. An em- Arróspide. Robust multiple lane road modeling based on
pirical evaluation of deep learning on highway driving. perspective analysis. In Proc. IEEE International Conf. on
arXiv/1504.01716, 2015. 1, 2 Image Processing (ICIP), 2008. 2
10647
[28] Bowen Pan, Jiankai Sun, Ho Yin Tiga Leung, Alex Ando- Keep your eyes on the lane: Real-time attention-guided lane
nian, and Bolei Zhou. Cross-view semantic segmentation detection. In Proc. IEEE Conf. on Computer Vision and Pat-
for sensing surroundings. IEEE Robotics Autom. Lett., 2020. tern Recognition (CVPR), 2021. 1, 2
3 [42] Mingxing Tan and Quoc Le. Efficientnet: Rethinking model
[29] Xingang Pan, Jianping Shi, Ping Luo, Xiaogang Wang, and scaling for convolutional neural networks. In Proc. of the
Xiaoou Tang. Spatial as deep: Spatial CNN for traffic scene International Conf. on Machine learning (ICML), 2019. 7
understanding. In Proc. of the Conf. on Artificial Intelligence [43] Lucas Tabelini Torres, Rodrigo Ferreira Berriel, Thiago M.
(AAAI), 2018. 1, 2 Paixão, Claudine Badue, Alberto F. De Souza, and Thi-
[30] Lang Peng, Zhirong Chen, Zhangjie Fu, Pengpeng Liang, ago Oliveira-Santos. Polylanenet: Lane estimation via deep
and Erkang Cheng. Bevsegformer: Bird’s eye view seman- polynomial regression. In Proc. of the International Conf. on
tic segmentation from arbitrary camera rigs. In Proc. of the Pattern Recognition (ICPR), 2020. 2
IEEE Winter Conference on Applications of Computer Vision [44] Bingke Wang, Zilei Wang, and Yixin Zhang. Polynomial
(WACV), 2023. 3 regression network for variable-number lane detection. In
[31] Jonah Philion and Sanja Fidler. Lift, splat, shoot: Encoding Proc. of the European Conf. on Computer Vision (ECCV),
images from arbitrary camera rigs by implicitly unprojecting 2020. 2
to 3d. In Proc. of the European Conf. on Computer Vision [45] Jinsheng Wang, Yinchao Ma, Shaofei Huang, Tianrui Hui,
(ECCV), 2020. 3, 5 Fei Wang, Chen Qian, and Tianzhu Zhang. A keypoint-based
[32] Maximilian Pittner, Alexandru Condurache, and Joel Janai. global association network for lane detection. In Proc. IEEE
3d-splinenet: 3d traffic line detection using parametric spline Conf. on Computer Vision and Pattern Recognition (CVPR),
representations. In Proc. of the IEEE Winter Conference on 2022. 1, 2
Applications of Computer Vision (WACV), 2023. 1, 2, 3, 4, [46] Ruihao Wang, Jian Qin, Kaiying Li, Yaochen Li, Dong Cao,
6, 8 and Jintao Xu. Bev-lanedet: An efficient 3d lane detection
[33] Fabio Pizzati, Marco Allodi, Alejandro Barrera, and Fer- based on virtual camera via key-points. In Proc. IEEE Conf.
nando Garcı́a. Lane detection and classification using cas- on Computer Vision and Pattern Recognition (CVPR), 2023.
caded cnns. In Proc. of the International Conf. on Computer 2, 3, 7, 8
Aided Systems Theory (EUROCAST), 2019. 1, 2 [47] Yuping Wang and Jier Chen. Eqdrive: Efficient equivari-
[34] Zhan Qu, Huan Jin, Yang Zhou, Zhen Yang, and Wei Zhang. ant motion forecasting with multi-modality for autonomous
Focus on local: Detecting lane marker from bottom up via driving. arXiv:2310.17540, 2023. 2
key point. In Proc. IEEE Conf. on Computer Vision and Pat-
[48] Lu Xiong, Zhenwen Deng, Peizhi Zhang, and Zhiqiang Fu.
tern Recognition (CVPR), 2021. 1, 2
A 3d estimation of structural road surface based on lane-line
[35] Matthias Rath and Alexandru Paul Condurache. Invariant in-
information. IFAC Conf. on Engine and Powertrain Control,
tegration in deep convolutional feature space. In Proc. of Eu-
Simulation and Modeling (E-COSM), 2018. 2
ropean Symposium on Artificial Neural Networks, Computa-
[49] Chenxin Xu, Robby T. Tan, Yuhong Tan, Siheng Chen,
tional Intelligence and Machine Learning (ESANN), 2020.
Yu Guang Wang, Xinchao Wang, and Yanfeng Wang. Eqmo-
2
tion: Equivariant multi-agent motion prediction with invari-
[36] Matthias Rath and Alexandru Paul Condurache. Improving
ant interaction reasoning. In Proc. IEEE Conf. on Computer
the sample-complexity of deep classification networks with
Vision and Pattern Recognition (CVPR), 2023. 2
invariant integration. In Proc. of International Joint Conf. on
[50] Fan Yan, Ming Nie, Xinyue Cai, Jianhua Han, Hang Xu,
Computer Vision, Imaging and Computer Graphics Theory
Zhen Yang, Chaoqiang Ye, Yanwei Fu, Michael Bi Mi, and
and Applications (VISIGRAPP), 2022. 2
Li Zhang. Once-3dlanes: Building monocular 3d lane detec-
[37] Thomas Roddick and Roberto Cipolla. Predicting semantic
tion. In Proc. IEEE Conf. on Computer Vision and Pattern
map representations from images using pyramid occupancy
Recognition (CVPR), 2022. 2, 3
networks. In Proc. IEEE Conf. on Computer Vision and Pat-
tern Recognition (CVPR), 2020. 3 [51] Tu Zheng, Hao Fang, Yi Zhang, Wenjian Tang, Zheng Yang,
[38] Thomas Roddick, Alex Kendall, and Roberto Cipolla. Ortho- Haifeng Liu, and Deng Cai. RESA: recurrent feature-shift
graphic feature transform for monocular 3d object detection. aggregator for lane detection. In Proc. of the Conf. on Artifi-
In Proc. of the British Machine Vision Conf. (BMVC), 2019. cial Intelligence (AAAI), 2021. 2
3 [52] Tu Zheng, Yifei Huang, Yang Liu, Wenjian Tang, Zheng
[39] Avishkar Saha, Oscar Mendez, Chris Russell, and Richard Yang, Deng Cai, and Xiaofei He. Clrnet: Cross layer re-
Bowden. Translating images into maps. In Proc. IEEE In- finement network for lane detection. In Proc. IEEE Conf. on
ternational Conf. on Robotics and Automation (ICRA), 2022. Computer Vision and Pattern Recognition (CVPR), 2022. 2
3 [53] Qin Zou, Hanwen Jiang, Qiyu Dai, Yuanhao Yue, Long
[40] Jinming Su, Chao Chen, Ke Zhang, Junfeng Luo, Xiaom- Chen, and Qian Wang. Robust lane detection from continu-
ing Wei, and Xiaolin Wei. Structure guided lane detection. ous driving scenes using deep neural networks. IEEE Trans.
In Proc. of the International Joint Conf. on Artificial Intelli- on Vehicular Technology (VTC), 2020. 1, 2
gence (IJCAI), 2021. 2
[41] Lucas Tabelini, Rodrigo Berriel, Thiago M Paixao, Claudine
Badue, Alberto F De Souza, and Thiago Oliveira-Santos.
10648