An Improved UFLD-V2 Lane Line Recognition Method
An Improved UFLD-V2 Lane Line Recognition Method
ABSTRACT- Lane line recognition remains a crucial component of autonomous driving, particularly under complex
scenarios involving illumination changes and occlusions. This paper presents a structurally efficient and robust improvement of the
UFLD-V2 architecture, designed for real-time and reliable lane detection. The proposed method integrates three lightweight yet
complementary components: (1) Res2Net, replacing the original ResNet backbone, enhances multi-scale feature extraction and
inference efficiency through reparameterization; (2) an Efficient Multi-scale Attention (EMA) module captures fine-grained
contextual details across varying scene complexities; and (3) the Simple Attention Module (SimAM) is applied in the segmentation
head to suppress background noise and improve localization accuracy. Unlike prior work that uses these modules in isolation, we
propose a tailored integration strategy that achieves a favorable trade-off between accuracy and computational cost. Extensive
experiments on the TuSimple dataset show the effectiveness of our method, achieving 0.95947 accuracy, 0.0262 false positive rate,
0.02328 false negative rate, and an F1 score of 0.96517. Our approach surpasses several state-of-the-art models, including UFLD,
PolyLaneNet, EL-GAN, SAD, CurveFormer++, BEVLaneDet, and PersFormer, particularly under challenging conditions. These
findings highlight the potential of our approach for practical deployment in intelligent driving systems.
Keywords: Lane Line Recognition, UFLD-V2, Res2Net, EMA Attention Module, SimAM Module.
The rest of the paper is organized as follows: Section 2 reviews robustness against illumination changes, occlusions, and road
related work, section 3 introduces the proposed method, section texture variations. End-to-end frameworks such as instance
4 presents experimental results, and Section 5 concludes the segmentation models [15, 16], generative adversarial methods
paper. like EL-GAN[16], and regression-based networks like
PolyLaneNet[17] have achieved strong performance on
░ 2. RELATED WORKS benchmarks like TuSimple and CULane.
Lane detection, as a crucial component of autonomous driving
More recently, a number of UFLD-based extensions have been
systems, plays a vital role in ensuring vehicle safety, supporting
proposed to further improve accuracy and robustness while
decision-making, and maintaining driving stability[5].
maintaining real-time efficiency. MCA-UFLD[18], for
Consequently, it has become a research hotspot in the fields of
example, utilizes a lightweight MobileNetV2 backbone with
intelligent transportation and computer vision in recent years.
coordinate attention and a vanishing point branch to enhance
Traditional lane detection methods primarily rely on
semantic perception without sacrificing speed. Other methods
handcrafted features and classical image processing techniques,
incorporate Split-Attention and ASFF modules to enhance
such as color thresholding, edge detection, and the Hough
multi-scale feature fusion, resulting in notable gains in F1-score
transform. These approaches can deliver acceptable
on CULane[8]. Additionally, techniques such as random
performance under controlled conditions; however, they often
masking and smooth curve loss have been employed to improve
lack robustness in complex environments involving shadow
performance under occlusions and enhance the continuity of
interference, illumination changes, lane occlusions, and road
predicted lane lines. To further increase generalization under
texture variability. This limits their adaptability to diverse and
poor visibility, contrastive learning strategies such as CLLD
unstructured road scenarios. With the rapid development of
have also been introduced[19].
deep learning, convolutional neural network (CNN)-based
methods have become dominant[6]. These methods enable end- These studies highlight the effectiveness of integrating
to-end learning, automatically extracting multi-scale features lightweight backbones, attention mechanisms, and auxiliary
from input images—ranging from low-level textures to high- learning strategies for lane detection. Inspired by these
level structural semantics—leading to more accurate and stable advances, we propose a structurally enhanced UFLD-V2
lane detection under challenging conditions[7]. Furthermore, framework that incorporates a Res2Net backbone for multi-
the integration of attention mechanisms and self-supervised scale feature extraction, an Efficient Multi-scale Attention
learning techniques in recent work has further improved (EMA) module for refined spatial-channel modeling, and a
generalization and robustness, enabling more reliable Simple Attention Module (SimAM) to suppress background
performance in real-world driving scenes[8]. interference. Unlike prior works that use such modules
independently, our method adopts a task-driven, latency-aware
Feature-based Lane detection methods primarily rely on
integration strategy designed to maintain real-time applicability
handcrafted features and classical image processing pipelines.
while significantly improving robustness and accuracy.
For example, some approaches utilize color thresholding in
Detailed descriptions are provided in section 3.
HSV or CIELab color spaces, edge detection, and the Hough
transform to extract lane markings[9-11]. These techniques can
achieve good performance under stable lighting and clear road ░ 3. PROPOSED METHODOLOGY
conditions. However, their strong dependence on color and
edge contrast limits their adaptability in real-world
environments with variable illumination, shadowing, or blurred
markings.
Figure 1 is the basic network studied in this paper. It consists of 3.2. Introducing EMA Attention Module
two main branches: existence branch and localization branch.
The input image is first extracted through the backbone network
to extract deep features, and then a multi-layer perceptron
(MLP) is used to determine whether there is a lane in the image.
If there is a lane, the localization branch is responsible for
predicting the specific location of the lane. Finally, the final lane
location is output.
3.3. Optimizing Segmentation Head with SimAM UFLD-V2 in multiple complex scenarios, verifying the
Module effectiveness and superiority of the proposed method.
detailed in section 3.4, ensuring consistency between defined threshold, the image remains unchanged. Otherwise,
architectural design and training procedures. the algorithm:
4.1. Training Settings • Computes the intensity range between the 5th and 95th
The model was trained on the TuSimple dataset, with the percentiles of pixel values;
original image resolution (1280×720) downscaled to 320×800 • Discards pixels outside this range to remove outliers;
to improve computational efficiency. Training was conducted • Applies linear contrast stretching or gamma correction to
for 200 epochs using a batch size of 16. The Adam optimizer enhance important visual details;
was employed with an initial learning rate of 0.05, which was • Outputs the adjusted image for model training.
decayed using a cosine annealing schedule. A weight decay of
This preprocessing strategy selectively enhances dark or
0.0005 was applied to mitigate overfitting. All experiments
shadowed images during training, ensuring lane markings
were conducted on Ubuntu 20.04 using the PyTorch framework
remain visible while maintaining a lightweight and efficient
and an NVIDIA H800 GPU.
training pipeline.
To improve model generalization and reduce overfitting, a The ABCE module enhances underexposed training images by
variety of data augmentation techniques were applied online removing intensity outliers and normalizing contrast, improving
during training via PyTorch’s data pipeline. These include: lane visibility without increasing model complexity.
(a)
(b)
Figure 7. Visual comparison before and after applying ABCE
algorithm
Figure 6. Flowchart of the Adaptive Brightness Compensation The left image shows the original low-light input from the
Enhancement (ABCE) algorithm
TuSimple dataset, where lane markings are difficult to
distinguish due to poor illumination. The right image displays
Additionally, we introduce a targeted low-light enhancement
the result after applying the proposed adaptive brightness
method called adaptive brightness compensation enhancement
compensation enhancement (ABCE) algorithm, which
(ABCE) to further improve robustness under poor illumination.
improves local contrast and lane visibility. This preprocessing
As illustrated in figure 6, ABCE first computes the grayscale
enhances the input quality before lane detection, especially
brightness of the input image. If the brightness is above a
under shadowed or underexposed conditions.
4.2. Dataset Preprocessing and Label Generation False positive rate: This value is the ratio of non-lane points
To facilitate structured training, the original TuSimple dataset predicted by the model as lane line points.
annotations in JSON format were converted into a standardized 𝐹𝑝𝑟𝑒𝑑
format using a custom preprocessing script 𝐹𝑃 = (9)
𝑁𝑝𝑟𝑒𝑑
(convert_tusimple.py). The process involves the following key
steps: Among them, 𝐹𝑝𝑟𝑒𝑑 represents the number of lane line points
• Lane classification: Lane lines are extracted from the that are misdetected during prediction, and 𝑁𝑝𝑟𝑒𝑑 represents the
annotation files and classified based on their slope. Lanes total number of lane line points in the prediction results.
are sorted from left to right, and lines with a length shorter
than 90 pixels are filtered out to exclude incomplete or F1 score: An indicator that comprehensively evaluates the
irrelevant annotations. performance of a classification model. It is often used to
• Segmentation label generation: For each input image, a measure the balance between the accuracy and recall of binary
corresponding segmentation mask is generated where and multi-classification models. It is the harmonic average of
pixel values {1, 2, 3, 4} represent the four lane lines from the two.
left to right. These masks serve as the ground truth for
segmentation-based training. (10)
• Structured label files: Two text files are generated—
train_gt.txt for training and test.txt for testing. The 𝑇𝑃 𝑇𝑃
Among them, 𝑃 = ,𝑅 = TP represents the
training file records the path to each image and the binary 𝑇𝑃+𝐹𝑃 𝑇𝑃+𝐹𝑁
existence indicators for lane positions, while the test file number of samples correctly recognized by the model, and TN
contains paths to test images for evaluation without represents the number of negative samples recognized by the
model as not lane lines.
labels.
• Annotation caching: Parsed Lane information is saved
4.4. Performance Curves
into a cache file (tusimple_anno_cache.json) to reduce
Figure 8 shows the training progression of accuracy, false
preprocessing time during repeated training runs. positives, false negatives, and F1 score over 200 epochs. The
model demonstrates stable convergence and consistent
This preprocessing step ensures that the lane annotations are
performance gains across all metrics.
consistent, structured, and suitable for deep learning-based lane
detection frameworks, while remaining fully compatible with
the official TuSimple evaluation protocol.
(7)
Figure 8. Training curves of Accuracy, False Positive Rate, False Negative Rate, and F1 Score over the first 100 epochs. The complete training
spanned 200 epochs, with continued performance refinement observed beyond the 100-epoch mark. Only the initial portion is visualized here for
clarity
░ Table 2. Model Complexity and Inference Efficiency percentage points, reduces FP and FN rates by up to 0.97% and
Params FLOPs 0.54%, respectively, and enhances F1 score by up to 1.45
Model Backbone FPS Notes percentage points.
(M) (G)
UFLD-V2 ResNet- As reported
(Baseline) 18
10.8 13.6 260
in [3]
These results demonstrate that the proposed method not only
achieves state-of-the-art detection accuracy, but also offers
Ours (Full Res2Net-
Estimated greater robustness and reliability across diverse scenarios,
~200.0 ~35–50 235 from pth making it well-suited for practical deployment in autonomous
Model) 50
and testing driving systems.
(Note: Parameter counts vary with backbone size. UFLD-V2 uses ResNet-
18, while the proposed model adopts Res2Net-50 along with EMA and SimAM 4.7. Inference Speed Evaluation
modules. FLOPs are approximate estimates based on architectural analysis
(input size 320×800). FPS is measured on an NVIDIA RTX 2080Ti using batch To evaluate the real-time performance of our method, we
size = 1). compare the inference speed (FPS) of the proposed Enhanced
UFLD-V2 with several representative lane detection models.
4.6. Benchmark Comparison For fairness, all FPS values are measured or estimated under a
Table 3 compares the improved model against baseline and unified hardware setting equivalent to RTX 2080Ti.
state-of-the-art methods. Our model consistently outperforms
others in all four-evaluation metrics. As shown in table 4, our model achieves an inference speed of
235 FPS, which is slightly lower than the original UFLD-V2
░ Table 3. Comparative performance with existing methods (260 FPS) but significantly higher than other transformer-based
on TuSimple test set methods such as CurveFormer++ (110 FPS). This demonstrates
FN
that the proposed improvements—including Res2Net, EMA,
Accurac FP Rate F1 Score and SimAM—offer substantial accuracy gains without
Model Rate
y (%) (%) (%) sacrificing real-time inference capability. The complete
(%)
UFLD 95.82 19.05 3.92 87.87 comparison is shown below;
PolyLaneNet 93.36 9.42 9.33 90.63
░ Table 4. Inference speed (FPS) comparison of different
EL-GAN 94.90 4.14 3.36 96.26
lane detection models (unified on RTX 2080Ti)
SAD 95.64 60.20 20.50 95.92
Model FPS
CurveFormer++ 95.81 2.75 3.12 96.30
UFLD 326
BEV-LaneDet 95.38 3.56 3.78 95.47
UFLD-V2 260
PersFormer 95.69 3.22 2.99 96.08
PolyLaneNet 115
Improved
95.95↑ 2.62↓ 2.33↓ 96.52↑ CurveFormer++ 110
UFLD-V2
(Note: Metrics for Improved UFLD-V2 are averaged over three runs) Enhanced UFLD-V2 235
(Note: FPS values are either measured directly or scaled from published data
As shown in table 2, the proposed Improved UFLD-V2 based on performance-equivalent conversions to RTX 2080Ti)
achieves the best overall performance across all evaluation
metrics. It obtains the highest accuracy of 95.95%, 4.8. Qualitative Visualization
outperforming the weakest baseline, PolyLaneNet (93.36%), by To assess the robustness and generalization of the proposed
2.59 percentage points. method beyond the TuSimple dataset, we performed qualitative
testing on additional challenging scenes from the CULane and
In terms of false positive rate, our model achieves the lowest BDD100K datasets.
value of 2.62%, which is significantly lower than that of EL-
GAN (4.14%) and dramatically lower than SAD (60.20%), Figure 9 presents the model’s visual outputs in five typical
indicating enhanced capability in suppressing erroneous driving conditions:
detections. (a) Normal: standard daylight from TuSimple
(b) Occlusion: vehicle-blocked lanes from TuSimple
For false negative rate, the proposed model also performs best (c) Hlight and
with only 2.33%, improving upon the next-best EL-GAN
(d) Shadow: strong illumination and contrast from CULane
(3.36%) by 1.03 percentage points and outperforming SAD
(20.50%) by a wide margin. (e) Night: low-light driving scenes from BDD100K
The overall detection quality is reflected in the F1 score, where As observed, the proposed model consistently achieves robust
the Improved UFLD-V2 achieves 96.52%, slightly surpassing and accurate lane detection, maintaining continuity and
EL-GAN (96.26%) and CurveFormer++ (96.30%). reducing false positives even under adverse visual disturbances.
These results provide strong visual evidence of the model’s
Compared with recent transformer-based approaches such as generalization ability across diverse urban and illumination
CurveFormer++, BEV-LaneDet, and PersFormer, our method conditions. Figure 9 Visualization of detection results under
consistently achieves superior performance across all four different conditions: (a) Normal, (b) Occlusion, (c) Road color
metrics. Specifically, it improves accuracy by up to 0.32 variation.
(a)
Normal
(b)
Occlusion
(c)
High
Light
(d)
Shadow
(e)
Night
Figure 9. Qualitative lane detection results across five representative scenarios from three datasets: (a) Normal, (b) Occlusion (TuSimple), (c)
High Light (CULane), (d) Shadow (CULane), and (e) Night (BDD100K). Each row displays three sample outputs produced by the proposed
method under the given condition. The results demonstrate the model’s ability to generalize across varying road structures, lighting conditions,
and occlusion patterns.
[9] Narote, S.P., et al., A review of recent advances in lane detection and
░ 5. CONCLUSION AND FUTURE departure warning system. Pattern Recognition, 2018. 73: p. 216-234.
WORKS [10] Ma, C.X., et al., A Real-time Semantic Segmentation Model for Lane
This paper presents an enhanced lane detection framework Detection. Journal of Network Intelligence, 2024. 9(4): p. 2234-2257.
based on the UFLD-V2 architecture, incorporating Res2Net for
multi-scale feature extraction, the EMA attention module for [11] Yoo, J.H., et al., A robust lane detection method based on vanishing point
estimation using the relevance of line segments. IEEE Transactions on
robust feature representation in complex scenarios, and the
Intelligent Transportation Systems, 2017. 18(12): p. 3254-3266.
SimAM module for refined lane boundary segmentation.
Experimental results on the TuSimple dataset validate the [12] Wang, J., et al. Lane boundary detection based on parabola model. in
effectiveness of the proposed approach, achieving an accuracy 2010 IEEE International Conference on Information and Automation,
ICIA 2010. 2010.
of 0.95947, a false positive rate of 0.0262, a false negative rate
of 0.02328, and an F1 score of 0.96517, thereby outperforming [13] Wang, J. and X. An. A multi-step curved lane detection algorithm, based
the original UFLD-V2 and several state-of-the-art baselines. on hyperbola-pair model. in 2010 IEEE International Conference on
Automation and Logistics, ICAL 2010. 2010.
The proposed method achieves a favorable balance between
[14] Huang, A.S., et al., Finding multiple lanes in urban road networks with
detection robustness and real-time inference efficiency. Its
vision and lidar. Autonomous Robots, 2009. 26(2-3): p. 103-122.
lightweight design and high FPS performance demonstrate that
the model is well-suited for deployment in practical [15] Neven, D., et al. Towards End-to-End Lane Detection: An Instance
autonomous driving systems, where both accuracy and speed Segmentation Approach. in IEEE Intelligent Vehicles Symposium,
Proceedings. 2018.
are essential for reliable operation in diverse driving conditions.
[16] Ghafoorian, M., et al. EL-GAN: Embedding loss driven generative
Future research will focus on further enhancing the model’s adversarial networks for lane detection. in Lecture Notes in Computer
efficiency, adaptability, and generalization capabilities in real- Science (including subseries Lecture Notes in Artificial Intelligence and
world applications. This includes optimizing inference through Lecture Notes in Bioinformatics). 2019.
network compression and pruning, incorporating domain
[17] Tabelini, L., et al. PolyLaneNet: Lane estimation via deep polynomial
adaptation to improve cross-scenario robustness, integrating regression. in Proceedings - International Conference on Pattern
multimodal sensor inputs (e.g., LiDAR, GPS) for more reliable Recognition. 2020.
environmental perception, exploring advanced attention
[18] Han, L., et al. Lane Detection Method Based on MCA-UFLD. in 2023
mechanisms for spatiotemporal modeling, and leveraging self-
IEEE 8th International Conference on Intelligent Transportation
supervised learning to reduce reliance on large-scale labeled Engineering (ICITE). 2023.
datasets.
[19] Chen, S., & Zhang, Y. (2024, August). Lane detection algorithm based
on improved UFLD. In Fourth International Conference on Image
░ REFERENCES Processing and Intelligent Control (IPIC 2024) (Vol. 13250, pp. 748-
[1] Ni, J., et al., A survey on theories and applications for self-driving cars 755). SPIE.
based on deep learning methods. Applied Sciences (Switzerland), 2020.
10(8). [20] Gao, S.H., et al., Res2Net: A New Multi-Scale Backbone Architecture.
IEEE transactions on pattern analysis and machine intelligence, 2021.
[2] Rui, s. and c. hui, Lane detection algorithm based on geometric matrix 43(2): p. 652-662.
sampling. Chinese Science: Information Science, 2017(47(04)): p. 455-
467. © 2025 by the LiKang Bo, Fei Lu Siaw, Tzer
Hwai Gilbert Thio. Submitted for possible
[3] Qin, Z., H. Wang, and X. Li. Ultra Fast Structure-Aware Deep Lane open access publication under the terms and
Detection. in Lecture Notes in Computer Science (including subseries
conditions of the Creative Commons Attribution (CC BY) license
Lecture Notes in Artificial Intelligence and Lecture Notes in
Bioinformatics). 2020. (http://creativecommons.org/licenses/by/4.0/).
[4] Qin, Z., P. Zhang, and X. Li, Ultra Fast Deep Lane Detection With Hybrid
Anchor Driven Ordinal Classification. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 2024. 46(5): p. 2555-2568.
[8] Yang, D., W. Bao, and K. Zheng, Lane Detection of Smart Car based on
Deep Learning. Journal of Physics: Conference Series, 2021. 1873(1): p.
012068.