BEVFusion简介、环境配置与安装以及遇到的各种报错处理
BEVFusion简介
针对点云投射到图像的多模态融合和图像投射到点云的多模态融合,前者会损失空间几何信息,后者会损失图像语义信息,这两种point-level的多模态融合方法均无法同时很好地处理目标检测和地图分割问题。
BEVFusion 是一种高效且通用的多任务多传感器融合框架。 它将多模态特征统一在共享鸟瞰图(BEV)表示空间中,在BEV空间下融合图像和点云特征,此网络是一个全卷积网络组成,并对2D-3D的视图转换作了效率优化与提升,将延迟减少了 40 倍以上,计算速度大大提高。
BEVFusion的网络结构如下图所示:
BEVFusion环境配置与安装
博主自己的硬件配置如下:
- Linux(Ubuntu 20.04)
- NVIDIA GeForce RTX 3090
- NVIDIA显卡驱动版本:12.0
- CUDA version:11.3
配置BEVFusion所需的python环境(若CUDA版本不一样,安装时需要做一些修改):
# 创建名为bevfusion的conda环境
conda create -n bevfusion python=3.8 -y
conda activate bevfusion
# 安装pytorch
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
# 安装mmcv
pip install mmcv-full==1.4.0 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html
# 安装mmdetection
pip install mmdet==2.20.0
# 安装nuscenes数据集工具、软件开发包
pip install nuscenes-dev-kit
# 安装mpi4py
pip install mpi4py==3.0.3
# 安装torchpack
pip install torchpack
# 安装numba
pip install numba
cd bevfusion
# 编译、安装mmdetection3d
python setup.py develop
报错解决
此节主要介绍博主在安装、训练或测试过程中遇到的一些报错和解决方法。
报错一
在训练或测试的时候,遇到spconv报错:
RuntimeError: /XXX/XXX/XXX/bevfusion/mmdet3d/ops/spconv/src/indice_cuda.cu 124
cuda execution failed with error 2
解决方法为:
1、在bevfusion文件夹下,运行python setup.py develop
,安装适配的spconv版本。
2、如果步骤(1)已经尝试但依然报错,那可能是显存不够导致,尝试换更大显存的显卡或将batch_size缩小。
报错二
在训练的时候遇到报错:
TypeError: FormatCode() got an unexpected keyword argument 'verify'
解决方法是安装适配版本的yapf:pip install yapf==0.40.1
.
报错三
AttributeError: module 'distutils' has no attribute 'version'
Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.
解决办法是安装适配版本的setuptools:pip install setuptools==59.5.0
.
报错四
Traceback (most recent call last):
File "/home/XXX/bevfusion/tools/train.py", line 91, in <module>
main()
File "/home/XXX/bevfusion/tools/train.py", line 24, in main
dist.init()
File "/home/XXX/anaconda3/envs/BEVFusionEnv/lib/python3.8/site-packages/torchpack/distributed/context.py", line 23, in init
master_host = 'tcp://' + os.environ['MASTER_HOST']
File "/home/XXX/anaconda3/envs/BEVFusionEnv/lib/python3.8/os.py", line 673, in __getitem__
raise KeyError(key) from None
KeyError: 'MASTER_HOST'
解决办法是添加环境变量:export MASTER_HOST=IP:port
.
报错五
ModuleNotFoundError: No module named 'mmcv._ext'
报错原因时错误地安装了mmcv=1.4.0。
解决方法是安装mmcv-full==1.4.0,切记不要少了full。
报错六
执行download_pretrained.sh文件时无法下载pth文件,出现报错:ERROR 404: Not Found
。
解决方法是在Google网盘上下载,链接如下:https://drive.google.com/drive/folders/1Jru7VTfgvFF949DlP1oDfriP3wrmOd3c.
报错七
_configtest.c:2:10: fatal error: mpi.h: No such file or directory
#include <mpi.h>
^~~~~~~
compilation terminated.
failure.
removing: _configtest.c _configtest.o
error: Cannot compile MPI programs. Check your configuration!!!
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for mpi4py
Running setup.py clean for mpi4py
Failed to build mpi4py
ERROR: Could not build wheels for mpi4py, which is required to install pyproject.toml-based projects
解决方法是先安装libopenmpi-dev,再安装mpi4py:
sudo apt install libopenmpi-dev
pip install mpi4py==3.0.3
报错八
cannot import name 'ball_query_ext' from 'mmdet3d.ops.ball_query'
解决方法是
cd bevfusion
python setup.py develop
报错九
AttributeError: module 'numpy' has no attribute 'int'.
`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
解决办法是numpy安装的版本过高,安装较低版本的numpy,例如:
pip install numpy==1.23.1
报错十
N > 0 assert faild. CUDA kernel launch blocks must be positive, but got N= 0
解决办法:
# cuda是10.2版本
pip install spconv-cu102
# cuda是11.3版本
pip install spconv-cu113