环境
- WSL2 + Ubuntu 22.04
- 显卡驱动: 528.89
- CUDA: 11.7
问题
在创建docker时使用--gpus all
会报错:
docker run --gpus all -it -e DISPLAY=unix$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix:rw celinachild/orbslam2 /bin/bash
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/b92e32945b90e830d9786a89fa0c138933ab7657c8a459bafe551fb944836b65/merged/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: file exists: unknown.
ERRO[0000] error waiting for container: context canceled
解决
参考 issue1551
解决方式是
- 先不使用gpu创建容器
- 删除相关文件(注意根据报错删除文件,不要删多)
rm -rf /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1 /usr/lib/x86_64-linux-gnu/libcuda.so.1
- 创建不包含这些文件的新的镜像
- 使用新的镜像创建容器
实现如下:
╭─changym at LAPTOP-CHANGYM in ~ (base)
╰─○ docker run -it celinachild/orbslam2
root@5ef649cf9ec8:/# rm /usr/lib/x86_64-linux-gnu/libnvidia-*
root@5ef649cf9ec8:/# rm /usr/lib/x86_64-linux-gnu/libcuda.so*
root@5ef649cf9ec8:/# rm /usr/lib/x86_64-linux-gnu/libnvcuvid.so.*
root@5ef649cf9ec8:/# exit
╭─changym at LAPTOP-CHANGYM in ~ (base)
╰─○ docker commit peaceful_ganguly orbslam2_cuda
sha256:2a93ce93469ed986226d119c2f4fe886b8b61554f20304cfcdd8832d8d02da0c
╭─changym at LAPTOP-CHANGYM in ~ (base)
╰─○ docker run --gpus all -it -e DISPLAY=unix$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix:rw orbslam2_cuda /bin/bash
root@47d847a00e55:/# nvidia-smi
Fri Apr 14 08:32:53 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.16 Driver Version: 528.89 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro T2000 wi... On | 00000000:01:00.0 On | N/A |
| N/A 51C P8 3W / 35W | 1839MiB / 4096MiB | 10% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 20 G /Xwayland N/A |
| 0 N/A N/A 20 G /Xwayland N/A |
| 0 N/A N/A 33 G /Xwayland N/A |
+-----------------------------------------------------------------------------+
此时可以正常创建容器。