WSL2 Ubuntu22.04 部署配置Xinference和所有的坑

NVIDIA Windows Driver x86
https://www.nvidia.com/Download/index.aspx
https://www.nvidia.cn/drivers/lookup/

国内用 cn

NVIDIA Studio 驱动程序
GeForce Game Ready 驱动程序

Q00 选哪个？我的环境已有驱动,先不升级。

验证下


PS C:\Users\zy-de> nvidia-smi
Mon Jan 20 13:45:26 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.94                 Driver Version: 560.94         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 ...  WDDM  |   00000000:01:00.0  On |                  N/A |
|  0%   43C    P8              3W /  220W |    1610MiB /  12282MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

WSL2 安装

windows11 安装WSL2全流程

1、启用window子系统及虚拟化

2、手动安装
wsl --update
wsl --version

设置默认WSL版本
我们只使用wsl2,power shell 以管理员方式运行 # 将 WSL 默认版本设置为 WSL 2
wsl --set-default-version 2

查看当前已有的镜像
wsl -l -v

更改存储路径

wsl2迁移存储位置快捷方法适用于WSL 2.3.11以上

步骤详细
1.列出可用wsl版本
wsl --list
2.如果在使用关闭wsl
wsl --shoudown
3. 移动指定版本到指定路径
wsl --manage Ubuntu-22.04 --move <path>

安装 Ubuntu 并改存储路径

wsl --list --online
wsl --install -d Ubuntu-22.04

wsl --shutdown
wsl --list --verbose

wsl --manage Ubuntu-22.04 --move d:\WSL_Ubuntu2204

安装 miniconda（关键）

关键步骤,安装了 python 环境,为安装 cuda_toolkit 准备环境。

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

conda --version
conda info -e


conda create -n mypy310 python=3.10

conda activate mypy310
conda deactivate mypy310

安装 cuda toolkit

先切换到 conda mypy310 环境下, 不需要切换到 root 用户

安装CUDA
CUDA Toolkit 12.6 Update 3

注意选, wsl-Ubuntu！！！

Ubuntu不用换源, 速度挺快的。没有任何错,直接安装ok。

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.6.3/local_installers/cuda-repo-wsl-ubuntu-12-6-local_12.6.3-1_amd64.deb
> 2G

sudo dpkg -i cuda-repo-wsl-ubuntu-12-6-local_12.6.3-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-12-6-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-6

一次成功

配置环境变量

vim ~/.bashrc
export PATH=/usr/local/cuda-12.6/bin:$PATH
source ~/.bashrc

【1】还要求配置 LD_LIBRARY_PATH, 暂时没配置
export LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64:$LD_LIBRARY_PATH

(mypy310) ***:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Oct_29_23:50:19_PDT_2024
Cuda compilation tools, release 12.6, V12.6.85
Build cuda_12.6.r12.6/compiler.35059454_0

(base) ***:~$ nvidia-smi
Mon Jan 20 17:21:38 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.02              Driver Version: 560.94         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|

cuda toolkit 安装成功

不需要手动安装 pytorch

pytorch

conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia

TODO 暂时没装, cuda 是12.6, 这里显示 12.4 有点奇怪。

后续 pip 安装 Xinference 时会自动安装。

不需要手动安装 NCCL, cudaxxx 一堆东东

后续 pip 安装 Xinference 时会自动安装。

pip 部署 Xinference 而不是 docker部署

尝试过用 Docker 部署 Xinference CPU版和 GPU 版。
Docker 用过 windows的 Docker desktop和虚拟机的 docker。都不行。
特别花时间,CPU版 6G, GPU 版 17G。

安装

pip install "xinference[all]"

很慢。换源。

pip config set global.index-url https://mirrors.aliyun.com/pypi/simple
pip config set install.trusted-host mirrors.aliyun.com

ok

pip install "xinference[all]"

坑 nvidia-cublas-cu12 time out

再试一次ok

坑 llama-cpp-python

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (llama-cpp-python)

不用 xinference[all], 单独安装其它, 暂时不安装 llama-cpp-python

pip install "xinference[transformers]"
pip install "xinference[vllm]"
pip install "xinference[sglang]"

安装成功


CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python

CMake Error at vendor/llama.cpp/CMakeLists.txt:104 (message):
LLAMA_CUBLAS is deprecated and will be removed in the future.

Use GGML_CUDA instead
*** CMake configuration failed
[end of output]


CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python


cuda.so: undefined reference to `cudaGetLastError@libcudart.so.12'
collect2: error: ld returned 1 exit status

*** CMake build failed
[end of output]

llama-cpp-python 还是出错。先不用这种模型。

查看 ip

ifconfig
172.29.225.230

TODO 扩展,WSL怎样配置能固定这个ip

启动

XINFERENCE_HOME=/***/xinference XINFERENCE_MODEL_SRC=modelscope xinference-local --host 0.0.0.0 --port 9997

访问

Windows 下浏览器 172.29.225.230:9997 直接访问 ok。
不需要在 wsl Ubuntu里启动浏览器。

在这里插入图片描述

Q00 虚拟机能访问吗？
可以访问！

在这里插入图片描述

查看显卡使用


(base) ***:~$ nvidia-smi
Mon Jan 20 17:33:08 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.02              Driver Version: 560.94         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 ...    On  |   00000000:01:00.0  On |                  N/A |
|  0%   41C    P8              2W /  220W |    3617MiB /  12282MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      7792      C   /python3.10                                 N/A      |
+-----------------------------------------------------------------------------------------+