硬件配置
型号:浪潮CS5260H2
CPU:(海光),架构x86 Hygon C86 5380 16-core Processor
GPU:Tesla T4
RAID卡:pm8204
网卡信息:网讯 WX1860A2
命令输出如下:
(base) [root@localhost ~]# lshw -c network
*-network:0
description: Ethernet interface
product: WX1860A2 Gigabit Ethernet Controller
vendor: Beijing Wangxun Technology Co., Ltd.
physical id: 0
bus info: pci@0000:01:00.0
logical name: em1
version: 01
serial: 9c:c2:c4:61:7a:df
size: 1Gbit/s
capacity: 1Gbit/s
width: 64 bits
系统安装
前期系统安装步骤省略,使用ventoy刻录U盘,下载镜像后,拷贝进入U盘,启动安装就可
系统版本
(base) [root@localhost ~]# nkvers
############## Kylin Linux Version #################
Release:
Kylin Linux Advanced Server release V10 (Lance)
Kernel:
4.19.90-52.22.v2207.ky10.x86_64
Build:
Kylin Linux Advanced Server
release V10 (SP3) /(Lance)-x86_64-Build23/20230324
#################################################
(base) [root@localhost ~]#
内核版本
(base) [root@localhost ~]# uname -a
Linux localhost.localdomain 4.19.90-52.22.v2207.ky10.x86_64 #1 SMP Tue Mar 14 12:19:10 CST 2023 x86_64 x86_64 x86_64 GNU/Linux
(base) [root@localhost ~]# rpm -qa | grep kernel
kernel-modules-extra-4.19.90-52.22.v2207.ky10.x86_64
kernel-tools-4.19.90-52.22.v2207.ky10.x86_64
kernel-core-4.19.90-52.22.v2207.ky10.x86_64
kernel-devel-4.19.90-52.22.v2207.ky10.x86_64
kernel-tools-libs-4.19.90-52.22.v2207.ky10.x86_64
kernel-headers-4.19.90-52.22.v2207.ky10.x86_64
kernel-4.19.90-52.22.v2207.ky10.x86_64
kernel-modules-4.19.90-52.22.v2207.ky10.x86_64
(base) [root@localhost ~]#
屏蔽nouveau,启动模式修改
编辑 /lib/modprobe.d/dist-blacklist.conf
#注释 nvidiafb
#blacklist nvidiafb
#添加以下两行
blacklist nouveau
options nouveau modeset=0
重建initramfs
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
dracut /boot/initramfs-$(uname -r).img $(uname -r)
修改系统启动模式
查看当前启动模式
systemctl get-default
设置为命令行模式
systemctl set-default multi-user.target
重启
执行reboot
cuda依赖配置与安装
安装gcc,如系统自带此步骤可省略
yum install gcc gcc-c++
检查gcc版本
(base) [root@localhost ~]# gcc --version
gcc (GCC) 7.3.0
Copyright © 2017 Free Software Foundation, Inc.
本程序是自由软件;请参看源代码的版权声明。本软件没有任何担保;
包括没有适销性和某一专用目的下的适用性担保。
(base) [root@localhost ~]#
检查内核版本
(base) [root@localhost ~]# ls /boot | grep vmlinu
vmlinuz-0-rescue-4f9a6ce3aaba4101b848f3a5814fe999
vmlinuz-4.19.90-52.22.v2207.ky10.x86_64
(base) [root@localhost ~]#
(base) [root@localhost ~]# rpm -aq | grep kernel-devel
kernel-devel-4.19.90-52.22.v2207.ky10.x86_64
(base) [root@localhost ~]# rpm -aq | grep kernel
kernel-modules-extra-4.19.90-52.22.v2207.ky10.x86_64
kernel-tools-4.19.90-52.22.v2207.ky10.x86_64
kernel-core-4.19.90-52.22.v2207.ky10.x86_64
kernel-devel-4.19.90-52.22.v2207.ky10.x86_64
kernel-tools-libs-4.19.90-52.22.v2207.ky10.x86_64
kernel-headers-4.19.90-52.22.v2207.ky10.x86_64
kernel-4.19.90-52.22.v2207.ky10.x86_64
kernel-modules-4.19.90-52.22.v2207.ky10.x86_64
(base) [root@localhost ~]#
如缺少进行yum安装或者升级内核保持一致即可,安装cuda不一会导致报错
cuda安装
官方文档CUDA按章配置可参考我另一篇博客:
Ubuntu22.04.4安装配置CUDA12.5,Cdnn官方详细版本_ubuntu 22.04.4-CSDN博客
这里麒麟系统我下载的Centos7的二进制包,执行安装即可
cuda历史版本及文档如下链接:
https://developer.nvidia.com/cuda-toolkit-archive
脚本执行如下:
wget https://develope