Env:centos7.9
官网下载软件包
初始步骤:节点之间无秘钥访问、nfs共享
XFTP上传包:
创建安装目录并解压
[root@master01 lsf10.1_x86_lnx310]# mkdir /share01/install
[root@master01 lsf10.1_x86_lnx310]# mkdir /share01/app
[root@master install]# tar xf lsf10.1_x86_lnx310.tar
创建管理用户
[root@master01 lsf10.1_x86_lnx310]# useradd lsfadmin -u 1500
[root@master01 install]# cd lsf10.1_x86_lnx310/
[root@master01 lsf10.1_x86_lnx310]# ls
lsf10.1_lnx310-lib217-x86_64.tar.Z lsf_std_entitlement.dat
lsf10.1_lsfinstall_linux_x86_64.tar
[root@master01 lsf10.1_x86_lnx310]# yum install ed
[root@master lsf10.1_x86_lnx310]# tar xf lsf10.1_lsfinstall_linux_x86_64.tar
[root@master01 lsf10.1_x86_lnx310]# cd lsf10.1_lsfinstall
[root@master01 lsf10.1_lsfinstall]# ls
aws_enable.config instlib patchlib scripts
aws_enable.sh lap pversions slave.config
conf_tmpl lsfinstall README
hostsetup lsf_unix_install.pdf rhostsetup
install.config patchinstall rpm
[root@master01 lsf10.1_lsfinstall]# vi install.config
LSF_TOP="/share01/app/lsf"
LSF_ADMINS="lsfadmin"
LSF_CLUSTER_NAME="nju_cluster1"
LSF_MASTER_LIST="master01"
LSF_ENTITLEMENT_FILE="/share01/install/lsf10.1_x86_lnx310/lsf_std_entitlement.dat"
LSF_TARDIR="/share01/install/lsf10.1_x86_lnx310"
[root@master01 lsf10.1_lsfinstall]# ./lsfinstall -f install.config
设置环境变量:
[root@master01 conf]# pwd
/share01/app/lsf/conf
[root@master01 conf]# source profile.lsf
运行lsfstartup命令启动集群:
该lsfstartup命令使用rsh来连接到集群中的所有节点,并启动LSF。如果您的环境中未配置 RSH,您可以通过将以下行 添加到lsf.conf文件来配置lsfstartup命令以使用 SSH :
在lsf.conf文件末尾添加:LSF_RSH=ssh
****以下是设置开机自启(针对生产环境是否设置开机自启)****
[root@master01 install]# ./hostsetup --top="/share01/app/lsf/" --boot="y"
Logging installation sequence in /share01/app/lsf/log/Install.log
------------------------------------------------------------
L S F H O S T S E T U P U T I L I T Y
------------------------------------------------------------
This script sets up local host (LSF server, client or slave) environment.
Setting up LSF server host "master01" ...
Checking LSF installation for host "master01" ... Done
Created symlink from /etc/systemd/system/multi-user.target.wants/lsfd.service to /usr/lib/systemd/system/lsfd.service.
Installing LSF RC scripts on host "master01" ... Done
LSF service ports are defined in /share01/app/lsf/conf/lsf.conf.
Checking LSF service ports definition on host "master01" ... Done
You are installing IBM Spectrum LSF - Standard Edition.
... Setting up LSF server host "master01" is done
... LSF host setup is done.
****设置了开机自启可以通过以下命令进行查看当前节点的状态****
[root@master01 install]# systemctl status lsfd.service
● lsfd.service - IBM Spectrum LSF
Loaded: loaded (/usr/lib/systemd/system/lsfd.service; enabled; vendor preset: disabled)
Active: inactive (dead)
****以下命令是在所有节点上启动****
[root@master01 install]# lsfstartup
Starting up all LIMs ...
Do you really want to start up LIM on all hosts ? [y/n]y
Start up LIM on <master01> ...... Warning: Permanently added 'master01,192.168.10.30' (ECDSA) to the list of known hosts.
done
Waiting for Master LIM to start up ... Master LIM is ok
Starting up all RESes ...
Do you really want to start up RES on all hosts ? [y/n]y
Start up RES on <master01> ...... Warning: Permanently added 'master01,192.168.10.30' (ECDSA) to the list of known hosts.
done
Starting all slave daemons on LSBATCH hosts ...
Do you really want to start up slave batch daemon on all hosts ? [y/n] y
Start up slave batch daemon on <master01> ...... Warning: Permanently added 'master01,192.168.10.30' (ECDSA) to the list of known hosts.
done
Done starting up LSF daemons on the local LSF cluster ...
****bhosts命令是查看节点启动状态****
[root@master01 install]# bhosts
HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV
master01 ok - 4 0 0 0 0 0
****以下命令是在当前节点启动****
[root@master01 install]# lsf_daemons start
Starting the LSF subsystem
****以下命令是查看当前节点启动的状态****
[root@master01 install]# lsf_daemons status
Show status of the LSF subsystem
lim (pid 17226) is running...
res (pid 17232) is running...
sbatchd (pid 17236) is running...