pacemaker使用

鲍飞大剑

已于 2022-07-06 11:45:50 修改

阅读量767

点赞数

CC 4.0 BY-SA版权

文章标签： centos linux 服务器

于 2022-07-06 11:37:40 首次发布

本文链接：https://blog.csdn.net/qq_27164239/article/details/125635753

本文介绍了Pacemaker，它是一个集群资源管理者，可保证集群服务的最大可用性，具有监测恢复故障、存储无关等特性。还详细说明了安装、配置集群软件的步骤，创建主备集群的方法，以及设置资源黏性等内容，以保障数据安全和资源稳定运行。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

什么是Pacemaker

Pacemaker是一个集群资源管理者。他用资源级别的监测和恢复来保证集群服务(aka. 资源)的最大可用性。它可以用你所擅长的基础组件(Corosync或者是Heartbeat)来实现通信和关系管理。Pacemaker包含以下的关键特性:
1. 监测并恢复节点和服务级别的故障
2. 存储无关，并不需要共享存储
3. 资源无关，任何能用脚本控制的资源都可以作为服务
4. 支持使用STONITH来保证数据一致性。
5. 支持大型或者小型的集群
6. clusters 支持 quorate(法定人数) 或 resource(资源) 驱动的集群
7. 支持任何的冗余配置
8. 自动同步各个节点的配置文件
9. 可以设定集群范围内的ordering, colocation and anti-colocation
10. 支持高级的服务模式
  1. Clones:为那些要在多个节点运行的服务所准备的
  2. Multi-state:为那些有多种模式的服务准备的。(比如.主从, 主备)
  3. 统一的，可脚本控制的cluster shell

安装集群软件

yum makecache fast
yum install -y pacemaker pcs psmisc policycoreutils-python

配置集群软件

在node上允许集群相关的服务通过本地的防火墙

firewall-cmd --permanent --add-service=high-availability
firewall-cmd --reload

启用pcs守护进程

systemctl start pcsd.service 
systemctl enable pcsd.service 
passwd hacluster

配置corosync

在任意node上使用命令 pcs cluster auth 给node授权使用hacluster用户

[root@centos-101 ~]# pcs cluster  auth centos-101 centos-minion4
Username: hacluster
Password: 
centos-minion4: Authorized
centos-101: Authorized
然后...
在同一台机器上使用 pcs cluster setup 命令生成并同步corosync配置

[root@centos-101 ~]# pcs cluster setup --name mycluster centos-101 centos-minion4
Destroying cluster on nodes: centos-101, centos-minion4...
centos-minion4: Stopping Cluster (pacemaker)...
centos-101: Stopping Cluster (pacemaker)...
centos-minion4: Successfully destroyed cluster
centos-101: Successfully destroyed cluster

Sending cluster config files to the nodes...
centos-101: Succeeded
centos-minion4: Succeeded

Synchronizing pcsd certificates on nodes centos-101, centos-minion4...
centos-minion4: Success
centos-101: Success

Restarting pcsd on the nodes in order to reload the certificates...
centos-minion4: Success
centos-101: Success

[root@ceph1 ~]# pcs cluster setup --start --name ha-test ceph1 ceph2
Shutting down pacemaker/corosync services...
Redirecting to /bin/systemctl stop  pacemaker.service
Redirecting to /bin/systemctl stop  corosync.service
Killing any remaining services...
Removing all cluster configuration files...
ceph1: Succeeded
ceph2: Succeeded
Starting cluster on nodes: ceph1, ceph2...
ceph2: Starting Cluster...
ceph1: Starting Cluster...
Synchronizing pcsd certificates on nodes ceph1, ceph2...
ceph1: Success
ceph2: Success

Restaring pcsd on the nodes in order to reload the certificates...
ceph1: Success
ceph2: Success

构建并配置集群

启动集群

[root@centos-101 svn_repo]# pcs cluster start --all
centos-minion4: Starting Cluster...
centos-101: Starting Cluster...

检验corosync的安装

首先，使用corosync-cfgtool检查集群通讯是否正常；

[root@centos-101 svn_repo]# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
	id	= 127.0.0.1
	status	= ring 0 active with no faults
--------------------------------------------------------
如果id 为127.0.0.1， 那么修改hosts文件...
把主机名对应的IP改成实际的IP...
[root@centos-101 svn_repo]# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
	id	= 192.168.56.101
	status	= ring 0 active with no faults
然后，检查检查membership和quorum APIs：

检查Packmaker安装

最后确保没有启动错误

[root@centos-minion4 ~]# journalctl | grep -i error
如果有关于STONITH的error，忽略，因为这里没有添加资源

创建主备集群
这是默认选项。资源放置在系统中的最适合位置。这意味着当负载能力“较好”或较差的节点变得可用时才转移资源。此选项的作用基本等同于自动故障回复，只是资源可能会转移到非之前活动的节点上；大于0：资源更愿意留在当前位置，但是如果有更合适的节点可用时会移动。值越高表示资源越愿意留在当前位置；小于0：资源更愿意移离当前位置。绝对值越高表示资源越愿意离开当前位置；INFINITY：如果不是因节点不适合运行资源（节点关机、节点待机、达到migration-threshold 或配置更改）而强制资源转移，资源总是留在当前位置。此选项的作用几乎等同于完全禁用自动故障回复；-INFINITY：资源总是移离当前位置；我们这里可以通过以下方式为资源指定默认黏性值： rsc_defaults resource-stickiness=100；
还是没懂...

设置默认资源粘性
1. 查看存在的配置
  
  当pacemaker启动，它自动记录集群中的成员及node的详细信息，同事还有别使用的stack和使用的Pacemaker的版本
  
  还可以获取xml格式的信息
  
  为了保证数据的安全，Pacemaker中默认的STONITH是enabled的，然而，当没有STONITH配置被使用它也知道并且会将这个报告为问题
  
  把stonith设置为disable然后稍后配置
2. 增加一个资源（resource）
  
  我们第一个资源是鸡群里每个node上产生一个独立的IP 地址
  
  。不管集群服务在哪里运行，用户需要个确定的地址来连接他们。我们选择192.168.56.99 作为浮动地址，赋予它一个名字ClusterIP，并通知集群每隔30秒检查一次他是否在运行。
  
  事实证明，cidr_netmask是可选项，IP要写跟网卡一个网段啊,不然会创建失败的...
  
  资源粘性...
  
  ----------------------------------------------------------------------------------
  
  正常启动node1.test.com后，集群资源vip很可能会重新从node2.test.com转移回node1.test.com，但也可能不回去。资源的这种在节点间每一次的来回流动都会造成那段时间内其无法正常被访问，所以，我们有时候需要在资源因为节点故障转移到其它节点后，即便原来的节点恢复正常也禁止资源再次流转回来。这可以通过定义资源的黏性(stickiness)来实现。在创建资源时或在创建资源后，都可以指定指定资源黏性。好了，下面我们来简单回忆一下，资源黏性。
  
  (7).资源黏性
  
  资源黏性是指：资源更倾向于运行在哪个节点。
  
  资源黏性值范围及其作用：
出处
http://clusterlabs.org/doc/zh-CN/Pacemaker/1.1/html-single/Clusters_from_Scratch/index.html#_install_the_cluster_software

/workspace/clyxys/dataplatform-dev/bmm_build_unicon.xml:114: Problem creating war: /workspace/clyxys/dataplatform-dev/app/com/bonc/dataplatform/assess/My97DatePicker/????????? (No such file or directory)



<target name="war" depends="compile" description="======compress j2ee war file======">

<mkdir dir="bmm_release" />



<war destfile="bmm_release/${war.file}-${build.version}-B${buildtime}.war" webxml="${web.dir}/WEB-INF/web.xml">

<fileset dir="${web.dir}" />

</war>

<copy file="bmm_release/${war.file}-${build.version}-B${buildtime}.war" tofile="release/${war.file}.war" />

</target>