0% found this document useful (0 votes)

4 views3 pages

AIX for System Administrators_ HA - CAA

The document provides a comprehensive guide on Cluster Aware AIX (CAA), detailing its features, commands, and configurations for creating and managing high availability clusters. It explains the role of CAA in conjunction with PowerHA, the requirements for the cluster repository disk, and the functionality of the AHAFS for event notifications. Additionally, it covers the deadman switch mechanism to protect data integrity during node isolation and includes various commands for cluster management.

Uploaded by

raj0000kaml

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views3 pages

AIX for System Administrators_ HA - CAA

Uploaded by

raj0000kaml

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

More rajraj.

kamlesh@

AIX for System Administrators

Practical Guide to AIX (and PowerVM, PowerHA, PowerVC, HMC, DevOps ...)

AIX-HW AIX-OS FS-LVM HA HMC-POWER NETWORKS NIM PERFORMANCE STORAGE POWERVM POWERVC DEVOPS EXTRA

HA - CAA
CAA (Cluster Aware AIX)

CAA is an AIX feature that was introduced in AIX 6.1 TL6 and AIX 7.1. It helps to easily create a cluster. CAA is not used as a stand-
alone package, it is used with PowerHA or with Shared Storage Pool. It can be seen as a set of commands and services that other
applications (like PowerHA, SSP) can exploit to provide high availability and disaster recovery support. CAA does not provide application
monitoring and resource failover capabilities, those are provided by PowerHA for example. IBM PowerHA, SSP and even RSCT (Reliable
Scalable Cluster Technology) use these built-in AIX clustering capabilities, and the reason for these built-in functions was to simplify
the configuration and management of high availability clusters.

CAA can provide specific events, so that applications can monitor these from any node in the cluster:
Node UP and node DOWN
Network adapter UP and DOWN
Network address change
Disk UP and DOWN
Predefined and user-defined events

CAA needs the following ports on all nodes for network communication:
4098 (for multicast)
6181
16191
42112

Checking CAA related daemons (services):

# lssrc -g caa
Subsystem Group PID Status
clcomd caa 6553806 active
clconfd caa 6619352 active

clcomd: It is the cluster communications daemon. Since PowerHA 7.1 it is a CAA service, before PowerHA 7.1 it was part of PowerHA
(clcomdES). The rhosts file that is used by the clcomd is in the /etc/cluster/rhosts file. The old clcomdES rhosts file in the
/usr/es/sbin/cluster/etc directory is not used.
clconfd: The clconfd subsystem runs on each node of the cluster. The clconfd daemon wakes up every 10 minutes to synchronize any
necessary cluster changes.

Starting with AIX 7.1 TL2, no longer require a total cluster outage to upgrade the cluster nodes:
A rolling upgrade of a cluster is done by taking a node offline and upgrading it to a new AIX technology level, while the other nodes
remain active. After a node is upgraded, the node is rebooted and brought online by issuing the clctrl command. This process is repeated
until all the nodes are upgraded. In a mixed cluster environment, CAA maintains compatibility with nodes that are still running prior AIX
levels. New features are not enabled until all the cluster nodes are upgraded to the new technology level.

=============================

CAA Repository disk

The cluster repository disk stores the cluster configuration data. It provides a central repository, so this disk must be accessible from
all nodes in the cluster. A minimal disk size of 10 GB is preferred (1GB may be also enough). This disk cannot be used for application
storage or any other purposes, the use of LVM commands (mkvg, mklv...) are not supported whith the cluster repository disk. The AIX LVM
commands are single node administrative commands, and are not applicable in a clustered configuration. (The cluster repository disk must
be compliant with the 512 byte block size.)

CAA stores the repository disk related information in the ODM CuAt, as part of the cluster information.

# odmget CuAt | grep -p cluster

CuAt:
name = "cluster0"
attribute = "node_uuid"
value = "52a6b8be-fff8-11e5-8e37-56a1a7627864"
type = "R"
generic = "DU"
rep = "s"
nls_index = 3

CuAt:
name = "cluster0"
attribute = "clvdisk"
value = "d7063c81-3f64-b5f7-d82b-fa8ed99bfe61"
type = "R"
generic = "DU"
rep = "s"
nls_index = 2
In case this ODM entry is missing (which can cause that a node will fail to join the cluster) it can be repopulated (and the node forced
to join the cluster) using clusterconf command: clusterconf -r hdisk# (hdisk# is the repository disk)

=============================

/aha (AHAFS)

Nodes that belong to a CAA cluster use the common AIX HA File System (AHAFS) for event notification. AHAFS is a pseudo file system used
for synchronized information exchange and it is implemented in the AIX kernel extension. AHAFS is mounted on /aha. It can monitor
predefined and user-defined system events and automatically notifies registered users or processes about the occurrences of the following
types of events:
- Modification of content of a file
- Usage of a file system that exceeds a user-defined threshold
- Death of a process
- Change in the value of a kernel tunable parameter

=============================

Creating a cluster

The command mkcluster is used for creating a CAA cluster:

# mkcluster -n mycluster -m nodeA,nodeB -r hdisk1 -d hdisk2,hdisk3

The name of the cluster is mycluster, the nodes are nodeA and nodeB, the repository disk is hdisk1 and the shared disks are hdisk2 and
hdisk3. When the cluster is ready a special volume group (caavg_private), new logical volumes and filesystems are created.

The following happens after issuing the mkcluster command:

- The cluster configuration is written to the cluster repository disk.
- Special volume groups, logical volumes, filesystems are created on the cluster repository disk. (caavg_private)
- Cluster services are made available to other applications like RSCT or PowerHA.
- Additional storage related taks...
- A clusterwide multicast address is established.
- The node discovers and monitors the available communication interfaces.
- The cluster interacts with Autonomic Health Advisory File System (AHAFS) for clusterwide event distribution and makes messages
available to PowerHA, RSCT...

CAA uses IP based network communications and storage interface communication through Fibre Channel. When using both type of
communication, all nodes in the cluster can always communicate with any other nodes in the cluster and thus eliminating "split brain"
incidents. If some node cannot communicate with others DMS (Dead Man Switch) timers are triggered.

A deadman switch is an action when Cluster Aware AIX (CAA) detects that a node become isolated. This occurs when nodes are not capable of
communicating via network and repository disk anymore. Based on the deadman switch setting (or the deadman_mode tunable) the AIX
operating system can react differently. DMS monitors for some specific time (node_timout) IO traffic, process health etc. and after the
timeout it can force a system shutdown or generate an Autonomic Health Advisor File System (AHAFS) event.

=============================

Deadman switch (DMS)

A deadman switch is an action when CAA detects that a node become isolated. This occurs when nodes are not capable of communicating via
network and repository disk anymore. The purpose of the DMS is to protect the data on the external disks. The AIX operating system reacts
differently depending on the DMS (deadman_mode) tunable. The deadman switch tunable can be set to force a system crash or generate an
AHAFS event.

# clctrl -tune -L deadman_mode <--check the current setting (clctrl -tune -h deadman_mode gives more details
NAME DEF MIN MAX UNIT SCOPE
ENTITY_NAME(UUID) CUR
--------------------------------------------------------------------------------
deadman_mode a c n
caa_cl(25ebea90-784a-11e1-a79d-b6fcc11f846f) a
--------------------------------------------------------------------------------

When the value is set to "a" (assert), the node will crash upon the deadman timer popping.
When the value is set to "e" (event), an AHAFS event is generated.

By default, the CAA deadman_mode option is “a”. If the deadman timeout is reached, the node crashes immediately to prevent a partitioned
cluster and data corruption.

=============================

Commands:

/var/adm/ras/syslog.caa caa log (in syslog: caa.info /var/adm/ras/syslog.caa rotate size 1m files 10)

lscluster -i lists interface configuration of the cluster

lscluster -i | egrep 'Node|Interface' overview of cluser, all interfaces (network and disk heartbeat)
lscluster -m lists info about nodes in the cluster
lscluster -d lists disks in the cluster and their status
lscluster -s lists statistis of network of a cluster
lscluster -c shows info about cluster configuration

mkcluster create a cluster

chcluster change a cluster configuration
rmcluster remove a cluster configuration
clcmd run a command on all nodes of a cluster (clcmd date: it shows the date on all nodes)

lsattr -El cluster0 lists IDs of cluster, disks

clctrl -stop -n mycluster -a stop cluster on all nodes (stop cluster on 1 node: clctrl -stop -n mycluster -m myserver1)
clctrl -start -n mycluster -a start cluster on all nodes (after completing maintenance)
clctrl -tune -L lists CAA related tunables (values stored in repository disk)
clctrl -tune -o <tunable>=<value> modifies a tunable across cluster (new value will be active at the next start)
=============================

If you want to use force option with CAA commands (not -f flag), the environment variable CAA_FORCE_ENABLE has to be set to 1:
# export CAA_FORCE_ENABLED=1
# rmcluster -r hdisk2

(Using force with rmcluster will remove the repository disk and ignore all errors.

=============================

Labels: HACMP

3 comments:
Anonymous said...
One question. How do you varyoff a "caavg_private" VG without messing up the cluster. I need to UPDATE SDDPCM and I am not sure if removing
the cluster will be a good choiche.
February 20, 2018 at 11:24 AM

Satpal said...
This comment has been removed by the author.
July 28, 2018 at 10:40 AM

Satpal said...
You can varyoff CAA (caavg_private) using 'clctl -stop -n CLUSTERNAME -m NODENAME'
Use "clctl -start -n CLUSTERNAME -m NODENAME" to varyon. (Try this command from the other node of cluster if you are facing problem with
varyon on problematic node.)
July 28, 2018 at 10:42 AM

Newer Post Home Older Post

Au Aix Powerha Cluster Migration PDF
No ratings yet
Au Aix Powerha Cluster Migration PDF
15 pages
Sun Cluster 3.1 Quick Reference
No ratings yet
Sun Cluster 3.1 Quick Reference
6 pages
Sun Cluster Command Cheat Sheet
No ratings yet
Sun Cluster Command Cheat Sheet
29 pages
Clusteraware PDF
No ratings yet
Clusteraware PDF
40 pages
AIX Boot Process
No ratings yet
AIX Boot Process
99 pages
Network Configuration Guide
No ratings yet
Network Configuration Guide
7 pages
Powerha Systemmirror For Aix V7.1 Two-Node Quick Configuration Guide
No ratings yet
Powerha Systemmirror For Aix V7.1 Two-Node Quick Configuration Guide
34 pages
Sun Cluster 3.1 Daemon & Command Guide
No ratings yet
Sun Cluster 3.1 Daemon & Command Guide
4 pages
AIX PowerHA (HACMP) Commands
No ratings yet
AIX PowerHA (HACMP) Commands
3 pages
Designing A PowerHA SystemMirror For AIX High Availability Solution - HA17 - Herrera
No ratings yet
Designing A PowerHA SystemMirror For AIX High Availability Solution - HA17 - Herrera
59 pages
Session Title:: IBM Power Systems Technical University
No ratings yet
Session Title:: IBM Power Systems Technical University
59 pages
Cluster Management: AIX Version 7.1
No ratings yet
Cluster Management: AIX Version 7.1
38 pages
Veritas Cluster Server Guide
No ratings yet
Veritas Cluster Server Guide
7 pages
Veritas Cluster Cheat Sheet
No ratings yet
Veritas Cluster Cheat Sheet
6 pages
Unix System Admin Commands
No ratings yet
Unix System Admin Commands
21 pages
Veritas Cluster Server Guide
No ratings yet
Veritas Cluster Server Guide
7 pages
NetApp CLI Guide for IT Admins
No ratings yet
NetApp CLI Guide for IT Admins
13 pages
Centos Cluster Server: Ryan Matteson
No ratings yet
Centos Cluster Server: Ryan Matteson
21 pages
Veritas Cluster Cheat Sheet
No ratings yet
Veritas Cluster Cheat Sheet
7 pages
Veritas Cluster Cheat Sheet PDF
No ratings yet
Veritas Cluster Cheat Sheet PDF
6 pages
Sc234279 HA 4.3 Administration
No ratings yet
Sc234279 HA 4.3 Administration
309 pages
VERITAS Cluster Server Commands Guide
No ratings yet
VERITAS Cluster Server Commands Guide
6 pages
Netapp Commands
No ratings yet
Netapp Commands
5 pages
PowerHA Cluster Command Guide
No ratings yet
PowerHA Cluster Command Guide
5 pages
Linux Interview Prep Guide
No ratings yet
Linux Interview Prep Guide
8 pages
SUN Cluster Quick Reference: Quorum Administration
No ratings yet
SUN Cluster Quick Reference: Quorum Administration
2 pages
Veritas Cluster Cheat Sheet
No ratings yet
Veritas Cluster Cheat Sheet
7 pages
NetApp Clustered Data ONTAP 82 CLI Commands Sort by Task
No ratings yet
NetApp Clustered Data ONTAP 82 CLI Commands Sort by Task
6 pages
Interview VXVM
No ratings yet
Interview VXVM
13 pages
Linux Interview Prep Guide
No ratings yet
Linux Interview Prep Guide
36 pages
Implementing Veritas CFS
No ratings yet
Implementing Veritas CFS
8 pages
Unixwerk - Cluster Commandline
No ratings yet
Unixwerk - Cluster Commandline
6 pages
RHCluster Command Line
100% (1)
RHCluster Command Line
8 pages
RHEL Interview Questions
No ratings yet
RHEL Interview Questions
120 pages
A Practical Guide To Oracle 10g RAC Its REAL Easy!: Gavin Soorma, Emirates Airline, Dubai Session# 106
No ratings yet
A Practical Guide To Oracle 10g RAC Its REAL Easy!: Gavin Soorma, Emirates Airline, Dubai Session# 106
113 pages
Oracle RAC 11gR2 Cloud Features
No ratings yet
Oracle RAC 11gR2 Cloud Features
43 pages
VCS Questions Answers
No ratings yet
VCS Questions Answers
12 pages
Aix 6.1 Operations Manual PDF
No ratings yet
Aix 6.1 Operations Manual PDF
600 pages
Baseadmndita AIX6.1 PDF
No ratings yet
Baseadmndita AIX6.1 PDF
644 pages
Chris Aix Live Update Best Practice v0.8
No ratings yet
Chris Aix Live Update Best Practice v0.8
17 pages
Nim201 Oct272024
No ratings yet
Nim201 Oct272024
45 pages
User Management
No ratings yet
User Management
8 pages
Lab4 - Installing Apache in Ubuntu Virtual Machine - Azure
No ratings yet
Lab4 - Installing Apache in Ubuntu Virtual Machine - Azure
32 pages
AIX for System Administrators_ HA - Snapshot
No ratings yet
AIX for System Administrators_ HA - Snapshot
3 pages
Do180 4.14 Student Guide
No ratings yet
Do180 4.14 Student Guide
526 pages
Azure vs AWS Cloud Service Comparison
No ratings yet
Azure vs AWS Cloud Service Comparison
37 pages
Azure Network Security Guide
No ratings yet
Azure Network Security Guide
88 pages
Lab8 - Understanding Performance Tiers of Storage - Disks
No ratings yet
Lab8 - Understanding Performance Tiers of Storage - Disks
38 pages
Lab20 - Understanding Table Storage - Azure
No ratings yet
Lab20 - Understanding Table Storage - Azure
22 pages
Introduction To Microsoft Azure
No ratings yet
Introduction To Microsoft Azure
8 pages
Azure Zone-Redundant Storage Guide
No ratings yet
Azure Zone-Redundant Storage Guide
17 pages
Azure Geo-Redundant Storage Guide
No ratings yet
Azure Geo-Redundant Storage Guide
20 pages
Lab3 - Installing IIS in Windows Virtual Machine - Azure
No ratings yet
Lab3 - Installing IIS in Windows Virtual Machine - Azure
43 pages
Lab15 - Understanding Locally Redundant Storage (LRS) - Azure
No ratings yet
Lab15 - Understanding Locally Redundant Storage (LRS) - Azure
29 pages
Azure CDN Profiles & Endpoints Guide
No ratings yet
Azure CDN Profiles & Endpoints Guide
46 pages
Lab21 - Understanding VNET Peering Between Two VNets in Same Region - Azure
No ratings yet
Lab21 - Understanding VNET Peering Between Two VNets in Same Region - Azure
99 pages
Lab11 - Understanding FrontEnd and BackEnd Subnets - Azure
No ratings yet
Lab11 - Understanding FrontEnd and BackEnd Subnets - Azure
95 pages
UxxxxxxxxxxxxxxUnix Route
No ratings yet
UxxxxxxxxxxxxxxUnix Route
4 pages
Lab6 - Understanding Features of Virtual Network - Azure
No ratings yet
Lab6 - Understanding Features of Virtual Network - Azure
39 pages
Lab9 - Understanding Managed Disks - Azure
No ratings yet
Lab9 - Understanding Managed Disks - Azure
33 pages
Lab23 - Understanding Availability Set and Load Balancer - Azure
No ratings yet
Lab23 - Understanding Availability Set and Load Balancer - Azure
163 pages
Azure Unmanaged Disks Guide
No ratings yet
Azure Unmanaged Disks Guide
83 pages
Lab10 - Capturing Existing VM - Build New VM With Customized Image Using Managed Disks - Azure
No ratings yet
Lab10 - Capturing Existing VM - Build New VM With Customized Image Using Managed Disks - Azure
71 pages
(Op 1000se Hse Opw 1200se Hse) Workshop
No ratings yet
(Op 1000se Hse Opw 1200se Hse) Workshop
298 pages
Line Tracking
No ratings yet
Line Tracking
214 pages
Stone India
100% (1)
Stone India
20 pages
R-30iB Basic Operator Manual (B-83284EN-2 02) (Optional Funct
100% (1)
R-30iB Basic Operator Manual (B-83284EN-2 02) (Optional Funct
428 pages
Application Equipment Manual
No ratings yet
Application Equipment Manual
545 pages
Vol A Drivers Manual of Wap5
No ratings yet
Vol A Drivers Manual of Wap5
202 pages
KTV 3 4 FLEX Direct - Quick Reference Guide - EN
No ratings yet
KTV 3 4 FLEX Direct - Quick Reference Guide - EN
24 pages
AIX for System Administrators_ HA - CAA
No ratings yet
AIX for System Administrators_ HA - CAA
3 pages
R-30iB Basic Operator Manual Dispense Functions B-83284EN-5
No ratings yet
R-30iB Basic Operator Manual Dispense Functions B-83284EN-5
158 pages
Code A Dead Man's Switch in Python 3 To Encrypt & Delete Files Whenever You Don't Check in Null Byte WonderHowTo
No ratings yet
Code A Dead Man's Switch in Python 3 To Encrypt & Delete Files Whenever You Don't Check in Null Byte WonderHowTo
18 pages
Jungheiricht Reach
100% (1)
Jungheiricht Reach
252 pages
1 Owners Manual Turbo ENG 02 19
No ratings yet
1 Owners Manual Turbo ENG 02 19
58 pages
FRB650 Operation and Maintenance Manual
100% (2)
FRB650 Operation and Maintenance Manual
14 pages
Williams Jet Tenders - Sportjet - Owners Manual - Rev 20.0
No ratings yet
Williams Jet Tenders - Sportjet - Owners Manual - Rev 20.0
134 pages
Qirox V7.10e
No ratings yet
Qirox V7.10e
442 pages
Independent Inquiry Report
No ratings yet
Independent Inquiry Report
34 pages
UltraBlast Machine User Guide
No ratings yet
UltraBlast Machine User Guide
57 pages
Aal VCD Manual
No ratings yet
Aal VCD Manual
28 pages
FANUC R30iBPlus iRTorchMate
No ratings yet
FANUC R30iBPlus iRTorchMate
115 pages
P-250iB Operations Manual (MAROC25IB09111E Rev.A)
No ratings yet
P-250iB Operations Manual (MAROC25IB09111E Rev.A)
147 pages
ASTender Owner's Manual Rev2.2
No ratings yet
ASTender Owner's Manual Rev2.2
29 pages
Power Tool Safety Guidelines
No ratings yet
Power Tool Safety Guidelines
2 pages
(R#1) XQ-450 - 600-800V
0% (1)
(R#1) XQ-450 - 600-800V
121 pages
Robot Controller Safety Guide
No ratings yet
Robot Controller Safety Guide
166 pages
ESC 214/214z/216/216z: Operating Instructions 51133144
100% (1)
ESC 214/214z/216/216z: Operating Instructions 51133144
182 pages
Vigilance Control Device VCD DHVdx8-QO
No ratings yet
Vigilance Control Device VCD DHVdx8-QO
4 pages
Automatic Error
No ratings yet
Automatic Error
364 pages
Gayk GPS by Carlos
No ratings yet
Gayk GPS by Carlos
23 pages
R-30iA PMC Operator Manual (B-82614EN 02)
100% (1)
R-30iA PMC Operator Manual (B-82614EN 02)
256 pages
RJ3 Maintenance
No ratings yet
RJ3 Maintenance
418 pages

AIX for System Administrators_ HA - CAA

Uploaded by

AIX for System Administrators_ HA - CAA

Uploaded by

More rajraj.

AIX for System Administrators

Checking CAA related daemons (services):

CAA Repository disk

# odmget CuAt | grep -p cluster

The command mkcluster is used for creating a CAA cluster:

The following happens after issuing the mkcluster command:

Deadman switch (DMS)

lscluster -i lists interface configuration of the cluster

mkcluster create a cluster

lsattr -El cluster0 lists IDs of cluster, disks

Newer Post Home Older Post

Subscribe to: Post Comments (Atom)

You might also like