Config files for Corosync and Pacemaker
/etc/corosync/corosync.conf – config file for corosync cluster membership and quorum
/var/lib/pacemaker/crm/cib.xml – config file for cluster nodes and resources
Log files
/var/log/cluster/corosync.log
/var/log/pacemaker.log
/var/log/pcsd/pcsd.log
/var/log/messages – used for some other services including crmd and pengine etc.
Pacemaker Cluster Resources and Resource Groups
A cluster resource refers to any object or service which is managed by the Pacemaker cluster.
A number of different resources are defined by Pacemaker:
Primitive: this is the basic resource managed by the cluster.
Clone: a resource which can run on multiple nodes simultaneously.
MultiStake or Master/Slave: a resource in which one instance serves as master and the other as slave. A common example of this is DRBD.
Resource Group: this is a set of primitives or clone which is used to group resources together for easier admin.
Resource Classes:
OCF or Open Cluster Framework: this is the most commonly used resource class for Pacemaker clusters
Service: used for implementing systemd, upstart, and lsb commands
Systemd: used for systemd commands
Fencing: used for Stonith fencing resources
Nagios: used for Nagios plugins
LSB or Linux Standard Base: these are for the older Linux init script operations. Now deprecated
Resource stickiness: this refers to running a resource on the same cluster node even after some problem occurs with the node which is later rectified. This is advised since migrating resources to other nodes should generally be avoided.
Constraints
Constraints: A set of rules that sets out how resources or resource groups should be started.
Constraint Types:
Location: A location constraint defines on which node a resource should run – or not run, if the priority is set to minus -INFINITY.
Colocation: A colocation constraint defines which resources should be started together – or not started together in the case of -INFINITY
Order: Order constraints define in which order resources should be started. This is to allow for pre-conditional services to be started first.
Resource Order Priority Scores:
These are used with the constraint types above.
The priority score can be set to a value between -1,000,000 (-INFINITY = the event will never happen) right up to INFINITY (1,000,000 = the event must happen).
Any negative priority score will prevent the resource from running.
Cluster Admin Commands
On RedHat Pacemaker Clusters, the pcs command is used to manage the cluster. pcs stands for “Pacemaker Configuration System”:
pcs status – View cluster status.
pcs config – View and manage cluster configuration.
pcs cluster – Configure cluster options and nodes.
pcs resource – Manage cluster resources.
pcs stonith – Manage fence devices.
pcs constraint – Manage resource constraints.
pcs property – Manage pacemaker properties.
pcs node – Manage cluster nodes.
pcs quorum – Manage cluster quorum settings.
pcs alert – Manage pacemaker alerts.
pcs pcsd – Manage pcs daemon.
pcs acl – Manage pacemaker access control lists.
Pacemaker Cluster Installation and Configuration Commands:
To install packages:
yum install pcs -y
yum install fence-agents-all -y
echo CHANGE_ME | passwd –stdin hacluster
systemctl start pcsd
systemctl enable pcsd
To authenticate new cluster nodes:
pcs cluster auth \
node1.example.com node2.example.com node3.example.com
Username: hacluster
Password:
node1.example.com: Authorized
node2.example.com: Authorized
node3.example.com: Authorized
To create and start a new cluster:
pcs cluster setup <option> <member> …
eg
pcs cluster setup –start –enable –name mycluster \
node1.example.com node2.example.com node3.example.com
To enable cluster services to start on reboot:
pcs cluster enable –all
To enable cluster service on a specific node[s]:
pcs cluster enable [–all] [node] […]
To disable cluster services on a node[s]:
pcs cluster disable [–all] [node] […]
To display cluster status:
pcs status
pcs config
pcs cluster status
pcs quorum status
pcs resource show
crm_verify -L -V
crm_mon – this is used as equivalent for the crmsh/crmd version of Pacemaker
To delete a cluster:
pcs cluster destroy <cluster>
To start/stop a cluster:
pcs cluster start –all
pcs cluster stop –all
To start/stop a cluster node:
pcs cluster start <node>
pcs cluster stop <node>
To carry out mantainance on a specific node:
pcs cluster standby <node>
Then to restore the node to the cluster service:
pcs cluster unstandby <node>
To switch a node to standby mode:
pcs cluster standby <node1>
To restore a node from standby mode:
pcs cluster unstandby <node1>
To set a cluster property
pcs property set <property>=<value>
To disable stonith fencing: NOTE: you should usually not do this on a live production cluster!
pcs property set stonith-enabled=false
To reenable the stonith fencing:
pcs property set stonith-enabled=true
To configure firewalling for the cluster:
firewall-cmd –permanent –add-service=high-availability
firewall-cmd –reload
To add a node to the cluster:
check hacluster user and password
systemctl status pcsd
Then on an active node:
pcs cluster auth node4.example.com
pcs cluster node add node4.example.com
Then, on the new node:
pcs cluster start
pcs cluster enable
To display the xml configuration
pcs cluster cib
To display current cluster status:
pcs status
To manage cluster resources:
pcs resource <tab>
To enable, disable and relocate resource groups:
pcs resource move <resource>
or alternatively with:
pcs resource relocate <resource>
to locate the resource back to its original node:
pcs resource clear <resource>
pcs contraint <type> <option>
To create a new resource:
pcs resource create <resource_name> <resource_type> <resource_options>
To create new resources, reference the appropriate resource agents or RAs.
To list ocf resource types:
(example below with ocf:heartbeat)
pcs resource list heartbeat
ocf:heartbeat:IPaddr2
ocf:heartbeat:LVM
ocf:heartbeat:Filesystem
ocf:heartbeat:oracle
ocf:heartbeat:apache
options detail of a resource type or agent:
pcs resource describe <resource_type>
pcs resource describe ocf:heartbeat:IPaddr2
pcs resource create vip_cluster ocf:heartbeat:IPaddr2 ip=192.168.125.10 –group myservices
pcs resource create apache-ip ocf:heartbeat:IPaddr2 ip=192.168.125.20 cidr_netmask=24
To display a resource:
pcs resource show
Cluster Troubleshooting
Logging functions:
journalctl
tail -f /var/log/messages
tail -f /var/log/cluster/corosync.log
Debug information commands:
pcs resource debug-start <resource>
pcs resource debug-stop <resource>
pcs resource debug-monitor <resource>
pcs resource failcount show <resource>
To update a resource after modification:
pcs resource update <resource> <options>
To reset the failcount:
pcs resource cleanup <resource>
To remove a resource from a node:
pcs resource move <resource> [ <node> ]
To start a resource or a resource group:
pcs resource enable <resource>
To stop a resource or resource group:
pcs resource disable <resource>
To create a resource group and add a new resource:
pcs resource create <resource_name> <resource_type> <resource_options> –group <group>
To delete a resource:
pcs resource delete <resource>
To add a resource to a group:
pcs resource group add <group> <resource>
pcs resource group list
pcs resource list
To add a constraint to a resource group:
pcs constraint colocation add apache-group with ftp-group -100000
pcs constraint order apache-group then ftp-group
To reset a constraint for a resource or a resource group:
pcs resource clear <resource>
To list resource agent (RA) classes:
pcs resource standards
To list available RAs:
pcs resource agents ocf | service | stonith
To list specific resource agents of a specific RA provider:
pcs resource agents ocf:pacemaker
To list RA information:
pcs resource describe RA
pcs resource describe ocf:heartbeat:RA
To create a resource:
pcs resource create ClusterIP IPaddr2 ip=192.168.100.125 cidr_netmask=24 params ip=192.168.125.100 cidr_netmask=32 op monitor interval=60s
To delete a resource:
pcs resource delete resourceid
To display a resource (example with ClusterIP):
pcs resource show ClusterIP
To start a resource:
pcs resource enable ClusterIP
To stop a resource:
pcs resource disable ClusterIP
To remove a resource:
pcs resource delete ClusterIP
To modify a resource:
pcs resource update ClusterIP clusterip_hash=sourceip
To delete parameters for a resource (resource specific, here for ClusterIP):
pcs resource update ClusterIP ip=192.168.100.25
To list the current resource defaults:
pcs resource rsc default
To set resource defaults:
pcs resource rsc defaults resource-stickiness=100
To list current operation defaults:
pcs resource op defaults
To set operation defaults:
pcs resource op defaults timeout=240s
To set colocation:
pcs constraint colocation add ClusterIP with WebSite INFINITY
To set colocation with roles:
pcs constraint colocation add Started AnotherIP with Master WebSite INFINITY
To set constraint ordering:
pcs constraint order ClusterIP then WebSite
To display constraint list:
pcs constraint list –full
To show a resource failure count:
pcs resource failcount show RA
To reset a resource failure count:
pcs resource failcount reset RA
To create a resource clone:
pcs resource clone ClusterIP globally-unique=true clone-max=2 clone-node-max=2
To manage a resource:
pcs resource manage RA
To unmanage a resource:
pcs resource unmanage RA
Fencing (Stonith) commands:
ipmitool -H rh7-node1-irmc -U admin -P password power on
fence_ipmilan –ip=rh7-node1-irmc.localdomain –username=admin –password=password –action=status
Status: ON
pcs stonith
pcs stonith describe fence_ipmilan
pcs stonith create ipmi-fencing1 fence_ipmilan \
pcmk_host_list=”rh7-node1.localdomain” \
ipaddr=192.168.100.125 \
login=admin passwd=password \
op monitor interval=60s
pcs property set stonith-enabled=true
pcs stonith fence pcmk-2
stonith_admin –reboot pcmk-2
To display fencing resources:
pcs stonith show
To display Stonith RA information:
pcs stonith describe fence_ipmilan
To list available fencing agents:
pcs stonith list
To add a filter to list available resource agents for Stonith:
pcs stonith list <string>
To setup properties for Stonith:
pcs property set no-quorum-policy=ignore
pcs property set stonith-action=poweroff # default is reboot
To create a fencing device:
pcs stonith create stonith-rsa-node1 fence_rsa action=off ipaddr=”node1_rsa” login=<user> passwd=<pass> pcmk_host_list=node1 secure=true
To display fencing devices:
pcs stonith show
To fence a node off from the rest of the cluster:
pcs stonith fence <node>
To modify a fencing device:
pcs stonith update stonithid [options]
To display fencing device options:
pcs stonith describe <stonith_ra>
To delete a fencing device:
pcs stonith delete stonithd