How Can We Help?
LPIC3 DIPLOMA Linux Clustering – LAB NOTES LESSON 6: Configuring SBD Fencing on SUSE
These are my notes made during my lab practical as part of my LPIC3 Diploma course in Linux Clustering. They are in “rough format”, presented as they were written.
Overview
SBD or Storage Based Device is a cluster-node fencing system used by Pacemaker-based Linux clusters.
The system uses a small disk or disk partition for exclusive use by SBD to manage node fencing operations.
This disk has to be accessible to the SBD system from all cluster nodes, and using the same disk address designation. For this reason the disk needs to be provisioned using shared storage. For this purpose I am using ISCSI, based on an external ie non-cluster storage server.
The cluster comprises three SuSe Leap version 15 nodes housed on a KVM virtual machine system on a Linux Ubuntu host.
ENSURE WHEN YOU BOOT THE CLUSTER THAT YOU ALWAYS BOOT susestorage VM FIRST! otherwise the SBD will fail to run. This is because SBD relies on access to an iscsi target disk located on shared storage on the susestorage server.
Networking Preliminaries on susestorage Server
First we need to fix up a couple of networking issues on the new susestorage server.
To set the default route on susestorage you need to add following line to the config file:
susestorage:/etc/sysconfig/network # cat ifroute-eth0
default 192.168.122.1 – eth0
susestorage:/etc/sysconfig/network #
then set the DNS:
add this to config file:
susestorage:/etc/sysconfig/network # cat config | grep NETCONFIG_DNS_STATIC_SERVERS
NETCONFIG_DNS_STATIC_SERVERS=”192.168.179.1 8.8.8.8 8.8.4.4″
then do:
susestorage:/etc/sysconfig/network # service network restart
default routing and dns lookups now working.
Install Watchdog
Install watchdog on all nodes:
modprobe softdog
suse61:~ # lsmod | grep dog
softdog 16384 0
suse61:~ #
When using SBD as a fencing mechanism, it is vital to consider the timeouts of all components, because they depend on each other.
Watchdog Timeout
This timeout is set during initialization of the SBD device. It depends mostly on your storage latency. The majority of devices must be successfully read within this time. Otherwise, the node might self-fence.
Note: Multipath or iSCSI Setup
If your SBD device(s) reside on a multipath setup or iSCSI, the timeout should be set to the time required to detect a path failure and switch to the next path.
This also means that in /etc/multipath.conf the value of max_polling_interval must be less than watchdog timeout.
Create a small SCSI disk on susestorage
create a small disk eg 10MB (not any smaller)
Do NOT partition the disk! There is also no need to format the disk with a file system – SBD works with raw block devices.
Disk /dev/sdb: 11.3 MiB, 11811840 bytes, 23070 sectors
Disk model: QEMU HARDDISK
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x8571f370
Device Boot Start End Sectors Size Id Type
/dev/sdb1 2048 23069 21022 10.3M 83 Linux
susestorage:~ #
Install the ISCSI software packages
susestorage:/etc/sysconfig/network # zypper in yast2-iscsi-lio-server
Retrieving repository ‘Main Update Repository’ metadata …………………………………………………………………….[done]
Building repository ‘Main Update Repository’ cache …………………………………………………………………………[done]
Retrieving repository ‘Update Repository (Non-Oss)’ metadata ………………………………………………………………..[done]
Building repository ‘Update Repository (Non-Oss)’ cache …………………………………………………………………….[done]
Loading repository data…
Reading installed packages…
Resolving package dependencies…
The following 5 NEW packages are going to be installed:
python3-configshell-fb python3-rtslib-fb python3-targetcli-fb targetcli-fb-common yast2-iscsi-lio-server
5 new packages to install.
Create ISCSI Target on the susestorage iscsi target server using targetcli
susestorage target iqn is:
iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415
This is generated in targetcli using the create command
However, the iqns for the client initiators are clearly incorrect because they are all the same! So we cant use them…
Reason for this is that the virtual machines were cloned from a single source.
suse61:/etc/sysconfig/network # cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.1996-04.de.suse:01:117bd2582b79
suse62:~ # cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.1996-04.de.suse:01:117bd2582b79
suse63:~ # cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.1996-04.de.suse:01:117bd2582b79
so we have to first generate new ones…
Modify the client initiator IQNs
How to Modify Initiator IQNs
Sometimes, when systems are mass deployed using the same Linux image, or through cloning of virtual machines with KVM, XEN VMWARE or Oracle Virtualbox, you will initially have duplicate initiator IQN IDs in all these systems.
You will need to create a new iSCSI initiator IQN. The initiator IQN for the system is defined in /etc/iscsi/initiatorname.iscsi.
To change the IQN, follow the steps given below.
1. Backup the existing /etc/iscsi/initiatorname.iscsi.
mv /etc/iscsi/initiatorname.iscsi /var/tmp/initiatorname.iscsi.backup
2. Generate the new IQN:
echo “InitiatorName=`/sbin/iscsi-iname`” > /etc/iscsi/initiatorname.iscsi
3. Reconfigure the ISCSI target ACLs to allow access using the new initiator IQN.
suse61:/etc/sysconfig/network # cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2016-04.com.open-iscsi:8c43f05f2f6b
suse61:/etc/sysconfig/network #
suse62:~ # cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2016-04.com.open-iscsi:66a864405884
suse62:~ #
suse63:~ # cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2016-04.com.open-iscsi:aa5ca12c8fc
suse63:~ #
iqn.2016-04.com.open-iscsi:8c43f05f2f6b
iqn.2016-04.com.open-iscsi:66a864405884
iqn.2016-04.com.open-iscsi:aa5ca12c8fc
/iscsi/iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415/tpg1/acls create iqn.2016-04.com.open-iscsi:8c43f05f2f6b
/iscsi/iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415/tpg1/acls create iqn.2016-04.com.open-iscsi:66a864405884
/iscsi/iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415/tpg1/acls create iqn.2016-04.com.open-iscsi:aa5ca12c8fc
susestorage:/ # targetcli
targetcli shell version 2.1.52
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type ‘help’.
/> /backstores/block create lun0 /dev/sdb1
Created block storage object lun0 using /dev/sdb1.
/> /iscsi create
Created target iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415.
Created TPG 1.
Global pref auto_add_default_portal=true
Created default portal listening on all IPs (0.0.0.0), port 3260.
/> cd iscsi
/iscsi> ls
o- iscsi ……………………………………………………………………………………………….. [Targets: 1]
o- iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415 ………………………………………………. [TPGs: 1]
o- tpg1 ……………………………………………………………………………………. [no-gen-acls, no-auth]
o- acls ……………………………………………………………………………………………… [ACLs: 0]
o- luns ……………………………………………………………………………………………… [LUNs: 0]
o- portals ………………………………………………………………………………………… [Portals: 1]
o- 0.0.0.0:3260 …………………………………………………………………………………………. [OK]
/iscsi> cd iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415/
/iscsi/iqn.20….1789836ce415> ls
o- iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415 ………………………………………………… [TPGs: 1]
o- tpg1 ……………………………………………………………………………………… [no-gen-acls, no-auth]
o- acls ……………………………………………………………………………………………….. [ACLs: 0]
o- luns ……………………………………………………………………………………………….. [LUNs: 0]
o- portals ………………………………………………………………………………………….. [Portals: 1]
o- 0.0.0.0:3260 …………………………………………………………………………………………… [OK]
/iscsi/iqn.20….1789836ce415> /tpg1/luns> create /backstores/block/lun0
No such path /tpg1
/iscsi/iqn.20….1789836ce415> cd tpg1/
/iscsi/iqn.20…836ce415/tpg1> cd luns
/iscsi/iqn.20…415/tpg1/luns> create /backstores/block/lun0
Created LUN 0.
/iscsi/iqn.20…415/tpg1/luns> cd /
/> /iscsi/iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415/tpg1/acls create iqn.2016-04.com.open-iscsi:8c43f05f2f6b
Created Node ACL for iqn.2016-04.com.open-iscsi:8c43f05f2f6b
Created mapped LUN 0.
/> /iscsi/iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415/tpg1/acls create iqn.2016-04.com.open-iscsi:66a864405884
Created Node ACL for iqn.2016-04.com.open-iscsi:66a864405884
Created mapped LUN 0.
/> /iscsi/iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415/tpg1/acls create iqn.2016-04.com.open-iscsi:aa5ca12c8fc
Created Node ACL for iqn.2016-04.com.open-iscsi:aa5ca12c8fc
Created mapped LUN 0.
/>
/> ls
o- / …………………………………………………………………………………………………………. […]
o- backstores ……………………………………………………………………………………………….. […]
| o- block …………………………………………………………………………………….. [Storage Objects: 1]
| | o- lun0 ………………………………………………………………… [/dev/sdb1 (10.3MiB) write-thru activated]
| | o- alua ……………………………………………………………………………………… [ALUA Groups: 1]
| | o- default_tg_pt_gp …………………………………………………………….. [ALUA state: Active/optimized]
| o- fileio ……………………………………………………………………………………. [Storage Objects: 0]
| o- pscsi …………………………………………………………………………………….. [Storage Objects: 0]
| o- ramdisk …………………………………………………………………………………… [Storage Objects: 0]
| o- rbd ………………………………………………………………………………………. [Storage Objects: 0]
o- iscsi ……………………………………………………………………………………………… [Targets: 1]
| o- iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415 …………………………………………….. [TPGs: 1]
| o- tpg1 ………………………………………………………………………………….. [no-gen-acls, no-auth]
| o- acls ……………………………………………………………………………………………. [ACLs: 3]
| | o- iqn.2016-04.com.open-iscsi:66a864405884 …………………………………………………….. [Mapped LUNs: 1]
| | | o- mapped_lun0 ………………………………………………………………………. [lun0 block/lun0 (rw)]
| | o- iqn.2016-04.com.open-iscsi:8c43f05f2f6b …………………………………………………….. [Mapped LUNs: 1]
| | | o- mapped_lun0 ………………………………………………………………………. [lun0 block/lun0 (rw)]
| | o- iqn.2016-04.com.open-iscsi:aa5ca12c8fc ……………………………………………………… [Mapped LUNs: 1]
| | o- mapped_lun0 ………………………………………………………………………. [lun0 block/lun0 (rw)]
| o- luns ……………………………………………………………………………………………. [LUNs: 1]
| | o- lun0 ……………………………………………………………. [block/lun0 (/dev/sdb1) (default_tg_pt_gp)]
| o- portals ………………………………………………………………………………………. [Portals: 1]
| o- 0.0.0.0:3260 ……………………………………………………………………………………….. [OK]
o- loopback …………………………………………………………………………………………… [Targets: 0]
o- vhost ……………………………………………………………………………………………… [Targets: 0]
o- xen-pvscsi …………………………………………………………………………………………. [Targets: 0]
/> saveconfig
Last 10 configs saved in /etc/target/backup/.
Configuration saved to /etc/target/saveconfig.json
/> quit
susestorage:/ # systemctl enable targetcli
Created symlink /etc/systemd/system/remote-fs.target.wants/targetcli.service → /usr/lib/systemd/system/targetcli.service.
susestorage:/ # systemctl status targetcli
● targetcli.service – “Generic Target-Mode Service (fb)”
Loaded: loaded (/usr/lib/systemd/system/targetcli.service; enabled; vendor preset: disabled)
Active: active (exited) since Fri 2021-03-12 13:27:54 GMT; 1min 15s ago
Main PID: 2522 (code=exited, status=1/FAILURE)
Mar 12 13:27:54 susestorage systemd[1]: Starting “Generic Target-Mode Service (fb)”…
Mar 12 13:27:54 susestorage targetcli[2522]: storageobject ‘block:lun0’ exist not restoring
Mar 12 13:27:54 susestorage systemd[1]: Started “Generic Target-Mode Service (fb)”.
susestorage:/ #
susestorage:/ # systemctl stop firewalld
susestorage:/ # systemctl disable firewalld
Removed /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
susestorage:/ #
susestorage:/ # systemctl status firewalld
● firewalld.service – firewalld – dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: disabled)
Active: inactive (dead)
Docs: man:firewalld(1)
Mar 12 12:55:38 susestorage systemd[1]: Starting firewalld – dynamic firewall daemon…
Mar 12 12:55:39 susestorage systemd[1]: Started firewalld – dynamic firewall daemon.
Mar 12 13:30:17 susestorage systemd[1]: Stopping firewalld – dynamic firewall daemon…
Mar 12 13:30:18 susestorage systemd[1]: Stopped firewalld – dynamic firewall daemon.
susestorage:/ #
this is the iscsi target service.
susestorage:/ # systemctl enable iscsid ; systemctl start iscsid ; systemctl status iscsid
Created symlink /etc/systemd/system/multi-user.target.wants/iscsid.service → /usr/lib/systemd/system/iscsid.service.
● iscsid.service – Open-iSCSI
Loaded: loaded (/usr/lib/systemd/system/iscsid.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2021-03-12 13:37:52 GMT; 10ms ago
Docs: man:iscsid(8)
man:iscsiuio(8)
man:iscsiadm(8)
Main PID: 2701 (iscsid)
Status: “Ready to process requests”
Tasks: 1
CGroup: /system.slice/iscsid.service
└─2701 /sbin/iscsid -f
Mar 12 13:37:52 susestorage systemd[1]: Starting Open-iSCSI…
Mar 12 13:37:52 susestorage systemd[1]: Started Open-iSCSI.
susestorage:/ #
ISCSI Client Configuration (ISCSI initiators)
next, on the clients suse61, suse62, suse63 install the initiators and configure as follows (on all 3 nodes):
suse61:~ # iscsiadm -m discovery -t sendtargets -p 10.0.6.10
10.0.6.10:3260,1 iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415
suse61:~ #
suse61:~ # iscsiadm -m node -T iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415 -p 10.0.6.10 -l
Logging in to [iface: default, target: iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415, portal: 10.0.6.10,3260]
Login to [iface: default, target: iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415, portal: 10.0.6.10,3260] successful.
suse61:~ #
Note we do NOT mount the iscsi disk for SBD!
check if the iscsi target disk is attached:
suse61:~ # iscsiadm -m session -P 3 | grep ‘Target\|disk’
Target: iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415 (non-flash)
Target Reset Timeout: 30
Attached scsi disk sdd State: running
suse61:~ #
IMPORTANT: this is NOT the same as mounting the disk, we do NOT do that!
on each node we have the same path to the disk:
suse61:~ # ls /dev/disk/by-path/
ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0
suse62:~ # ls /dev/disk/by-path/
ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0
suse63:~ # ls /dev/disk/by-path/
ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0
so, you can put this disk path in your SBD fencing config file
Configure SBD on the Cluster
In the sbd config file you have the directive for the location of your sbd device:
suse61:~ # nano /etc/sysconfig/sbd
# SBD_DEVICE specifies the devices to use for exchanging sbd messages
# and to monitor. If specifying more than one path, use “;” as
# separator.
#
#SBD_DEVICE=””
you can use /dev/disk/by-path designation for this to be certain it is the same on all nodes
namely,
/dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0
suse61:~ # nano /etc/sysconfig/sbd
# SBD_DEVICE specifies the devices to use for exchanging sbd messages
# and to monitor. If specifying more than one path, use “;” as
# separator.
#
#SBD_DEVICE=””
SBD_DEVICE=”/dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0″
then on all three nodes:
check you have put a config file in /etc/modules-load.d with name watchdog.conf !! .conf is essential!
in this file just put the line:
softdog
suse61:/etc/modules-load.d # cat /etc/modules-load.d/watchdog.conf
softdog
suse61:/etc/modules-load.d #
systemctl status systemd-modules-load
suse61:~ # systemctl status systemd-modules-load
● systemd-modules-load.service – Load Kernel Modules
Loaded: loaded (/usr/lib/systemd/system/systemd-modules-load.service; static; vendor preset: disabled)
Active: active (exited) since Thu 2021-03-11 12:38:46 GMT; 15h ago
Docs: man:systemd-modules-load.service(8)
man:modules-load.d(5)
Main PID: 7772 (code=exited, status=0/SUCCESS)
Tasks: 0
CGroup: /system.slice/systemd-modules-load.service
Mar 11 12:38:46 suse61 systemd[1]: Starting Load Kernel Modules…
Mar 11 12:38:46 suse61 systemd[1]: Started Load Kernel Modules.
suse61:~ #
then do on all 3 nodes:
systemctl restart systemd-modules-load
suse61:/etc/modules-load.d # systemctl status systemd-modules-load
● systemd-modules-load.service – Load Kernel Modules
Loaded: loaded (/usr/lib/systemd/system/systemd-modules-load.service; static; vendor preset: disabled)
Active: active (exited) since Fri 2021-03-12 04:18:16 GMT; 11s ago
Docs: man:systemd-modules-load.service(8)
man:modules-load.d(5)
Process: 24239 ExecStart=/usr/lib/systemd/systemd-modules-load (code=exited, status=0/SUCCESS)
Main PID: 24239 (code=exited, status=0/SUCCESS)
Mar 12 04:18:16 suse61 systemd[1]: Starting Load Kernel Modules…
Mar 12 04:18:16 suse61 systemd[1]: Started Load Kernel Modules.
suse61:/etc/modules-load.d # date
Fri 12 Mar 04:18:35 GMT 2021
suse61:/etc/modules-load.d #
lsmod | grep dog to verify:
suse61:/etc/modules-load.d # lsmod | grep dog
softdog 16384 0
suse61:/etc/modules-load.d #
Create the SBD fencing device
sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 create
suse61:/etc/modules-load.d # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 create
Initializing device /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0
Creating version 2.1 header on device 3 (uuid: 614c3373-167d-4bd6-9e03-d302a17b429d)
Initializing 255 slots on device 3
Device /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 is initialized.
suse61:/etc/modules-load.d #
then edit the
nano /etc/sysconfig/sbd
SDB_DEVICE – as above
SBD_WATCHDOG=”yes”
SBD_STARTMODE=”clean” – this is optional, for test env dont use
then sync your cluster config
pcs cluster sync
on suse the command equivalent is:
suse61:/etc/modules-load.d # crm cluster restart
INFO: Cluster services stopped
INFO: Cluster services started
suse61:/etc/modules-load.d #
suse61:/etc/modules-load.d # sbd query-watchdog
Discovered 2 watchdog devices:
[1] /dev/watchdog
Identity: Software Watchdog
Driver: softdog
CAUTION: Not recommended for use with sbd.
[2] /dev/watchdog0
Identity: Software Watchdog
Driver: softdog
CAUTION: Not recommended for use with sbd.
suse61:/etc/modules-load.d #
After you have added your SBD devices to the SBD configuration file, enable the SBD daemon. The SBD daemon is a critical piece of the cluster stack. It needs to be running when the cluster stack is running. Thus, the sbd service is started as a dependency whenever the pacemaker service is started.
suse61:/etc/modules-load.d # systemctl enable sbd
Created symlink /etc/systemd/system/corosync.service.requires/sbd.service → /usr/lib/systemd/system/sbd.service.
Created symlink /etc/systemd/system/pacemaker.service.requires/sbd.service → /usr/lib/systemd/system/sbd.service.
Created symlink /etc/systemd/system/dlm.service.requires/sbd.service → /usr/lib/systemd/system/sbd.service.
suse61:/etc/modules-load.d # crm cluster restart
INFO: Cluster services stopped
INFO: Cluster services started
suse61:/etc/modules-load.d #
suse63:~ # crm_resource –cleanup
Cleaned up all resources on all nodes
suse63:~ #
suse61:/etc/modules-load.d # crm configure
crm(live/suse61)configure# primitive stonith_sbd stonith:external/sbd
crm(live/suse61)configure# property stonith-enabled=”true”
crm(live/suse61)configure# property stonith-timeout=”30″
crm(live/suse61)configure#
verify with:
crm(live/suse61)configure# show
node 167773757: suse61
node 167773758: suse62
node 167773759: suse63
primitive iscsiip IPaddr2 \
params ip=10.0.6.200 \
op monitor interval=10s
primitive stonith_sbd stonith:external/sbd
property cib-bootstrap-options: \
have-watchdog=true \
dc-version=”2.0.4+20200616.2deceaa3a-lp152.2.3.1-2.0.4+20200616.2deceaa3a” \
cluster-infrastructure=corosync \
cluster-name=hacluster \
stonith-enabled=true \
last-lrm-refresh=1615479646 \
stonith-timeout=30
rsc_defaults rsc-options: \
resource-stickiness=1 \
migration-threshold=3
op_defaults op-options: \
timeout=600 \
record-pending=true
crm(live/suse61)configure# commit
crm(live/suse61)configure# exit
WARNING: This command ‘exit’ is deprecated, please use ‘quit’
bye
suse61:/etc/modules-load.d #
Verify the SBD System is active on the cluster
After the resource has started, your cluster is successfully configured for use of SBD. It will use this method in case a node needs to be fenced.
so now it looks like this:
crm_mon
Cluster Summary:
* Stack: corosync
* Current DC: suse63 (version 2.0.4+20200616.2deceaa3a-lp152.2.3.1-2.0.4+20200616.2deceaa3a) – partition with quorum
* Last updated: Fri Mar 12 10:41:40 2021
* Last change: Fri Mar 12 10:40:02 2021 by hacluster via crmd on suse62
* 3 nodes configured
* 2 resource instances configured
Node List:
* Online: [ suse61 suse62 suse63 ]
Active Resources:
* iscsiip (ocf::heartbeat:IPaddr2): Started suse62
* stonith_sbd (stonith:external/sbd): Started suse61
also verify with
suse61:/etc/modules-load.d # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 list
suse61 clear
suse61:/etc/modules-load.d #
suse62:~ # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 list
0 suse61 clear
suse62:~ #
suse63:~ # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 list
0 suse61 clear
suse63:~ #
MAKE SURE WHEN YOU BOOT THE CLUSTER THAT YOU ALWAYS BOOT susestorage VM FIRST! otherwise the sbd will fail to run!
because sbd disk is housed on an iscsi target disk on the susestorage server.
Can also verify with: (also on each cluster node, but only showing one here):
suse61:/etc/modules-load.d # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 dump
==Dumping header on disk /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0
Header version : 2.1
UUID : 614c3373-167d-4bd6-9e03-d302a17b429d
Number of slots : 255
Sector size : 512
Timeout (watchdog) : 5
Timeout (allocate) : 2
Timeout (loop) : 1
Timeout (msgwait) : 10
==Header on disk /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 is dumped
suse61:/etc/modules-load.d #
At this point I did a KVM snapshot backup of each node.
Next we can test the SBD:
suse61:/etc/modules-load.d # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 message suse63 test
sbd failed; please check the logs.
suse61:/etc/modules-load.d #
in journalctl we find:
Mar 12 10:55:20 suse61 sbd[5721]: /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0: error: slot_msg: slot_msg(): No slot found for suse63.
Mar 12 10:55:20 suse61 sbd[5720]: warning: messenger: Process 5721 failed to deliver!
Mar 12 10:55:20 suse61 sbd[5720]: error: messenger: Message is not delivered via more then a half of devices
Had to reboot all machines
then
suse61:~ # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 list
0 suse61 clear
1 suse63 clear
2 suse62 clear
suse61:~ #
To test SBD fencing
suse61:~ # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 message suse62 off
suse62:~ #
Broadcast message from systemd-journald@suse62 (Sat 2021-03-13 00:57:17 GMT):
sbd[1983]: emerg: do_exit: Rebooting system: off
client_loop: send disconnect: Broken pipe
root@yoga:/home/kevin#
You can also test the fencing by using the command
echo c > /proc/sysrq-trigger
suse63:~ #
suse63:~ # echo c > /proc/sysrq-trigger
with that, node63 has hanged and crm_mon then shows:
Cluster Summary:
* Stack: corosync
* Current DC: suse62 (version 2.0.4+20200616.2deceaa3a-lp152.2.3.1-2.0.4+20200616.2deceaa3a) – partition with quorum
* Last updated: Sat Mar 13 15:00:40 2021
* Last change: Fri Mar 12 11:14:12 2021 by hacluster via crmd on suse62
* 3 nodes configured
* 2 resource instances configured
Node List:
* Node suse63: UNCLEAN (offline)
* Online: [ suse61 suse62 ]
Active Resources:
* iscsiip (ocf::heartbeat:IPaddr2): Started suse63 (UNCLEAN)
* stonith_sbd (stonith:external/sbd): Started [ suse62 suse63 ]
Failed Fencing Actions:
* reboot of suse62 failed: delegate=, client=pacemaker-controld.1993, origin=suse61, last-failed=’2021-03-12 20:55:09Z’
Pending Fencing Actions:
* reboot of suse63 pending: client=pacemaker-controld.2549, origin=suse62
Thus we can see that node suse63 has been recognized by the cluster as failed and has been fenced.
We must now reboot node suse63 and clear the fenced state.
How To Restore A Node After SBD Fencing
A fencing message from SBD in the sbd slot for the node will not allow the node to join the cluster until it’s been manually cleared.
This means that when the node next boots up it will not join the cluster and will initially be in error state.
So, after fencing a node, when it reboots you need to do the following:
After fencing a node, when it reboots:
first make sure the ISCSI disk is connected on ALL nodes including the fenced one:
on each node do:
suse62:/dev/disk/by-path # iscsiadm -m discovery -t sendtargets -p 10.0.6.10
suse62:/dev/disk/by-path # iscsiadm -m node -T iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415 -p 10.0.6.10 -l
Logging in to [iface: default, target: iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415, portal: 10.0.6.10,3260]
Login to [iface: default, target: iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415, portal: 10.0.6.10,3260] successful.
suse62:/dev/disk/by-path #
THEN, run the sbd “clear fencing poison pill” command:
either locally on the fenced node:
suse62:/dev/disk/by-path # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 message LOCAL clear
or else from another node in the cluster, replacing LOCAL with the name of the fenced node:
suse61:/dev/disk/by-path # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 message suse62 clear
Also had to start pacemaker on the fenced node after the reboot, ie:
on suse63:
systemctl start pacemaker
cluster was then synced correctly. Verify to check:
suse61:~ # crm cluster restart
INFO: Cluster services stopped
INFO: Cluster services started
suse61:~ #
suse61:~ # crm_resource –cleanup
Cleaned up all resources on all nodes
then verify to check:
(failed fencing actions is a historical log entry which refers to the reboot, namely the fact that at the reboot stage the fenced node suse62 was at that point not yet cleared of the sbd fence in order to rejoin the cluster)
suse61:~ # crm_mon
Cluster Summary:
* Stack: corosync
* Current DC: suse63 (version 2.0.4+20200616.2deceaa3a-lp152.2.3.1-2.0.4+20200616.2deceaa3a) – partition with quorum
* Last updated: Sat Mar 13 07:04:38 2021
* Last change: Fri Mar 12 11:14:12 2021 by hacluster via crmd on suse62
* 3 nodes configured
* 2 resource instances configured
Node List:
* Online: [ suse61 suse62 suse63 ]
Active Resources:
* iscsiip (ocf::heartbeat:IPaddr2): Started suse63
* stonith_sbd (stonith:external/sbd): Started suse63
Failed Fencing Actions:
* reboot of suse62 failed: delegate=, client=pacemaker-controld.1993, origin=suse61, last-failed=’2021-03-12 20:55:09Z’
On Reboot
1. check that the SBD ISCSI disk is present on each node:
suse61:/dev/disk/by-path # ls -l
total 0
lrwxrwxrwx 1 root root 9 Mar 15 13:51 ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-
iscsi.susestorage.x8664:sn.1789836ce415-lun-0 -> ../../sdd
If not present, then re-login to the iscsi target server:
iscsiadm -m discovery -t sendtargets -p 10.0.6.10
iscsiadm -m node -T iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415 -p 10.0.6.10 -l
2. Check that the SBD device is present. If not, then re-create the device with:
suse62:/dev/disk/by-path # sbd -d /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 create
Initializing device /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0
Creating version 2.1 header on device 3 (uuid: 0d1a68bb-8ccf-4471-8bc9-4b2939a5f063)
Initializing 255 slots on device 3
Device /dev/disk/by-path/ip-10.0.6.10:3260-iscsi-iqn.2003-01.org.linux-iscsi.susestorage.x8664:sn.1789836ce415-lun-0 is initialized.
suse62:/dev/disk/by-path #
It should not usually be necessary to start pacemaker or corosync directly, as these are started on each node by the cluster DC node (suse61).
use
crm_resource cleanup
to clear error states.
If nodes still do not join the cluster, on the affected nodes use:
systemctl start pacemaker
see example below:
suse63:/dev/disk/by-path # crm_resource cleanup
Could not connect to the CIB: Transport endpoint is not connected
Error performing operation: Transport endpoint is not connected
suse63:/dev/disk/by-path # systemctl status corosync
● corosync.service – Corosync Cluster Engine
Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2021-03-15 13:04:50 GMT; 58min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 1828 (corosync)
Tasks: 2
CGroup: /system.slice/corosync.service
└─1828 corosync
Mar 15 13:16:14 suse63 corosync[1828]: [CPG ] downlist left_list: 1 received
Mar 15 13:16:14 suse63 corosync[1828]: [CPG ] downlist left_list: 1 received
Mar 15 13:16:14 suse63 corosync[1828]: [QUORUM] Members[2]: 167773758 167773759
Mar 15 13:16:14 suse63 corosync[1828]: [MAIN ] Completed service synchronization, ready to provide service.
Mar 15 13:16:41 suse63 corosync[1828]: [TOTEM ] A new membership (10.0.6.61:268) was formed. Members joined: 167773757
Mar 15 13:16:41 suse63 corosync[1828]: [CPG ] downlist left_list: 0 received
Mar 15 13:16:41 suse63 corosync[1828]: [CPG ] downlist left_list: 0 received
Mar 15 13:16:41 suse63 corosync[1828]: [CPG ] downlist left_list: 0 received
Mar 15 13:16:41 suse63 corosync[1828]: [QUORUM] Members[3]: 167773757 167773758 167773759
Mar 15 13:16:41 suse63 corosync[1828]: [MAIN ] Completed service synchronization, ready to provide service.
suse63:/dev/disk/by-path # systemctl status pacemaker
● pacemaker.service – Pacemaker High Availability Cluster Manager
Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)
Active: inactive (dead)
Docs: man:pacemakerd
https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html
Mar 15 13:06:20 suse63 systemd[1]: Dependency failed for Pacemaker High Availability Cluster Manager.
Mar 15 13:06:20 suse63 systemd[1]: pacemaker.service: Job pacemaker.service/start failed with result ‘dependency’.
Mar 15 13:08:46 suse63 systemd[1]: Dependency failed for Pacemaker High Availability Cluster Manager.
Mar 15 13:08:46 suse63 systemd[1]: pacemaker.service: Job pacemaker.service/start failed with result ‘dependency’.
Mar 15 13:13:28 suse63 systemd[1]: Dependency failed for Pacemaker High Availability Cluster Manager.
Mar 15 13:13:28 suse63 systemd[1]: pacemaker.service: Job pacemaker.service/start failed with result ‘dependency’.
Mar 15 13:30:07 suse63 systemd[1]: Dependency failed for Pacemaker High Availability Cluster Manager.
Mar 15 13:30:07 suse63 systemd[1]: pacemaker.service: Job pacemaker.service/start failed with result ‘dependency’.
suse63:/dev/disk/by-path # systemctl start pacemaker
suse63:/dev/disk/by-path # systemctl status pacemaker
● pacemaker.service – Pacemaker High Availability Cluster Manager
Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2021-03-15 14:03:54 GMT; 2s ago
Docs: man:pacemakerd
https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html
Main PID: 2474 (pacemakerd)
Tasks: 7
CGroup: /system.slice/pacemaker.service
├─2474 /usr/sbin/pacemakerd -f
├─2475 /usr/lib/pacemaker/pacemaker-based
├─2476 /usr/lib/pacemaker/pacemaker-fenced
├─2477 /usr/lib/pacemaker/pacemaker-execd
├─2478 /usr/lib/pacemaker/pacemaker-attrd
├─2479 /usr/lib/pacemaker/pacemaker-schedulerd
└─2480 /usr/lib/pacemaker/pacemaker-controld
Mar 15 14:03:56 suse63 pacemaker-controld[2480]: notice: Could not obtain a node name for corosync nodeid 167773758
Mar 15 14:03:56 suse63 pacemaker-controld[2480]: notice: Node (null) state is now member
Mar 15 14:03:56 suse63 pacemaker-controld[2480]: notice: Node suse63 state is now member
Mar 15 14:03:56 suse63 pacemaker-controld[2480]: notice: Defaulting to uname -n for the local corosync node name
Mar 15 14:03:56 suse63 pacemaker-controld[2480]: notice: Pacemaker controller successfully started and accepting connections
Mar 15 14:03:56 suse63 pacemaker-controld[2480]: notice: State transition S_STARTING -> S_PENDING
Mar 15 14:03:57 suse63 pacemaker-controld[2480]: notice: Could not obtain a node name for corosync nodeid 167773757
Mar 15 14:03:57 suse63 pacemaker-controld[2480]: notice: Could not obtain a node name for corosync nodeid 167773758
Mar 15 14:03:57 suse63 pacemaker-controld[2480]: notice: Fencer successfully connected
Mar 15 14:03:57 suse63 pacemaker-controld[2480]: notice: State transition S_PENDING -> S_NOT_DC
suse63:/dev/disk/by-path #
To start the cluster:
crm cluster start
SBD Command Syntax
suse61:~ # sbd
Not enough arguments.
Shared storage fencing tool.
Syntax:
sbd <options> <command> <cmdarguments>
Options:
-d <devname> Block device to use (mandatory; can be specified up to 3 times)
-h Display this help.
-n <node> Set local node name; defaults to uname -n (optional)
-R Do NOT enable realtime priority (debugging only)
-W Use watchdog (recommended) (watch only)
-w <dev> Specify watchdog device (optional) (watch only)
-T Do NOT initialize the watchdog timeout (watch only)
-S <0|1> Set start mode if the node was previously fenced (watch only)
-p <path> Write pidfile to the specified path (watch only)
-v|-vv|-vvv Enable verbose|debug|debug-library logging (optional)
-1 <N> Set watchdog timeout to N seconds (optional, create only)
-2 <N> Set slot allocation timeout to N seconds (optional, create only)
-3 <N> Set daemon loop timeout to N seconds (optional, create only)
-4 <N> Set msgwait timeout to N seconds (optional, create only)
-5 <N> Warn if loop latency exceeds threshold (optional, watch only)
(default is 3, set to 0 to disable)
-C <N> Watchdog timeout to set before crashdumping
(def: 0s = disable gracefully, optional)
-I <N> Async IO read timeout (defaults to 3 * loop timeout, optional)
-s <N> Timeout to wait for devices to become available (def: 120s)
-t <N> Dampening delay before faulty servants are restarted (optional)
(default is 5, set to 0 to disable)
-F <N> # of failures before a servant is considered faulty (optional)
(default is 1, set to 0 to disable)
-P Check Pacemaker quorum and node health (optional, watch only)
-Z Enable trace mode. WARNING: UNSAFE FOR PRODUCTION!
-r Set timeout-action to comma-separated combination of
noflush|flush plus reboot|crashdump|off (default is flush,reboot)
Commands:
create initialize N slots on <dev> – OVERWRITES DEVICE!
list List all allocated slots on device, and messages.
dump Dump meta-data header from device.
allocate <node>
Allocate a slot for node (optional)
message <node> (test|reset|off|crashdump|clear|exit)
Writes the specified message to node’s slot.
watch Loop forever, monitoring own slot
query-watchdog Check for available watchdog-devices and print some info
test-watchdog Test the watchdog-device selected.
Attention: This will arm the watchdog and have your system reset
in case your watchdog is working properly!
suse61:~ #