Bug 2166243
Summary: | Commands `pcs stonith sbd enable|disable` do not work properly when cluster is not running | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Miroslav Lisik <mlisik> | |
Component: | pcs | Assignee: | Miroslav Lisik <mlisik> | |
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 8.8 | CC: | cluster-maint, idevat, mlisik, mpospisi, nhostako, omular, tojeline | |
Target Milestone: | rc | Keywords: | Regression, Triaged | |
Target Release: | 8.8 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | pcs-0.10.15-4.el8 | Doc Type: | No Doc Update | |
Doc Text: |
The affected packages have not been released.
|
Story Points: | --- | |
Clone Of: | ||||
: | 2166249 (view as bug list) | Environment: | ||
Last Closed: | 2023-05-16 08:12:43 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2166249 |
Description
Miroslav Lisik
2023-02-01 09:43:18 UTC
Another additional info: [root@r88-1 ~]# rpm -q pcs pcs-0.10.15-1.el8.x86_64 [root@r88-1 ~]# pcs host auth -u hacluster -p $PASSWORD r88-{1..2}.vm r88-1.vm: Authorized r88-2.vm: Authorized [root@r88-1 ~]# pcs cluster setup HACluster r88-{1..2}.vm --start --wait No addresses specified for host 'r88-1.vm', using 'r88-1.vm' No addresses specified for host 'r88-2.vm', using 'r88-2.vm' Destroying cluster on hosts: 'r88-1.vm', 'r88-2.vm'... r88-1.vm: Successfully destroyed cluster r88-2.vm: Successfully destroyed cluster Requesting remove 'pcsd settings' from 'r88-1.vm', 'r88-2.vm' r88-1.vm: successful removal of the file 'pcsd settings' r88-2.vm: successful removal of the file 'pcsd settings' Sending 'corosync authkey', 'pacemaker authkey' to 'r88-1.vm', 'r88-2.vm' r88-1.vm: successful distribution of the file 'corosync authkey' r88-1.vm: successful distribution of the file 'pacemaker authkey' r88-2.vm: successful distribution of the file 'corosync authkey' r88-2.vm: successful distribution of the file 'pacemaker authkey' Sending 'corosync.conf' to 'r88-1.vm', 'r88-2.vm' r88-1.vm: successful distribution of the file 'corosync.conf' r88-2.vm: successful distribution of the file 'corosync.conf' Cluster has been successfully set up. Starting cluster on hosts: 'r88-1.vm', 'r88-2.vm'... Waiting for node(s) to start: 'r88-1.vm', 'r88-2.vm'... r88-1.vm: Cluster started r88-2.vm: Cluster started [root@r88-1 ~]# pcs cluster stop --all r88-2.vm: Stopping Cluster (pacemaker)... r88-1.vm: Stopping Cluster (pacemaker)... r88-1.vm: Stopping Cluster (corosync)... r88-2.vm: Stopping Cluster (corosync)... [root@r88-1 ~]# pcs stonith sbd enable Running SBD pre-enabling checks... r88-1.vm: SBD pre-enabling checks done r88-2.vm: SBD pre-enabling checks done Warning: auto_tie_breaker quorum option will be enabled to make SBD fencing effective. Cluster has to be offline to be able to make this change. Checking corosync is not running on nodes... r88-1.vm: corosync is not running r88-2.vm: corosync is not running Sending updated corosync.conf to nodes... r88-1.vm: Succeeded r88-2.vm: Succeeded Distributing SBD config... r88-1.vm: SBD config saved r88-2.vm: SBD config saved Enabling sbd... r88-2.vm: sbd enabled r88-1.vm: sbd enabled Warning: Cluster restart is required in order to apply these changes. [root@r88-1 ~]# pcs cluster start --all --wait r88-2.vm: Starting Cluster... r88-1.vm: Starting Cluster... Waiting for node(s) to start... r88-2.vm: Started r88-1.vm: Started [root@r88-1 ~]# grep SBD_WATCHDOG_TIMEOUT /etc/sysconfig/sbd SBD_WATCHDOG_TIMEOUT=5 [root@r88-1 ~]# pcs property set stonith-watchdog-timeout=10 [root@r88-1 ~]# pcs cluster cib | grep stonith-watchdog-timeout <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> [root@r88-1 ~]# pcs cluster stop --all r88-2.vm: Stopping Cluster (pacemaker)... r88-1.vm: Stopping Cluster (pacemaker)... r88-1.vm: Stopping Cluster (corosync)... r88-2.vm: Stopping Cluster (corosync)... 1) sbd enable [root@r88-1 ~]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> total 76 drwxr-x---. 2 hacluster haclient 4096 Jan 31 18:26 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. -rw-------. 1 hacluster haclient 258 Jan 31 18:24 cib-1.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-1.raw.sig -rw-------. 1 hacluster haclient 416 Jan 31 18:24 cib-2.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-2.raw.sig -rw-------. 1 hacluster haclient 719 Jan 31 18:25 cib-3.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-3.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-4.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-4.raw.sig -rw-------. 1 hacluster haclient 946 Jan 31 18:25 cib-5.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-5.raw.sig -rw-------. 1 hacluster haclient 945 Jan 31 18:26 cib-6.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-6.raw.sig -rw-------. 1 hacluster haclient 957 Jan 31 18:26 cib-7.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-7.raw.sig -rw-r-----. 1 hacluster haclient 1 Jan 31 18:26 cib.last -rw-------. 1 hacluster haclient 1065 Jan 31 18:26 cib.xml -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib.xml.sig [root@r88-2 ~]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> total 84 drwxr-x---. 2 hacluster haclient 4096 Jan 31 18:26 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. -rw-------. 1 hacluster haclient 258 Jan 31 18:24 cib-1.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-1.raw.sig -rw-------. 1 hacluster haclient 416 Jan 31 18:24 cib-2.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-2.raw.sig -rw-------. 1 hacluster haclient 719 Jan 31 18:25 cib-3.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-3.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-4.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-4.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-5.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-5.raw.sig -rw-------. 1 hacluster haclient 946 Jan 31 18:25 cib-6.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-6.raw.sig -rw-------. 1 hacluster haclient 945 Jan 31 18:26 cib-7.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-7.raw.sig -rw-------. 1 hacluster haclient 957 Jan 31 18:26 cib-8.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-8.raw.sig -rw-r-----. 1 hacluster haclient 1 Jan 31 18:26 cib.last -rw-------. 1 hacluster haclient 1065 Jan 31 18:26 cib.xml -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib.xml.sig [root@r88-1 ~]# pcs stonith sbd enable Running SBD pre-enabling checks... r88-1.vm: SBD pre-enabling checks done r88-2.vm: SBD pre-enabling checks done Distributing SBD config... r88-1.vm: SBD config saved r88-2.vm: SBD config saved Enabling sbd... r88-1.vm: sbd enabled r88-2.vm: sbd enabled Warning: Cluster restart is required in order to apply these changes. [root@r88-1 ~]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib total 76 drwxr-x---. 2 hacluster haclient 4096 Jan 31 18:26 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. -rw-------. 1 hacluster haclient 258 Jan 31 18:24 cib-1.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-1.raw.sig -rw-------. 1 hacluster haclient 416 Jan 31 18:24 cib-2.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-2.raw.sig -rw-------. 1 hacluster haclient 719 Jan 31 18:25 cib-3.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-3.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-4.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-4.raw.sig -rw-------. 1 hacluster haclient 946 Jan 31 18:25 cib-5.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-5.raw.sig -rw-------. 1 hacluster haclient 945 Jan 31 18:26 cib-6.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-6.raw.sig -rw-------. 1 hacluster haclient 957 Jan 31 18:26 cib-7.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-7.raw.sig -rw-r-----. 1 hacluster haclient 1 Jan 31 18:26 cib.last -rw-------. 1 hacluster haclient 968 Jan 31 18:28 cib.xml -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib.xml.sig [root@r88-2 ~]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> total 84 drwxr-x---. 2 hacluster haclient 4096 Jan 31 18:26 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. -rw-------. 1 hacluster haclient 258 Jan 31 18:24 cib-1.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-1.raw.sig -rw-------. 1 hacluster haclient 416 Jan 31 18:24 cib-2.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-2.raw.sig -rw-------. 1 hacluster haclient 719 Jan 31 18:25 cib-3.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-3.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-4.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-4.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-5.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-5.raw.sig -rw-------. 1 hacluster haclient 946 Jan 31 18:25 cib-6.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-6.raw.sig -rw-------. 1 hacluster haclient 945 Jan 31 18:26 cib-7.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-7.raw.sig -rw-------. 1 hacluster haclient 957 Jan 31 18:26 cib-8.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-8.raw.sig -rw-r-----. 1 hacluster haclient 1 Jan 31 18:26 cib.last -rw-------. 1 hacluster haclient 1065 Jan 31 18:26 cib.xml -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib.xml.sig [root@r88-1 ~]# pcs cluster start --all --wait r88-2.vm: Starting Cluster... r88-1.vm: Starting Cluster... Waiting for node(s) to start... r88-2.vm: Started r88-1.vm: Started [root@r88-1 ~]# pcs cluster cib | grep stonith-watchdog-timeout <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> RESULT: There is stonith-watchdog-timeout set from previous configuration. It should be removed during sbd enable. Another issue is that pcs is removing stonith-watchdog-timeout only on one node. [root@r88-1 ~]# pcs cluster stop --all r88-2.vm: Stopping Cluster (pacemaker)... r88-1.vm: Stopping Cluster (pacemaker)... r88-2.vm: Stopping Cluster (corosync)... r88-1.vm: Stopping Cluster (corosync)... 2) sbd disable [root@r88-1 ~]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> total 92 drwxr-x---. 2 hacluster haclient 4096 Jan 31 18:31 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. -rw-------. 1 hacluster haclient 258 Jan 31 18:24 cib-1.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-1.raw.sig -rw-------. 1 hacluster haclient 416 Jan 31 18:24 cib-2.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-2.raw.sig -rw-------. 1 hacluster haclient 719 Jan 31 18:25 cib-3.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-3.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-4.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-4.raw.sig -rw-------. 1 hacluster haclient 946 Jan 31 18:25 cib-5.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-5.raw.sig -rw-------. 1 hacluster haclient 945 Jan 31 18:26 cib-6.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-6.raw.sig -rw-------. 1 hacluster haclient 957 Jan 31 18:26 cib-7.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-7.raw.sig -rw-------. 1 hacluster haclient 945 Jan 31 18:30 cib-9.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:30 cib-9.raw.sig -rw-------. 1 hacluster haclient 968 Jan 31 18:28 cib.auto.2O0wo1 -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib.auto.scCD80 -rw-r-----. 1 hacluster haclient 2 Jan 31 18:31 cib.last -rw-------. 1 hacluster haclient 1054 Jan 31 18:31 cib.xml -rw-------. 1 hacluster haclient 32 Jan 31 18:31 cib.xml.sig [root@r88-2 ~]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib /var/lib/pacemaker/cib/cib-9.raw: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> /var/lib/pacemaker/cib/cib-10.raw: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> total 100 drwxr-x---. 2 hacluster haclient 4096 Jan 31 18:31 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. -rw-------. 1 hacluster haclient 1053 Jan 31 18:30 cib-10.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:30 cib-10.raw.sig -rw-------. 1 hacluster haclient 258 Jan 31 18:24 cib-1.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-1.raw.sig -rw-------. 1 hacluster haclient 416 Jan 31 18:24 cib-2.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-2.raw.sig -rw-------. 1 hacluster haclient 719 Jan 31 18:25 cib-3.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-3.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-4.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-4.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-5.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-5.raw.sig -rw-------. 1 hacluster haclient 946 Jan 31 18:25 cib-6.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-6.raw.sig -rw-------. 1 hacluster haclient 945 Jan 31 18:26 cib-7.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-7.raw.sig -rw-------. 1 hacluster haclient 957 Jan 31 18:26 cib-8.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-8.raw.sig -rw-------. 1 hacluster haclient 1065 Jan 31 18:26 cib-9.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-9.raw.sig -rw-r-----. 1 hacluster haclient 2 Jan 31 18:31 cib.last -rw-------. 1 hacluster haclient 1053 Jan 31 18:31 cib.xml -rw-------. 1 hacluster haclient 32 Jan 31 18:31 cib.xml.sig [root@r88-1 ~]# pcs stonith sbd disable Disabling sbd... r88-1.vm: sbd disabled r88-2.vm: sbd disabled Warning: Cluster restart is required in order to apply these changes. [root@r88-1 ~]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="0"/> total 92 drwxr-x---. 2 hacluster haclient 4096 Jan 31 18:31 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. -rw-------. 1 hacluster haclient 258 Jan 31 18:24 cib-1.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-1.raw.sig -rw-------. 1 hacluster haclient 416 Jan 31 18:24 cib-2.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-2.raw.sig -rw-------. 1 hacluster haclient 719 Jan 31 18:25 cib-3.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-3.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-4.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-4.raw.sig -rw-------. 1 hacluster haclient 946 Jan 31 18:25 cib-5.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-5.raw.sig -rw-------. 1 hacluster haclient 945 Jan 31 18:26 cib-6.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-6.raw.sig -rw-------. 1 hacluster haclient 957 Jan 31 18:26 cib-7.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-7.raw.sig -rw-------. 1 hacluster haclient 945 Jan 31 18:30 cib-9.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:30 cib-9.raw.sig -rw-------. 1 hacluster haclient 968 Jan 31 18:28 cib.auto.2O0wo1 -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib.auto.scCD80 -rw-r-----. 1 hacluster haclient 2 Jan 31 18:31 cib.last -rw-------. 1 hacluster haclient 1065 Jan 31 18:32 cib.xml -rw-------. 1 hacluster haclient 32 Jan 31 18:31 cib.xml.sig [root@r88-2 ~]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib /var/lib/pacemaker/cib/cib-9.raw: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> /var/lib/pacemaker/cib/cib-10.raw: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> total 100 drwxr-x---. 2 hacluster haclient 4096 Jan 31 18:31 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. -rw-------. 1 hacluster haclient 1053 Jan 31 18:30 cib-10.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:30 cib-10.raw.sig -rw-------. 1 hacluster haclient 258 Jan 31 18:24 cib-1.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-1.raw.sig -rw-------. 1 hacluster haclient 416 Jan 31 18:24 cib-2.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:24 cib-2.raw.sig -rw-------. 1 hacluster haclient 719 Jan 31 18:25 cib-3.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-3.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-4.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-4.raw.sig -rw-------. 1 hacluster haclient 958 Jan 31 18:25 cib-5.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-5.raw.sig -rw-------. 1 hacluster haclient 946 Jan 31 18:25 cib-6.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:25 cib-6.raw.sig -rw-------. 1 hacluster haclient 945 Jan 31 18:26 cib-7.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-7.raw.sig -rw-------. 1 hacluster haclient 957 Jan 31 18:26 cib-8.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-8.raw.sig -rw-------. 1 hacluster haclient 1065 Jan 31 18:26 cib-9.raw -rw-------. 1 hacluster haclient 32 Jan 31 18:26 cib-9.raw.sig -rw-r-----. 1 hacluster haclient 2 Jan 31 18:31 cib.last -rw-------. 1 hacluster haclient 1053 Jan 31 18:31 cib.xml -rw-------. 1 hacluster haclient 32 Jan 31 18:31 cib.xml.sig [root@r88-1 ~]# pcs cluster start --all --wait r88-2.vm: Starting Cluster... r88-1.vm: Starting Cluster... Waiting for node(s) to start... r88-1.vm: Started [root@r88-2 ~]# Broadcast message from systemd-journald@r88-2 (Tue 2023-01-31 18:35:18 CET): pacemaker-controld[11656]: emerg: Shutting down: stonith-watchdog-timeout configured (10) but SBD not active Message from syslogd@r88-2 at Jan 31 18:35:18 ... pacemaker-controld[11656]: emerg: Shutting down: stonith-watchdog-timeout configured (10) but SBD not active [root@r88-1 ~]# pcs cluster start --all --wait=1 r88-1.vm: Starting Cluster... r88-2.vm: Starting Cluster... Waiting for node(s) to start... Broadcast message from systemd-journald@r88-1 (Tue 2023-01-31 18:40:47 CET): pacemaker-controld[6920]: emerg: Shutting down: stonith-watchdog-timeout configured (10) but SBD not active Message from syslogd@r88-1 at Jan 31 18:40:47 ... pacemaker-controld[6920]: emerg: Shutting down: stonith-watchdog-timeout configured (10) but SBD not active r88-1.vm: Waiting timeout r88-2.vm: Started Error: unable to verify all nodes have started RESULT: Cluster cannot start because stonith-watchdog-timeout was not set to 0 during disable. 3) enabling sbd before first cluster start [root@r88-1 ~]# pcs host auth -u hacluster -p $PASSWORD r88-{1..2}.vm r88-1.vm: Authorized r88-2.vm: Authorized [root@r88-1 ~]# pcs cluster setup HACluster r88-{1..2}.vm No addresses specified for host 'r88-1.vm', using 'r88-1.vm' No addresses specified for host 'r88-2.vm', using 'r88-2.vm' Destroying cluster on hosts: 'r88-1.vm', 'r88-2.vm'... r88-1.vm: Successfully destroyed cluster r88-2.vm: Successfully destroyed cluster Requesting remove 'pcsd settings' from 'r88-1.vm', 'r88-2.vm' r88-1.vm: successful removal of the file 'pcsd settings' r88-2.vm: successful removal of the file 'pcsd settings' Sending 'corosync authkey', 'pacemaker authkey' to 'r88-1.vm', 'r88-2.vm' r88-1.vm: successful distribution of the file 'corosync authkey' r88-1.vm: successful distribution of the file 'pacemaker authkey' r88-2.vm: successful distribution of the file 'corosync authkey' r88-2.vm: successful distribution of the file 'pacemaker authkey' Sending 'corosync.conf' to 'r88-1.vm', 'r88-2.vm' r88-1.vm: successful distribution of the file 'corosync.conf' r88-2.vm: successful distribution of the file 'corosync.conf' Cluster has been successfully set up. [root@r88-1 ~]# ls -la /var/lib/pacemaker/cib total 8 drwxr-x---. 2 hacluster haclient 4096 Feb 1 09:46 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. [root@r88-1 ~]# pcs stonith sbd enable Running SBD pre-enabling checks... r88-1.vm: SBD pre-enabling checks done r88-2.vm: SBD pre-enabling checks done Warning: auto_tie_breaker quorum option will be enabled to make SBD fencing effective. Cluster has to be offline to be able to make this change. Checking corosync is not running on nodes... r88-2.vm: corosync is not running r88-1.vm: corosync is not running Sending updated corosync.conf to nodes... r88-1.vm: Succeeded r88-2.vm: Succeeded Distributing SBD config... r88-1.vm: SBD config saved r88-2.vm: SBD config saved Enabling sbd... r88-1.vm: sbd enabled r88-2.vm: sbd enabled Warning: Cluster restart is required in order to apply these changes. [root@r88-1 ~]# ls -la /var/lib/pacemaker/cib total 12 drwxr-x---. 2 hacluster haclient 4096 Feb 1 09:57 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. -rw-r--r--. 1 root root 275 Feb 1 09:57 cib.xml [root@r88-1 ~]# cat /var/lib/pacemaker/cib/cib.xml <cib admin_epoch="0" epoch="2" num_updates="0" validate-with="pacemaker-3.1"> <configuration> <crm_config> <cluster_property_set id="cib-bootstrap-options"/> </crm_config> <nodes/> <resources/> <constraints/> </configuration> <status/> </cib> [root@r88-1 ~]# pcs cluster start --all --wait r88-2.vm: Starting Cluster... r88-1.vm: Starting Cluster... Waiting for node(s) to start... r88-2.vm: Started ^CTraceback (most recent call last): <snip> KeyboardInterrupt RESULT: command hanged and node r88-1 has not fully started [root@r88-1 ~]# systemctl status pacemaker ● pacemaker.service - Pacemaker High Availability Cluster Manager Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2023-02-01 10:02:39 CET; 4min 4s ago Docs: man:pacemakerd https://clusterlabs.org/pacemaker/doc/ Main PID: 7637 (pacemakerd) Tasks: 6 Memory: 21.1M CGroup: /system.slice/pacemaker.service ├─7637 /usr/sbin/pacemakerd ├─7641 /usr/libexec/pacemaker/pacemaker-fenced ├─7642 /usr/libexec/pacemaker/pacemaker-execd ├─7644 /usr/libexec/pacemaker/pacemaker-schedulerd ├─8498 /usr/libexec/pacemaker/pacemaker-attrd └─8499 /usr/libexec/pacemaker/pacemaker-controld Feb 01 10:06:37 r88-1 pacemaker-attrd[8498]: notice: Starting Pacemaker node attribute manager Feb 01 10:06:37 r88-1 pacemaker-controld[8423]: warning: Couldn't complete CIB registration 10 times... pause and retry Feb 01 10:06:39 r88-1 pacemakerd[7637]: error: pacemaker-controld[8423] is unresponsive to ipc after 5 tries but we found the pid so have it killed that we can restart Feb 01 10:06:39 r88-1 pacemakerd[7637]: notice: Stopping pacemaker-controld Feb 01 10:06:39 r88-1 pacemakerd[7637]: warning: pacemaker-controld[8423] terminated with signal 9 (Killed) Feb 01 10:06:39 r88-1 pacemakerd[7637]: notice: Respawning pacemaker-controld subdaemon after unexpected exit Feb 01 10:06:39 r88-1 pacemaker-controld[8499]: notice: Additional logging available in /var/log/pacemaker/pacemaker.log Feb 01 10:06:39 r88-1 pacemaker-controld[8499]: notice: Starting Pacemaker controller Feb 01 10:06:40 r88-1 pacemaker-controld[8499]: warning: Couldn't complete CIB registration 1 times... pause and retry Feb 01 10:06:43 r88-1 pacemakerd[7637]: notice: pacemaker-attrd[8498] is unresponsive to ipc after 1 tries Snippet from journalctl: Feb 01 10:03:02 r88-1 pacemaker-based[7781]: notice: Additional logging available in /var/log/pacemaker/pacemaker.log Feb 01 10:03:02 r88-1 pacemaker-based[7781]: notice: Starting Pacemaker CIB manager Feb 01 10:03:02 r88-1 pacemaker-based[7781]: notice: /var/lib/pacemaker/cib/cib.xml is not owned by user hacluster Feb 01 10:03:02 r88-1 pacemaker-based[7781]: notice: /var/lib/pacemaker/cib/cib.xml is not owned by group haclient Feb 01 10:03:02 r88-1 pacemaker-based[7781]: error: /var/lib/pacemaker/cib/cib.xml must be owned and writable by either user hacluster or group haclient Feb 01 10:03:02 r88-1 pacemaker-based[7781]: error: Ignoring invalid CIB Feb 01 10:03:02 r88-1 pacemaker-based[7781]: crit: Could not write out new CIB and no saved version to revert to Feb 01 10:03:02 r88-1 pacemaker-based[7781]: crit: Cannot start CIB... terminating Feb 01 10:03:02 r88-1 pacemakerd[7637]: error: pacemaker-based[7781] exited with status 66 (Input file not available) Feb 01 10:03:02 r88-1 pacemakerd[7637]: notice: Respawning pacemaker-based subdaemon after unexpected exit Upstream patch: https://github.com/ClusterLabs/pcs/commit/b2086863548d94be0df3dfafcf5d6369e92d4c0f Test: (pcs) [root@r88-1 pcs]# pcs/pcs host auth -u hacluster -p $PASSWORD r88-{1..2}.vm r88-1.vm: Authorized r88-2.vm: Authorized (pcs) [root@r88-1 pcs]# pcs/pcs cluster setup HACluster r88-{1..2}.vm <snip> >>> Enabling sbd before first cluster start (pcs) [root@r88-1 pcs]# pcs/pcs stonith sbd enable <snip> (pcs) [root@r88-1 pcs]# ls -la /var/lib/pacemaker/cib total 8 drwxr-x---. 2 hacluster haclient 4096 Feb 2 13:05 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. (pcs) [root@r88-2 pcs]# ls -la /var/lib/pacemaker/cib total 8 drwxr-x---. 2 hacluster haclient 4096 Feb 2 13:05 . drwxr-x---. 6 hacluster haclient 4096 Jan 31 16:15 .. (pcs) [root@r88-1 pcs]# pcs/pcs cluster start --all --wait <snip> (pcs) [root@r88-1 pcs]# pcs/pcs status | grep -A1 "Node List" Node List: * Online: [ r88-1.vm r88-2.vm ] >>> Cluster started without issues. >>> Set stonith-watchdog-timeout and stop cluster. (pcs) [root@r88-1 pcs]# pcs/pcs property set stonith-watchdog-timeout=10 (pcs) [root@r88-1 pcs]# pcs/pcs cluster cib | grep stonith-watchdog-timeout <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> (pcs) [root@r88-1 pcs]# pcs/pcs cluster stop --all <snip> >>> Check cib file before and after enabling sbd. (pcs) [root@r88-1 pcs]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib/cib.xml* /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> -rw-------. 1 hacluster haclient 1065 Feb 2 13:21 /var/lib/pacemaker/cib/cib.xml -rw-------. 1 hacluster haclient 32 Feb 2 13:21 /var/lib/pacemaker/cib/cib.xml.sig (pcs) [root@r88-2 pcs]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib/cib.xml* /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> -rw-------. 1 hacluster haclient 1065 Feb 2 13:21 /var/lib/pacemaker/cib/cib.xml -rw-------. 1 hacluster haclient 32 Feb 2 13:21 /var/lib/pacemaker/cib/cib.xml.sig (pcs) [root@r88-1 pcs]# pcs/pcs stonith sbd enable <snip> (pcs) [root@r88-1 pcs]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib/cib.xml* -rw-------. 1 hacluster haclient 968 Feb 2 13:23 /var/lib/pacemaker/cib/cib.xml (pcs) [root@r88-2 pcs]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib/cib.xml* -rw-------. 1 hacluster haclient 968 Feb 2 13:23 /var/lib/pacemaker/cib/cib.xml >>> Cib files were updated on all nodes. >>> Start cluster and check stonith-watchdog-timeout property. (pcs) [root@r88-1 pcs]# pcs/pcs cluster start --all --wait <snip> (pcs) [root@r88-1 pcs]# pcs/pcs cluster cib | grep stonith-watchdog-timeout >>> Cluster property stonith-watchdog-timeout is not set as expected. >>> Set stonith-watchdog-timeout again and stop the cluster. (pcs) [root@r88-1 pcs]# pcs/pcs property set stonith-watchdog-timeout=10 (pcs) [root@r88-1 pcs]# pcs/pcs cluster cib | grep stonith-watchdog-timeout <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> (pcs) [root@r88-1 pcs]# pcs/pcs cluster stop --all <snip> >>> Check cib file before and after disabling sbd. (pcs) [root@r88-1 pcs]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib/cib.xml* /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> -rw-------. 1 hacluster haclient 1065 Feb 2 13:46 /var/lib/pacemaker/cib/cib.xml -rw-------. 1 hacluster haclient 32 Feb 2 13:46 /var/lib/pacemaker/cib/cib.xml.sig (pcs) [root@r88-2 pcs]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib/cib.xml* /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="10"/> -rw-------. 1 hacluster haclient 1065 Feb 2 13:46 /var/lib/pacemaker/cib/cib.xml -rw-------. 1 hacluster haclient 32 Feb 2 13:46 /var/lib/pacemaker/cib/cib.xml.sig (pcs) [root@r88-1 pcs]# pcs/pcs stonith sbd disable <snip> (pcs) [root@r88-1 pcs]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib/cib.xml* /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="0"/> -rw-------. 1 hacluster haclient 1076 Feb 2 13:53 /var/lib/pacemaker/cib/cib.xml (pcs) [root@r88-2 pcs]# grep stonith-watchdog-timeout -R /var/lib/pacemaker/cib; ls -la /var/lib/pacemaker/cib/cib.xml* /var/lib/pacemaker/cib/cib.xml: <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="0"/> -rw-------. 1 hacluster haclient 1076 Feb 2 13:53 /var/lib/pacemaker/cib/cib.xml >>> Cib files were updated on all nodes. >>> Start cluster and check stonith-watchdog-timeout property. (pcs) [root@r88-1 pcs]# pcs/pcs cluster start --all --wait <snip> (pcs) [root@r88-1 pcs]# pcs/pcs cluster cib | grep stonith-watchdog-timeout <nvpair id="cib-bootstrap-options-stonith-watchdog-tim" name="stonith-watchdog-timeout" value="0"/> (pcs) [root@r88-1 pcs]# pcs/pcs status | grep -A1 "Node List" Node List: * Online: [ r88-1.vm r88-2.vm ] >>> Cluster property stonith-watchdog-timeout is set to 0 as expected. DevTestResults: (venv) [root@r08-08-a pcs]# rpm -q pcs pcs-0.10.15-4.el8.x86_64 (venv) [root@r08-08-a pcs]# pcs cluster setup coolcluster r08-08-a.vm r08-08-b.vm No addresses specified for host 'r08-08-a.vm', using 'r08-08-a.vm' No addresses specified for host 'r08-08-b.vm', using 'r08-08-b.vm' Destroying cluster on hosts: 'r08-08-a.vm', 'r08-08-b.vm'... r08-08-b.vm: Successfully destroyed cluster r08-08-a.vm: Successfully destroyed cluster Requesting remove 'pcsd settings' from 'r08-08-a.vm', 'r08-08-b.vm' r08-08-a.vm: successful removal of the file 'pcsd settings' r08-08-b.vm: successful removal of the file 'pcsd settings' Sending 'corosync authkey', 'pacemaker authkey' to 'r08-08-a.vm', 'r08-08-b.vm' r08-08-a.vm: successful distribution of the file 'corosync authkey' r08-08-a.vm: successful distribution of the file 'pacemaker authkey' r08-08-b.vm: successful distribution of the file 'corosync authkey' r08-08-b.vm: successful distribution of the file 'pacemaker authkey' Sending 'corosync.conf' to 'r08-08-a.vm', 'r08-08-b.vm' r08-08-a.vm: successful distribution of the file 'corosync.conf' r08-08-b.vm: successful distribution of the file 'corosync.conf' Cluster has been successfully set up. (venv) [root@r08-08-a pcs]# pcs stonith sbd enable Running SBD pre-enabling checks... r08-08-a.vm: SBD pre-enabling checks done r08-08-b.vm: SBD pre-enabling checks done Warning: auto_tie_breaker quorum option will be enabled to make SBD fencing effective. Cluster has to be offline to be able to make this change. Checking corosync is not running on nodes... r08-08-a.vm: corosync is not running r08-08-b.vm: corosync is not running Sending updated corosync.conf to nodes... r08-08-a.vm: Succeeded r08-08-b.vm: Succeeded Distributing SBD config... r08-08-a.vm: SBD config saved r08-08-b.vm: SBD config saved Enabling sbd... r08-08-b.vm: sbd enabled r08-08-a.vm: sbd enabled Warning: Cluster restart is required in order to apply these changes. (venv) [root@r08-08-a pcs]# pcs cluster start --all --wait r08-08-a.vm: Starting Cluster... r08-08-b.vm: Starting Cluster... Waiting for node(s) to start... r08-08-a.vm: Started r08-08-b.vm: Started (venv) [root@r08-08-a pcs]# pcs status Cluster name: coolcluster Status of pacemakerd: 'Pacemaker is running' (last updated 2023-02-13 17:16:55 +01:00) Cluster Summary: * Stack: corosync * Current DC: r08-08-b.vm (version 2.1.5-4.el8-a3f44794f94) - partition with quorum * Last updated: Mon Feb 13 17:16:55 2023 * Last change: Mon Feb 13 17:16:49 2023 by hacluster via crmd on r08-08-b.vm * 2 nodes configured * 0 resource instances configured Node List: * Online: [ r08-08-a.vm r08-08-b.vm ] Full List of Resources: * No resources Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled sbd: active/enabled (venv) [root@r08-08-a pcs]# pcs stonith sbd status SBD STATUS <node name>: <installed> | <enabled> | <running> r08-08-a.vm: YES | YES | YES r08-08-b.vm: YES | YES | YES Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pcs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2738 |