Bug 2042433

Summary: Enabling sbd before starting the cluster sets an incorrect `validate-with` value in /var/lib/pacemaker/cib/cib.xml [rhel-8.5.0.z]
Product: Red Hat Enterprise Linux 8 Reporter: RHEL Program Management Team <pgm-rhel-tools>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.5CC: cfeist, cluster-maint, idevat, kmalyjur, mlisik, mmazoure, mpospisi, nhostako, omular, sbradley, sfoucek, tojeline, troy.engel
Target Milestone: rcKeywords: EasyFix, Regression, Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: pcs-0.10.10-4.el8_5.1 Doc Type: Bug Fix
Doc Text:
Cause: User sets up a cluster without starting it, then they set up SBD and then they start the cluster. Consequence: Pcs creates a default empty CIB in Pacemaker 1.x format. Various pcs commands or their options do not work, until the CIB is manually upgraded to Pacemaker 2.x format. Fix: Make pcs create an empty CIB in Pacemaker 2.x format. Result: Pcs works with no need to upgrade CIB manually.
Story Points: ---
Clone Of: 2022463 Environment:
Last Closed: 2022-03-15 09:23:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2022463    
Bug Blocks:    

Comment 2 Miroslav Lisik 2022-01-27 10:27:19 UTC
DevTestResults:

[root@r85-node-01 ~]# rpm -q pcs
pcs-0.10.10-4.el8_5.1.x86_64


[root@r85-node-01 ~]# pcs host auth -u hacluster -p $PASS r85-node-0{1,2}
r85-node-01: Authorized
r85-node-02: Authorized
[root@r85-node-01 ~]# pcs cluster setup HACluster r85-node-0{1,2}
No addresses specified for host 'r85-node-01', using 'r85-node-01'
No addresses specified for host 'r85-node-02', using 'r85-node-02'
Destroying cluster on hosts: 'r85-node-01', 'r85-node-02'...
r85-node-01: Successfully destroyed cluster
r85-node-02: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'r85-node-01', 'r85-node-02'
r85-node-01: successful removal of the file 'pcsd settings'
r85-node-02: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'r85-node-01', 'r85-node-02'
r85-node-01: successful distribution of the file 'corosync authkey'
r85-node-01: successful distribution of the file 'pacemaker authkey'
r85-node-02: successful distribution of the file 'corosync authkey'
r85-node-02: successful distribution of the file 'pacemaker authkey'
Sending 'corosync.conf' to 'r85-node-01', 'r85-node-02'
r85-node-01: successful distribution of the file 'corosync.conf'
r85-node-02: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.

[root@r85-node-01 ~]# ls -l /var/lib/pacemaker/cib/
total 0

[root@r85-node-01 ~]# pcs stonith sbd enable
Running SBD pre-enabling checks...
r85-node-01: SBD pre-enabling checks done
r85-node-02: SBD pre-enabling checks done
Warning: auto_tie_breaker quorum option will be enabled to make SBD fencing effective. Cluster has to be offline to be able to make this change.
Checking corosync is not running on nodes...
r85-node-01: corosync is not running
r85-node-02: corosync is not running
Sending updated corosync.conf to nodes...
r85-node-01: Succeeded
r85-node-02: Succeeded
Distributing SBD config...
r85-node-01: SBD config saved
r85-node-02: SBD config saved
Enabling sbd...
r85-node-02: sbd enabled
r85-node-01: sbd enabled
Warning: Cluster restart is required in order to apply these changes.

[root@r85-node-01 ~]# ls -1 /var/lib/pacemaker/cib/
cib-0.raw
cib.last
cib.xml
cib.xml.sig

[root@r85-node-01 ~]# grep validate-with /var/lib/pacemaker/cib/cib.xml
<cib admin_epoch="0" epoch="2" num_updates="0" validate-with="pacemaker-3.1" cib-last-written="Wed Jan 26 17:12:22 2022">

[root@r85-node-01 ~]# pcs cluster start --all --wait
r85-node-02: Starting Cluster...
r85-node-01: Starting Cluster...
Waiting for node(s) to start...
r85-node-01: Started
r85-node-02: Started

[root@r85-node-01 ~]# pcs cluster cib | grep validate-with
<cib admin_epoch="0" epoch="7" num_updates="4" validate-with="pacemaker-3.1" cib-last-written="Wed Jan 26 17:17:12 2022" update-origin="r85-node-02" update-client="crmd" update-user="hacluster" crm_feature_set="3.11.0" have-quorum="1" dc-uuid="2">
[root@r85-node-01 ~]# pcs alert create path=/var/lib/pacemaker/alert_file.sh id=test_alert
[root@r85-node-01 ~]# echo $?
0

Comment 6 Simon Foucek 2022-01-28 12:11:27 UTC
Before fix:

>[root@virt-029 ~]# rpm -q pcs
pcs-0.10.10-4.el8.x86_64
>[root@virt-029 ~]#  pcs host auth -u hacluster -p password virt-029 virt-030
virt-030: Authorized
virt-029: Authorized
>[root@virt-029 ~]# pcs cluster setup HACluster virt-029 virt-030
No addresses specified for host 'virt-029', using 'virt-029'
No addresses specified for host 'virt-030', using 'virt-030'
Destroying cluster on hosts: 'virt-029', 'virt-030'...
virt-029: Successfully destroyed cluster
virt-030: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'virt-029', 'virt-030'
virt-029: successful removal of the file 'pcsd settings'
virt-030: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'virt-029', 'virt-030'
virt-029: successful distribution of the file 'corosync authkey'
virt-029: successful distribution of the file 'pacemaker authkey'
virt-030: successful distribution of the file 'corosync authkey'
virt-030: successful distribution of the file 'pacemaker authkey'
Sending 'corosync.conf' to 'virt-029', 'virt-030'
virt-029: successful distribution of the file 'corosync.conf'
virt-030: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.
>[root@virt-029 ~]# ls -l /var/lib/pacemaker/cib/
total 0
>[root@virt-029 ~]# pcs stonith sbd enable
Running SBD pre-enabling checks...
virt-030: SBD pre-enabling checks done
virt-029: SBD pre-enabling checks done
Warning: auto_tie_breaker quorum option will be enabled to make SBD fencing effective. Cluster has to be offline to be able to make this change.
Checking corosync is not running on nodes...
virt-030: corosync is not running
virt-029: corosync is not running
Sending updated corosync.conf to nodes...
virt-029: Succeeded
virt-030: Succeeded
Distributing SBD config...
virt-030: SBD config saved
virt-029: SBD config saved
Enabling sbd...
virt-030: sbd enabled
virt-029: sbd enabled
Warning: Cluster restart is required in order to apply these changes.
>[root@virt-029 ~]# ls /var/lib/pacemaker/cib/
cib-0.raw  cib.last  cib.xml  cib.xml.sig
>[root@virt-029 ~]#  grep validate-with /var/lib/pacemaker/cib/cib.xml
<cib admin_epoch="0" epoch="2" num_updates="0" validate-with="pacemaker-1.2" cib-last-written="Fri Jan 28 12:55:29 2022">
>[root@virt-029 ~]# pcs cluster start --all --wait
virt-029: Starting Cluster...
virt-030: Starting Cluster...
Waiting for node(s) to start...
virt-029: Started
virt-030: Started
>[root@virt-029 ~]#  pcs cluster cib | grep validate-with
<cib admin_epoch="0" epoch="7" num_updates="4" validate-with="pacemaker-1.2" cib-last-written="Fri Jan 28 12:56:20 2022" update-origin="virt-030" update-client="crmd" update-user="hacluster" crm_feature_set="3.11.0" have-quorum="1" dc-uuid="2">
>[root@virt-029 ~]# pcs alert create path=/var/lib/pacemaker/alert_file.sh id=test_alert
Error: Unable to update cib
Call cib_apply_diff failed (-203): Update does not conform to the configured schema

<cib admin_epoch="0" epoch="8" num_updates="0" validate-with="pacemaker-1.2" cib-last-written="Fri Jan 28 12:56:42 2022" update-origin="virt-029" update-client="cibadmin" update-user="root" crm_feature_set="3.11.0" have-quorum="1" dc-uuid="2">
  <configuration>
    <crm_config>
      <cluster_property_set id="cib-bootstrap-options">
        <nvpair id="cib-bootstrap-options-have-watchdog" name="have-watchdog" value="true"/>
        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="2.1.0-8.el8-7c3f660707"/>
        <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="corosync"/>
        <nvpair id="cib-bootstrap-options-cluster-name" name="cluster-name" value="HACluster"/>
      </cluster_property_set>
    </crm_config>
    <nodes>
      <node id="1" uname="virt-029"/>
      <node id="2" uname="virt-030"/>
    </nodes>
    <resources/>
    <constraints/>
    <alerts>
      <alert id="test_alert" path="/var/lib/pacemaker/alert_file.sh"/>
    </alerts>
  </configuration>
  <status>
    <node_state id="2" uname="virt-030" in_ccm="true" crmd="online" crm-debug-origin="do_state_transition" join="member" expected="member">
      <lrm id="2">
        <lrm_resources/>
      </lrm>
    </node_state>
    <node_state id="1" uname="virt-029" in_ccm="true" crmd="online" crm-debug-origin="do_state_transition" join="member" expected="member">
      <lrm id="1">
        <lrm_resources/>
      </lrm>
    </node_state>
  </status>
</cib>

>[root@virt-029 ~]# echo $?
1

After fix:
>[root@virt-029 ~]# rpm -q pcs
pcs-0.10.10-4.el8_5.1.x86_64
>[root@virt-029 ~]# pcs host auth -u hacluster -p password virt-029 virt-030
virt-029: Authorized
virt-030: Authorized
>[root@virt-029 ~]# pcs cluster setup HACluster virt-029 virt-030
No addresses specified for host 'virt-029', using 'virt-029'
No addresses specified for host 'virt-030', using 'virt-030'
Destroying cluster on hosts: 'virt-029', 'virt-030'...
virt-029: Successfully destroyed cluster
virt-030: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'virt-029', 'virt-030'
virt-029: successful removal of the file 'pcsd settings'
virt-030: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'virt-029', 'virt-030'
virt-029: successful distribution of the file 'corosync authkey'
virt-029: successful distribution of the file 'pacemaker authkey'
virt-030: successful distribution of the file 'corosync authkey'
virt-030: successful distribution of the file 'pacemaker authkey'
Sending 'corosync.conf' to 'virt-029', 'virt-030'
virt-029: successful distribution of the file 'corosync.conf'
virt-030: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.
>[root@virt-029 ~]# ls -l /var/lib/pacemaker/cib/
total 0
>[root@virt-029 ~]# pcs stonith sbd enable
Running SBD pre-enabling checks...
virt-030: SBD pre-enabling checks done
virt-029: SBD pre-enabling checks done
Warning: auto_tie_breaker quorum option will be enabled to make SBD fencing effective. Cluster has to be offline to be able to make this change.
Checking corosync is not running on nodes...
virt-030: corosync is not running
virt-029: corosync is not running
Sending updated corosync.conf to nodes...
virt-029: Succeeded
virt-030: Succeeded
Distributing SBD config...
virt-029: SBD config saved
virt-030: SBD config saved
Enabling sbd...
virt-030: sbd enabled
virt-029: sbd enabled
Warning: Cluster restart is required in order to apply these changes.
>[root@virt-029 ~]# ls -1 /var/lib/pacemaker/cib/
cib-0.raw
cib.last
cib.xml
cib.xml.sig
>[root@virt-029 ~]# grep validate-with /var/lib/pacemaker/cib/cib.xml
<cib admin_epoch="0" epoch="2" num_updates="0" validate-with="pacemaker-3.1" cib-last-written="Fri Jan 28 13:02:20 2022">
>[root@virt-029 ~]#  pcs cluster start --all --wait
virt-030: Starting Cluster...
virt-029: Starting Cluster...
Waiting for node(s) to start...
virt-030: Started
virt-029: Started
>[root@virt-029 ~]#  pcs cluster cib | grep validate-with
<cib admin_epoch="0" epoch="7" num_updates="4" validate-with="pacemaker-3.1" cib-last-written="Fri Jan 28 13:02:58 2022" update-origin="virt-030" update-client="crmd" update-user="hacluster" crm_feature_set="3.11.0" have-quorum="1" dc-uuid="2">
>[root@virt-029 ~]# pcs alert create path=/var/lib/pacemaker/alert_file.sh id=test_alert
>[root@virt-029 ~]# echo $?
0

Result:
"validate-with" value is properly set if the cluster is started after enabling sbd and alerts can be created

Comment 10 errata-xmlrpc 2022-03-15 09:23:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pcs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0881