Bug 2095662

Summary: A duplicate ACL user causes a DC election loop (RHEL 8)
Product: Red Hat Enterprise Linux 8 Reporter: Reid Wahl <nwahl>
Component: pacemakerAssignee: Ken Gaillot <kgaillot>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: high Docs Contact:
Priority: high    
Version: 8.5CC: cluster-maint, cluster-qe, kgaillot, msmazova, sbradley
Target Milestone: rcKeywords: Triaged
Target Release: 8.8   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: pacemaker-2.1.5-1.el8 Doc Type: Bug Fix
Doc Text:
Cause: Pacemaker interprets two acl_target entries with the same id as a single entry that moved, and for a full CIB replace would start a new DC election. Consequence: The cluster would get into an infinite DC election loop. Fix: Moved entries in the CIB ACL section no longer start a new DC election. Result: Duplicate acl_target ids do not cause an election loop.
Story Points: ---
Clone Of: 2095597 Environment:
Last Closed: 2023-05-16 08:35:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 2.1.5
Embargoed:
Bug Depends On: 2095597    
Bug Blocks:    

Description Reid Wahl 2022-06-10 07:18:05 UTC
+++ This bug was initially created as a clone of Bug #2095597 +++

Note: The issue (reported on RHEL 7) is reproducible on RHEL 8.

Description of problem:

If a duplicate ACL user (acl_target) exists in the CIB, DC elections will loop quickly and infinitely. There is no good reason to have a duplicate ACL user, but pacemaker should not loop in this way if a user makes this mistake.

    <acls>
      <acl_role id="read-access">
        <acl_permission id="read-access-read" kind="read" xpath="/"/>
      </acl_role>
      <acl_target id="testuser">
        <role id="read-access"/>
      </acl_target>
      <acl_target id="testuser">
        <role id="read-access"/>
      </acl_target>
    </acls>

[root@fastvm-rhel-7-6-21 pcs]# tail -f /var/log/messages
Jun  9 17:31:25 fastvm-rhel-7-6-21 crmd[17198]:  notice: State transition S_ELECTION -> S_INTEGRATION
Jun  9 17:31:25 fastvm-rhel-7-6-21 stonith-ng[17194]:  notice: Versions did not change in patch 0.40.3
Jun  9 17:31:25 fastvm-rhel-7-6-21 attrd[17196]:  notice: Updating all attributes after cib_refresh_notify event
Jun  9 17:31:25 fastvm-rhel-7-6-21 crmd[17198]:  notice: State transition S_ELECTION -> S_INTEGRATION
Jun  9 17:31:25 fastvm-rhel-7-6-21 stonith-ng[17194]:  notice: Versions did not change in patch 0.40.7
Jun  9 17:31:25 fastvm-rhel-7-6-21 attrd[17196]:  notice: Updating all attributes after cib_refresh_notify event

Detailed logs are in "Additional info."

Also note: pcs does not allow this duplicate creation.

[root@fastvm-rhel-7-6-21 pcs]# pcs acl user create testuser read-access
Error: 'testuser' already exists

But if the duplicate already exists, pcs seems unable to remove it. It has to be removed manually and then pushed. IMO this isn't worth fixing in pcs, but others' opinions may differ.

-----

Version-Release number of selected component (if applicable):

pacemaker-2.1.2-4.el8

-----

How reproducible:

Always

-----

Steps to Reproduce:
1. Enable ACLs (`pcs acl enable`).
2. Save the CIB to a file (`pcs cluster cib > /tmp/cib.xml`), and add the following to /tmp/cib.xml, just before the closing </configuration> tag.

    <acls>
      <acl_role id="read-access">
        <acl_permission id="read-access-read" kind="read" xpath="/"/>
      </acl_role>
      <acl_target id="testuser">
        <role id="read-access"/>
      </acl_target>
      <acl_target id="testuser">
        <role id="read-access"/>
      </acl_target>
    </acls>

3. Push the updated CIB (`pcs cluster cib-push --config /tmp/cib.xml`).
4. It may be necessary to perform a full cluster stop-and-start in order to produce the issue.

-----

Actual results:

An infinite and rapid DC election loop as shown in the description.

-----

Expected results:

No DC election loop; perhaps a warning or failure due to the duplicate ACL user, or perhaps ignore the duplicate.

-----

Additional info:

Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (     utils.c:975   )    info: update_dc:        Set DC to node1 (3.0.14)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_modify operation for section crm_config: OK (rc=0, origin=node1/crmd/574, version=0.40.125)
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (       fsa.c:548   )    info: do_state_transition:      State transition S_INTEGRATION -> S_FINALIZE_JOIN | input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (   cib_ops.c:230   )    info: cib_process_replace:      Digest matched on replace from node1: a0c66419385d0b14385cfa03f3e8d523
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (   cib_ops.c:266   )    info: cib_process_replace:      Replaced 0.40.125 with 0.40.125 from node1
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:3531  )    info: __xml_diff_object:        acl_target.testuser moved from 1 to 2
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:891   )    info: cib_perform_op:   cib_perform_op: Local-only Change: 0.40.125
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:920   )    info: cib_perform_op:   +~ /cib/configuration/acls/acl_target[@id='testuser'] moved to offset 2
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (       cib.c:316   )    info: controld_delete_node_state:       Deleting resource history for node node1 (via CIB call 580) | xpath=//node_state[@uname='node1']/lrm
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (       cib.c:316   )    info: controld_delete_node_state:       Deleting resource history for node node2 (via CIB call 582) | xpath=//node_state[@uname='node2']/lrm
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (    notify.c:366   )    info: cib_replace_notify:       Local-only Replace: 0.40.125 from node1
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_replace operation for section 'all': OK (rc=0, origin=node1/crmd/577, version=0.40.125)
Jun 09 17:31:25 [17196] fastvm-rhel-7-6-21      attrd: (      main.c:97    )  notice: attrd_cib_replaced_cb:    Updating all attributes after cib_refresh_notify event
Jun 09 17:31:25 [17196] fastvm-rhel-7-6-21      attrd: (  commands.c:1079  )   debug: write_attributes: Writing out all attributes
Jun 09 17:31:25 [17196] fastvm-rhel-7-6-21      attrd: (  commands.c:1290  )    info: write_attribute:  Processed 2 private changes for #attrd-protocol, id=n/a, set=n/a
Jun 09 17:31:25 [17194] fastvm-rhel-7-6-21 stonith-ng: (       xml.c:1320  )  notice: xml_patch_version_check:  Versions did not change in patch 0.40.125
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (       fsa.c:548   )    info: do_state_transition:      State transition S_FINALIZE_JOIN -> S_ELECTION | input=I_ELECTION cause=C_FSA_INTERNAL origin=do_cib_replaced
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (     utils.c:979   )    info: update_dc:        Unset DC. Was node1
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_modify operation for section nodes to all (origin=local/crmd/578)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_modify operation for section nodes to all (origin=local/crmd/579)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_delete operation for section //node_state[@uname='node1']/lrm to all (origin=local/crmd/580)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_modify operation for section status to all (origin=local/crmd/581)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_delete operation for section //node_state[@uname='node2']/lrm to all (origin=local/crmd/582)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (  cib_file.c:293   )    info: cib_file_backup:  Archived previous version as /var/lib/pacemaker/cib/cib-82.raw
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (  election.c:381   )    info: election_check:   election-DC won by local node
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (      misc.c:46    )    info: do_log:   Input I_ELECTION_DC received in state S_ELECTION from election_win_cb
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_modify operation for section nodes: OK (rc=0, origin=node1/crmd/578, version=0.40.125)
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (       fsa.c:548   )  notice: do_state_transition:      State transition S_ELECTION -> S_INTEGRATION | input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=election_win_cb
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_modify operation for section nodes: OK (rc=0, origin=node1/crmd/579, version=0.40.125)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:884   )    info: cib_perform_op:   Diff: --- 0.40.125 2
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:886   )    info: cib_perform_op:   Diff: +++ 0.40.126 (null)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:966   )    info: cib_perform_op:   -- /cib/status/node_state[@id='1']/lrm[@id='1']
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:952   )    info: cib_perform_op:   +  /cib:  @num_updates=126
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_delete operation for section //node_state[@uname='node1']/lrm: OK (rc=0, origin=node1/crmd/580, version=0.40.126)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:884   )    info: cib_perform_op:   Diff: --- 0.40.126 2
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:886   )    info: cib_perform_op:   Diff: +++ 0.40.127 (null)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:952   )    info: cib_perform_op:   +  /cib:  @num_updates=127
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:952   )    info: cib_perform_op:   +  /cib/status/node_state[@id='1']:  @crm-debug-origin=do_lrm_query_internal
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:908   )    info: cib_perform_op:   ++ /cib/status/node_state[@id='1']:  <lrm id="1"/>
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:915   )    info: cib_perform_op:   ++                                     <lrm_resources/>
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:915   )    info: cib_perform_op:   ++                                   </lrm>
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_modify operation for section status: OK (rc=0, origin=node1/crmd/581, version=0.40.127)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:884   )    info: cib_perform_op:   Diff: --- 0.40.127 2
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:886   )    info: cib_perform_op:   Diff: +++ 0.40.128 (null)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:966   )    info: cib_perform_op:   -- /cib/status/node_state[@id='2']/lrm[@id='2']
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:952   )    info: cib_perform_op:   +  /cib:  @num_updates=128
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_delete operation for section //node_state[@uname='node2']/lrm: OK (rc=0, origin=node1/crmd/582, version=0.40.128)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_modify operation for section status to all (origin=local/crmd/583)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_modify operation for section nodes to all (origin=local/crmd/586)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_modify operation for section status to all (origin=local/crmd/587)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:884   )    info: cib_perform_op:   Diff: --- 0.40.128 2
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:886   )    info: cib_perform_op:   Diff: +++ 0.40.129 (null)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:952   )    info: cib_perform_op:   +  /cib:  @num_updates=129
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:952   )    info: cib_perform_op:   +  /cib/status/node_state[@id='2']:  @crm-debug-origin=do_lrm_query_internal
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:908   )    info: cib_perform_op:   ++ /cib/status/node_state[@id='2']:  <lrm id="2"/>
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:915   )    info: cib_perform_op:   ++                                     <lrm_resources/>
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:915   )    info: cib_perform_op:   ++                                   </lrm>
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_modify operation for section status: OK (rc=0, origin=node1/crmd/583, version=0.40.129)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_modify operation for section nodes: OK (rc=0, origin=node1/crmd/586, version=0.40.129)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:884   )    info: cib_perform_op:   Diff: --- 0.40.129 2
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:886   )    info: cib_perform_op:   Diff: +++ 0.40.130 (null)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:952   )    info: cib_perform_op:   +  /cib:  @num_updates=130
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:952   )    info: cib_perform_op:   +  /cib/status/node_state[@id='1']:  @crm-debug-origin=do_cib_replaced
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (       xml.c:952   )    info: cib_perform_op:   +  /cib/status/node_state[@id='2']:  @crm-debug-origin=do_cib_replaced
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_modify operation for section status: OK (rc=0, origin=node1/crmd/587, version=0.40.130)
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (  election.c:225   )    info: do_dc_takeover:   Taking over DC status for this partition
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_master operation for section 'all': OK (rc=0, origin=local/crmd/588, version=0.40.130)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_modify operation for section cib to all (origin=local/crmd/589)
Jun 09 17:31:25 [17194] fastvm-rhel-7-6-21 stonith-ng: (       xml.c:1325  )   debug: xml_patch_version_check:  Can apply patch 0.40.126 to 0.40.125
Jun 09 17:31:25 [17194] fastvm-rhel-7-6-21 stonith-ng: (       xml.c:1325  )   debug: xml_patch_version_check:  Can apply patch 0.40.127 to 0.40.126
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_modify operation for section cib: OK (rc=0, origin=node1/crmd/589, version=0.40.130)
Jun 09 17:31:25 [17194] fastvm-rhel-7-6-21 stonith-ng: (       xml.c:1325  )   debug: xml_patch_version_check:  Can apply patch 0.40.128 to 0.40.127
Jun 09 17:31:25 [17194] fastvm-rhel-7-6-21 stonith-ng: (       xml.c:1325  )   debug: xml_patch_version_check:  Can apply patch 0.40.129 to 0.40.128
Jun 09 17:31:25 [17194] fastvm-rhel-7-6-21 stonith-ng: (       xml.c:1325  )   debug: xml_patch_version_check:  Can apply patch 0.40.130 to 0.40.129
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: (  cib_file.c:423   )    info: cib_file_write_with_digest:       Wrote version 0.40.0 of the CIB to disk (digest: 6ebef8311746943742a847b50bbb11ef)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_modify operation for section crm_config to all (origin=local/crmd/591)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_modify operation for section crm_config: OK (rc=0, origin=node1/crmd/591, version=0.40.130)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_modify operation for section crm_config to all (origin=local/crmd/593)
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1113  )    info: cib_process_request:      Completed cib_modify operation for section crm_config: OK (rc=0, origin=node1/crmd/593, version=0.40.130)
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (   join_dc.c:179   )    info: join_make_offer:  Sending join-29 offer to node1
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (   join_dc.c:179   )    info: join_make_offer:  Sending join-29 offer to node2
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (   join_dc.c:212   )    info: do_dc_join_offer_all:     Waiting on join-29 requests from 2 outstanding nodes
Jun 09 17:31:25 [17193] fastvm-rhel-7-6-21        cib: ( callbacks.c:1049  )    info: cib_process_request:      Forwarding cib_modify operation for section crm_config to all (origin=local/crmd/595)
Jun 09 17:31:25 [17198] fastvm-rhel-7-6-21       crmd: (     utils.c:975   )    info: update_dc:        Set DC to node1 (3.0.14)

Comment 2 Ken Gaillot 2022-10-06 20:42:37 UTC
This was fixed by various commits upstream, which will all be in the upstream 2.1.5 release

Comment 7 Markéta Smazová 2022-12-21 12:11:34 UTC
Tested using reproducer in the Description (Comment 0).

before fix:
-----------

>    [root@virt-511 ~]# rpm -q pacemaker
>    pacemaker-2.1.0-8.el8.x86_64

Setup cluster:

>    [root@virt-511 ~]# pcs status
>    Cluster name: STSRHTS32383
>    Cluster Summary:
>      * Stack: corosync
>      * Current DC: virt-511 (version 2.1.0-8.el8-7c3f660707) - partition with quorum
>      * Last updated: Wed Dec 21 11:40:35 2022
>      * Last change:  Tue Dec 20 17:02:06 2022 by root via cibadmin on virt-511
>      * 2 nodes configured
>      * 2 resource instances configured

>    Node List:
>      * Online: [ virt-511 virt-515 ]

>    Full List of Resources:
>      * fence-virt-511	(stonith:fence_xvm):	 Started virt-511
>      * fence-virt-515	(stonith:fence_xvm):	 Started virt-515

>    Daemon Status:
>      corosync: active/disabled
>      pacemaker: active/disabled
>      pcsd: active/enabled

Enable ACLs:

>    [root@virt-511 ~]# pcs acl enable
>    [root@virt-511 ~]# pcs acl
>    ACLs are enabled

Save a copy of CIB and add ACLs:

>    [root@virt-511 ~]# pcs cluster cib > /tmp/cib.xml
>    [root@virt-511 ~]# vim /tmp/cib.xml

Push the updated CIB:

>    [root@virt-511 ~]# date && pcs cluster cib-push --config /tmp/cib.xml
>    Wed 21 Dec 11:42:24 CET 2022
>    CIB updated

Check the ACLs:

>    [root@virt-511 ~]# cibadmin --query --scope acls
>    <acls>
>      <acl_role id="read-access">
>        <acl_permission id="read-access-read" kind="read" xpath="/"/>
>      </acl_role>
>      <acl_target id="testuser">
>        <role id="read-access"/>
>      </acl_target>
>      <acl_target id="testuser">
>        <role id="read-access"/>
>      </acl_target>
>    </acls>

>    [root@virt-511 ~]# pcs acl
>    ACLs are enabled

>    User: testuser
>      Roles: read-access
>    User: testuser
>      Roles: read-access
>    Role: read-access
>      Permission: read xpath / (read-access-read)
>    [root@virt-511 ~]#

Check log:

>    [root@virt-511 ~]# tail -f /var/log/messages
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: notice: State transition S_IDLE -> S_POLICY_ENGINE
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: notice: State transition S_ELECTION -> S_INTEGRATION
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: warning: watchdog-fencing-query failed
>    Dec 21 11:42:25 virt-511 pacemaker-fenced[51506]: notice: Versions did not change in patch 0.9.1
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: notice: State transition S_ELECTION -> S_INTEGRATION
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: warning: watchdog-fencing-query failed
>    Dec 21 11:42:25 virt-511 pacemaker-fenced[51506]: notice: Versions did not change in patch 0.9.1
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: notice: State transition S_ELECTION -> S_INTEGRATION
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: warning: watchdog-fencing-query failed
>    Dec 21 11:42:25 virt-511 pacemaker-fenced[51506]: notice: Versions did not change in patch 0.9.1
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: notice: State transition S_ELECTION -> S_INTEGRATION
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: warning: watchdog-fencing-query failed
>    Dec 21 11:42:25 virt-511 pacemaker-fenced[51506]: notice: Versions did not change in patch 0.9.1
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: notice: State transition S_ELECTION -> S_INTEGRATION
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: warning: watchdog-fencing-query failed
>    Dec 21 11:42:25 virt-511 pacemaker-fenced[51506]: notice: Versions did not change in patch 0.9.1
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: notice: State transition S_ELECTION -> S_INTEGRATION
>    Dec 21 11:42:25 virt-511 pacemaker-controld[51510]: warning: watchdog-fencing-query failed
>    Dec 21 11:42:25 virt-511 pacemaker-fenced[51506]: notice: Versions did not change in patch 0.9.1

Result: An infinite DC election loop.


after fix:
----------

>    [root@virt-507 ~]# rpm -q pacemaker
>    pacemaker-2.1.5-4.el8.x86_64

Setup cluster:

>    [root@virt-507 ~]# pcs status
>    Cluster name: STSRHTS29018
>    Status of pacemakerd: 'Pacemaker is running' (last updated 2022-12-19 16:54:14 +01:00)
>    Cluster Summary:
>      * Stack: corosync
>      * Current DC: virt-508 (version 2.1.5-4.el8-a3f44794f94) - partition with quorum
>      * Last updated: Mon Dec 19 16:54:15 2022
>      * Last change:  Mon Dec 19 16:53:54 2022 by root via cibadmin on virt-507
>      * 2 nodes configured
>      * 2 resource instances configured

>    Node List:
>      * Online: [ virt-507 virt-508 ]

>    Full List of Resources:
>      * fence-virt-507	(stonith:fence_xvm):	 Started virt-507
>      * fence-virt-508	(stonith:fence_xvm):	 Started virt-508

>    Daemon Status:
>      corosync: active/disabled
>      pacemaker: active/disabled
>      pcsd: active/enabled

Enable ACLs:

>    [root@virt-507 ~]# pcs acl enable
>    [root@virt-507 ~]# pcs acl
>    ACLs are enabled

Save a copy of CIB and add ACLs:

>    [root@virt-507 ~]# pcs cluster cib > /tmp/cib.xml
>    [root@virt-507 ~]# vim /tmp/cib.xml

Push the updated CIB:

>    [root@virt-507 ~]# date && pcs cluster cib-push --config /tmp/cib.xml
>    Mon 19 Dec 17:03:26 CET 2022
>    CIB updated

Check the ACLs:

>    [root@virt-507 ~]# cibadmin --query --scope acls
>    <acls>
>      <acl_role id="read-access">
>        <acl_permission id="read-access-read" kind="read" xpath="/"/>
>      </acl_role>
>      <acl_target id="testuser">
>        <role id="read-access"/>
>      </acl_target>
>      <acl_target id="testuser">
>        <role id="read-access"/>
>      </acl_target>
>    </acls>

>    [root@virt-507 ~]# pcs acl
>    ACLs are enabled

>    User: testuser
>      Roles: read-access
>    User: testuser
>      Roles: read-access
>    Role: read-access
>      Permission: read xpath / (read-access-read)

Check log:

>    [root@virt-508 ~]# tail -f /var/log/messages
>    Dec 19 17:03:27 virt-508 pacemaker-controld[82133]: notice: State transition S_IDLE -> S_POLICY_ENGINE
>    Dec 19 17:03:27 virt-508 pacemaker-schedulerd[82132]: notice: Calculated transition 7, saving inputs in /var/lib/pacemaker/pengine/pe-input-19.bz2
>    Dec 19 17:03:27 virt-508 pacemaker-controld[82133]: notice: Transition 7 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pDec 19 17:03:27 virt-508 pacemaker-controld[82133]: notice: State transition S_TRANSITION_ENGINE -> S_IDLEengine/pe-input-19.bz2): Complete


Results: Cluster proceeds normally, no DC election loop.

marking VERIFIED in pacemaker-2.1.5-4.el8

Comment 9 errata-xmlrpc 2023-05-16 08:35:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2818