RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2228955 - Race condition when DC and attribute writer are both shutting down
Summary: Race condition when DC and attribute writer are both shutting down
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: pacemaker
Version: 8.4
Hardware: All
OS: All
urgent
urgent
Target Milestone: rc
: 8.9
Assignee: Ken Gaillot
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On: 2228933
Blocks: 2229013
TreeView+ depends on / blocked
 
Reported: 2023-08-03 18:06 UTC by Ken Gaillot
Modified: 2023-11-14 16:54 UTC (History)
4 users (show)

Fixed In Version: pacemaker-2.1.6-8.el8
Doc Type: Bug Fix
Doc Text:
Cause: A node's attribute manager writes all its transient node attributes from memory to the CIB after winning the election for attribute writer, even if its node has requested shutdown. Consequence: If a node is DC, requests shutdown, and wins the attribute writer election after its controller has left the cluster but before its attribute manager has left, it can write out its shutdown attribute to the CIB. The next time it rejoins the cluster, it will be immediately shut down. Fix: A node's attribute manager should not write out its attributes after winning an election if shutdown has been requested for its node. Result: A leaving DC node does not have an unexpected shutdown the next time it rejoins.
Clone Of: 2228933
: 2229013 (view as bug list)
Environment:
Last Closed: 2023-11-14 15:32:36 UTC
Type: Bug
Target Upstream Version: 2.1.7
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CLUSTERQE-6844 0 None None None 2023-08-04 14:11:15 UTC
Red Hat Issue Tracker RHELPLAN-164425 0 None None None 2023-08-03 18:07:11 UTC
Red Hat Product Errata RHEA-2023:6970 0 None None None 2023-11-14 15:33:44 UTC

Description Ken Gaillot 2023-08-03 18:06:09 UTC
+++ This bug was initially created as a clone of Bug #2228933 +++

Description of problem:

Pacemaker consists of multiple daemons, including the controller and the attribute manager, which both elect one node to have a special role (the Designated Controller a.k.a. DC and the attribute writer).

When a node needs to be shut down, a "shutdown" transient node attribute is created for it.

Transient node attributes are stored both in the CIB and in attribute manager memory. When the DC leaves the cluster, all other nodes remove its transient node attributes from the CIB, including "shutdown". When any node's attribute manager leaves the cluster, its transient node attributes are removed from memory by all other nodes' attribute managers.

When a node wins the attribute writer election, it writes out all its transient node attributes to the CIB.

This creates a race condition when different nodes are the DC and the writer, and both nodes are shutting down while other nodes remain up.

When the DC controller exits, the remaining nodes erase its attributes. However its attribute manager may still be up at this point, and if the former attribute writer leaves at this time, it may win the election for a new attribute writer, and write out its attributes back to the CIB.

Since the shutdown attribute is written back out, the next time the node joins the cluster, it will immediately be shut down.


Version-Release number of selected component (if applicable):


How reproducible: Difficult


Steps to Reproduce:

1. Configure a cluster of at least 5 nodes (so that quorum can be retained after shutting down 2).

2. Ensure that different nodes are DC and attribute writer. The DC can be determined with "crmadmin -D". The attribute writer can be determined by searching /var/log/pacemaker/pacemaker.log on all nodes for the most recent "Recorded local node as attribute writer" message. Restart the existing winner to force a new election until this happens.

3. Shut down the DC and attribute writer at the same time.

Actual results: Sometimes, the CIB will still have a "shutdown" node attribute for the former DC. This can be checked with "pcs cluster cib" and looking under "transient_attributes" in the "node_state" section for the node.


Expected results: The "shutdown" node attribute for the former DC is never present after it leaves the cluster.


Additional info: If this can't be reproduced, it can be sanity-checked only.

Comment 3 Ken Gaillot 2023-08-03 22:34:25 UTC
Fixed upstream as of commit f5263c94

Comment 8 jrehova 2023-08-25 08:39:41 UTC
Version of pacemaker:
> [root@virt-543:~]# rpm -q pacemaker
> pacemaker-2.1.6-7.el8.x86_64

Determining the DC node:
> [root@virt-543:~]# crmadmin -D
> Designated Controller is: virt-544

Determining the attribute writer node --> virt-546:
> [root@virt-543:~]# for n in 543 544 545 546 547; do echo $n; qarsh -l root virt-$n "grep 'Recorded local node as attribute writer' /var/log/pacemaker/pacemaker.log | tail -1"; done
> 543
> Aug 23 14:19:13 virt-543 pacemaker-attrd     [65920] (attrd_declare_winner) 	notice: Recorded local node as attribute writer (was unset)
> 544
> Aug 23 14:19:13 virt-544 pacemaker-attrd     [65845] (attrd_declare_winner) 	notice: Recorded local node as attribute writer (was unset)
> 545
> Aug 23 14:19:13 virt-545 pacemaker-attrd     [65698] (attrd_declare_winner) 	notice: Recorded local node as attribute writer (was unset)
> 546
> Aug 23 14:19:21 virt-546 pacemaker-attrd     [65700] (attrd_declare_winner) 	notice: Recorded local node as attribute writer (was unset)
> 547
> Aug 23 14:19:13 virt-547 pacemaker-attrd     [65497] (attrd_declare_winner) 	notice: Recorded local node as attribute writer (was unset)

Rebooting both DC and attribute writer nodes at the same time:
> [root@virt-544 ~]# reboot
> [root@virt-546 ~]# reboot

Result: "shutdown" attribute is present in the CIB.

> [root@virt-543:~]# pcs cluster cib | xmllint --xpath '//node_state/transient_attributes' -
> <transient_attributes id="1">
>         <instance_attributes id="status-1">
>           <nvpair id="status-1-.feature-set" name="#feature-set" value="3.17.4"/>
>         </instance_attributes>
>       </transient_attributes><transient_attributes id="3">
>         <instance_attributes id="status-3">
>           <nvpair id="status-3-.feature-set" name="#feature-set" value="3.17.4"/>
>         </instance_attributes>
>       </transient_attributes><transient_attributes id="2">
>         <instance_attributes id="status-2">
>           <nvpair id="status-2-.feature-set" name="#feature-set" value="3.17.4"/>
>           <nvpair id="status-2-shutdown" name="shutdown" value="1692793538"/>
>         </instance_attributes>
>       </transient_attributes><transient_attributes id="5">
>         <instance_attributes id="status-5">
>           <nvpair id="status-5-.feature-set" name="#feature-set" value="3.17.4"/>
>         </instance_attributes>
>       </transient_attributes>

Comment 10 Ken Gaillot 2023-08-28 15:19:28 UTC
The original fix was found to be incomplete. The completed fix has been merged in upstream main branch as of commit 58400e27.

Comment 12 jrehova 2023-09-07 13:17:17 UTC
Marking Verified in version pacemaker-2.1.6-8.el8.x86_64.

Comment 14 errata-xmlrpc 2023-11-14 15:32:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:6970


Note You need to log in before you can comment on or make changes to this bug.