Bug 2228933 - Race condition when DC and attribute writer are both shutting down
Summary: Race condition when DC and attribute writer are both shutting down
Keywords:
Status: ON_QA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: pacemaker
Version: 9.2
Hardware: All
OS: All
urgent
urgent
Target Milestone: rc
: 9.3
Assignee: Ken Gaillot
QA Contact: cluster-qe
URL:
Whiteboard:
: 2230133 (view as bug list)
Depends On:
Blocks: 2228955 2229014
TreeView+ depends on / blocked
 
Reported: 2023-08-03 16:57 UTC by Ken Gaillot
Modified: 2023-08-10 15:41 UTC (History)
4 users (show)

Fixed In Version: pacemaker-2.1.6-8.el9
Doc Type: Bug Fix
Doc Text:
Cause: A node's attribute manager writes all its transient node attributes from memory to the CIB after winning the election for attribute writer, even if its node has requested shutdown. Consequence: If a node is DC, requests shutdown, and wins the attribute writer election after its controller has left the cluster but before its attribute manager has left, it can write out its shutdown attribute to the CIB. The next time it rejoins the cluster, it will be immediately shut down. Fix: A node's attribute manager should not write out its attributes after winning an election if shutdown has been requested for its node. Result: A leaving DC node does not have an unexpected shutdown the next time it rejoins.
Clone Of:
: 2228955 2229014 (view as bug list)
Environment:
Last Closed:
Type: Bug
Target Upstream Version: 2.1.7
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CLUSTERQE-6845 0 None None None 2023-08-04 14:17:05 UTC
Red Hat Issue Tracker RHELPLAN-164410 0 None None None 2023-08-03 16:59:01 UTC

Description Ken Gaillot 2023-08-03 16:57:54 UTC
Description of problem:

Pacemaker consists of multiple daemons, including the controller and the attribute manager, which both elect one node to have a special role (the Designated Controller a.k.a. DC and the attribute writer).

When a node needs to be shut down, a "shutdown" transient node attribute is created for it.

Transient node attributes are stored both in the CIB and in attribute manager memory. When the DC leaves the cluster, all other nodes remove its transient node attributes from the CIB, including "shutdown". When any node's attribute manager leaves the cluster, its transient node attributes are removed from memory by all other nodes' attribute managers.

When a node wins the attribute writer election, it writes out all its transient node attributes to the CIB.

This creates a race condition when different nodes are the DC and the writer, and both nodes are shutting down while other nodes remain up.

When the DC controller exits, the remaining nodes erase its attributes. However its attribute manager may still be up at this point, and if the former attribute writer leaves at this time, it may win the election for a new attribute writer, and write out its attributes back to the CIB.

Since the shutdown attribute is written back out, the next time the node joins the cluster, it will immediately be shut down.


Version-Release number of selected component (if applicable):


How reproducible: Difficult


Steps to Reproduce:

1. Configure a cluster of at least 5 nodes (so that quorum can be retained after shutting down 2).

2. Ensure that different nodes are DC and attribute writer. The DC can be determined with "crmadmin -D". The attribute writer can be determined by searching /var/log/pacemaker/pacemaker.log on all nodes for the most recent "Recorded local node as attribute writer" message. Restart the existing winner to force a new election until this happens.

3. Shut down the DC and attribute writer at the same time.

Actual results: Sometimes, the CIB will still have a "shutdown" node attribute for the former DC. This can be checked with "pcs cluster cib" and looking under "transient_attributes" in the "node_state" section for the node.


Expected results: The "shutdown" node attribute for the former DC is never present after it leaves the cluster.


Additional info: If this can't be reproduced, it can be sanity-checked only.

Comment 3 Ken Gaillot 2023-08-03 22:36:40 UTC
Fixed upstream as of commit f5263c94

Comment 5 Ken Gaillot 2023-08-09 14:40:15 UTC
*** Bug 2230133 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.