Bug 1721198
Summary: | A pacemaker_remoted node fails monitor (probe) and stop /start operations on a resource because it returns "rc=189 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Ken Gaillot <kgaillot> |
Component: | pacemaker | Assignee: | Ken Gaillot <kgaillot> |
Status: | CLOSED ERRATA | QA Contact: | pkomarov |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 8.0 | CC: | abeekhof, aherr, cluster-maint, cluster-qe, kwenning, lmiccini, pkomarov, sbradley, toneata, ykulkarn |
Target Milestone: | rc | Keywords: | ZStream |
Target Release: | 8.1 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | pacemaker-2.0.2-2.el8 | Doc Type: | Bug Fix |
Doc Text: |
Cause: Pacemaker implicitly ordered all stops needed on a Pacemaker Remote node before the stop of the node's Pacemaker Remote connection, including stops that were implied by fencing of the node. Also, Pacemaker scheduled actions on Pacemaker Remote nodes with a failed connection so that the actions could be done once the connection is recovered, even if the connection wasn't being recovered (for example, if the node was shutting down when the failure occurred).
Consequence: If a Pacemaker Remote node needed to be fenced while it was in the process of shutting down, once the fencing completed pacemaker scheduled probes on the node. The probes fail because the connection is not actually active. Due to the failed probe, a stop is scheduled which also fails, leading to fencing of the node again, and the situation repeats itself indefinitely.
Fix: Pacemaker Remote connection stops are no longer ordered after implied stops, and actions are not scheduled on Pacemaker Remote nodes when the connection is failed and not being started again.
Result: A Pacemaker Remote node that needs to be fenced while it is in the process of shutting down is fenced once, without repeating indefinitely.
|
Story Points: | --- |
Clone Of: | 1704870 | Environment: | |
Last Closed: | 2019-11-05 20:57:48 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1704870 | ||
Bug Blocks: | 1703946 |
Description
Ken Gaillot
2019-06-17 14:47:08 UTC
The fix is already in the 8.1 build via the rebase Bug 1695737, but I made this separate BZ so we could ask for an 8.0.0 z-stream. This bug has been copied as 8.0.0 z-stream bug#1734066 and now must be resolved in the current update release, set blocker flag. Verified , verification steps at : https://bugzilla.redhat.com/show_bug.cgi?id=1734066#c3 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3385 |