Bug 2168633
| Summary: | [BDI] Pacemaker resources left UNCLEAN after controller node failure | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Riccardo Bruzzone <rbruzzon> | |
| Component: | pacemaker | Assignee: | Klaus Wenninger <kwenning> | |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | |
| Severity: | urgent | Docs Contact: | Steven J. Levine <slevine> | |
| Priority: | high | |||
| Version: | 8.4 | CC: | cfeist, cluster-maint, jrehova, kgaillot, kwenning, lmiccini, matteo.panella, mjuricek, nwahl, sbradley, slevine | |
| Target Milestone: | rc | Keywords: | Triaged, ZStream | |
| Target Release: | 8.9 | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | x86_64 | |||
| OS: | All | |||
| Whiteboard: | ||||
| Fixed In Version: | pacemaker-2.1.6-1.el8 | Doc Type: | Bug Fix | |
| Doc Text: |
.A fence watchdog configured as a second fencing device now fences a node when the first device times out
Previously, when a watchdog fencing device was configured as the second device in a fencing topology, the watchdog timeout would not be considered when calculating the timeout for the fencing operation. As a result, if the first device timed out the fencing operation would time out even though the watchdog would fence the node. With this fix, the watchdog timeout is included in the fencing operation timeout and the fencing operation succeeds if the first device times out.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 2182482 2187419 2187421 2187422 2187423 (view as bug list) | Environment: | ||
| Last Closed: | 2023-11-14 15:32:36 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | 2.1.6 | |
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2182482, 2187419, 2187421, 2187422, 2187423 | |||
|
Description
Riccardo Bruzzone
2023-02-09 15:38:21 UTC
The "watchdog is not eligible to fence <node>" part is reproducible as of pacemaker-2.1.4-5.el9_1.2.x86_64. The following code appears to add only the local node (and no other nodes) to device->targets, so that we can use the watchdog device only to fence ourselves and not to fence any other node. - https://github.com/ClusterLabs/pacemaker/blame/Pacemaker-2.1.4/daemons/fenced/fenced_commands.c#L1344-L1361 If that's intended behavior, then I'm not sure why. With that being said, on my reproducer cluster (using newer packages), so far there's no real issue: yes, we're unable to use `watchdog` to fence the node, but the node is declared "fenced" anyway after stonith-watchdog-timeout expires. Then resources are recovered. Maybe it's related to the particular timeouts in place, as proposed in chat, or perhaps it's an issue in an older version. (In reply to Reid Wahl from comment #5) > > If that's intended behavior, then I'm not sure why. > Behavior is intended - sort of. If the node to be watchdog-fenced is alive it will then advertise capability to self-fence and - assuming there is no other reason here to self fence like being unclean or without quorum - get the job to do so. Problem I'm seeing is how timeout for a topology is evaluated. If there are 2 levels the timeout will be derived to 2x the standard timeout. So if the fence-action on the 1st level times out we still have a standard-timeout for watchdog-fencing to take place. This is what is observed when the node to be watchdog-fenced is available. In case this node isn't available, timeout for the full topology is derived to just 1x standard timeout as watchdog-fencing - as pointed out above - isn't reported as available by the node. In this scenario all timeout is used up by the first level timing out and thus we haven't got enough time left for the watchdog-timeout. Haven't checked but this optimized timeout evaluation might have been introduced after playing with this scenario initially - or first level timing out was actually never tested as we always had other errors - don't remember ... The least intrusive way to cope with this would probably be to make topology overall-timeout calculation watchdog-aware in a sense that it would still add one timeout for watchdog-fencing - even if reported not available. This would already correct the behavior. When we entirely suppress the unavailable-message for watchdog-fencing we would loose information. Thus I'd recommend tweaking it in a way that it becomes more obvious what is going on. These special cases for watchdog-fencing are kind of ugly and thus alternative ideas are welcome. https://github.com/ClusterLabs/pacemaker/pull/3026 @nwahl: Do you think we still need the additional logs? (In reply to Klaus Wenninger from comment #9) > https://github.com/ClusterLabs/pacemaker/pull/3026 > > @nwahl: Do you think we still need the additional logs? It seems reproducible so probably not, but it's good to know that we have them now :) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:6970 |