1869728 – sbd always triggers a reboot while with no-quorum-action=stop assuring that all resources are down within watchdog-timeout might be safe enough

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1869728 - sbd always triggers a reboot while with no-quorum-action=stop assuring that all resources are down within watchdog-timeout might be safe enough

Summary: sbd always triggers a reboot while with no-quorum-action=stop assuring that a...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	sbd
Sub Component:
Version:	8.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	8.4
Assignee:	Klaus Wenninger
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-08-18 14:08 UTC by Klaus Wenninger
Modified:	2022-02-18 07:27 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-02-18 07:27:18 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)

Description Klaus Wenninger 2020-08-18 14:08:02 UTC

Description of problem:
sbd always triggers a reboot while with no-quorum-action=stop assuring that all resources are down within watchdog-timeout might be safe enough

Version-Release number of selected component (if applicable):
sbd-1.4.0-7.el8

How reproducible:
100%

Steps to Reproduce:
1. Setup simple 3-node-cluster with resources that shut down quickly
2. Configure watchdog-fencing with SBD
3. Set no-quorum-action=stop (or leave it to default)
3. Disconnect the nodes from each other

Actual results:
all 3 nodes will reboot after watchdog-timeout

Expected results:
cluster will try to shutdown resources and if successful within watchdog-timeout nodes won't reboot

Additional info:

Comment 1 Klaus Wenninger 2020-08-31 09:00:30 UTC

Discussion

Comment 2 Klaus Wenninger 2020-08-31 09:18:38 UTC

Discussion revealed that rejoining such a node that has successfully stopped resources within the given timeout won't mimic a fence-reboot well enough for the rest of the cluster. Examples are e.g. leftover transient attributes and in general this would probably impose some burden on testing as we'd somehow have to ensure behavior being comparable to after a reboot.

In general we might think over use-cases where it really is that desirable to prevent a reboot.

A list for brain-storming could start with:
- certain server-hardware is quite slow on a reboot while a quorum-loss might go away quickly and we could recover the cluster quicker
- it is always a pain for an admin to find a shell he was using to observe the node behavior to be starved/closed because of a reboot
- the node might run services outside of pacemaker-control that would be unnecessarily affected
- ...
These arguments are valid for most cluster-scenarios but might be more annoying with watchdog fencing as we might expect issues happening more frequently.
Should a cluster-node run anything but services under pacemaker control - maybe not - maybe there are reasons why it makes sense ...

Another possibility that came to my mind was introduction of a new no-quorum-policy=shutdown (or whatever imposes less risk of missunderstanding) that would make the node attempt a graceful pacemaker-shutdown. SBD would again allow watchdog-timeout for this to happen and if it detects a graceful-shutdown of pacemaker (without resources running - meaning not in maintenance mode) it would be content and not trigger an actual reboot.
Like this from a testing-perspective we would have the same case as a manual service-stop/start.

Comment 3 Ken Gaillot 2020-08-31 16:47:57 UTC

(In reply to Klaus Wenninger from comment #2)
> Discussion revealed that rejoining such a node that has successfully stopped
> resources within the given timeout won't mimic a fence-reboot well enough
> for the rest of the cluster. Examples are e.g. leftover transient attributes
> and in general this would probably impose some burden on testing as we'd
> somehow have to ensure behavior being comparable to after a reboot.

It's a tough question. There's no way to mimic what happens with nodes not using sbd:

- If any nodes retain quorum, they will fence the nodes without quorum.

- Any node that loses quorum will stop resources if it's able, but leave pacemaker running so it can rejoin the cluster if quorum is regained before fencing is scheduled against it.

The problem of course is that sbd can't know if any other nodes retain quorum, so it has to fence to be safe.

As you suggest, if we can absolutely guarantee that all resources are stopped, and pacemaker and corosync are restarted, then perhaps fencing should be considered unnecessary. On the other hand, sbd can't guarantee that pacemaker and corosync behave correctly once restarted, which may violate assumptions held by any surviving partition. We could stop pacemaker and corosync instead of restarting them, but then the node can't rejoin if quorum is regained, so the only practical benefit is less chance of losing logs.

> In general we might think over use-cases where it really is that desirable
> to prevent a reboot.
> 
> A list for brain-storming could start with:
>   - certain server-hardware is quite slow on a reboot while a quorum-loss
> might go away quickly and we could recover the cluster quicker

Just brainstorming, what about a separate quorum loss timeout? If pacemaker detects sbd running and sees a quorum loss timeout from the sbd sysconfig, it would wait that long before declaring the node fencing successful. The timeout would have to be identical on all nodes.

That would slow down quorum recovery for the chance of the node rejoining more quickly. Users would have to balance the two concerns.

>   - it is always a pain for an admin to find a shell he was using to observe
> the node behavior to be starved/closed because of a reboot
>   - the node might run services outside of pacemaker-control that would be
> unnecessarily affected

I don't think that's an issue since fencing is always a possibility, so the admin must already incorporate that into any policy regarding non-clustered services.

>   - ...
> These arguments are valid for most cluster-scenarios but might be more
> annoying with watchdog fencing as we might expect issues happening more
> frequently.
> Should a cluster-node run anything but services under pacemaker control -
> maybe not - maybe there are reasons why it makes sense ...
> 
> Another possibility that came to my mind was introduction of a new
> no-quorum-policy=shutdown (or whatever imposes less risk of
> missunderstanding) that would make the node attempt a graceful
> pacemaker-shutdown. SBD would again allow watchdog-timeout for this to
> happen and if it detects a graceful-shutdown of pacemaker (without resources
> running - meaning not in maintenance mode) it would be content and not
> trigger an actual reboot.
> Like this from a testing-perspective we would have the same case as a manual
> service-stop/start.

Per above, I think the problem is that either the node can't rejoin if quorum is regained, or we risk corosync/pacemaker operating without any observation or check from the quorate partition.

Comment 8 RHEL Program Management 2022-02-18 07:27:18 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Note You need to log in before you can comment on or make changes to this bug.