Bug 1872490

Summary:	[RFE] Show in cluster status when Pacemaker is waiting on sbd at start-up
Product:	Red Hat Enterprise Linux 8	Reporter:	Ken Gaillot <kgaillot>
Component:	pacemaker	Assignee:	Klaus Wenninger <kwenning>
Status:	CLOSED ERRATA	QA Contact:	cluster-qe <cluster-qe>
Severity:	low	Docs Contact:
Priority:	high
Version:	8.3	CC:	cluster-maint, jpokorny, kgaillot, msmazova
Target Milestone:	rc	Keywords:	FutureFeature, Triaged
Target Release:	8.4	Flags:	pm-rhel: mirror+
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:	pacemaker-2.0.5-6.el8	Doc Type:	Enhancement
Doc Text:	Feature: Cluster status (via crm_mon or "pcs status") will now display more detailed information when the cluster is in the process of starting up. Reason: Previously, cluster status would have deficient or misleading information when the cluster was starting up on the local node. Result: Users get more accurate information when they check cluster status during cluster start-up.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-05-18 15:26:40 UTC	Type:	Feature Request
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1195703, 1229826, 1251196

Description Ken Gaillot 2020-08-25 21:45:06 UTC

Description of problem: Since Bug 1718324, Pacemaker (when suitably configured) will wait to be contacted by sbd before starting up its subdaemons and running resources. The only indication of this to the user is a log message, so a user checking status will not know why resources are not starting.

Version-Release number of selected component (if applicable): 8.3


How reproducible: consistently


Steps to Reproduce:
1. Configure a cluster using sbd as its fence mechanism, with the configuration option described in Bug 1718324 Comment 9 turned on.
2. Start the cluster and watch the output of pcs status.

Actual results: pcs status gives no indication why resources are not running before sbd contacts pacemaker.


Expected results: pcs status displays a message saying that Pacemaker is waiting for sbd.


Additional info: Configuring a long sbd timeout and sending the sbd process a stop signal (or attaching to it with a debugger) before starting the cluster will probably make the issue easier to observe. If sbd is not resumed before the timeout expires, the node will panic.

Comment 2 Ken Gaillot 2020-08-25 21:48:55 UTC

Probably a good idea to make crm_resource --why show it too

Comment 3 Klaus Wenninger 2020-08-26 09:12:45 UTC

Rolling back to selinux-policy-3.14.3-11.el8.noarch should be another way to observe the issue as this will prevent ipc between sbd & pacemaker and thus prevent pacemaker getting kicked by sbd.
But of course the debugger/signal makes it easier to resume normal operation.

'crm_mon' or 'pcs status' called on the node that is waiting may even make it look as if pacemaker wasn't running at all as the sub-daemon(s) contacted by crm_mon are in fact not running.
On the node that is waiting the state of pacemakerd can be queried using 'crmadmin -P'.
If we want status/analysis tools to be able to do that query from other nodes we might have to make this usable from other nodes as well while pcs might make use of pcsd instead.

Comment 4 Ken Gaillot 2020-08-26 14:40:21 UTC

(In reply to Klaus Wenninger from comment #3)
> 'crm_mon' or 'pcs status' called on the node that is waiting may even make
> it look as if pacemaker wasn't running at all as the sub-daemon(s) contacted
> by crm_mon are in fact not running.
> On the node that is waiting the state of pacemakerd can be queried using
> 'crmadmin -P'.
> If we want status/analysis tools to be able to do that query from other
> nodes we might have to make this usable from other nodes as well while pcs
> might make use of pcsd instead.

I was thinking only of crm_mon on the host that's waiting -- if we make crm_mon query pacemakerd first, it could show a useful message before the other daemons start.

If run from other nodes, I believe crm_mon will already show the node as "pending" (i.e. in the corosync ring but not joined to the pacemaker cluster), which is probably fine. I don't think we could reasonably do anything different on the pacemaker side, though pcs status could potentially check it via pcsd as you suggested.

Your comment made me realize Bug 1194761 ('[RFE] make crm_mon indicate "pacemaker being started here" as a per-node state') overlaps with this one -- I'll close that one as a duplicate since the comments here have more detail. Basically the idea is to print that the local node is starting instead of showing all other nodes as unclean. Maybe we could show something along the lines of:

    Pacemaker does not appear to be running on this node = unable to contact pacemakerd
    Pacemaker is waiting to be contacted by sbd before starting = pacemakerd in sbd wait
    Pacemaker is starting = pacemakerd starting subdaemons -or- no node_state entry for local node

Beyond that, we could check whether any other node has a node_state entry, but without one we still can't know whether we've finished starting in a single-node situation, started and are now waiting for other nodes, or started and are now cut off from other nodes. So I'm not sure we can (or should) avoid the UNCLEAN messages at that point. Maybe we could show something like "never seen yet" instead of "UNCLEAN" if a node has no node_state.

Comment 5 Ken Gaillot 2020-08-26 14:40:55 UTC

*** Bug 1194761 has been marked as a duplicate of this bug. ***

Comment 7 Ken Gaillot 2021-01-29 00:41:35 UTC

Fix merged upstream as of commit 586e69ec

crm_mon will now display more informative messages at cluster start-up when in interactive console mode, including "Pacemaker daemons starting ...", "Waiting for startup-trigger from SBD ...", and ""Waiting for CIB ..." as appropriate (in normal operation they should flash by very quickly, but if any step is slow it will show up). Similarly, there are more informative messages at shutdown.

For "pcs status" (or equivalently, running crm_mon in one-shot mode), these states will be shown as error messages.

Comment 29 Markéta Smazová 2021-02-15 22:41:23 UTC

>   [root@virt-175 ~]# rpm -q pacemaker
>   pacemaker-2.0.5-6.el8.x86_64

>   [root@virt-175 ~]# rpm -q sbd
>   sbd-1.4.2-2.el8.x86_64


Configure cluster using sbd:

>   [root@virt-175 ~]# pcs host auth virt-1{75,76} -u hacluster -p password
>   virt-175: Authorized
>   virt-176: Authorized

>   [root@virt-175 ~]# pcs cluster setup test_cluster virt-175 virt-176
>   [...]
>   Cluster has been successfully set up.

Configure a long sbd timeout:

>   [root@virt-175 ~]# pcs stonith sbd enable watchdog=/dev/watchdog SBD_WATCHDOG_TIMEOUT=40
>   Running SBD pre-enabling checks...
>   virt-175: SBD pre-enabling checks done
>   virt-176: SBD pre-enabling checks done
>   Warning: auto_tie_breaker quorum option will be enabled to make SBD fencing effective. Cluster has to be offline to be able to make this change.
>   Checking corosync is not running on nodes...
>   virt-175: corosync is not running
>   virt-176: corosync is not running
>   Sending updated corosync.conf to nodes...
>   virt-175: Succeeded
>   virt-176: Succeeded
>   Distributing SBD config...
>   virt-175: SBD config saved
>   virt-176: SBD config saved
>   Enabling sbd...
>   virt-175: sbd enabled
>   virt-176: sbd enabled
>   Warning: Cluster restart is required in order to apply these changes.

Start the cluster and immediately kill sbd. As a result, nodes are fenced:

>   [root@virt-175 ~]# pcs cluster start --all && killall -STOP sbd
>   virt-176: Starting Cluster...
>   virt-175: Starting Cluster...

Watch output of crm_mon in another window on cluster startup:

>   [root@virt-175 ~]# crm_mon

>   Waiting until cluster is available on this node ...
>   Waiting for startup-trigger from SBD ...




Test if additional messages are displayed at cluster shutdown:

>   [root@virt-175 ~]# pcs status
>   Cluster name: test_cluster
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-175 (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 20:04:14 2021
>     * Last change:  Mon Feb 15 20:02:19 2021 by hacluster via crmd on virt-175
>     * 2 nodes configured
>     * 2 resource instances configured

>   Node List:
>     * Online: [ virt-175 virt-176 ]

>   Full List of Resources:
>     * dummy1	(ocf::pacemaker:Dummy):	 Started virt-175
>     * dummy2	(ocf::pacemaker:Dummy):	 Started virt-176

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled
>     sbd: active/enabled

Stop the cluster and run `pcs status` in another window:

>   [root@virt-175 ~]# pcs cluster destroy --all 
>   virt-176: Stopping Cluster (pacemaker)...
>   virt-175: Stopping Cluster (pacemaker)...
>   virt-176: Successfully destroyed cluster
>   virt-175: Successfully destroyed cluster


>   [root@virt-176 ~]# pcs status 
>   Error: error running crm_mon, is pacemaker running?
>   crm_mon: Error: cluster is not available on this node
>   Pacemaker daemons shut down - reporting to SBD ...
>   Error: error running crm_mon, is pacemaker running?



marking verified in pacemaker-2.0.5-6.el8

Comment 32 errata-xmlrpc 2021-05-18 15:26:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:1782

Comment 33 errata-xmlrpc 2021-05-18 15:46:38 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:1782