Bug 1194761

Summary:	[RFE] make crm_mon indicate "pacemaker being started here" as a per-node state
Product:	Red Hat Enterprise Linux 8	Reporter:	Jan Pokorný [poki] <jpokorny>
Component:	pacemaker	Assignee:	Ken Gaillot <kgaillot>
Status:	CLOSED DUPLICATE	QA Contact:	cluster-qe <cluster-qe>
Severity:	low	Docs Contact:
Priority:	medium
Version:	8.0	CC:	cfeist, cluster-maint, fdinitto, kgaillot, phagara, tojeline
Target Milestone:	pre-dev-freeze	Keywords:	FutureFeature
Target Release:	8.4	Flags:	pm-rhel: mirror+
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Enhancement
Doc Text:		Story Points:	---
Clone Of:
Clones:	1195703 (view as bug list)		Environment:
Last Closed:	2020-08-26 14:40:55 UTC	Type:	Feature Request
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1682116
Bug Blocks:	1195703, 1229826, 1251196

Description Jan Pokorný [poki] 2015-02-20 17:05:36 UTC

It takes some time from the point pacemaker was started till it reaches
expected "up and running on the node(s)" state as indicated by crm_mon
(and, in turn, by pcs).

During such a period, "crm_mon -X" output looks like this:

# crm_mon -X
> <crm_mon version="1.1.12">
>     <summary>
>         <last_update time="Fri Feb 20 10:26:54 2015" />
>         <last_change time="" user="" client="clufter 0.3.6a" origin="" />
>         <current_dc present="false" />
>         <nodes_configured number="3" expected_votes="unknown" />
>         <resources_configured number="5" />
>     </summary>
>     <nodes>
> 	<node name="host-034.virt" id="1"
>               online="false"
>               standby="false"
>               standby_onfail="false"
>               maintenance="false"
>               pending="false"
>               unclean="true"
>               shutdown="false"
>               expected_up="false"
>               is_dc="false"
>               resources_running="0"
>               type="member" />

[other nodes ditto]

>     </nodes>
>     <resources>
>     </resources>
> </crm_mon>

leading to this output of pcs:


# pcs status
> Cluster name: 
> Last updated: Fri Feb 20 10:26:55 2015
> Last change:  via clufter 0.3.6a
> Current DC: NONE
> 3 Nodes configured
> 5 Resources configured
> 
> 
> Node host-034.virt (1): UNCLEAN (offline)

[...]

> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/enabled

Note that pacemaker is already running.


It would be nice if pacemaker was able to indicate "starting up" on
particular nodes, perhaps by the mean of introducing new predicate
(or utilizing/overloading? the existing ones).

In a wider perspective, it might also be good if pcs was able to
wait/block until selected/all(?) nodes that are starting are indeed
started.  Some similar polling mechanisms were introduced to pcs
recently, I expect this would be in the same vein.

Comment 1 Jan Pokorný [poki] 2018-08-01 11:59:55 UTC

See also tangential discussion -- there's apparently demand for
transient states identification, or conversely, busy-waiting-less
hooking into when the target state (here "ready to provide CIB data")
is reached:

https://lists.clusterlabs.org/pipermail/developers/2018-July/001271.html

Comment 4 Ken Gaillot 2020-08-26 14:40:55 UTC


*** This bug has been marked as a duplicate of bug 1872490 ***