Bug 1195703

Summary: [RFE] make pcs indicate "pacemaker being started here" as a per-node state, with an option to wait until start process has finished
Product: Red Hat Enterprise Linux 8 Reporter: Jan Pokorný [poki] <jpokorny>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED WONTFIX QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: low    
Version: 8.0CC: cfeist, cluster-maint, cluster-qe, fdinitto, idevat, omular, phagara, tojeline
Target Milestone: rcKeywords: FutureFeature
Target Release: 8.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: 1194761
: 1229822 (view as bug list) Environment:
Last Closed: 2020-11-01 03:02:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1194761, 1872490    
Bug Blocks:    

Description Jan Pokorný [poki] 2015-02-24 12:45:39 UTC
Please see the last paragraph of the original bug below.
Rest sets the context/use case.

It would likely need some heavy-lifting on the pacemaker side as per
that very bug (hence blocker for this one).


+++ This bug was initially created as a clone of Bug #1194761 +++

It takes some time from the point pacemaker was started till it reaches
expected "up and running on the node(s)" state as indicated by crm_mon
(and, in turn, by pcs).

During such a period, "crm_mon -X" output looks like this:

# crm_mon -X
> <crm_mon version="1.1.12">
>     <summary>
>         <last_update time="Fri Feb 20 10:26:54 2015" />
>         <last_change time="" user="" client="clufter 0.3.6a" origin="" />
>         <current_dc present="false" />
>         <nodes_configured number="3" expected_votes="unknown" />
>         <resources_configured number="5" />
>     </summary>
>     <nodes>
> 	<node name="host-034.virt" id="1"
>               online="false"
>               standby="false"
>               standby_onfail="false"
>               maintenance="false"
>               pending="false"
>               unclean="true"
>               shutdown="false"
>               expected_up="false"
>               is_dc="false"
>               resources_running="0"
>               type="member" />

[other nodes ditto]

>     </nodes>
>     <resources>
>     </resources>
> </crm_mon>

leading to this output of pcs:


# pcs status
> Cluster name: 
> Last updated: Fri Feb 20 10:26:55 2015
> Last change:  via clufter 0.3.6a
> Current DC: NONE
> 3 Nodes configured
> 5 Resources configured
> 
> 
> Node host-034.virt (1): UNCLEAN (offline)

[...]

> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/enabled

Note that pacemaker is already running.


It would be nice if pacemaker was able to indicate "starting up" on
particular nodes, perhaps by the mean of introducing new predicate
(or utilizing/overloading? the existing ones).

In a wider perspective, it might also be good if pcs was able to
wait/block until selected/all(?) nodes that are starting are indeed
started.  Some similar polling mechanisms were introduced to pcs
recently, I expect this would be in the same vein.

Comment 1 Jan Pokorný [poki] 2015-02-25 14:43:27 UTC
re "with an option to wait until start process has finished" part:

A precedent for that has already been established with [bug 1156311],
i.e., I would expect something along the lines of:

(1)    pcs cluster start --wait

and perhaps even something like:

(2)    pcs cluster status --wait-started


Note that in case of (1), state handling error/race condition[*] chances
exposed towards cluster users (especially automated scripts) ought to
be eliminated and hence this would offer safest way to achieve
"hold the further execution until cluster fully functional".

[*] between possibly asynchonous/failing/blocked start of cluster runtime
    and waiting until fully started

Comment 2 Jan Pokorný [poki] 2015-02-25 14:46:20 UTC
re (2) from [comment 1]:

Apparently pcs would only wait if it detected cluster is being started
(likely thanks to the feature proposed in blocker [bug 1194761]).

Comment 3 Jan Pokorný [poki] 2015-02-25 18:34:43 UTC
FWIW, this is how phd currently deals with a similar task IIUIC:
https://github.com/davidvossel/phd/blob/49fad931331cdfc32e1e9b5d1f752890510c1da3/lib/phd_utils_api.sh#L405

Comment 5 Jan Pokorný [poki] 2018-08-01 12:01:46 UTC
See [bug 1194761 comment 1].

Comment 9 RHEL Program Management 2020-11-01 03:02:39 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.