1195703 – [RFE] make pcs indicate "pacemaker being started here" as a per-node state, with an option to wait until start process has finished

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1195703 - [RFE] make pcs indicate "pacemaker being started here" as a per-node state, with an option to wait until start process has finished

Summary: [RFE] make pcs indicate "pacemaker being started here" as a per-node state, w...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	pcs
Sub Component:
Version:	8.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	unspecified
Target Milestone:	rc
Target Release:	8.1
Assignee:	Tomas Jelinek
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:	1194761 1872490
Blocks:
TreeView+	depends on / blocked

Reported:	2015-02-24 12:45 UTC by Jan Pokorný [poki]
Modified:	2020-11-03 20:32 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Enhancement
Doc Text:
Clone Of:	1194761
Clones:	1229822 (view as bug list)
Environment:
Last Closed:	2020-11-01 03:02:39 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1207345	1	None	None	None	2021-01-20 06:05:38 UTC

Internal Links: 1207345

Description Jan Pokorný [poki] 2015-02-24 12:45:39 UTC

Please see the last paragraph of the original bug below.
Rest sets the context/use case.

It would likely need some heavy-lifting on the pacemaker side as per
that very bug (hence blocker for this one).


+++ This bug was initially created as a clone of Bug #1194761 +++

It takes some time from the point pacemaker was started till it reaches
expected "up and running on the node(s)" state as indicated by crm_mon
(and, in turn, by pcs).

During such a period, "crm_mon -X" output looks like this:

# crm_mon -X
> <crm_mon version="1.1.12">
>     <summary>
>         <last_update time="Fri Feb 20 10:26:54 2015" />
>         <last_change time="" user="" client="clufter 0.3.6a" origin="" />
>         <current_dc present="false" />
>         <nodes_configured number="3" expected_votes="unknown" />
>         <resources_configured number="5" />
>     </summary>
>     <nodes>
> 	<node name="host-034.virt" id="1"
>               online="false"
>               standby="false"
>               standby_onfail="false"
>               maintenance="false"
>               pending="false"
>               unclean="true"
>               shutdown="false"
>               expected_up="false"
>               is_dc="false"
>               resources_running="0"
>               type="member" />

[other nodes ditto]

>     </nodes>
>     <resources>
>     </resources>
> </crm_mon>

leading to this output of pcs:


# pcs status
> Cluster name: 
> Last updated: Fri Feb 20 10:26:55 2015
> Last change:  via clufter 0.3.6a
> Current DC: NONE
> 3 Nodes configured
> 5 Resources configured
> 
> 
> Node host-034.virt (1): UNCLEAN (offline)

[...]

> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/enabled

Note that pacemaker is already running.


It would be nice if pacemaker was able to indicate "starting up" on
particular nodes, perhaps by the mean of introducing new predicate
(or utilizing/overloading? the existing ones).

In a wider perspective, it might also be good if pcs was able to
wait/block until selected/all(?) nodes that are starting are indeed
started.  Some similar polling mechanisms were introduced to pcs
recently, I expect this would be in the same vein.

Comment 1 Jan Pokorný [poki] 2015-02-25 14:43:27 UTC

re "with an option to wait until start process has finished" part:

A precedent for that has already been established with [bug 1156311],
i.e., I would expect something along the lines of:

(1)    pcs cluster start --wait

and perhaps even something like:

(2)    pcs cluster status --wait-started


Note that in case of (1), state handling error/race condition[*] chances
exposed towards cluster users (especially automated scripts) ought to
be eliminated and hence this would offer safest way to achieve
"hold the further execution until cluster fully functional".

[*] between possibly asynchonous/failing/blocked start of cluster runtime
    and waiting until fully started

Comment 2 Jan Pokorný [poki] 2015-02-25 14:46:20 UTC

re (2) from [comment 1]:

Apparently pcs would only wait if it detected cluster is being started
(likely thanks to the feature proposed in blocker [bug 1194761]).

Comment 3 Jan Pokorný [poki] 2015-02-25 18:34:43 UTC

FWIW, this is how phd currently deals with a similar task IIUIC:
https://github.com/davidvossel/phd/blob/49fad931331cdfc32e1e9b5d1f752890510c1da3/lib/phd_utils_api.sh#L405

Comment 5 Jan Pokorný [poki] 2018-08-01 12:01:46 UTC

See [bug 1194761 comment 1].

Comment 9 RHEL Program Management 2020-11-01 03:02:39 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Note You need to log in before you can comment on or make changes to this bug.