Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Feature:
Add optional '--wait' parameter to 'pcs resource' commands making pcs wait for resources to start/move and report on what nodes the resource has started/failed.
Reason:
User needs to know whether the resource has started/failed without using 'pcs status command' after 'pcs resource create' command.
Result:
When using the '--wait' parameter in supported 'pcs resource' command the user is informed whether the resource has started/been moved/failed, on which nodes, and gets a failure explanation message in case of fail.
Description of problem:
When you create a resource with pcs it returns immediately, so if there is some error in the resource configuration you don't know without checking status (and don't know why it fails to start).
The new option should be --wait=[n] to be consistent with resource enable/disable
If a wait time is not specified we use the default wait timeout from pacemaker.
If a resource fails we don't wait the full timeout, just immediately return.
We should also return the node that the resource failed to start one (or succeeded starting on).
Also, if a resource fails we want to get the error output from the resource agent. There is some work with resource agents to provide this information to pacemaker, but we may need to parse /var/log/messages to try and find that output. We may also need to use pcsd to get this information.
Created attachment 954471[details]
proposed fix
Test:
[root@rh70-node1:~]# pcs resource create apa1 apache --wait
Resource 'apa1' is running on node rh70-node1.
[root@rh70-node1:~]# echo $?
0
[root@rh70-node1:~]# time pcs resource create apa2 apache configfile=/root/missing --wait
Error: unable to start: 'apa2', please check logs for failure information
rh70-node2: Port number is invalid!
rh70-node1: Port number is invalid!
real 0m1.312s
user 0m0.190s
sys 0m0.071s
[root@rh70-node1:~]# echo $?
1
Affected commands (guide for testing):
pcs resource create
- waits for the resource to be started
- if --clone is specified waits for all instances to be running (works with globally-unique, clone-max and clone-node-max meta attributes to get the number of instances)
- if --master is specified waits for the resource to be promoted (works with master-max and master-node-max meta attributes to get the number of promoted instances)
- reports failures and nodes on which the resource is running
- does not wait if -f, -- disabled or meta target-role=Stopped is specified
pcs resource enable, pcs resource disable
- already had a --wait option support
- added: reports failures and nodes on which the resource is running
pcs resource move
- waits for the resource to be started on a target node (or a node different to the current one if the target node not specified)
- reports failures and nodes on which the resource is running
- does not wait if -f is specified or the resource is not running
pcs resource ban
- waits for the resource to be started on a node different to a target node (or current node if the target node is not specified)
- reports failures and nodes on which the resource is running
- does not wait if -f is specified or the resource is not running
pcs resource clear
- gets operations related to the specified resource using crm_simulate and waits for them to finish
- reports failures and nodes on which the resource is running
- does not wait if -f is specified
pcs resource meta, pcs resource clone, pcs resource master
- waits for the resource to be started/stopped when changing target-role
- waits for master/clone instances to be started/stopped/promoted when changing globally-unique, clone-max, clone-node-max, master-max, master-node-max options
- reports failures and nodes on which the resource is running
- does not wait if -f is specified
pcs resource unclone
- waits for the resource to be running as one instance
- does not wait if -f is specified or the resource is not running
pcs resource group add, pcs resource group remove, pcs resource ungroup
- gets operations related to the specified resource(s) using crm_simulate and waits for them to finish
- reports failures and nodes on which the resource is running
- does not wait if -f is specified
pcs resource update
- waits for the resource to be started/stopped when changing target-role
- waits for master/clone instances to be started/stopped/promoted when changing globally-unique, clone-max, clone-node-max, master-max, master-node-max options
- reports failures and nodes on which the resource is running
- does not wait if -f is specified
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHBA-2015-0415.html