This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1459251 - pcs should not guess expected status of a resource when --wait is used
pcs should not guess expected status of a resource when --wait is used
Status: NEW
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pcs (Show other bugs)
7.4
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Tomas Jelinek
cluster-qe@redhat.com
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-06 11:57 EDT by Tomas Jelinek
Modified: 2017-07-21 07:09 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Tomas Jelinek 2017-06-06 11:57:13 EDT
Description of problem:
When the --wait flag is used in pcs commands, pcs guesses in what state a resource managed by the command should be when the command finishes. At the end of the command, pcs checks in what state the resource really is and returns 0 if real and expected status matches or 1 if the statuses do not match.

The issue is the expected state of the resource is very hard to get right and that may lead to pcs exiting with a bad return code.


Version-Release number of selected component (if applicable):
pcs-0.9.158-4.el7.x86_64


How reproducible:
always, easily (depending on cluster settings complexity)


Steps to Reproduce:
# pcs resource create test1 ocf:pacemaker:Dummy meta is-managed=false --wait
Error: resource 'test1' is not running on any node
# echo $?
1


Actual results:
pcs exits with 1 because the resource did not start


Expected results:
pcs exits with 0 as the resource was not able to start (pacemaker does not start unmanaged resources) and therefore the command succeeded


Additional info:
With this particular reproducer the issue may seem to be easy to fix in pcs - if the resource is not managed, we expect it not to be started. However more complex setups are possible: the resource may not start due to constraints, utilization, cluster properties and so on and so forth. Also we are not talking about resource create only. Most of the commands supporting --wait are affected.
Comment 1 Tomas Jelinek 2017-06-07 07:32:43 EDT
One way to deal with this is to not make any assumptions of expected resource state. --wait would cause pcs to run crm_resource --wait and print resource status. But pcs exit code would not depend on the resource status. Users would be able to use separate commands to figure out resources' status like described in bz1290830 comment 3.

Note You need to log in before you can comment on or make changes to this bug.