This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1298581 - Need a way to provide a reason for why a resource won't start (or was stopped)
Need a way to provide a reason for why a resource won't start (or was stopped)
Status: ON_QA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pacemaker (Show other bugs)
7.2
Unspecified Unspecified
medium Severity medium
: rc
: 7.5
Assigned To: Ken Gaillot
cluster-qe@redhat.com
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-14 08:31 EST by Chris Feist
Modified: 2017-10-10 13:47 EDT (History)
4 users (show)

See Also:
Fixed In Version: pacemaker-1.1.18-1.el7
Doc Type: No Doc Update
Doc Text:
Low-level tools do not need documentation. The corresponding pcs functionality will be documented as part of that separate bz.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Chris Feist 2016-01-14 08:31:27 EST
Description of problem:
People are running 'pcs resource enable' and the resource doesn't start because of constraints, but they don't know which constraints are preventing it from starting (or if constraints are the reason it isn't starting).

We'd like a command we can run that specifies a resource and then tells us it is either running, or gives us a reason why it isn't running.  pcs can then use that output and pass it back to the user.

Version-Release number of selected component (if applicable):
all

How reproducible:
always
Comment 2 Ken Gaillot 2016-01-18 17:56:10 EST
Upstream commit https://github.com/ClusterLabs/pacemaker/commit/11ac60a modified "crm_resource --cleanup" to print reasons why the resource would stay stopped. For example:

* The configuration specifies that 'myrsc' should remain stopped

and/or

* Resource myrsc is configured to not be managed by the cluster

I'm thinking we can add a new option that would do just the check and print. The current implementation does not know what particular constraints or other configuration prevent startup, but it's a good starting point.
Comment 4 Ken Gaillot 2016-05-16 12:27:18 EDT
This will not be ready in the 7.3 timeframe.
Comment 5 Ken Gaillot 2017-03-06 18:24:32 EST
This will not be ready in the 7.4 tiemframe.
Comment 6 Ken Gaillot 2017-05-31 13:35:40 EDT
Upstream pacemaker now supports a "crm_resource --why" command as of commit 6f0a149c, thanks to a submission by contributor Aravind Kumar.

The descriptions that --cleanup and --why give are currently limited to a few conditions, but this provides a basis for future enhancements.
Comment 7 Ken Gaillot 2017-10-10 12:56:55 EDT
QA: Test procedure:

1. Configure a pacemaker cluster with various resources.

2. Cause some of the resources to be stopped by various means (pcs resource disable, -INFINITY constraints on all nodes, colocation or ordering constraint with another resource stopped by one of those means, etc.).

3. Run crm_resource with the new --why option. Before the fix, the option doesn't exist; after the fix, it shows why each resource is stopped. Usage:

crm_resource --why
-> shows status of all resources (running or not running, and why)

crm_resource --why -r <resource-id>
-> shows status of particular resource

crm_resource --why -r <resource-id> -N <node-name>
-> shows status of particular resource on particular node

Note You need to log in before you can comment on or make changes to this bug.