Bug 1298581
Summary: | Need a way to provide a reason for why a resource won't start (or was stopped) | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Chris Feist <cfeist> |
Component: | pacemaker | Assignee: | Ken Gaillot <kgaillot> |
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.2 | CC: | abeekhof, cluster-maint, michele, phagara |
Target Milestone: | rc | ||
Target Release: | 7.5 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | pacemaker-1.1.18-1.el7 | Doc Type: | No Doc Update |
Doc Text: |
Low-level tools do not need documentation. The corresponding pcs functionality will be documented as part of that separate bz.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-10 15:28:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Chris Feist
2016-01-14 13:31:27 UTC
Upstream commit https://github.com/ClusterLabs/pacemaker/commit/11ac60a modified "crm_resource --cleanup" to print reasons why the resource would stay stopped. For example: * The configuration specifies that 'myrsc' should remain stopped and/or * Resource myrsc is configured to not be managed by the cluster I'm thinking we can add a new option that would do just the check and print. The current implementation does not know what particular constraints or other configuration prevent startup, but it's a good starting point. This will not be ready in the 7.3 timeframe. This will not be ready in the 7.4 tiemframe. Upstream pacemaker now supports a "crm_resource --why" command as of commit 6f0a149c, thanks to a submission by contributor Aravind Kumar. The descriptions that --cleanup and --why give are currently limited to a few conditions, but this provides a basis for future enhancements. QA: Test procedure: 1. Configure a pacemaker cluster with various resources. 2. Cause some of the resources to be stopped by various means (pcs resource disable, -INFINITY constraints on all nodes, colocation or ordering constraint with another resource stopped by one of those means, etc.). 3. Run crm_resource with the new --why option. Before the fix, the option doesn't exist; after the fix, it shows why each resource is stopped. Usage: crm_resource --why -> shows status of all resources (running or not running, and why) crm_resource --why -r <resource-id> -> shows status of particular resource crm_resource --why -r <resource-id> -N <node-name> -> shows status of particular resource on particular node Before the fix (1.1.16-12.el7-94ff4df), the "--why" option does not exist: > [root@virt-131 ~]# crm_resource --why > crm_resource: unrecognized option '--why' After the fix (1.1.18-11.el7-2b07d5c5a9), creating a bunch of dummy resources and (mis)configuring them to not be able to start using the following methods: * disabling the resource, which resulted in the following being printed by "crm_resource --why": > The configuration specifies that '<resource>' should remain stopped * banning the resource on all nodes (ie. creating -INFINITY location constraints); "crm_resource --why" DOES NOT show any extra info apart from "is not running" * constraining a resource to be colocated with a disabled resource; "crm_resource --why" DOES NOT show any extra info apart from "is not running" * constraining a resource to be started after a disabled resource; "crm_resource --why" DOES NOT show any extra info apart from "is not running" While the number of particular reasons why resources are not starting is currently *very* limited (see link at the end of this comment), it seems highly unlikely to break anything and can be enhanced to cover more situations in the future (to actually become useful in practice). Marking VERIFIED. https://github.com/ClusterLabs/pacemaker/blob/edd67444e967a0c58a96aab1748b378eec3b40f9/tools/crm_resource_runtime.c#L829 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0860 |