Bug 1918527
| Summary: | pcs resource manage --monitor enables monitor for all resources in a group | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Reid Wahl <nwahl> | |
| Component: | pcs | Assignee: | Tomas Jelinek <tojeline> | |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | |
| Severity: | medium | Docs Contact: | Steven J. Levine <slevine> | |
| Priority: | medium | |||
| Version: | 8.3 | CC: | cluster-maint, idevat, mlisik, mmazoure, mpospisi, nhostako, omular, slevine, tojeline | |
| Target Milestone: | rc | Keywords: | Triaged | |
| Target Release: | 8.8 | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | pcs-0.10.14-6.el8 | Doc Type: | Bug Fix | |
| Doc Text: |
.Enabling a single resource and monitoring operation no longer enables monitoring operations for all resources in a resource group
Previously, after unmanaging all resources and monitoring operations in a resource group, managing one of the resources in that group along with its monitoring operation re-enabled the monitoring operations for all resources in the resource group. This could trigger unexpected cluster behavior.
With this fix, managing a resource and re-enabling its monitoring operation re-enables the monitoring operation for that resource only and not for the other resources in a resource group.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 2092950 (view as bug list) | Environment: | ||
| Last Closed: | 2023-05-16 08:12:42 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
Upstream fix + tests: https://github.com/ClusterLabs/pcs/commit/e0387c0878c5f7ca11d9e14d0096c265694d2365 Test: See Steps to Reproduce, Actual results, Expected results in comment 0 DevTestResults:
[root@r88-1 ~]# rpm -q pcs
pcs-0.10.14-6.el8.x86_64
[root@r88-1 ~]# pcs resource
* Resource Group: G:
* d1 (ocf::pacemaker:Dummy): Started r88-2
* d2 (ocf::pacemaker:Dummy): Started r88-2
[root@r88-1 ~]# pcs resource unmanage --monitor d1
[root@r88-1 ~]# pcs resource unmanage --monitor d2
[root@r88-1 ~]# pcs resource
* Resource Group: G:
* d1 (ocf::pacemaker:Dummy): Started r88-2 (unmanaged)
* d2 (ocf::pacemaker:Dummy): Started r88-2 (unmanaged)
[root@r88-1 ~]# pcs resource manage --monitor d1
[root@r88-1 ~]# pcs resource
* Resource Group: G:
* d1 (ocf::pacemaker:Dummy): Started r88-2
* d2 (ocf::pacemaker:Dummy): Started r88-2 (unmanaged)
BEFORE:
=======
[root@virt-498 ~]# rpm -q pcs
pcs-0.10.12-6.el8.x86_64
[root@virt-498 ~]# pcs resource create d1 ocf:heartbeat:Dummy
[root@virt-498 ~]# pcs resource create d2 ocf:heartbeat:Dummy
[root@virt-498 ~]# pcs resource group add g1 d1 d2
[root@virt-498 ~]# pcs resource config | egrep 'Resource:|Meta|monitor'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor interval=10s timeout=20s (d1-monitor-interval-10s)
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
monitor interval=10s timeout=20s (d2-monitor-interval-10s)
[root@virt-498 ~]# pcs resource unmanage --monitor d1 && pcs resource unmanage --monitor d2
[root@virt-498 ~]# pcs resource config | egrep 'Resource:|Meta|monitor'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
Meta Attrs: is-managed=false
monitor enabled=false interval=10s timeout=20s (d1-monitor-interval-10s)
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
Meta Attrs: is-managed=false
monitor enabled=false interval=10s timeout=20s (d2-monitor-interval-10s)
[root@virt-498 ~]# pcs resource manage --monitor d1
[root@virt-498 ~]# pcs resource config d1 d2 | egrep 'Resource:|Meta|monitor'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor interval=10s timeout=20s (d1-monitor-interval-10s)
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
Meta Attrs: is-managed=false
monitor interval=10s timeout=20s (d2-monitor-interval-10s)
AFTER:
======
[root@virt-252 ~]# rpm -q pcs
pcs-0.10.15-2.el8.x86_64
## 2 resources in a group, running 'pcs resource unmanage --monitor' on both, then 'pcs resource manage --monitor' only on the first one
[root@virt-252 ~]# pcs resource create d1 ocf:heartbeat:Dummy
[root@virt-252 ~]# pcs resource create d2 ocf:heartbeat:Dummy
[root@virt-252 ~]# pcs resource group add g1 d1 d2
[root@virt-252 ~]# pcs resource
* Resource Group: g1:
* d1 (ocf::heartbeat:Dummy): Started virt-252
* d2 (ocf::heartbeat:Dummy): Started virt-252
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
monitor: d2-monitor-interval-10s
[root@virt-252 ~]# pcs resource unmanage --monitor d1 && pcs resource unmanage --monitor d2
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d1-monitor-interval-10s
enabled=0
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d2-monitor-interval-10s
enabled=0
[root@virt-252 ~]# pcs resource manage --monitor d1
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d2-monitor-interval-10s
enabled=0
[root@virt-252 ~]# pcs resource
* Resource Group: g1:
* d1 (ocf::heartbeat:Dummy): Started virt-252
* d2 (ocf::heartbeat:Dummy): Started virt-252 (unmanaged)
> OK: The second resource (where 'pcs resource manage --monitor' wasn't run) is still unmanaged and its monitor is disabled
# re-enable the second resource as well
[root@virt-252 ~]# pcs resource manage --monitor d2
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
monitor: d2-monitor-interval-10s
> OK
## 3 resources in a group, running 'pcs resource unmanage --monitor' on all 3, then 'pcs resource manage --monitor' on the first one and the second one
[root@virt-252 ~]# pcs resource create d3 ocf:heartbeat:Dummy
[root@virt-252 ~]# pcs resource group add g1 d3
[root@virt-252 ~]# pcs resource
* Resource Group: g1:
* d1 (ocf::heartbeat:Dummy): Started virt-252
* d2 (ocf::heartbeat:Dummy): Started virt-252
* d3 (ocf::heartbeat:Dummy): Started virt-252
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
monitor: d2-monitor-interval-10s
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
monitor: d3-monitor-interval-10s
[root@virt-252 ~]# pcs resource unmanage --monitor d1 && pcs resource unmanage --monitor d2 && pcs resource unmanage --monitor d3
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d1-monitor-interval-10s
enabled=0
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d2-monitor-interval-10s
enabled=0
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d3-monitor-interval-10s
enabled=0
[root@virt-252 ~]# pcs resource manage --monitor d1
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d2-monitor-interval-10s
enabled=0
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d3-monitor-interval-10s
enabled=0
> OK
[root@virt-252 ~]# pcs resource manage --monitor d2
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
monitor: d2-monitor-interval-10s
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d3-monitor-interval-10s
enabled=0
> OK
# re-enable the last resource as well
[root@virt-252 ~]# pcs resource manage --monitor d3
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
monitor: d2-monitor-interval-10s
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
monitor: d3-monitor-interval-10s
> OK
## 3 resources without a group, running 'pcs resource unmanage --monitor' on all 3, then 'pcs resource manage --monitor' on the others and test that it doesn't affect each other
[root@virt-252 ~]# pcs resource ungroup g1
[root@virt-252 ~]# pcs resource
* d1 (ocf::heartbeat:Dummy): Started virt-252
* d2 (ocf::heartbeat:Dummy): Started virt-253
* d3 (ocf::heartbeat:Dummy): Started virt-252
[root@virt-252 ~]# pcs resource unmanage --monitor d1
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d1-monitor-interval-10s
enabled=0
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
monitor: d2-monitor-interval-10s
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
monitor: d3-monitor-interval-10s
[root@virt-252 ~]# pcs resource unmanage --monitor d2
[root@virt-252 ~]# pcs resource unmanage --monitor d3
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d1-monitor-interval-10s
enabled=0
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d2-monitor-interval-10s
enabled=0
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d3-monitor-interval-10s
enabled=0
[root@virt-252 ~]# pcs resource manage --monitor d1
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d2-monitor-interval-10s
enabled=0
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d3-monitor-interval-10s
enabled=0
[root@virt-252 ~]# pcs resource manage --monitor d2
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
monitor: d2-monitor-interval-10s
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
is-managed=false
monitor: d3-monitor-interval-10s
enabled=0
> OK
# without --monitor option
[root@virt-252 ~]# pcs resource manage d3
Warning: Resource 'd3' has no enabled monitor operations. Re-run with '--monitor' to enable them.
[root@virt-252 ~]# echo $?
0
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
monitor: d2-monitor-interval-10s
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
monitor: d3-monitor-interval-10s
enabled=0
[root@virt-252 ~]# pcs resource
* d1 (ocf::heartbeat:Dummy): Started virt-252
* d2 (ocf::heartbeat:Dummy): Started virt-253
* d3 (ocf::heartbeat:Dummy): Started virt-252
> OK
[root@virt-252 ~]# pcs resource manage --monitor d3
[root@virt-252 ~]# pcs resource config | grep -e 'Resource\|is-managed' -e 'monitor\|enabled'
Resource: d1 (class=ocf provider=heartbeat type=Dummy)
monitor: d1-monitor-interval-10s
Resource: d2 (class=ocf provider=heartbeat type=Dummy)
monitor: d2-monitor-interval-10s
Resource: d3 (class=ocf provider=heartbeat type=Dummy)
monitor: d3-monitor-interval-10s
> OK
Marking as VERIFIED for pcs-0.10.15-2.el8.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pcs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2738 |
Description of problem: Suppose that `pcs resource unmanage --monitor` is run on two individual resources in a group, and then `pcs resource manage --monitor` is run on one of those resources. Both resources' monitor operations get re-enabled in this case. However, only the resource where the `manage --monitor` command was run gets re-managed. Demo (with dummy1 and dummy2 inside dummygrp): ~~~ # pcs resource unmanage --monitor dummy1 && pcs resource unmanage --monitor dummy2 # pcs resource config dummy1 dummy2 | egrep 'Resource:|Meta|monitor' Resource: dummy1 (class=ocf provider=heartbeat type=Dummy) Meta Attrs: is-managed=false monitor enabled=false interval=10s timeout=20s (dummy1-monitor-interval-10s) Resource: dummy2 (class=ocf provider=heartbeat type=Dummy) Meta Attrs: is-managed=false monitor enabled=false interval=10s timeout=20s (dummy2-monitor-interval-10s) # pcs resource manage --monitor dummy1 # pcs resource config dummy1 dummy2 | egrep 'Resource:|Meta|monitor' Resource: dummy1 (class=ocf provider=heartbeat type=Dummy) monitor interval=10s timeout=20s (dummy1-monitor-interval-10s) Resource: dummy2 (class=ocf provider=heartbeat type=Dummy) Meta Attrs: is-managed=false monitor interval=10s timeout=20s (dummy2-monitor-interval-10s) ~~~ IMO, the least surprising behavior would be to both re-manage and re-enable monitor on **only** the resource for which we ran `pcs resource manage --monitor`. Ideally the manage command should perform the inverse of the unmanage command. However, even if we make the opposite case (to re-manage and re-enable monitor on all resources in the group), then we should be consistent. I don't see a reason to re-enable monitor on all resources when we only re-manage one resource. ----- The cause: With some debugging added: ~~~ # pcs resource manage --monitor dummy1 Iterating over _find_resources_expand_tags_or_raise: resource_el is dummy1 Iterating over to_manage_set: resource_el is dummy1 resource_el is dummygrp Iterating over primitives_set: resource_el is dummy1 resource_el is dummy2 ~~~ find_resources_to_manage() grabs both the primitive ID and the group ID in accordance with the comment block below its doc string. - (https://github.com/ClusterLabs/pcs/blob/v0.10.7/pcs/lib/cib/resource/common.py#L237-L243) Then find_primitives() updates primitives_set with all the primitives in dummygrp (dummy1 and dummy2). We iterate over these primitives and enable all their monitor operations. Note: I acknowledge that specifying the behavior for pcs resource enable/manage is quite difficult and that decisions often have unintended consequences, as we also discussed in BZ 1875632. In this case, I believe we should enable monitors on a particular primitive only if either (a) the primitive was specified explicitly on the command line or (b) the primitive's group/parent ID was specified explicitly on the command line. In my example, that would look like: "enable monitor on dummy1, but don't enable monitor for dummy2 because neither dummy2 nor dummygrp was passed as a CLI argument." (Without looking further, I'm not sure how tags would factor into this.) ----- Version-Release number of selected component (if applicable): pcs-0.10.6-4.el8 ----- How reproducible: Always ----- Steps to Reproduce: 1. Configure two resources in a group (e.g., dummy1 and dummy2 in dummygrp). 2. Run `pcs resource unmanage --monitor` on both resources. 3. Run `pcs resource manage --monitor` on only one resource (e.g., dummy1). ----- Actual results: The monitor operation is re-enabled for both resources. Only the specified resource is re-managed. ----- Expected results: The monitor operation is re-enabled only for the specified resource. Only the specified resource is re-managed. ----- Additional info: As a workaround, users can omit the `--monitor` keyword and instead manipulate the monitor operation manually.