RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2092950 - pcs resource manage --monitor enables monitor for all resources in a group
Summary: pcs resource manage --monitor enables monitor for all resources in a group
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: pcs
Version: 9.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: 9.2
Assignee: Tomas Jelinek
QA Contact: cluster-qe@redhat.com
Steven J. Levine
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-02 15:27 UTC by Tomas Jelinek
Modified: 2023-05-16 15:54 UTC (History)
12 users (show)

Fixed In Version: pcs-0.11.3-5.el9
Doc Type: Bug Fix
Doc Text:
.Enabling a single resource and monitoring operation no longer enables monitoring operations for all resources in a resource group Previously, after unmanaging all resources and monitoring operations in a resource group, managing one of the resources in that group along with its monitoring operation re-enabled the monitoring operations for all resources in the resource group. This could trigger unexpected cluster behavior. With this fix, managing a resource and re-enabling its monitoring operation re-enables the monitoring operation for that resource only and not for the other resources in a resource group.
Clone Of: 1918527
Environment:
Last Closed: 2023-05-09 07:18:23 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CLUSTERQE-6161 0 None None None 2022-11-11 20:35:55 UTC
Red Hat Issue Tracker RHELPLAN-124088 0 None None None 2022-06-02 15:34:08 UTC
Red Hat Knowledge Base (Solution) 5721201 0 None None None 2023-05-16 15:54:17 UTC
Red Hat Product Errata RHBA-2023:2151 0 None None None 2023-05-09 07:18:47 UTC

Description Tomas Jelinek 2022-06-02 15:27:07 UTC
+++ This bug was initially created as a clone of Bug #1918527 +++

Description of problem:

Suppose that `pcs resource unmanage --monitor` is run on two individual resources in a group, and then `pcs resource manage --monitor` is run on one of those resources.

Both resources' monitor operations get re-enabled in this case. However, only the resource where the `manage --monitor` command was run gets re-managed.

Demo (with dummy1 and dummy2 inside dummygrp):
~~~
# pcs resource unmanage --monitor dummy1 && pcs resource unmanage --monitor dummy2

# pcs resource config dummy1 dummy2 | egrep 'Resource:|Meta|monitor'
 Resource: dummy1 (class=ocf provider=heartbeat type=Dummy)
  Meta Attrs: is-managed=false
              monitor enabled=false interval=10s timeout=20s (dummy1-monitor-interval-10s)
 Resource: dummy2 (class=ocf provider=heartbeat type=Dummy)
  Meta Attrs: is-managed=false
              monitor enabled=false interval=10s timeout=20s (dummy2-monitor-interval-10s)

# pcs resource manage --monitor dummy1

# pcs resource config dummy1 dummy2 | egrep 'Resource:|Meta|monitor'
 Resource: dummy1 (class=ocf provider=heartbeat type=Dummy)
              monitor interval=10s timeout=20s (dummy1-monitor-interval-10s)
 Resource: dummy2 (class=ocf provider=heartbeat type=Dummy)
  Meta Attrs: is-managed=false
              monitor interval=10s timeout=20s (dummy2-monitor-interval-10s)
~~~

IMO, the least surprising behavior would be to both re-manage and re-enable monitor on **only** the resource for which we ran `pcs resource manage --monitor`. Ideally the manage command should perform the inverse of the unmanage command.

However, even if we make the opposite case (to re-manage and re-enable monitor on all resources in the group), then we should be consistent. I don't see a reason to re-enable monitor on all resources when we only re-manage one resource.

-----

The cause:

With some debugging added:
~~~
# pcs resource manage --monitor dummy1
Iterating over _find_resources_expand_tags_or_raise:
resource_el is dummy1
Iterating over to_manage_set:
resource_el is dummy1
resource_el is dummygrp
Iterating over primitives_set:
resource_el is dummy1
resource_el is dummy2
~~~

find_resources_to_manage() grabs both the primitive ID and the group ID in accordance with the comment block below its doc string.
  - (https://github.com/ClusterLabs/pcs/blob/v0.10.7/pcs/lib/cib/resource/common.py#L237-L243)

Then find_primitives() updates primitives_set with all the primitives in dummygrp (dummy1 and dummy2).

We iterate over these primitives and enable all their monitor operations.


Note: I acknowledge that specifying the behavior for pcs resource enable/manage is quite difficult and that decisions often have unintended consequences, as we also discussed in BZ 1875632. In this case, I believe we should enable monitors on a particular primitive only if either (a) the primitive was specified explicitly on the command line or (b) the primitive's group/parent ID was specified explicitly on the command line.

In my example, that would look like: "enable monitor on dummy1, but don't enable monitor for dummy2 because neither dummy2 nor dummygrp was passed as a CLI argument."

(Without looking further, I'm not sure how tags would factor into this.)

-----

Version-Release number of selected component (if applicable):

pcs-0.10.6-4.el8

-----

How reproducible:

Always

-----

Steps to Reproduce:
1. Configure two resources in a group (e.g., dummy1 and dummy2 in dummygrp).
2. Run `pcs resource unmanage --monitor` on both resources.
3. Run `pcs resource manage --monitor` on only one resource (e.g., dummy1).

-----

Actual results:

The monitor operation is re-enabled for both resources. Only the specified resource is re-managed.

-----

Expected results:

The monitor operation is re-enabled only for the specified resource. Only the specified resource is re-managed.

-----

Additional info:

As a workaround, users can omit the `--monitor` keyword and instead manipulate the monitor operation manually.

Comment 1 Tomas Jelinek 2022-06-27 13:29:20 UTC
Upstream fix + tests: https://github.com/ClusterLabs/pcs/commit/ddbd11f255702a171ddbfc84cd219adce4e0ea7b

Test:
See Steps to Reproduce, Actual results, Expected results in comment 0

Comment 2 Miroslav Lisik 2022-10-26 13:10:13 UTC
DevTestResults:

[root@r92-1 ~]# rpm -q pcs
pcs-0.11.3-5.el9.x86_64

[root@r92-1 ~]# pcs resource
  * Resource Group: G:
    * d1        (ocf:pacemaker:Dummy):   Started r92-1
    * d2        (ocf:pacemaker:Dummy):   Started r92-1
[root@r92-1 ~]# pcs resource unmanage --monitor d1
[root@r92-1 ~]# pcs resource unmanage --monitor d2
[root@r92-1 ~]# pcs resource
  * Resource Group: G:
    * d1        (ocf:pacemaker:Dummy):   Started r92-1 (unmanaged)
    * d2        (ocf:pacemaker:Dummy):   Started r92-1 (unmanaged)
[root@r92-1 ~]# pcs resource manage --monitor d1
[root@r92-1 ~]# pcs resource
  * Resource Group: G:
    * d1        (ocf:pacemaker:Dummy):   Started r92-1
    * d2        (ocf:pacemaker:Dummy):   Started r92-1 (unmanaged)

Comment 7 Michal Mazourek 2023-01-17 15:48:11 UTC
The same test as in bz1918527 comment 10 was used to verify this bz.
Marking as VERIFIED for pcs-0.11.4-4.el9.

Comment 10 errata-xmlrpc 2023-05-09 07:18:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pcs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2151


Note You need to log in before you can comment on or make changes to this bug.