2218232 – Cluster does not move resource group when colocation constraint exists for individual group member

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2218232 - Cluster does not move resource group when colocation constraint exists for individual group member

Summary: Cluster does not move resource group when colocation constraint exists for in...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	pacemaker
Sub Component:
Version:	8.9
Hardware:	All
OS:	All
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	8.9
Assignee:	Ken Gaillot
QA Contact:	cluster-qe
Docs Contact:
URL:
Whiteboard:
Depends On:	2218218
Blocks:
TreeView+	depends on / blocked

Reported:	2023-06-28 13:56 UTC by Ken Gaillot
Modified:	2023-11-14 16:51 UTC (History)
CC List:	3 users (show)
Fixed In Version:	pacemaker-2.1.6-3.el8
Doc Type:	Bug Fix
Doc Text:	Cause: When assigning groups to a node, Pacemaker did not consider constraints that were configured explicitly with a group member instead of the group itself. Consequence: A group could be assigned to a node where some of its members were unable to run. Fix: Pacemaker now considers member colocations when assigning groups. Result: Groups run on the best available node.
Clone Of:	2218218
Environment:
Last Closed:	2023-11-14 15:32:36 UTC
Type:	Bug
Target Upstream Version:	2.1.7
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	CLUSTERQE-6792	None	None	None	2023-06-28 15:10:54 UTC
Red Hat Issue Tracker	RHELPLAN-161090	None	None	None	2023-06-28 13:56:53 UTC
Red Hat Issue Tracker	RHELPLAN-161091	None	None	None	2023-06-28 13:56:55 UTC
Red Hat Product Errata	RHEA-2023:6970	None	None	None	2023-11-14 15:33:44 UTC

Description Ken Gaillot 2023-06-28 13:56:08 UTC

+++ This bug was initially created as a clone of Bug #2218218 +++

Description of problem:
when a resource with an existing ordering + colocation constraints against other resource(s) is added to a group, and that group's resources are already running on a node that's different from the node where this resource's dependencies are running, the resource will fail to start after it is added to the group.

Version-Release number of selected component (if applicable):
pacemaker-2.1.6-2.el9

How reproducible:
always

Steps to Reproduce:
1. create two colocated and ordered resources, "vip-dep" and "vip"
> pcs resource create vip-dep ocf:pacemaker:Dummy
> pcs resource create vip ocf:pacemaker:Dummy
> pcs constraint order start vip-dep then vip
> pcs constraint colocation add vip with vip-dep score=INFINITY

2. create a resource group "grp" with some random resources (resources inside a group have implicit ordering and colocation constraints)
> pcs resource create foo ocf:pacemaker:Dummy --group grp
> pcs resource create bar ocf:pacemaker:Dummy --group grp

3. due to resource load balancing, the "grp" group is now started on one node and our "vip" and "vip-dep" resources on another node

4. add "vip" to the "grp" resource group
> pcs resource group add grp vip

Actual results:
the "vip" resource will fail to start after adding it into the group that's already running on a different node that the "vip-dep" resource, on which "vip" has a colocation and ordering constraints

Expected results:
pacemaker should stop and move the "vip-dep" resource to the same node where "grp" is already running (or vice versa), so that all constraints are satisfied and all resources can start

Additional info:
upstream patch https://github.com/ClusterLabs/pacemaker/pull/3141

Comment 4 Markéta Smazová 2023-07-25 12:13:55 UTC

after fix:
----------

>   [root@virt-521 ~]# rpm -q pacemaker
>   pacemaker-2.1.6-4.el8.x86_64

Create two colocated and ordered resources:
>   [root@virt-521 ~]# pcs resource create blue1 ocf:pacemaker:Dummy
>   [root@virt-521 ~]# pcs resource create blue2 ocf:pacemaker:Dummy
>   [root@virt-521 ~]# pcs constraint order start blue1 then blue2
>   Adding blue1 blue2 (kind: Mandatory) (Options: first-action=start then-action=start)
>   [root@virt-521 ~]# pcs constraint colocation add blue2 with blue1 score=INFINITY

Create group with two resources:
>   [root@virt-521 ~]# pcs resource create green1 ocf:pacemaker:Dummy --group green-group
>   [root@virt-521 ~]# pcs resource create green2 ocf:pacemaker:Dummy --group green-group

The colocated and ordered resources "blue1" and "blue2" run on node "virt-521" and the group "green-group"
runs on node "virt-522":
>   [root@virt-521 ~]# pcs status
>   Cluster name: STSRHTS29909
>   Cluster Summary:
>     * Stack: corosync (Pacemaker is running)
>     * Current DC: virt-521 (version 2.1.6-4.el8-6fdc9deea29) - partition with quorum
>     * Last updated: Mon Jul 24 16:49:15 2023 on virt-521
>     * Last change:  Mon Jul 24 16:49:05 2023 by root via cibadmin on virt-521
>     * 2 nodes configured
>     * 6 resource instances configured

>   Node List:
>     * Online: [ virt-521 virt-522 ]

>   Full List of Resources:
>     * fence-virt-521	(stonith:fence_xvm):	 Started virt-521
>     * fence-virt-522	(stonith:fence_xvm):	 Started virt-522
>     * blue1	(ocf::pacemaker:Dummy):	 Started virt-521
>     * blue2	(ocf::pacemaker:Dummy):	 Started virt-521
>     * Resource Group: green-group:
>       * green1	(ocf::pacemaker:Dummy):	 Started virt-522
>       * green2	(ocf::pacemaker:Dummy):	 Started virt-522

>   Daemon Status:
>     corosync: active/enabled
>     pacemaker: active/enabled
>     pcsd: active/enabled

>   [root@virt-521 ~]# pcs constraint --full
>   Location Constraints:
>   Ordering Constraints:
>     start blue1 then start blue2 (kind:Mandatory) (id:order-blue1-blue2-mandatory)
>   Colocation Constraints:
>     blue2 with blue1 (score:INFINITY) (id:colocation-blue2-blue1-INFINITY)
>   Ticket Constraints:

Add resource "blue2" to the "green-group":
>   [root@virt-521 ~]# pcs resource group add green-group blue2
>   [root@virt-521 ~]# pcs status
>   Cluster name: STSRHTS29909
>   Cluster Summary:
>     * Stack: corosync (Pacemaker is running)
>     * Current DC: virt-521 (version 2.1.6-4.el8-6fdc9deea29) - partition with quorum
>     * Last updated: Mon Jul 24 16:50:46 2023 on virt-521
>     * Last change:  Mon Jul 24 16:50:20 2023 by root via cibadmin on virt-521
>     * 2 nodes configured
>     * 6 resource instances configured

>   Node List:
>     * Online: [ virt-521 virt-522 ]

>   Full List of Resources:
>     * fence-virt-521	(stonith:fence_xvm):	 Started virt-521
>     * fence-virt-522	(stonith:fence_xvm):	 Started virt-522
>     * blue1	(ocf::pacemaker:Dummy):	 Started virt-521
>     * Resource Group: green-group:
>       * green1	(ocf::pacemaker:Dummy):	 Started virt-521
>       * green2	(ocf::pacemaker:Dummy):	 Started virt-521
>       * blue2	(ocf::pacemaker:Dummy):	 Started virt-521

>   Daemon Status:
>     corosync: active/enabled
>     pacemaker: active/enabled
>     pcsd: active/enabled

RESULT: Resource group "green-group" moved to the node "virt-521" where the resources "blue1" and "blue2" originally started.

marking VERIFIED in pacemaker-2.1.6-4.el8

Comment 7 errata-xmlrpc 2023-11-14 15:32:36 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:6970

Note You need to log in before you can comment on or make changes to this bug.