Bug 2218232 - Cluster does not move resource group when colocation constraint exists for individual group member
Summary: Cluster does not move resource group when colocation constraint exists for in...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: pacemaker
Version: 8.9
Hardware: All
OS: All
urgent
urgent
Target Milestone: rc
: 8.9
Assignee: Ken Gaillot
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On: 2218218
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-28 13:56 UTC by Ken Gaillot
Modified: 2023-08-10 15:39 UTC (History)
3 users (show)

Fixed In Version: pacemaker-2.1.6-3.el8
Doc Type: Bug Fix
Doc Text:
Cause: When assigning groups to a node, Pacemaker did not consider constraints that were configured explicitly with a group member instead of the group itself. Consequence: A group could be assigned to a node where some of its members were unable to run. Fix: Pacemaker now considers member colocations when assigning groups. Result: Groups run on the best available node.
Clone Of: 2218218
Environment:
Last Closed:
Type: Bug
Target Upstream Version: 2.1.7
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CLUSTERQE-6792 0 None None None 2023-06-28 15:10:54 UTC
Red Hat Issue Tracker RHELPLAN-161090 0 None None None 2023-06-28 13:56:53 UTC
Red Hat Issue Tracker RHELPLAN-161091 0 None None None 2023-06-28 13:56:55 UTC

Description Ken Gaillot 2023-06-28 13:56:08 UTC
+++ This bug was initially created as a clone of Bug #2218218 +++

Description of problem:
when a resource with an existing ordering + colocation constraints against other resource(s) is added to a group, and that group's resources are already running on a node that's different from the node where this resource's dependencies are running, the resource will fail to start after it is added to the group.

Version-Release number of selected component (if applicable):
pacemaker-2.1.6-2.el9

How reproducible:
always

Steps to Reproduce:
1. create two colocated and ordered resources, "vip-dep" and "vip"
> pcs resource create vip-dep ocf:pacemaker:Dummy
> pcs resource create vip ocf:pacemaker:Dummy
> pcs constraint order start vip-dep then vip
> pcs constraint colocation add vip with vip-dep score=INFINITY

2. create a resource group "grp" with some random resources (resources inside a group have implicit ordering and colocation constraints)
> pcs resource create foo ocf:pacemaker:Dummy --group grp
> pcs resource create bar ocf:pacemaker:Dummy --group grp

3. due to resource load balancing, the "grp" group is now started on one node and our "vip" and "vip-dep" resources on another node

4. add "vip" to the "grp" resource group
> pcs resource group add grp vip

Actual results:
the "vip" resource will fail to start after adding it into the group that's already running on a different node that the "vip-dep" resource, on which "vip" has a colocation and ordering constraints

Expected results:
pacemaker should stop and move the "vip-dep" resource to the same node where "grp" is already running (or vice versa), so that all constraints are satisfied and all resources can start

Additional info:
upstream patch https://github.com/ClusterLabs/pacemaker/pull/3141

Comment 4 Markéta Smazová 2023-07-25 12:13:55 UTC
after fix:
----------

>   [root@virt-521 ~]# rpm -q pacemaker
>   pacemaker-2.1.6-4.el8.x86_64

Create two colocated and ordered resources:
>   [root@virt-521 ~]# pcs resource create blue1 ocf:pacemaker:Dummy
>   [root@virt-521 ~]# pcs resource create blue2 ocf:pacemaker:Dummy
>   [root@virt-521 ~]# pcs constraint order start blue1 then blue2
>   Adding blue1 blue2 (kind: Mandatory) (Options: first-action=start then-action=start)
>   [root@virt-521 ~]# pcs constraint colocation add blue2 with blue1 score=INFINITY

Create group with two resources:
>   [root@virt-521 ~]# pcs resource create green1 ocf:pacemaker:Dummy --group green-group
>   [root@virt-521 ~]# pcs resource create green2 ocf:pacemaker:Dummy --group green-group

The colocated and ordered resources "blue1" and "blue2" run on node "virt-521" and the group "green-group"
runs on node "virt-522":
>   [root@virt-521 ~]# pcs status
>   Cluster name: STSRHTS29909
>   Cluster Summary:
>     * Stack: corosync (Pacemaker is running)
>     * Current DC: virt-521 (version 2.1.6-4.el8-6fdc9deea29) - partition with quorum
>     * Last updated: Mon Jul 24 16:49:15 2023 on virt-521
>     * Last change:  Mon Jul 24 16:49:05 2023 by root via cibadmin on virt-521
>     * 2 nodes configured
>     * 6 resource instances configured

>   Node List:
>     * Online: [ virt-521 virt-522 ]

>   Full List of Resources:
>     * fence-virt-521	(stonith:fence_xvm):	 Started virt-521
>     * fence-virt-522	(stonith:fence_xvm):	 Started virt-522
>     * blue1	(ocf::pacemaker:Dummy):	 Started virt-521
>     * blue2	(ocf::pacemaker:Dummy):	 Started virt-521
>     * Resource Group: green-group:
>       * green1	(ocf::pacemaker:Dummy):	 Started virt-522
>       * green2	(ocf::pacemaker:Dummy):	 Started virt-522

>   Daemon Status:
>     corosync: active/enabled
>     pacemaker: active/enabled
>     pcsd: active/enabled

>   [root@virt-521 ~]# pcs constraint --full
>   Location Constraints:
>   Ordering Constraints:
>     start blue1 then start blue2 (kind:Mandatory) (id:order-blue1-blue2-mandatory)
>   Colocation Constraints:
>     blue2 with blue1 (score:INFINITY) (id:colocation-blue2-blue1-INFINITY)
>   Ticket Constraints:

Add resource "blue2" to the "green-group":
>   [root@virt-521 ~]# pcs resource group add green-group blue2
>   [root@virt-521 ~]# pcs status
>   Cluster name: STSRHTS29909
>   Cluster Summary:
>     * Stack: corosync (Pacemaker is running)
>     * Current DC: virt-521 (version 2.1.6-4.el8-6fdc9deea29) - partition with quorum
>     * Last updated: Mon Jul 24 16:50:46 2023 on virt-521
>     * Last change:  Mon Jul 24 16:50:20 2023 by root via cibadmin on virt-521
>     * 2 nodes configured
>     * 6 resource instances configured

>   Node List:
>     * Online: [ virt-521 virt-522 ]

>   Full List of Resources:
>     * fence-virt-521	(stonith:fence_xvm):	 Started virt-521
>     * fence-virt-522	(stonith:fence_xvm):	 Started virt-522
>     * blue1	(ocf::pacemaker:Dummy):	 Started virt-521
>     * Resource Group: green-group:
>       * green1	(ocf::pacemaker:Dummy):	 Started virt-521
>       * green2	(ocf::pacemaker:Dummy):	 Started virt-521
>       * blue2	(ocf::pacemaker:Dummy):	 Started virt-521

>   Daemon Status:
>     corosync: active/enabled
>     pacemaker: active/enabled
>     pcsd: active/enabled

RESULT: Resource group "green-group" moved to the node "virt-521" where the resources "blue1" and "blue2" originally started.

marking VERIFIED in pacemaker-2.1.6-4.el8


Note You need to log in before you can comment on or make changes to this bug.