2030869 – Transient node attributes should be preserved if remote node leaves and returns without restarting

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2030869 - Transient node attributes should be preserved if remote node leaves and returns without restarting

Summary: Transient node attributes should be preserved if remote node leaves and retur...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	pacemaker
Sub Component:
Version:	8.4
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	8.9
Assignee:	Chris Lumens
QA Contact:	cluster-qe
Docs Contact:	Steven J. Levine
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-12-09 23:00 UTC by lpham
Modified:	2023-11-14 16:51 UTC (History)
CC List:	8 users (show)
Fixed In Version:	pacemaker-2.1.6-1.el8
Doc Type:	Enhancement
Doc Text:	.Pacemaker Remote nodes now preserve transient node attributes after a brief connection outage Previously, when a Pacemaker Remote connection was lost, Pacemaker would always purge its transient node attributes. This was unnecessary if the connection was quickly recoverable and the remote daemon had not restarted in the meantime. Pacemaker Remote nodes now preserve transient node attributes after a brief, recoverable connection outage.
Clone Of:
Environment:
Last Closed:	2023-11-14 15:32:34 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)
This tar file contains the "crm_report -S" output files for the 3 hosts mentioned in the description section (782.00 KB, application/x-tar) 2021-12-09 23:00 UTC, lpham	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	CLUSTERQE-6693	None	None	None	2023-05-16 13:10:52 UTC
Red Hat Issue Tracker	RHELPLAN-105309	None	None	None	2021-12-09 23:06:26 UTC
Red Hat Product Errata	RHEA-2023:6970	None	None	None	2023-11-14 15:33:21 UTC

Description lpham 2021-12-09 23:00:23 UTC

Created attachment 1845554 [details]
This tar file contains the "crm_report -S" output files for the 3 hosts mentioned in the description section

Description of problem:

When a cluster node which hosts the Pacemaker remote resource is fenced, the resources on the corresponding Pacemaker remote host are also stopped.  Note that the remote resource itself was able to migrate to run on another cluster node successfully.


Version-Release number of selected component (if applicable):
Pacemaker version: 2.0.4-2.db2pcmk.el8.x86_64


How reproducible:
In my environment, this problem is reproduced every time


Steps to Reproduce:
1.Set up a cluster with 3 nodes: 2 nodes run as cluster nodes and the 3 node runs as Pacemaker remote
2.Set up a fence agent for the cluster
3.Run "reboot -f" on the cluster node where the remote resource is currently active on.  

Actual results:
The host is fenced, and the remote resource is migrated to run on the other cluster node.  However, the resources on the Pacemaker remote host are stopped.

Expected results:
It is expected that the resources on the Pacemaker remote host to continue running without interruption 

Additional info:
Attached files are the crm_report -S output file, captured on the 3 hosts on my cluster.
In my test setup:
- The 2 hosts svtx05 and svtx06 are configured as cluster nodes
- The 3rd host svtx07 is the Pacemaker remote host
- The remote resource, name "svtx07" was active on host svtx05
- Host svtx05 was rebooted at "2021-12-09 17:05:20"

Comment 3 lpham 2021-12-10 18:35:14 UTC

This is how the remote resource is defined in the resource model, as shown in the "crm configure show" output:

primitive svtx07 ocf:pacemaker:remote \
        params server=svtx07 reconnect_interval=1m \
        op monitor interval=30s \
        meta is-managed=true

>> It may be better for the user to configure them as non-transient so that their -INFINITY rules don't take effect this scenario.
What change do I need to make to configure the node to have non-transient state ?

Comment 4 lpham 2021-12-10 19:48:41 UTC

Assuming that transient state refers to using attrd_updater and persistent state refers to using crm_attribute when setting CIB attributes, there is an issue for us when using crm_attribute is that if the node that was acting as the Domain Control is down (and before it migrates to another node), then the crm_attribute command sometimes hang or returned an error so that's why we hesitate to use crm_attribute in our code when we need to set a CIB attribute that does not have to be persistent.

We need to CIB attributes values to be preserved during remote resource migration.

Comment 5 Ken Gaillot 2021-12-10 20:03:09 UTC

Hi,

(In reply to lpham from comment #4)
> Assuming that transient state refers to using attrd_updater and persistent
> state refers to using crm_attribute when setting CIB attributes, there is an
> issue for us when using crm_attribute is that if the node that was acting as
> the Domain Control is down (and before it migrates to another node), then
> the crm_attribute command sometimes hang or returned an error so that's why

I don't think crm_attribute has any direct dependencies on the DC, but the DC node also sets itself as the cluster's CIB authority, so the hang or error might have something to do with the CIB status. Do you have any more details about that issue?

> we hesitate to use crm_attribute in our code when we need to set a CIB
> attribute that does not have to be persistent.
> 
> We need to CIB attributes values to be preserved during remote resource
> migration.

See Comment 2. Transient attributes for remote nodes are preserved during clean migrations of the connection, but not when the host node is being fenced.

Comment 6 lpham 2021-12-13 17:21:45 UTC

I was able to see the comment, thanks Ken.
The way we use these transient CIB attributes, they represent the (internal) state of the resources and they are usually set to match the equivalent cluster resource state on that host. Hence, making them persistent doesn't make sense.
These transient CIB attributes are set by the resource monitor function. If the monitor function determines that the resource is online, it sets the value as "1". Otherwise, it sets the value to "0". A non-existence or a "0" setting would indicate that the resource is not active and hence prevent the dependent resources to be started. I suppose that I can try using crm_attribute with --lifetime=reboot, but I think I might hit some issues when the DC host was temporarily unavailable, as I remember hitting some unexpected error when I ran a reboot test of the DC node.

A couple of questions:
- Why can't the cluster manager treat transient attributes the same way for clean migration and fenced migration ? What is special about a fence-trigger migration ? Can't the cluster manager trigger migration of all remote resources as part of the cluster node fencing (before invoking the fencing agent) ?
- If we add code to trigger migration of the remote resource(s) within the fence agent script, would that help or would that still be considered un-clean migration and hence all transient attributes would still be lost ? Not that this is what we want to do anyway, as we do prefer that fencing action to be quick, but I just want to explore the options.

Comment 7 Ken Gaillot 2021-12-13 17:57:31 UTC

(In reply to lpham from comment #6)
> I was able to see the comment, thanks Ken.
> The way we use these transient CIB attributes, they represent the (internal)
> state of the resources and they are usually set to match the equivalent
> cluster resource state on that host.  Hence, making them persistent doesn't
> make sense.
> These transient CIB attributes are set by the resource monitor function.  If
> the monitor function determines that the resource is online, it sets the
> value as "1".  Otherwise, it sets the value to "0".  A non-existence or a
> "0" setting would indicate that the resource is not active and hence prevent
> the dependent resources to be started.  I suppose that I can try using

Are ordering constraints insufficient for the dependent resources?

> crm_attribute with --lifetime=reboot, but I think I might hit some issues
> when the DC host was temporarily unavailable, as I remember hitting some
> unexpected error when I ran a reboot test of the DC node.

crm_attribute --lifetime=reboot should be essentially equivalent to attrd_updater, so that wouldn't get around this issue.
 
> A couple of questions:
> - Why can't the cluster manager treat transient attributes the same way for
> clean migration and fenced migration ?  What is special about a
> fence-trigger migration ?  Can't the cluster manager trigger migration of
> all remote resources as part of the cluster node fencing (before invoking
> the fencing agent) ?

No, nothing can be done on a host that's scheduled for fencing, because the assumption is that the host is not functioning correctly. The stop of the remote connection is implied by successful fencing of the host.

> - If we add code to trigger migration of the remote resource(s) within the
> fence agent script, would that help or would that still be considered
> un-clean migration and hence all transient attributes would still be lost ? 
> Not that this is what we want to do anyway, as we do prefer that fencing
> action to be quick, but I just want to explore the options.

No, once the host is scheduled to be fenced, nothing more will be done on it until fencing completes successfully and it rejoins the cluster.

Comment 8 lpham 2022-01-19 16:58:51 UTC

I have been testing a solution using a combination of persistent CIB attributes and order dependency, and remove all resource dependencies based on transient CIB attributes.  It seems to work on the early tests, but I still need to do more validations.

Comment 9 lpham 2022-01-31 21:36:37 UTC

Just a quick update here that the problem seems to have been resolved by not using transient CIB attributes for resource dependency.
But I now encountered a related (but different) issue when replacing attrd_updater with crm_attribute command when setting a persistent CIB attribute value.
From my testing, the crm_attribute command with --l forever option hung during fencing, and never returned even after 2 minutes.  To work around this issue, I had to reduce the monitor timeout for the corresponding resources from 120s to 10s, so that the monitor action would time out and retried faster.  The same hung crm_attribute command was successful on the next retry.
This is a different issue, so I am good with closing out this one.
Thanks for all the helps.

Comment 10 Ken Gaillot 2022-02-01 17:00:09 UTC

That's good to hear :)

The more general problem of remote nodes losing their transient attributes in a network blip is still interesting, so I'm going to leave this bz open for a while, in case we have time to address it.

Comment 12 Ken Gaillot 2023-05-02 21:50:33 UTC

Fixed in upstream 2.1 branch as of commit 09058c94

Comment 18 Ken Gaillot 2023-06-29 16:15:12 UTC

Added docs

Comment 21 Markéta Smazová 2023-07-24 08:56:16 UTC

after fix
----------
>   [root@virt-529 ~]# rpm -q pacemaker
>   pacemaker-2.1.6-3.el8.x86_64


Setup 3 node cluster (2 cluster nodes and 1 pacemaker remote node), create transient attribute
on a pacemaker remote node:
>   [root@virt-529 ~]# pcs status --full
>   Cluster name: STSRHTS31813
>   Cluster Summary:
>     * Stack: corosync (Pacemaker is running)
>     * Current DC: virt-528 (2) (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum
>     * Last updated: Fri Jul 21 10:34:02 2023 on virt-529
>     * Last change:  Fri Jul 21 10:33:48 2023 by virt-526 via cibadmin on virt-528
>     * 3 nodes configured
>     * 6 resource instances configured

>   Node List:
>     * RemoteNode virt-526: online
>     * Node virt-528 (2): online, feature set 3.17.4
>     * Node virt-529 (3): online, feature set 3.17.4

>   Full List of Resources:
>     * fence-virt-526	(stonith:fence_xvm):	 Started virt-529
>     * fence-virt-528	(stonith:fence_xvm):	 Started virt-529
>     * fence-virt-529	(stonith:fence_xvm):	 Started virt-528
>     * virt-526	(ocf::pacemaker:remote):	 Started virt-528
>     * dummy2	(ocf::pacemaker:Dummy):	 Started virt-526
>     * dummy1	(ocf::pacemaker:Dummy):	 Started virt-526

>   Node Attributes:
>     * Node: virt-526:
>       * test-attribute                  	: testing-remote-node

>   Migration Summary:

>   Tickets:

>   PCSD Status:
>     virt-528: Online
>     virt-529: Online

>   Daemon Status:
>     corosync: active/enabled
>     pacemaker: active/enabled
>     pcsd: active/enabled

>   [root@virt-529 ~]# pcs constraint --full
>   Location Constraints:
>   Ordering Constraints:
>   Colocation Constraints:
>   Ticket Constraints:

Check remote node's transient attribute:
>   [root@virt-529 ~]# crm_attribute --type status --node virt-526 --name test-attribute --query
>   scope=status  name=test-attribute value=testing-remote-node

Reboot node where pacemaker remote resource is running:
>   [root@virt-528 ~]# reboot -f
>   Rebooting.

>   [root@virt-529 ~]# date; crm_mon -r -m -f -A -1
>   Fri 21 Jul 10:55:55 CEST 2023
>   Cluster Summary:
>     * Stack: corosync (Pacemaker is running)
>     * Current DC: virt-529 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum
>     * Last updated: Fri Jul 21 10:55:56 2023 on virt-529
>     * Last change:  Fri Jul 21 10:33:48 2023 by virt-526 via cibadmin on virt-528
>     * 3 nodes configured
>     * 6 resource instances configured

>   Node List:
>     * Node virt-528: UNCLEAN (offline)
>     * Online: [ virt-529 ]
>     * RemoteOnline: [ virt-526 ]

>   Full List of Resources:
>     * fence-virt-526	(stonith:fence_xvm):	 Started virt-529
>     * fence-virt-528	(stonith:fence_xvm):	 Started virt-529
>     * fence-virt-529	(stonith:fence_xvm):	 Starting [ virt-528 virt-529 ]
>     * virt-526	(ocf::pacemaker:remote):	 Started [ virt-528 virt-529 ]
>     * dummy2	(ocf::pacemaker:Dummy):	 Stopped
>     * dummy1	(ocf::pacemaker:Dummy):	 Stopped

>   Node Attributes:
>     * Node: virt-526:
>       * test-attribute                    : testing-remote-node

>   Migration Summary:

>   Fencing History:
>     * reboot of virt-528 pending: client=pacemaker-controld.59743, origin=virt-529

Node is fenced:
>   [root@virt-529 ~]# date; crm_mon -r -m -f -A -1
>   Fri 21 Jul 10:55:58 CEST 2023
>   Cluster Summary:
>     * Stack: corosync (Pacemaker is running)
>     * Current DC: virt-529 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum
>     * Last updated: Fri Jul 21 10:55:58 2023 on virt-529
>     * Last change:  Fri Jul 21 10:33:48 2023 by virt-526 via cibadmin on virt-528
>     * 3 nodes configured
>     * 6 resource instances configured

>   Node List:
>     * Online: [ virt-529 ]
>     * OFFLINE: [ virt-528 ]
>     * RemoteOnline: [ virt-526 ]

>   Full List of Resources:
>     * fence-virt-526	(stonith:fence_xvm):	 Started virt-529
>     * fence-virt-528	(stonith:fence_xvm):	 Started virt-529
>     * fence-virt-529	(stonith:fence_xvm):	 Started virt-529
>     * virt-526	(ocf::pacemaker:remote):	 Started virt-529
>     * dummy2	(ocf::pacemaker:Dummy):	 Started virt-526
>     * dummy1	(ocf::pacemaker:Dummy):	 Started virt-526

>   Node Attributes:
>     * Node: virt-526:
>       * test-attribute                    : testing-remote-node

>   Migration Summary:

>   Fencing History:
>     * reboot of virt-528 successful: delegate=virt-529, client=pacemaker-controld.59743, origin=virt-529, completed='2023-07-21 10:55:57.141210 +02:00'

RESULT: Pacemaker remote resource "virt-526" migrates to another node "virt-529", resources stay on a remote node.


Check remote node's transient attributes:
>   [root@virt-529 ~]# crm_attribute --type status --node virt-526 --name test-attribute --query
>   scope=status  name=test-attribute value=testing-remote-node

RESULT: Transient attributes on a pacemaker remote node were not deleted.


Fenced node returns back:
>   [root@virt-529 ~]# date; crm_mon -r -m -f -A -1
>   Fri 21 Jul 10:57:27 CEST 2023
>   Cluster Summary:
>     * Stack: corosync (Pacemaker is running)
>     * Current DC: virt-529 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum
>     * Last updated: Fri Jul 21 10:57:27 2023 on virt-529
>     * Last change:  Fri Jul 21 10:33:48 2023 by virt-526 via cibadmin on virt-528
>     * 3 nodes configured
>     * 6 resource instances configured

>   Node List:
>     * Online: [ virt-528 virt-529 ]
>     * RemoteOnline: [ virt-526 ]

>   Full List of Resources:
>     * fence-virt-526	(stonith:fence_xvm):	 Started virt-528
>     * fence-virt-528	(stonith:fence_xvm):	 Started virt-529
>     * fence-virt-529	(stonith:fence_xvm):	 Started virt-528
>     * virt-526	(ocf::pacemaker:remote):	 Started virt-529
>     * dummy2	(ocf::pacemaker:Dummy):	 Started virt-526
>     * dummy1	(ocf::pacemaker:Dummy):	 Started virt-526

>   Node Attributes:
>     * Node: virt-526:
>       * test-attribute                    : testing-remote-node

>   Migration Summary:

>   Fencing History:
>     * reboot of virt-528 successful: delegate=virt-529, client=pacemaker-controld.59743, origin=virt-529, completed='2023-07-21 10:55:57.141210 +02:00'


marking verified in pacemaker-2.1.6-3.el8

Comment 24 errata-xmlrpc 2023-11-14 15:32:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:6970

Note You need to log in before you can comment on or make changes to this bug.