RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2039399 - Removing pacemaker_remote node ends up in fence action on the remote node
Summary: Removing pacemaker_remote node ends up in fence action on the remote node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: pacemaker
Version: 9.0
Hardware: All
OS: All
high
medium
Target Milestone: rc
: 9.0
Assignee: Ken Gaillot
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 2046446
TreeView+ depends on / blocked
 
Reported: 2022-01-11 16:29 UTC by Michal Mazourek
Modified: 2024-03-08 14:45 UTC (History)
3 users (show)

Fixed In Version: pacemaker-2.1.2-4.el9
Doc Type: No Doc Update
Doc Text:
This issue was not in a released build
Clone Of:
: 2046446 (view as bug list)
Environment:
Last Closed: 2022-05-17 12:20:40 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-107444 0 None None None 2022-01-11 16:33:06 UTC
Red Hat Product Errata RHBA-2022:2293 0 None None None 2022-05-17 12:20:51 UTC

Description Michal Mazourek 2022-01-11 16:29:42 UTC
Description of problem:
When removing pacemaker_remote node from cluster configuration (via 'pcs cluster node remove-remote'), fence action occurs on the remote node.


Version-Release number of selected component (if applicable):
pacemaker-2.1.2-1.el9.x86_64


How reproducible:
always


Steps to Reproduce:
[root@virt-255 ~]# pcs status
Cluster name: STSRHTS22418
Cluster Summary:
  * Stack: corosync
  * Current DC: virt-256 (version 2.1.2-1.el9-ada5c3b36e2) - partition with quorum
  * Last updated: Tue Jan 11 15:29:56 2022
  * Last change:  Tue Jan 11 15:29:01 2022 by root via cibadmin on virt-255
  * 4 nodes configured
  * 4 resource instances configured

Node List:
  * Online: [ virt-255 virt-256 virt-261 virt-262 ]

Full List of Resources:
  * fence-virt-255	(stonith:fence_xvm):	 Started virt-255
  * fence-virt-256	(stonith:fence_xvm):	 Started virt-256
  * fence-virt-261	(stonith:fence_xvm):	 Started virt-261
  * fence-virt-262	(stonith:fence_xvm):	 Started virt-262

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

[root@virt-255 ~]# pcs cluster node remove virt-262
Destroying cluster on hosts: 'virt-262'...
virt-262: Successfully destroyed cluster
Sending updated corosync.conf to nodes...
virt-261: Succeeded
virt-255: Succeeded
virt-256: Succeeded
virt-255: Corosync configuration reloaded

[root@virt-255 ~]# pcs cluster node add-remote virt-262
No addresses specified for host 'virt-262', using 'virt-262'
Sending 'pacemaker authkey' to 'virt-262'
virt-262: successful distribution of the file 'pacemaker authkey'
Requesting 'pacemaker_remote enable', 'pacemaker_remote start' on 'virt-262'
virt-262: successful run of 'pacemaker_remote enable'
virt-262: successful run of 'pacemaker_remote start'

[root@virt-255 ~]# pcs status | grep "Node List" -A 2
Node List:
  * Online: [ virt-255 virt-256 virt-261 ]
  * RemoteOnline: [ virt-262 ]

## removing the remote node

[root@virt-255 ~]# pcs cluster node remove-remote virt-262
Requesting 'pacemaker_remote disable', 'pacemaker_remote stop' on 'virt-262'
virt-262: successful run of 'pacemaker_remote disable'
virt-262: successful run of 'pacemaker_remote stop'
Requesting remove 'pacemaker authkey' from 'virt-262'
virt-262: successful removal of the file 'pacemaker authkey'
Deleting Resource - virt-262
[root@virt-255 ~]# echo $?
0

> This will stuck on 'Deleting Resource - virt-262' line for a few minutes, fence of the remote node will happen after that.


Actual results:
[root@virt-255 ~]# pcs stonith history
reboot of virt-262 successful: delegate=virt-261, client=pacemaker-controld.439906, origin=virt-256, completed='1970-01-05 01:18:46 +01:00'
1 event found


Expected results:
No fence action will occur


Additional info:
The same issue is present also in pacemaker-2.1.2-2, both on RHEL8 and RHEL9.
The issue is not present in version pacemaker-2.1.0-11.el9 and lower.

Comment 1 Ken Gaillot 2022-01-11 17:14:11 UTC
Hi, can you attach a pcs cluster report from around the time of the issue?

Comment 4 Ken Gaillot 2022-01-15 00:14:12 UTC
I confirmed that the issue was introduced between the upstream 2.1.1 and 2.1.2 releases. Pacemaker is not properly detecting that the remote node was intentionally shut down. More investigation will be needed to determine the cause and a fix.

Comment 5 Ken Gaillot 2022-01-26 16:45:39 UTC
The fix turned out to be straightforward, so we're going to get this into 9.0

Comment 6 Ken Gaillot 2022-01-26 17:15:59 UTC
Fixed upstream by commit 16928cfc69

Comment 11 Ilias Romanos 2022-02-24 14:56:39 UTC
[root@virt-258 ~]# rpm -qa pacemaker
pacemaker-2.1.2-4.el9.x86_64

[root@virt-258 ~]# pcs status
Cluster name: STSRHTS21045
Cluster Summary:
  * Stack: corosync
  * Current DC: virt-260 (version 2.1.2-4.el9-ada5c3b36e2) - partition with quorum
  * Last updated: Thu Feb 24 15:06:04 2022
  * Last change:  Thu Feb 24 15:01:45 2022 by root via cibadmin on virt-258
  * 5 nodes configured
  * 5 resource instances configured

Node List:
  * Online: [ virt-258 virt-259 virt-260 virt-261 virt-262 ]

Full List of Resources:
  * fence-virt-258      (stonith:fence_xvm):     Started virt-258
  * fence-virt-259      (stonith:fence_xvm):     Started virt-259
  * fence-virt-260      (stonith:fence_xvm):     Started virt-260
  * fence-virt-261      (stonith:fence_xvm):     Started virt-261
  * fence-virt-262      (stonith:fence_xvm):     Started virt-262

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

[root@virt-258 ~]# pcs cluster node remove virt-262
Destroying cluster on hosts: 'virt-262'...
virt-262: Successfully destroyed cluster
Sending updated corosync.conf to nodes...
virt-260: Succeeded
virt-259: Succeeded
virt-261: Succeeded
virt-258: Succeeded
virt-258: Corosync configuration reloaded

[root@virt-258 ~]# pcs cluster node add-remote virt-262
No addresses specified for host 'virt-262', using 'virt-262'
Sending 'pacemaker authkey' to 'virt-262'
virt-262: successful distribution of the file 'pacemaker authkey'
Requesting 'pacemaker_remote enable', 'pacemaker_remote start' on 'virt-262'
virt-262: successful run of 'pacemaker_remote enable'
virt-262: successful run of 'pacemaker_remote start'

[root@virt-258 ~]# pcs status | grep "Node List" -A 2
Node List:
  * Online: [ virt-258 virt-259 virt-260 virt-261 ]
  * RemoteOnline: [ virt-262 ]

[root@virt-258 ~]# pcs cluster node remove-remote virt-262
Requesting 'pacemaker_remote disable', 'pacemaker_remote stop' on 'virt-262'
virt-262: successful run of 'pacemaker_remote disable'
virt-262: successful run of 'pacemaker_remote stop'
Requesting remove 'pacemaker authkey' from 'virt-262'
virt-262: successful removal of the file 'pacemaker authkey'
Deleting Resource - virt-262

[root@virt-258 ~]# echo $?
0

[root@virt-258 ~]# sleep 120 && pcs stonith history
0 events found

Comment 13 errata-xmlrpc 2022-05-17 12:20:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: pacemaker), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2293


Note You need to log in before you can comment on or make changes to this bug.