801355 – cman+pacemaker leads to double fences

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 801355 - cman+pacemaker leads to double fences

Summary: cman+pacemaker leads to double fences

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	pacemaker
Sub Component:
Version:	6.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	David Vossel
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-03-08 10:58 UTC by Jaroslav Kortus
Modified:	2016-04-26 13:25 UTC (History)
CC List:	5 users (show)
Fixed In Version:	pacemaker-1.1.8-1.el6
Doc Type:	Bug Fix
Doc Text:	Cause: Multiple parts of the system may notice a node failure at slightly different times. Consequence: If more than one component requests that the node be fenced, then the fencing component will do so multiple times. Fix: Merge identical requests from different clients if the first is still in progress. Result: The node is fenced only once.
Clone Of:
Environment:
Last Closed:	2013-02-21 09:51:03 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
crm_report output (136.41 KB, application/x-bzip2) 2012-03-08 10:59 UTC, Jaroslav Kortus	no flags	Details
/var/log/messages snippet (2.29 KB, text/plain) 2012-11-29 11:35 UTC, michal novacek	no flags	Details
/var/log/messages snippet (13.04 KB, text/plain) 2012-12-04 13:25 UTC, michal novacek	no flags	Details
Show Obsolete (1) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2013:0375	0	normal	SHIPPED_LIVE	pacemaker bug fix and enhancement update	2013-02-20 20:52:23 UTC

Description Jaroslav Kortus 2012-03-08 10:58:00 UTC

Description of problem:
cman with fence_pcmk + pacemaker (in this case with virt fencing) fence the node twice if it fails.

This is most probably due to cman requesting fence and pacemaker making it's fencing decision right after that.

Version-Release number of selected component (if applicable):
pacemaker-1.1.7-2.el6.x86_64
cman-3.0.12.1-27.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. setup cman+pacemaker cluster with fence_pcmk and (for example) virt fencing
2. ssh to one node and issue halt -fin
3. wait for the node to get fenced, observe it gets fenced twice in a row
  
Actual results:
node fenced twice

Expected results:
one fence

Additional info:
config snips:
cluster.conf:
		<clusternode name="node01" nodeid="1" votes="1">
			<fence>
				<method name="pcmk-redirect">
					<device name="pcmk" port="node01"/>
				</method>
			</fence>
		</clusternode>
	<fencedevices>
		<fencedevice agent="fence_pcmk" name="pcmk"/>
	</fencedevices>
crm:
primitive virt-fencing stonith:fence_xvm \
	params pcmk_host_check="static-list" pcmk_host_list="node01,node02,node03" action="reboot" debug="1"


you can see the double fence via virsh console or via manually running fence_virtd (fence_virtd -F -f /etc/fence_virt.conf -d 2):

Request 2 seqno 367654 domain node03
Plain TCP request
Request 2 seqno 367654 src 192.168.100.101 target node03
Rebooting domain node03...
[REBOOT] Calling virDomainDestroy(0x233cc30)
Domain has been shut off
Calling virDomainCreateLinux()...
Request 2 seqno 623812 domain node03
Plain TCP request
Request 2 seqno 623812 src 192.168.100.101 target node03
Rebooting domain node03...
[REBOOT] Calling virDomainDestroy(0x233cd60)
Domain has been shut off
Calling virDomainCreateLinux()...

Comment 1 Jaroslav Kortus 2012-03-08 10:59:07 UTC

Created attachment 568590 [details]
crm_report output

output of crm_report after node03 was fenced

Comment 3 David Vossel 2012-03-12 22:26:03 UTC

(In reply to comment #1)
> Created attachment 568590 [details]
> crm_report output
> 
> output of crm_report after node03 was fenced

I setup your environment and see problem.  I'm working on tracking it down.

Comment 4 David Vossel 2012-03-23 17:51:44 UTC

I was able to track down the cause of this issue last week.  Here is what I found.

Both cman and crmd are alerted to the loss of membership of a node at the same time.  This results in both of these processes scheduling the node to be fenced independently of one another.  Since the remote operation to stonith to fence the node has an async response, neither cman nor crmd are capable of being aware each other have scheduled the fence operation on the same node at nearly the same time.

To be clear, this is not the result of the crmd trying to fence a node because it detected it is down as a result of someone else fencing it earlier on.  This is the result of both crmd and cman scheduling the fencing of a node at the same time as a result of the same event within the cluster.

I have a patch that resolves this issue specifically, but now that I understand how this situation occurs I am not satisfied that it will prevent other similar situations from occurring.  The correct solution to resolve this is still being discussed.

Comment 5 Andrew Beekhof 2012-03-23 21:05:58 UTC

NACK for 6.3, better to be strictly correct and keep the data safe.
We'd likely introduce more corner cases than we solve.

Rather than rush this in, we'll take our time for 6.4 and make sure we're not leaving any holes.

Comment 9 michal novacek 2012-11-29 11:35:26 UTC

Created attachment 654205 [details]
/var/log/messages snippet

It shows that both fenced (lines 8) and stonith-ng (line 9) fence the node.

Comment 10 michal novacek 2012-11-30 14:03:17 UTC

buggy version:

node01$ rpm -q pacemaker cman
pacemaker-1.1.7-6.el6.x86_64
cman-3.0.12.1-32.el6.x86_64

node01$ crm status
============
Last updated: Mon Nov 26 05:53:06 2012
Last change: Mon Nov 26 05:40:06 2012 via cibadmin on c3-node01
Stack: cman
Current DC: c3-node01 - partition with quorum
Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
3 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ c3-node01 c3-node02 c3-node03 ]

 virt-fencing   (stonith:fence_xvm):    Started c3-node01

node01$ fence_node c3-node03
fence c3-node03 success

..and on the virtual host I see that there is multiple fence of that node

host$ fence_virtd -F -d 2
...
Got virbr0 for interface
Request 2 seqno 925456 domain c3-node03
Plain TCP request
Request 2 seqno 925456 src 192.168.122.202 target c3-node03
Rebooting domain c3-node03...
[REBOOT] Calling virDomainDestroy(0xef1270)
Domain has been shut off
Calling virDomainCreateLinux()...
Request 2 seqno 88574 domain c3-node03
Plain TCP request
Request 2 seqno 88574 src 192.168.122.202 target c3-node03
Rebooting domain c3-node03...
[REBOOT] Calling virDomainDestroy(0xef13a0)
Domain has been shut off
Calling virDomainCreateLinux()...
Request 2 seqno 195177 domain c3-node03
Plain TCP request
Request 2 seqno 195177 src 192.168.122.202 target c3-node03
Rebooting domain c3-node03...
[REBOOT] Calling virDomainDestroy(0xef15a0)
Domain has been shut off
Calling virDomainCreateLinux()...

-----
patched version

node01$ rpm -q pacemaker cman 
pacemaker-1.1.8-4.el6.x86_64
cman-3.0.12.1-46.el6.x86_64

node01$ crm_mon -1
Last updated: Tue Nov 27 04:37:52 2012
Last change: Mon Nov 26 08:54:17 2012 via crmd on c3-node02
Stack: cman
Current DC: c3-node02 - partition with quorum
Version: 1.1.8-4.el6-394e906
3 Nodes configured, unknown expected votes
1 Resources configured.

Online: [ c3-node01 c3-node02 c3-node03 ]

 virt-fencing   (stonith:fence_xvm):    Started c3-node02

node01$ fence_node c3-node03
fence c3-node03 success

...and on the virtual host I see that there is fence happening only once:

host$ fence_virtd -F -d 2
...
Got virbr0 for interface
Request 2 seqno 463322 domain c3-node03
Plain TCP request
Request 2 seqno 463322 src 192.168.122.202 target c3-node03
Rebooting domain c3-node03...
[REBOOT] Calling virDomainDestroy(0x1939270)
Domain has been shut off
Calling virDomainCreateLinux()...

Comment 11 Andrew Beekhof 2012-12-04 02:59:02 UTC

(In reply to comment #9)
> Created attachment 654205 [details]
> /var/log/messages snippet
> 
> It shows that both fenced (lines 8) and stonith-ng (line 9) fence the node.

Not really, fenced is using the fence_pcmk device.
All this does it tell stonith-ng that the node needed to be shot.

So the node is only shot once, you're just seeing logs from two different subsystems regarding the same event.

Comment 12 michal novacek 2012-12-04 13:25:30 UTC

Created attachment 657504 [details]
/var/log/messages snippet

Comment 13 michal novacek 2012-12-04 13:34:17 UTC

I got somehow confused and previously attached log from the corrected version of pacemaker. Attachement 657504 is tha one that show incorrect behaviour.

Comment 14 Andrew Beekhof 2012-12-05 03:40:55 UTC

David, looks like we're actually fencing the node three times.

Comment 15 Jaroslav Kortus 2012-12-05 12:36:49 UTC

that was most likely due to fence_node command. Previously I was testing it with something that shuts down the node (pkill -9 corosync, halt -fin, panic or similar).

Now it looks like that fence_node lets the cluster know about the fencing. IIRC this was not done before (or at least I could see fence_node shutting down the node and then pure cman did it once more after it realized it had lost token).

Comment 16 David Vossel 2012-12-05 15:55:48 UTC

(In reply to comment #14)
> David, looks like we're actually fencing the node three times.

I'm confused and just want to clarify this.  What version of pacemaker is that log (657504) from.  I would not expect that to be the behavior from the 1.1.8-4 release.

-- Vossel

Comment 17 michal novacek 2012-12-05 16:26:24 UTC

Comment on attachment 657504 [details]
/var/log/messages snippet

This is behaviour encountered with paceameker-1.1.7-6.

Comment 19 errata-xmlrpc 2013-02-21 09:51:03 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0375.html

Note You need to log in before you can comment on or make changes to this bug.