Bug 801355

Summary: cman+pacemaker leads to double fences
Product: Red Hat Enterprise Linux 6 Reporter: Jaroslav Kortus <jkortus>
Component: pacemakerAssignee: David Vossel <dvossel>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: high    
Version: 6.3CC: abeekhof, cluster-maint, dvossel, fdinitto, mnovacek
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pacemaker-1.1.8-1.el6 Doc Type: Bug Fix
Doc Text:
Cause: Multiple parts of the system may notice a node failure at slightly different times. Consequence: If more than one component requests that the node be fenced, then the fencing component will do so multiple times. Fix: Merge identical requests from different clients if the first is still in progress. Result: The node is fenced only once.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 09:51:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
crm_report output
none
/var/log/messages snippet
none
/var/log/messages snippet none

Description Jaroslav Kortus 2012-03-08 10:58:00 UTC
Description of problem:
cman with fence_pcmk + pacemaker (in this case with virt fencing) fence the node twice if it fails.

This is most probably due to cman requesting fence and pacemaker making it's fencing decision right after that.

Version-Release number of selected component (if applicable):
pacemaker-1.1.7-2.el6.x86_64
cman-3.0.12.1-27.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. setup cman+pacemaker cluster with fence_pcmk and (for example) virt fencing
2. ssh to one node and issue halt -fin
3. wait for the node to get fenced, observe it gets fenced twice in a row
  
Actual results:
node fenced twice

Expected results:
one fence

Additional info:
config snips:
cluster.conf:
		<clusternode name="node01" nodeid="1" votes="1">
			<fence>
				<method name="pcmk-redirect">
					<device name="pcmk" port="node01"/>
				</method>
			</fence>
		</clusternode>
	<fencedevices>
		<fencedevice agent="fence_pcmk" name="pcmk"/>
	</fencedevices>
crm:
primitive virt-fencing stonith:fence_xvm \
	params pcmk_host_check="static-list" pcmk_host_list="node01,node02,node03" action="reboot" debug="1"


you can see the double fence via virsh console or via manually running fence_virtd (fence_virtd -F -f /etc/fence_virt.conf -d 2):

Request 2 seqno 367654 domain node03
Plain TCP request
Request 2 seqno 367654 src 192.168.100.101 target node03
Rebooting domain node03...
[REBOOT] Calling virDomainDestroy(0x233cc30)
Domain has been shut off
Calling virDomainCreateLinux()...
Request 2 seqno 623812 domain node03
Plain TCP request
Request 2 seqno 623812 src 192.168.100.101 target node03
Rebooting domain node03...
[REBOOT] Calling virDomainDestroy(0x233cd60)
Domain has been shut off
Calling virDomainCreateLinux()...

Comment 1 Jaroslav Kortus 2012-03-08 10:59:07 UTC
Created attachment 568590 [details]
crm_report output

output of crm_report after node03 was fenced

Comment 3 David Vossel 2012-03-12 22:26:03 UTC
(In reply to comment #1)
> Created attachment 568590 [details]
> crm_report output
> 
> output of crm_report after node03 was fenced

I setup your environment and see problem.  I'm working on tracking it down.

Comment 4 David Vossel 2012-03-23 17:51:44 UTC
I was able to track down the cause of this issue last week.  Here is what I found.

Both cman and crmd are alerted to the loss of membership of a node at the same time.  This results in both of these processes scheduling the node to be fenced independently of one another.  Since the remote operation to stonith to fence the node has an async response, neither cman nor crmd are capable of being aware each other have scheduled the fence operation on the same node at nearly the same time.

To be clear, this is not the result of the crmd trying to fence a node because it detected it is down as a result of someone else fencing it earlier on.  This is the result of both crmd and cman scheduling the fencing of a node at the same time as a result of the same event within the cluster.

I have a patch that resolves this issue specifically, but now that I understand how this situation occurs I am not satisfied that it will prevent other similar situations from occurring.  The correct solution to resolve this is still being discussed.

Comment 5 Andrew Beekhof 2012-03-23 21:05:58 UTC
NACK for 6.3, better to be strictly correct and keep the data safe.
We'd likely introduce more corner cases than we solve.

Rather than rush this in, we'll take our time for 6.4 and make sure we're not leaving any holes.

Comment 9 michal novacek 2012-11-29 11:35:26 UTC
Created attachment 654205 [details]
/var/log/messages snippet

It shows that both fenced (lines 8) and stonith-ng (line 9) fence the node.

Comment 10 michal novacek 2012-11-30 14:03:17 UTC
buggy version:

node01$ rpm -q pacemaker cman
pacemaker-1.1.7-6.el6.x86_64
cman-3.0.12.1-32.el6.x86_64

node01$ crm status
============
Last updated: Mon Nov 26 05:53:06 2012
Last change: Mon Nov 26 05:40:06 2012 via cibadmin on c3-node01
Stack: cman
Current DC: c3-node01 - partition with quorum
Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
3 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ c3-node01 c3-node02 c3-node03 ]

 virt-fencing   (stonith:fence_xvm):    Started c3-node01

node01$ fence_node c3-node03
fence c3-node03 success

..and on the virtual host I see that there is multiple fence of that node

host$ fence_virtd -F -d 2
...
Got virbr0 for interface
Request 2 seqno 925456 domain c3-node03
Plain TCP request
Request 2 seqno 925456 src 192.168.122.202 target c3-node03
Rebooting domain c3-node03...
[REBOOT] Calling virDomainDestroy(0xef1270)
Domain has been shut off
Calling virDomainCreateLinux()...
Request 2 seqno 88574 domain c3-node03
Plain TCP request
Request 2 seqno 88574 src 192.168.122.202 target c3-node03
Rebooting domain c3-node03...
[REBOOT] Calling virDomainDestroy(0xef13a0)
Domain has been shut off
Calling virDomainCreateLinux()...
Request 2 seqno 195177 domain c3-node03
Plain TCP request
Request 2 seqno 195177 src 192.168.122.202 target c3-node03
Rebooting domain c3-node03...
[REBOOT] Calling virDomainDestroy(0xef15a0)
Domain has been shut off
Calling virDomainCreateLinux()...

-----
patched version

node01$ rpm -q pacemaker cman 
pacemaker-1.1.8-4.el6.x86_64
cman-3.0.12.1-46.el6.x86_64

node01$ crm_mon -1
Last updated: Tue Nov 27 04:37:52 2012
Last change: Mon Nov 26 08:54:17 2012 via crmd on c3-node02
Stack: cman
Current DC: c3-node02 - partition with quorum
Version: 1.1.8-4.el6-394e906
3 Nodes configured, unknown expected votes
1 Resources configured.

Online: [ c3-node01 c3-node02 c3-node03 ]

 virt-fencing   (stonith:fence_xvm):    Started c3-node02

node01$ fence_node c3-node03
fence c3-node03 success

...and on the virtual host I see that there is fence happening only once:

host$ fence_virtd -F -d 2
...
Got virbr0 for interface
Request 2 seqno 463322 domain c3-node03
Plain TCP request
Request 2 seqno 463322 src 192.168.122.202 target c3-node03
Rebooting domain c3-node03...
[REBOOT] Calling virDomainDestroy(0x1939270)
Domain has been shut off
Calling virDomainCreateLinux()...

Comment 11 Andrew Beekhof 2012-12-04 02:59:02 UTC
(In reply to comment #9)
> Created attachment 654205 [details]
> /var/log/messages snippet
> 
> It shows that both fenced (lines 8) and stonith-ng (line 9) fence the node.

Not really, fenced is using the fence_pcmk device.
All this does it tell stonith-ng that the node needed to be shot.

So the node is only shot once, you're just seeing logs from two different subsystems regarding the same event.

Comment 12 michal novacek 2012-12-04 13:25:30 UTC
Created attachment 657504 [details]
/var/log/messages snippet

Comment 13 michal novacek 2012-12-04 13:34:17 UTC
I got somehow confused and previously attached log from the corrected version of pacemaker. Attachement 657504 is tha one that show incorrect behaviour.

Comment 14 Andrew Beekhof 2012-12-05 03:40:55 UTC
David, looks like we're actually fencing the node three times.

Comment 15 Jaroslav Kortus 2012-12-05 12:36:49 UTC
that was most likely due to fence_node command. Previously I was testing it with something that shuts down the node (pkill -9 corosync, halt -fin, panic or similar).

Now it looks like that fence_node lets the cluster know about the fencing. IIRC this was not done before (or at least I could see fence_node shutting down the node and then pure cman did it once more after it realized it had lost token).

Comment 16 David Vossel 2012-12-05 15:55:48 UTC
(In reply to comment #14)
> David, looks like we're actually fencing the node three times.

I'm confused and just want to clarify this.  What version of pacemaker is that log (657504) from.  I would not expect that to be the behavior from the 1.1.8-4 release.

-- Vossel

Comment 17 michal novacek 2012-12-05 16:26:24 UTC
Comment on attachment 657504 [details]
/var/log/messages snippet

This is behaviour encountered with paceameker-1.1.7-6.

Comment 19 errata-xmlrpc 2013-02-21 09:51:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0375.html