Bug 472782

Summary: Master in qdisk does not win and both nodes are fenced off in race condition
Product: [Retired] Red Hat Cluster Suite Reporter: Shane Bradley <sbradley>
Component: cmanAssignee: Lon Hohberger <lhh>
Status: CLOSED WONTFIX QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: cluster-maint, dash, iannis, jko, rbinkhor, tao
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-05-11 17:05:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sosreport for node1
none
sosreport for node2 none

Description Shane Bradley 2008-11-24 16:06:38 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.4) Gecko/2008111217 Fedora/3.0.4-1.fc9 Firefox/3.0.4

Cluster environment setup up with qdisk heuristic goes to fence race
if the heartbeat link goes down (unplug cable).

There is clearly only 1 master in this configuration. However, the
master does not win and both nodes fence each other off.

--------------------------------------------------------------------------------

Nov 14 11:44:08 pe1950-3 qdiskd[6193]: <info> Assuming master role
Nov 14 12:00:57 pe1950-3 qdiskd[6193]: <notice> Writing eviction notice for node 2

Nov 14 11:44:03 pe1950-4 qdiskd[5857]: <notice> Score sufficient for master operation (1/1; required=1); upgrading
Nov 14 11:44:09 pe1950-4 qdiskd[5857]: <info> Node 1 is the master
Nov 14 12:08:45 pe1950-4 qdiskd[5605]: <info> Quorum Daemon Initializing

----------------------------------------------------------------------------------

Nov 14 11:43:54 pe1950-3 fenced: startup succeeded
Nov 14 12:00:42 pe1950-3 fenced[6203]: pe1950-4-hb not a cluster member after 100 sec post_fail_delay
Nov 14 12:00:42 pe1950-3 fenced[6203]: fencing node "pe1950-4-hb"
Nov 14 12:00:51 pe1950-3 fenced[6203]: fence "pe1950-4-hb" success

Nov 14 11:43:54 pe1950-4 fenced: startup succeeded
Nov 14 12:00:42 pe1950-4 fenced[5867]: pe1950-3-hb not a cluster member after 100 sec post_fail_delay
Nov 14 12:00:42 pe1950-4 fenced[5867]: fencing node "pe1950-3-hb"
Nov 14 12:09:06 pe1950-4 fenced[5615]: pe1950-3-hb not a cluster member after 3 sec post_join_delay
Nov 14 12:09:06 pe1950-4 fenced[5615]: fencing node "pe1950-3-hb"
Nov 14 12:09:17 pe1950-4 fenced[5615]: fence "pe1950-3-hb" success

--------------------------------------------------------------------------------
Nov 14 12:01:25 pe1950-3 qdiskd[6193]: <crit> Node 2 is undead.
Nov 14 12:01:25 pe1950-3 qdiskd[6193]: <alert> Writing eviction notice for node 2
Nov 14 12:01:26 pe1950-3 root: Time Stamp: Fri Nov 14 12:01:25 2008 Node ID: 1 Score: 1/1 (Minimum required = 1) Current state: Master Initializing Set: { } Visible Set: {1 } Master Node ID: 1 Quorate Set: { 1 }
Nov 14 12:01:26 pe1950-3 qdiskd[6193]: <crit> Node 2 is undead.
Nov 14 12:01:26 pe1950-3 qdiskd[6193]: <alert> Writing eviction notice for node 2

Nov 14 12:00:41 pe1950-4 root: Time Stamp: Fri Nov 14 12:00:40 2008 Node ID: 2 Score: 1/1 (Minimum required = 1) Current state: Running Initializing Set: { } Visible Set: { 1 2 } Master Node ID: 1 Quorate Set: { 1 }
Nov 14 12:00:42 pe1950-4 fenced[5867]: pe1950-3-hb not a cluster member after 100 sec post_fail_delay
Nov 14 12:00:42 pe1950-4 fenced[5867]: fencing node "pe1950-3-hb"
Nov 14 12:00:42 pe1950-4 ccsd[5792]: Cluster is not quorate.  Refusing connection.
Nov 14 12:00:42 pe1950-4 ccsd[5792]: Error while processing connect: Connection refused


Reproducible: Always

Steps to Reproduce:
1. Setup cluster with qdisk
2. Shutdown the heartbeat network on both nodes at same time
Actual Results:  
Both nodes try to fence each other off.

Expected Results:  
Both nodes should see that the network is down.
The master in qdisk should fence the other off to prevent race condition.

This issue looks identical to bz for rhel5:
https://bugzilla.redhat.com/show_bug.cgi?id=372901

Comment 1 Shane Bradley 2008-11-24 16:08:29 UTC
Created attachment 324493 [details]
sosreport for node1

Comment 2 Shane Bradley 2008-11-24 16:09:05 UTC
Created attachment 324494 [details]
sosreport for node2

Comment 4 Lon Hohberger 2009-05-11 19:04:17 UTC
First of all this is a feature request.  While I believe this is a reasonable
course of action, there is no current master-wins behavior in the feature set of qdiskd if no heuristics are present.

The only way to do this cleanly is to interrupt the fencing operation in the
non-master node.

Since CMAN decides on a new membership view prior to fencing operation taking
place, the only method to ensure this works is to notify qdiskd that CMAN has
decided to fence and to have qdiskd do something based on:

- whether or not a master exists
- whether or not the other node exists, and
- if a master exists, which node is master

Some possible solutions as well as a workaround are here:

https://bugzilla.redhat.com/show_bug.cgi?id=372901#c7

Since administrators cannot control which node is the qdiskd master (nor will this be an option), a workaround causing a node to hang will provide predictable behavior in a network partition - moreso than implementation of master-wins.

Comment 7 Lon Hohberger 2009-07-13 13:39:45 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=372901#c9

^^ simple design