Bug 144170 - Regain quorum: "fencing deferred to 4294967295"
Regain quorum: "fencing deferred to 4294967295"
Status: CLOSED CURRENTRELEASE
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: fence (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Adam "mantis" Manthei
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-01-04 16:44 EST by Derek Anderson
Modified: 2009-04-16 15:56 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-01-11 14:37:27 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Derek Anderson 2005-01-04 16:44:54 EST
Description of problem:
Create quorum in a 3 node cluster (link-10,link-11,link-12).  Remove
two of the nodes (link-10,link-12) so link-11 loses quorum and goes
into "Activity blocked" mode.  Bring back link-10 with: ccsd;
cman_tool join; fence_tool join.

When link-10 joins the fence domain this appears in the messages file:
Jan  4 15:39:00 link-10 fenced[2489]: fencing deferred to 4294967295

###
### link-11's view of the inquorate cluster:
###
[root@link-11 root]# cat /proc/cluster/nodes
Node  Votes Exp Sts  Name
   1    1    3   X   link-10
   2    1    3   M   link-11
   3    1    3   X   link-12
[root@link-11 root]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2]

DLM Lock Space:  "clvmd"                             2   3 run       -
[2]

[root@link-11 root]# cat /proc/cluster/status
Protocol version: 4.0.1
Config version: 1
Cluster name: MILTON
Cluster ID: 4812
Membership state: Cluster-Member
Nodes: 1
Expected_votes: 3
Total_votes: 1
Quorum: 2  Activity blocked
Active subsystems: 3
Node addresses: 192.168.44.161

###
### link-11's view of the cluster after link-10 rejoins
###
[root@link-11 root]# cat /proc/cluster/nodes
Node  Votes Exp Sts  Name
   1    1    3   M   link-10
   2    1    3   M   link-11
   3    1    3   X   link-12
[root@link-11 root]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[2]

[root@link-11 root]# cat /proc/cluster/status
Protocol version: 4.0.1
Config version: 1
Cluster name: MILTON
Cluster ID: 4812
Membership state: Cluster-Member
Nodes: 2
Expected_votes: 3
Total_votes: 2
Quorum: 2
Active subsystems: 3
Node addresses: 192.168.44.161

Version-Release number of selected component (if applicable):
Latest 6.1 RPMS built Wed 15 Dec 2004 01:13:08 PM CST

How reproducible:
First time this has been noticed

Steps to Reproduce:
1. 3 nodes quorate and joined to fence domain
2. Remove 2 nodes simultaneously
3. Join one of the two back to regain quorum and rejoin fence domain
  
Actual results:
fencing deferred to an unknown node number (4294967295).

Expected results:
fencing action assumed by one of the now quorate cluster members.

Additional info:
Comment 1 David Teigland 2005-01-06 10:51:38 EST
Fixing this is as simple as printing the node name instead of the node
number.  In this situation the node number hasn't been set yet
(4294967295 == -1) but the node name is available.
Comment 2 David Teigland 2005-01-07 11:13:53 EST
In the cases where we'd print -1, we actually don't know the node
we're deferring to, we only know it's not us.  Now we print the
node name if we know or "prior member" if we don't.

/cvs/cluster/cluster/fence/fenced/recover.c,v  <--  recover.c
new revision: 1.10.2.1; previous revision: 1.10
Comment 3 Derek Anderson 2005-01-11 14:37:27 EST
Verified.  Log message now looks like:

Jan 11 13:33:32 link-12 fenced[2634]: fencing deferred to prior member

[root@link-11 root]# fenced -V
fenced 1.7. (built Jan 10 2005 16:22:11)

Note You need to log in before you can comment on or make changes to this bug.