Bug 128571 - Can't join fence domain b/c ccs_test connect fails
Can't join fence domain b/c ccs_test connect fails
Status: CLOSED CURRENTRELEASE
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gfs (Show other bugs)
4
All Linux
high Severity medium
: ---
: ---
Assigned To: Jonathan Earl Brassow
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-07-26 12:04 EDT by Derek Anderson
Modified: 2010-01-11 21:55 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-07-27 10:34:42 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Derek Anderson 2004-07-26 12:04:35 EDT
Description of problem:
In a quorate 3-node cluster, attempts to join the fence domain fail:

[root@link-11 root]# cat /proc/cluster/nodes
Node  Votes Exp Sts  Name
   1    1    3   M   link-10
   2    1    3   M   link-11
   3    1    3   M   link-12
[root@link-11 root]# fenced -D
Command Line Arguments:
  name = default
  debug = 1
fenced: fence_domain_add: init_nodes ccs error -1
[root@link-11 root]# ccs_test connect
ccs_connect failed: Connection refused
[root@link-11 root]# pidof ccsd
3377
[root@link-11 root]#

Version-Release number of selected component (if applicable):
[root@link-11 root]# fenced -V
fenced DEVEL.1090872651 (built Jul 26 2004 15:11:58)
Copyright (C) Red Hat, Inc.  2004  All rights reserved.
[root@link-11 root]# ccsd -V
ccsd DEVEL.1090872650 (built Jul 26 2004 15:11:54)
Copyright (C) Red Hat, Inc.  2004  All rights reserved.

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Derek Anderson 2004-07-26 12:06:55 EDT
Raising priority.  This is a blocker for doing anything with a filesystem.
Comment 2 Derek Anderson 2004-07-26 12:40:44 EDT
I can workaround this by running 'ccs_test connect force' on each node
before attempting to join the fence domain.
Comment 3 Jonathan Earl Brassow 2004-07-26 16:48:34 EDT
I was able to reproduce this by:
1. forming a quorate cluster
2. on a single node, do cman_tool leave; cman_tool join

The descriptor held on the cluster manager was becoming invalid.  Now I 
close the descriptor on cman shutdown and attempt to reconnect when it 
becomes available again.

If this is truly the fix for the problem, it may also address bug 128569
Comment 4 Jonathan Earl Brassow 2004-07-26 18:25:12 EDT
Ok, I needed to do an
FD_ZERO(&rset);
before populating the variable.

This appears to be what was causing this bug, as well as 128569
Comment 5 Derek Anderson 2004-07-27 10:34:42 EDT
Works now.
Comment 6 Kiersten (Kerri) Anderson 2004-11-16 14:10:18 EST
Updating version to the right level in the defects.  Sorry for the storm.

Note You need to log in before you can comment on or make changes to this bug.