Bug 128571

Summary: Can't join fence domain b/c ccs_test connect fails
Product: [Retired] Red Hat Cluster Suite Reporter: Derek Anderson <danderso>
Component: gfsAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: high    
Version: 4   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-07-27 14:34:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Derek Anderson 2004-07-26 16:04:35 UTC
Description of problem:
In a quorate 3-node cluster, attempts to join the fence domain fail:

[root@link-11 root]# cat /proc/cluster/nodes
Node  Votes Exp Sts  Name
   1    1    3   M   link-10
   2    1    3   M   link-11
   3    1    3   M   link-12
[root@link-11 root]# fenced -D
Command Line Arguments:
  name = default
  debug = 1
fenced: fence_domain_add: init_nodes ccs error -1
[root@link-11 root]# ccs_test connect
ccs_connect failed: Connection refused
[root@link-11 root]# pidof ccsd
3377
[root@link-11 root]#

Version-Release number of selected component (if applicable):
[root@link-11 root]# fenced -V
fenced DEVEL.1090872651 (built Jul 26 2004 15:11:58)
Copyright (C) Red Hat, Inc.  2004  All rights reserved.
[root@link-11 root]# ccsd -V
ccsd DEVEL.1090872650 (built Jul 26 2004 15:11:54)
Copyright (C) Red Hat, Inc.  2004  All rights reserved.

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Derek Anderson 2004-07-26 16:06:55 UTC
Raising priority.  This is a blocker for doing anything with a filesystem.

Comment 2 Derek Anderson 2004-07-26 16:40:44 UTC
I can workaround this by running 'ccs_test connect force' on each node
before attempting to join the fence domain.

Comment 3 Jonathan Earl Brassow 2004-07-26 20:48:34 UTC
I was able to reproduce this by:
1. forming a quorate cluster
2. on a single node, do cman_tool leave; cman_tool join

The descriptor held on the cluster manager was becoming invalid.  Now I 
close the descriptor on cman shutdown and attempt to reconnect when it 
becomes available again.

If this is truly the fix for the problem, it may also address bug 128569

Comment 4 Jonathan Earl Brassow 2004-07-26 22:25:12 UTC
Ok, I needed to do an
FD_ZERO(&rset);
before populating the variable.

This appears to be what was causing this bug, as well as 128569


Comment 5 Derek Anderson 2004-07-27 14:34:42 UTC
Works now.

Comment 6 Kiersten (Kerri) Anderson 2004-11-16 19:10:18 UTC
Updating version to the right level in the defects.  Sorry for the storm.