Bug 250688 - cman_tool status shows blank cluster name and Cluster Id of 0
Summary: cman_tool status shows blank cluster name and Cluster Id of 0
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.0
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-08-02 21:27 UTC by Bryn M. Reeves
Modified: 2018-10-19 19:41 UTC (History)
3 users (show)

Fixed In Version: RHBA-2007-0575
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-11-07 16:59:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
cluster.conf for clu01 (1.70 KB, text/plain)
2007-08-02 21:27 UTC, Bryn M. Reeves
no flags Details
cluster.conf for clu02 (1.50 KB, text/plain)
2007-08-02 21:28 UTC, Bryn M. Reeves
no flags Details
Patch to fix (607 bytes, patch)
2007-08-16 08:16 UTC, Christine Caulfield
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2007:0575 0 normal SHIPPED_LIVE cman bug fix update 2007-10-31 12:26:24 UTC

Description Bryn M. Reeves 2007-08-02 21:27:37 UTC
Description of problem:
When operating two clusters on a single LAN segment, cman_tool displays an empty
cluster name field. This results in identical cluster IDs (0) and default
multicast addresses, causing nodes membership to "leak" from one cluster to the
other.

For example, a two node cluster (clu01):
# cman_tool status
Version: 6.0.1
Config Version: 12
Cluster Name:
Cluster Id: 0
Cluster Member: Yes
Cluster Generation: 8
Membership state: Cluster-Member
Nodes: 2
Expected votes: 1
Total votes: 2
Quorum: 1
Active subsystems: 5
Flags: 2node
Ports Bound: 0
Node name: clu01n01.example.com
Node ID: 1
Multicast addresses: 239.192.0.0
Node addresses: 10.0.0.1

And a three node cluster on the same LAN (clu02):

# cman_tool status
Version: 6.0.1
Config Version: 3
Cluster Name:
Cluster Id: 0
Cluster Member: Yes
Cluster Generation: 12
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Quorum: 2
Active subsystems: 6
Flags:
Ports Bound: 0
Node name: clu02n01.example.com
Node ID: 1
Multicast addresses: 239.192.0.0
Node addresses: 10.0.0.11
Version-Release number of selected component (if applicable):

Both clusters form correctly when only one is started at a time but attempts to
start both simultaneously results in nodes joining the wrong clusters.

How reproducible:
Unclear, has been reported twice but not yet reproduced.

Steps to Reproduce:
1. Configure a pair of clusters on a single LAN segment, e.g. with the above
addresses in a single 10.0.0.0/16 network.
2. Do not specify an explicit multicast address/port in cluster.conf
3. Allow both clusters to run at the same time
  
Actual results:
One or more nodes join the wrong cluster.

Expected results:
Nodes all join the correct cluster.

Additional info:

Comment 1 Bryn M. Reeves 2007-08-02 21:27:38 UTC
Created attachment 160553 [details]
cluster.conf for clu01

Comment 2 Bryn M. Reeves 2007-08-02 21:28:57 UTC
Created attachment 160554 [details]
cluster.conf for clu02

Comment 3 Christine Caulfield 2007-08-03 07:38:10 UTC
It seems that this happens if the cluster name is passed to the cman_tool
command-line:

# cman_tool join -c chrissie
# cman_tool_status
Version: 6.0.1
Config Version: 39
Cluster Name: 
Cluster Id: 0
...

# cman_tool join
# cman_tool_status
Version: 6.0.1
Config Version: 39
Cluster Name: chrissie
Cluster Id: 26347
...

Does that sound like what might be happening in this case ?

The fix is simple and has been applied to CVS head:
Checking in cmanccs.c;
/cvs/cluster/cluster/cman/daemon/cmanccs.c,v  <--  cmanccs.c
new revision: 1.29; previous revision: 1.28
done



Comment 5 Christine Caulfield 2007-08-03 10:50:10 UTC
I've added this fix to RHEL5 branch for 5.2.

Checking in cmanccs.c;
/cvs/cluster/cluster/cman/daemon/cmanccs.c,v  <--  cmanccs.c
new revision: 1.21.2.5; previous revision: 1.21.2.4
done


Comment 6 Issue Tracker 2007-08-03 12:16:59 UTC
In my case, the client is just booting up the cluster and letting it
automatically form the cluster in accordance with it's cluster.conf.

Internal Status set to 'Waiting on Customer'
Status set to: Waiting on Client

This event sent from IssueTracker by mbelangia 
 issue 127532

Comment 7 Christine Caulfield 2007-08-03 12:40:59 UTC
But does the customer have CLUSTERNAME defined in /etc/sysconfig/cman ?

That's what passes the cluster name to cman_tool join.


Comment 10 Kiersten (Kerri) Anderson 2007-08-03 14:10:36 UTC
Customer problem, setting blocker flag for 5.1 so we pickup the fix.

Comment 11 Christine Caulfield 2007-08-06 10:27:28 UTC
Put this back to assigned as I'm pretty sure it's fixing the problem. Now all we
need is all the ACKS I think.

Comment 15 Ken Wright 2007-08-16 03:03:30 UTC
Please provide status of this bug as I am still in a down state since 7/21/2007.

I am happy to assist by testing any patches that fix this problem prior to
official public release of the patch.

Regards,

Ken

Comment 16 Christine Caulfield 2007-08-16 08:16:00 UTC
Created attachment 161634 [details]
Patch to fix

Here's the patch that's in head of CVS.

Comment 18 Christine Caulfield 2007-08-17 07:51:25 UTC
On the RHEL51 branch:

Checking in cmanccs.c;
/cvs/cluster/cluster/cman/daemon/cmanccs.c,v  <--  cmanccs.c
new revision: 1.21.2.4.2.1; previous revision: 1.21.2.4
done


Comment 24 errata-xmlrpc 2007-11-07 16:59:47 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0575.html



Note You need to log in before you can comment on or make changes to this bug.