Bug 217461 - cman view of cluster nodes not updated when adding nodes via ccs_tool update command
Summary: cman view of cluster nodes not updated when adding nodes via ccs_tool update ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-11-27 23:39 UTC by Kiersten (Kerri) Anderson
Modified: 2009-04-16 22:29 UTC (History)
3 users (show)

Fixed In Version: RC
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-02-08 00:57:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Kiersten (Kerri) Anderson 2006-11-27 23:39:06 UTC
Description of problem:
cman doesn't resync its node information after you add a node by doing a
ccs_tool update command with a new version of the cluster.conf file?

This causes a problem when starting the new node into the cluster.  CCSD gets a
version mismatch if I copy the new cluster.conf file to the new node versus the
version the cluster is formed around.  It then tries to form its own cluster
around that version of the file, which causes its fencing agent to hang.

[root@kanderso-xen-01 cluster]# diff cluster.conf.1node cluster.conf
2c2
< <cluster name="ka-xen-cluster" config_version="32">
---
> <cluster name="ka-xen-cluster" config_version="31">
31,35d30
<       <clusternode name="kanderso-xen-06.lab.msp.redhat.com" votes="1"
nodeid="10">
<               <fence>
<                       <method name="1"><device name="xvm"
domain="kanderso-xen-06"/></method>
<               </fence>
<       </clusternode>
[root@kanderso-xen-01 cluster]# cman_tool status
Version: 6.0.1
Config Version: 31
Cluster Name: ka-xen-cluster
Cluster Id: 15028
Cluster Member: Yes
Cluster Generation: 24
Membership state: Cluster-Member
Nodes: 9
Expected votes: 9
Total votes: 9
Quorum: 5  
Active subsystems: 6
Flags: 
Ports Bound: 0  
Node name: kanderso-xen-01.lab.msp.redhat.com
Node ID: 1
Multicast addresses: 239.192.58.238 
Node addresses: 10.15.85.21 
[root@kanderso-xen-01 cluster]# ccs_tool update cluster.conf.1node 
Config file updated from version 31 to 32

Update complete.
[root@kanderso-xen-01 cluster]# ccs_tool lsnode

Cluster name: ka-xen-cluster, config_version: 32

Nodename                        Votes Nodeid Fencetype
kanderso-xen-01.lab.msp.redhat.com   1    1    xvm
kanderso-xen-02.lab.msp.redhat.com   1    2    xvm
kanderso-xen-03.lab.msp.redhat.com   1    3    xvm
kanderso-xen-04.lab.msp.redhat.com   1    4    xvm
kanderso-xen-05.lab.msp.redhat.com   1    5    xvm
kanderso-xen-06.lab.msp.redhat.com   1   10    xvm
kanderso-xen-22.lab.msp.redhat.com   1    6    xvm
kanderso-xen-23.lab.msp.redhat.com   1    7    xvm
kanderso-xen-24.lab.msp.redhat.com   1    8    xvm
kanderso-xen-25.lab.msp.redhat.com   1    9    xvm
[root@kanderso-xen-01 cluster]# cman_tool status
Version: 6.0.1
Config Version: 31
Cluster Name: ka-xen-cluster
Cluster Id: 15028
Cluster Member: Yes
Cluster Generation: 24
Membership state: Cluster-Member
Nodes: 9
Expected votes: 9
Total votes: 9
Quorum: 5  
Active subsystems: 6
Flags: 
Ports Bound: 0  
Node name: kanderso-xen-01.lab.msp.redhat.com
Node ID: 1
Multicast addresses: 239.192.58.238 
Node addresses: 10.15.85.21 
[root@kanderso-xen-01 cluster]# cman_tool join
cman_tool: Node is already active
[root@kanderso-xen-01 cluster]# cman_tool status
Version: 6.0.1
Config Version: 31
Cluster Name: ka-xen-cluster
Cluster Id: 15028
Cluster Member: Yes
Cluster Generation: 24
Membership state: Cluster-Member
Nodes: 9
Expected votes: 9
Total votes: 9
Quorum: 5  
Active subsystems: 6
Flags: 
Ports Bound: 0  
Node name: kanderso-xen-01.lab.msp.redhat.com
Node ID: 1
Multicast addresses: 239.192.58.238 
Node addresses: 10.15.85.21 


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Kiersten (Kerri) Anderson 2006-11-27 23:39:50 UTC
I opened this to track it, even if it is just a configuration command that I am
missing.

Comment 2 Kiersten (Kerri) Anderson 2006-11-27 23:56:20 UTC
Doing a service cman restart on a node in the existing cluster also causes the
errors to occur.  The initscript will hang trying to starting fence daemon.

Comment 3 Kiersten (Kerri) Anderson 2006-11-28 00:02:08 UTC
However, after doing the service cman restart on one node, the other remaining
nodes in the cluster now show a new version of the configuration:

[root@kanderso-xen-02 ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   X     12                        kanderso-xen-01.lab.msp.redhat.com
   2   M      4   2006-11-27 17:10:57  kanderso-xen-02.lab.msp.redhat.com
   3   M     20   2006-11-27 17:10:58  kanderso-xen-03.lab.msp.redhat.com
   4   M     24   2006-11-27 17:10:59  kanderso-xen-04.lab.msp.redhat.com
   5   M     16   2006-11-27 17:10:58  kanderso-xen-05.lab.msp.redhat.com
   6   M     20   2006-11-27 17:10:58  kanderso-xen-22.lab.msp.redhat.com
   7   M     16   2006-11-27 17:10:58  kanderso-xen-23.lab.msp.redhat.com
   8   M     12   2006-11-27 17:10:57  kanderso-xen-24.lab.msp.redhat.com
   9   M     12   2006-11-27 17:10:57  kanderso-xen-25.lab.msp.redhat.com
  10   X      0                        kanderso-xen-06.lab.msp.redhat.com
[root@kanderso-xen-02 ~]# cman_tool status
Version: 6.0.1
Config Version: 32
Cluster Name: ka-xen-cluster
Cluster Id: 15028
Cluster Member: Yes
Cluster Generation: 28
Membership state: Cluster-Member
Nodes: 9
Expected votes: 9
Total votes: 8
Quorum: 5  
Active subsystems: 6
Flags: 
Ports Bound: 0  
Node name: kanderso-xen-02.lab.msp.redhat.com
Node ID: 2
Multicast addresses: 239.192.58.238 
Node addresses: 10.15.85.22 

And then starting the cluster software on the new node, caused this node to be
reconfigured and back in operation.  Very strange and everything is running
correctly again with 10 nodes in the cluster.

Comment 4 Christine Caulfield 2006-11-28 10:20:30 UTC
The correct procedure for updating the config file is either:

1. ccs_tool update <file>; cman_tool version -r <version>
  or
2. Simply start a new node with the later config file (the others will spot the
new version and read it from ccs)

so you got half of 1, and 2 rescued you :-)

Number 1 is the faff - it should be a single-step process and you shouldn't have
to remember the version number for the second command.

It's (in RHEL5) trivial to make ccs_tool tell cman that the file has been
changed so I'll do that.

Comment 5 Christine Caulfield 2006-11-28 11:23:37 UTC
Checking in update.c;
/cvs/cluster/cluster/ccs/ccs_tool/update.c,v  <--  update.c
new revision: 1.9; previous revision: 1.8
done
Checking in update.c;
/cvs/cluster/cluster/ccs/ccs_tool/update.c,v  <--  update.c
new revision: 1.8.2.1; previous revision: 1.8
done


Comment 6 Kiersten (Kerri) Anderson 2006-11-28 15:56:48 UTC
Ok, so operator error.  Do we want to change the process for rhel5 since it is
the same as for rhel4?  What happens if someone does the cman_tool version
command after doing the ccs_tool update with the new changes?  At this point, am
leaning towards not changing this for the release.

Comment 7 Christine Caulfield 2006-11-28 16:08:17 UTC
I think it's a nice change to have (though I've left it off the RHEL50 branch).
the extra step seems a bit pointless as we know all the information. And if the
user does do the extra step with the new code it's harmless.

The rest of new cman has code to reduce the amount of manual intervention needed
with ccs updates and this is a logical step I feel.

Comment 8 Kiersten (Kerri) Anderson 2006-11-30 15:46:16 UTC
Add it to RHEL50 branch as well given it will be harmless and is heading in the
right direction.

Comment 9 Christine Caulfield 2006-11-30 15:57:30 UTC
Added to RHEL50:

Checking in update.c;
/cvs/cluster/cluster/ccs/ccs_tool/update.c,v  <--  update.c
new revision: 1.8.4.1; previous revision: 1.8
done


Comment 11 RHEL Program Management 2007-02-08 00:57:46 UTC
A package has been built which should help the problem described in 
this bug report. This report is therefore being closed with a resolution 
of CURRENTRELEASE. You may reopen this bug report if the solution does 
not work for you.


Comment 12 Nate Straz 2007-12-13 17:22:29 UTC
Moving all RHCS ver 5 bugs to RHEL 5 so we can remove RHCS v5 which never existed.


Note You need to log in before you can comment on or make changes to this bug.