Bug 249291

Summary: delete node task fails to do all items listed in the help document
Product: Red Hat Enterprise Linux 5 Reporter: Corey Marthaler <cmarthal>
Component: congaAssignee: Ryan McCabe <rmccabe>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: low Docs Contact:
Priority: low    
Version: 5.0CC: cluster-maint, jparsons, kanderso, kupcevic, rmccabe
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2007-0640 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-07 15:38:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
log from luci server none

Description Corey Marthaler 2007-07-23 15:24:26 UTC
According to the conga help:
Delete Node - when a node is deleted, it is made to leave the cluster, all
cluster services are stopped on the node, its cluster.conf file is deleted, and
a new cluster.conf file is propagated to the remaining nodes in the cluster with
the deleted node removed from the configuration. Note that deleting a node does
not remove the installed cluster packages from the node.

First, after the delete operation is done, the view conga shows you afterward
still has that node in it. It is not removed from conga's view until conga is
reloaded by clicking on the cluster tab. 

Second, the cluster.conf is not only "propagated to the remaining nodes" but als
o to the node that was just deleted.

Third, the other nodes don't appear to belive the systme is gone, just that it's
currently not running:

[root@taft-03 ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M     84   2007-07-20 15:34:46  taft-02.lab.msp.redhat.com
   2   M     92   2007-07-20 15:53:22  taft-01.lab.msp.redhat.com
   3   X     88                        taft-04.lab.msp.redhat.com
   4   M     80   2007-07-20 15:34:46  taft-03.lab.msp.redhat.com



Version-Release number of selected component (if applicable):
ricci-0.10.0-2.el5
luci-0.10.0-2.el5

Comment 1 Corey Marthaler 2007-07-23 15:28:12 UTC
More delete stuff...

After attempting to re-add the deleted node I see the following from conga:

Status messages:

    * Host taft-04.lab.msp.redhat.com is already authenticated


The following errors occurred:

    * taft-04.lab.msp.redhat.com reports it is already a member of cluster
"taft_cluster"



Comment 2 Ryan McCabe 2007-07-23 17:40:23 UTC
Which version of cman do you have? This looks like it might be the same problem 
fixed here https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=244867

Comment 3 Corey Marthaler 2007-07-23 18:25:03 UTC
cman-2.0.69-1.el5

Comment 7 Corey Marthaler 2007-08-01 19:39:53 UTC
With the latest, all I see when attempting to delete is the following:

Deletion of node "taft-03.lab.msp.redhat.com" from cluster "TAFT_CLUSTER" failed

And nothing happens at all.

Comment 8 Ryan McCabe 2007-08-01 20:00:00 UTC
When this happens, it's an indication that at least one cluster service on the 
node could not be stopped. The delete procedure tries to stop all cluster 
services, then delete the cluster.conf and propagate a new conf to the 
remaining cluster members. If the first step fails, luci bails out. When the 
node couldn't be deleted, were there any init scripts hung trying to stop? If 
not, are you able to stop cluster services manually?

If you edit /var/lib/luci/Extensions/conga_constants, at the bottom of the 
file, there are three constants that control debugging output. If you set 
LUCI_DEBUG_MODE to True and LUCI_DEBUG_VERBOSITY to something >= 3, and 
configure syslogd to log authpriv.debug, luci will produce (a lot) of debugging 
output that should indicate what went wrong.

Comment 9 Ryan McCabe 2007-08-01 20:01:24 UTC
Sorry, that should read /var/lib/luci/Extensions/conga_constants.py above.

Comment 10 Corey Marthaler 2007-08-16 22:35:15 UTC
Created attachment 161700 [details]
log from luci server

Here is the log that you requested during a delete attempt. Again, all I saw
was, "Deletion of node "link-08.lab.msp.redhat.com" from cluster "LINK_278"
failed"

Comment 11 Ryan McCabe 2007-08-17 20:28:43 UTC
Thanks for the log. Turns out the bug has nothing specific to do with node
deletion; the real problem is luci is not properly handling clusters whose names
are not all lowercase in a few places. Fix committed to CVS and will be in the
next build.

Comment 13 Corey Marthaler 2007-08-21 20:29:19 UTC
fix verified in 0.10.0-5.el5.

Comment 15 errata-xmlrpc 2007-11-07 15:38:25 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2007-0640.html