Bug 227743

Summary: Intermittent/recurring problem - when cluster is deleted, sometimes a node is not affected
Product: Red Hat Enterprise Linux 5 Reporter: Len DiMaggio <ldimaggi>
Component: congaAssignee: Ryan McCabe <rmccabe>
Status: CLOSED ERRATA QA Contact: Corey Marthaler <cmarthal>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.0CC: cluster-maint, djansa, jlaska, kanderso, kupcevic, rmccabe
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2007-0640 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-07 15:37:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Len DiMaggio 2007-02-07 21:37:48 UTC
Description of problem:
Intermittent/recurring problem - when cluster is deleted, sometimes a node is
not affected

Version-Release number of selected component (if applicable):
luci-0.8-30.el5
ricci-0.8-30.el5

How reproducible:
10% - 25%? Saw it once today

Steps to Reproduce:
1. Create a multi-node cluster and delete it
2. On some nodes, sometimes, cluster.conf is not deleted, cman and rgmanager are
not stopped, or have their boot settings set to off - details are included below

Actual results:
Sometimes a node is not cleaned up

Expected results:
All nodes should be removed from the cluster

Additional info:


->  /etc/cluster/cluster.conf is not deleted

->  these services' boot settings are not set to off:

[root@tng3-4 ~]# chkconfig --list cman
cman            0:off   1:off   2:on    3:on    4:on    5:on    6:off
[root@tng3-4 ~]# chkconfig --list rgmanager
rgmanager       0:off   1:off   2:on    3:on    4:on    5:on    6:off

-> This process is left running:
root      4921  4854  0 09:53 ?        00:00:00 /bin/bash
/etc/rc6.d/K03yum-updatesd stop

As is cman - haven't seen rgmanager left running.

Comment 1 Ryan McCabe 2007-03-06 18:57:44 UTC
The failure to stop the node may be a result of either (or both) Bug #222919 or
Bug #230783

Comment 2 RHEL Program Management 2007-03-21 22:08:41 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 Kiersten (Kerri) Anderson 2007-04-23 17:05:53 UTC
Fixing Product Name.  Cluster Suite was merged into Enterprise Linux for version
5.0.

Comment 4 Ryan McCabe 2007-05-21 22:22:12 UTC
Fixed in my current branch. Before deleting a cluster, we first try to stop the
cluster services on each of the nodes. If this fails, bail out before deleting
any nodes. This should prevent the case where some nodes are deleted, but others
are not (e.g., because a service hangs while stopping).

Comment 7 errata-xmlrpc 2007-11-07 15:37:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2007-0640.html