225747 – Create/delete cluster - then access disk on node = Generic error on host: cluster tools: cman_tool errored

Bug 225747 - Create/delete cluster - then access disk on node = Generic error on host: cluster tools: cman_tool errored

Summary: Create/delete cluster - then access disk on node = Generic error on host: clu...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	conga
Sub Component:
Version:	5.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Jim Parsons
QA Contact:	Corey Marthaler
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	236023
TreeView+	depends on / blocked

Reported:	2007-01-31 18:35 UTC by Len DiMaggio
Modified:	2009-04-16 22:34 UTC (History)
CC List:	7 users (show)
Fixed In Version:	RHSA-2007-0640
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-11-07 15:36:55 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Screenshot (793.00 KB, image/png) 2007-01-31 18:35 UTC, Len DiMaggio	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2007:0640	0	normal	SHIPPED_LIVE	Moderate: conga security, bug fix, and enhancement update	2007-11-08 14:19:24 UTC

Description Len DiMaggio 2007-01-31 18:35:48 UTC

Description of problem:

I found this problem today, while verify other bug fixes. The detailed steps to
recreate are listed below.

Version-Release number of selected component (if applicable):
RHEL5-Server-20070126.0
luci-0.8-30.el5
ricci-0.8-30.el5

How reproducible:
Seems to be 100%

Steps to Reproduce:
1. Create a new cluster
2. After the cluster is complete - and the nodes have rebooted - delete the
cluster from the cluster tab in luci (not from the homepage/manage systems tab)
3. The cluster nodes should still be accessible as storage nodes. Go to the
storage tab in luci and select one of the nodes. This error is returned in a
dialog box:

An error has occured while probing storage:
Generic error on host: cluster tools: cman_tool errored

(See the attachment for a screenshot.)
  
Actual results:
The error listed above and the in the screenshot.

Expected results:
No errors.

Additional info:

I also saw this SELinux AVC message - before I installed these new packages:

selinux-policy-2.4.6-32.el5
selinux-policy-targeted-2.4.6-32.el5

type=AVC msg=audit(1170257190.911:160): avc:  denied  { write } for  pid=24613
comm="cman_tool" name="cman_client" dev=dm-0 ino=620947
scontext=system_u:system_r:ricci_modstorage_t:s0
tcontext=system_u:object_r:ccs_var_run_t:s0 tclass=sock_file

I'm not too sure about how to clear up this condition - rebooting the nodes
seems to do it sometimes.

Here are the services/processes running when the failure occurs:

[root@tng3-1 ~]# service --status-all | grep cman
[root@tng3-1 ~]# service --status-all | grep cluster
modclusterd (pid 2039) is running...
[root@tng3-1 ~]# ps -ef | grep cman
root      4203  2348  0 12:16 pts/0    00:00:00 grep cman
[root@tng3-1 ~]# service cman status
ccsd is stopped
[root@tng3-1 ~]# service rgmanager status

 ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 12:20 ?        00:00:00 init [3]                       
                  
root         2     1  0 12:20 ?        00:00:00 [migration/0]
root         3     1  0 12:20 ?        00:00:00 [ksoftirqd/0]
root         4     1  0 12:20 ?        00:00:00 [watchdog/0]
root         5     1  0 12:20 ?        00:00:00 [events/0]
root         6     1  0 12:20 ?        00:00:00 [khelper]
root         7     1  0 12:20 ?        00:00:00 [kthread]
root        10     7  0 12:20 ?        00:00:00 [kblockd/0]
root        11     7  0 12:20 ?        00:00:00 [cqueue/0]
root        14     7  0 12:20 ?        00:00:00 [khubd]
root        16     7  0 12:20 ?        00:00:00 [kseriod]
root        87     7  0 12:20 ?        00:00:00 [pdflush]
root        88     7  0 12:20 ?        00:00:00 [pdflush]
root        89     7  0 12:20 ?        00:00:00 [kswapd0]
root        90     7  0 12:20 ?        00:00:00 [aio/0]
root       242     7  0 12:20 ?        00:00:00 [kpsmoused]
root       258     7  0 12:20 ?        00:00:00 [kmirrord]
root       263     7  0 12:20 ?        00:00:00 [ksnapd]
root       266     7  0 12:20 ?        00:00:00 [kjournald]
root       293     7  0 12:20 ?        00:00:00 [kauditd]
root       322     1  0 12:20 ?        00:00:00 /sbin/udevd -d
root       620     7  0 12:20 ?        00:00:00 [scsi_eh_0]
root       621     7  0 12:20 ?        00:00:00 [lpfc_worker_0]
root       622     7  0 12:20 ?        00:00:00 [scsi_wq_0]
root       623     7  0 12:20 ?        00:00:00 [fc_wq_0]
root       624     7  0 12:20 ?        00:00:00 [fc_dl_0]
root      1015     7  0 12:20 ?        00:00:00 [kjournald]
root      1443     1  0 12:20 ?        00:00:00 /sbin/dhclient -1 -q -cf
/etc/dhclient-eth0.conf -l
root      1525     1  0 12:20 ?        00:00:00 /usr/sbin/restorecond
root      1536     1  0 12:20 ?        00:00:00 auditd
root      1538  1536  0 12:20 ?        00:00:00 python /sbin/audispd
root      1551     1  0 12:20 ?        00:00:00 syslogd -m 0
root      1554     1  0 12:20 ?        00:00:00 klogd -x
root      1579     1  0 12:20 ?        00:00:00 mcstransd
rpc       1591     1  0 12:20 ?        00:00:00 portmap
root      1611     1  0 12:20 ?        00:00:00 rpc.statd
root      1642     1  0 12:20 ?        00:00:00 rpc.idmapd
dbus      1659     1  0 12:20 ?        00:00:00 dbus-daemon --system
root      1670     1  0 12:20 ?        00:00:00 /usr/sbin/hcid
root      1674     1  0 12:20 ?        00:00:00 /usr/sbin/sdpd
root      1697     1  0 12:20 ?        00:00:00 [krfcommd]
root      1733     1  0 12:20 ?        00:00:00 pcscd
root      1751     1  0 12:20 ?        00:00:00 /usr/bin/hidd --server
root      1765     1  0 12:20 ?        00:00:00 automount
root      1789     1  0 12:20 ?        00:00:00 cupsd
root      1802     1  0 12:20 ?        00:00:00 /usr/sbin/sshd
root      1813     1  0 12:20 ?        00:00:00 xinetd -stayalive -pidfile
/var/run/xinetd.pid
root      1833     1  0 12:20 ?        00:00:00 sendmail: accepting connections
smmsp     1842     1  0 12:20 ?        00:00:00 sendmail: Queue runner@01:00:00
for /var/spool/clie
root      1861     1  0 12:20 ?        00:00:00 gpm -m /dev/input/mice -t exps2
root      1887     1  0 12:20 ?        00:00:00 crond
xfs       1909     1  0 12:20 ?        00:00:00 xfs -droppriv -daemon
root      1928     1  0 12:20 ?        00:00:00 /usr/sbin/atd
root      1940     1  0 12:20 ?        00:00:00 /usr/bin/python
/usr/sbin/yum-updatesd
avahi     1951     1  0 12:20 ?        00:00:00 avahi-daemon: running [tng3-1.local]
avahi     1952  1951  0 12:20 ?        00:00:00 avahi-daemon: chroot helper
68        1962     1  0 12:20 ?        00:00:00 hald
root      1963  1962  0 12:20 ?        00:00:00 hald-runner
root      1981  1963  0 12:20 ?        00:00:00 hald-addon-storage: polling /dev/hdd
root      2011     1  0 12:21 ?        00:00:00 modclusterd
root      2066     1  0 12:21 ?        00:00:00 /usr/sbin/oddjobd -p
/var/run/oddjobd.pid -t 300
root      2097     1  0 12:21 ?        00:00:00 /usr/sbin/saslauthd -m
/var/run/saslauthd -a pam
root      2098  2097  0 12:21 ?        00:00:00 /usr/sbin/saslauthd -m
/var/run/saslauthd -a pam
root      2099  2097  0 12:21 ?        00:00:00 /usr/sbin/saslauthd -m
/var/run/saslauthd -a pam
root      2100  2097  0 12:21 ?        00:00:00 /usr/sbin/saslauthd -m
/var/run/saslauthd -a pam
root      2102  2097  0 12:21 ?        00:00:00 /usr/sbin/saslauthd -m
/var/run/saslauthd -a pam
ricci     2109     1  0 12:21 ?        00:00:00 ricci -u 101
root      2133     1  0 12:21 ?        00:00:00 /usr/sbin/smartd -q never
root      2136     1  0 12:21 ttyS0    00:00:00 /sbin/agetty ttyS0 115200 vt100-nav
root      2137     1  0 12:21 tty1     00:00:00 /sbin/mingetty tty1
root      2138     1  0 12:21 tty2     00:00:00 /sbin/mingetty tty2
root      2139     1  0 12:21 tty3     00:00:00 /sbin/mingetty tty3
root      2140     1  0 12:21 tty4     00:00:00 /sbin/mingetty tty4
root      2141     1  0 12:21 tty5     00:00:00 /sbin/mingetty tty5
root      2142     1  0 12:21 tty6     00:00:00 /sbin/mingetty tty6
root      2205  1802  0 12:23 ?        00:00:00 sshd: root@pts/0 
root      2207  2205  0 12:23 pts/0    00:00:00 -bash
root      2287  2207  0 12:32 pts/0    00:00:00 ps -ef

Comment 1 Len DiMaggio 2007-01-31 18:35:49 UTC

Created attachment 147037 [details]
Screenshot

Comment 2 Len DiMaggio 2007-01-31 18:51:39 UTC

Note - a reboot does not always clear this problem - having the node join a new
cluster does seem to work.

Comment 3 Daniel Walsh 2007-02-01 20:35:47 UTC

Fixed avc in selinux-policy-2.4.6-35

Comment 4 Len DiMaggio 2007-02-08 14:30:15 UTC

So far - the only workarounds are to either have the nodes in question join a
new cluster - or reinstall the OS. Something seems to be cached somewhere.

Comment 5 Stanko Kupcevic 2007-02-09 12:52:05 UTC

modstorage tries to check cluster quorum, but there is no cluster anymore,
therefore cman_tool errors. 

`lvmconf --disable-cluster` should be run on cluster/nodes deletion/removal and
better error message should be generated.

Comment 6 Stanko Kupcevic 2007-03-06 09:16:12 UTC

Commited to HEAD/RHEL4/RHEL5

Comment 7 RHEL Program Management 2007-03-21 22:21:06 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 Stanko Kupcevic 2007-03-30 18:45:29 UTC

Test with 0.9.2-4.el5 (conga and clustermon)

Comment 10 Daniel Walsh 2007-04-11 19:21:29 UTC

The associated SELinux will not get updated.  They can either generate SELinux
policy modules using audit2allow -M myconga or run in permissive mode.  Fix will
show up in 5.1

Comment 11 Kiersten (Kerri) Anderson 2007-04-23 17:08:53 UTC

Fixing Product Name.  Cluster Suite was merged into Enterprise Linux for version
5.0.

Comment 15 errata-xmlrpc 2007-11-07 15:36:55 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2007-0640.html

Note You need to log in before you can comment on or make changes to this bug.