Bug 225747

Summary: Create/delete cluster - then access disk on node = Generic error on host: cluster tools: cman_tool errored
Product: Red Hat Enterprise Linux 5 Reporter: Len DiMaggio <ldimaggi>
Component: congaAssignee: Jim Parsons <jparsons>
Status: CLOSED ERRATA QA Contact: Corey Marthaler <cmarthal>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.0CC: cluster-maint, djansa, dwalsh, jlaska, kanderso, kupcevic, rmccabe
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2007-0640 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-07 15:36:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 236023    
Attachments:
Description Flags
Screenshot none

Description Len DiMaggio 2007-01-31 18:35:48 UTC
Description of problem:

I found this problem today, while verify other bug fixes. The detailed steps to
recreate are listed below.

Version-Release number of selected component (if applicable):
RHEL5-Server-20070126.0
luci-0.8-30.el5
ricci-0.8-30.el5

How reproducible:
Seems to be 100%

Steps to Reproduce:
1. Create a new cluster
2. After the cluster is complete - and the nodes have rebooted - delete the
cluster from the cluster tab in luci (not from the homepage/manage systems tab)
3. The cluster nodes should still be accessible as storage nodes. Go to the
storage tab in luci and select one of the nodes. This error is returned in a
dialog box:

An error has occured while probing storage:
Generic error on host: cluster tools: cman_tool errored

(See the attachment for a screenshot.)
  
Actual results:
The error listed above and the in the screenshot.

Expected results:
No errors.

Additional info:

I also saw this SELinux AVC message - before I installed these new packages:

selinux-policy-2.4.6-32.el5
selinux-policy-targeted-2.4.6-32.el5

type=AVC msg=audit(1170257190.911:160): avc:  denied  { write } for  pid=24613
comm="cman_tool" name="cman_client" dev=dm-0 ino=620947
scontext=system_u:system_r:ricci_modstorage_t:s0
tcontext=system_u:object_r:ccs_var_run_t:s0 tclass=sock_file

I'm not too sure about how to clear up this condition - rebooting the nodes
seems to do it sometimes.

Here are the services/processes running when the failure occurs:

[root@tng3-1 ~]# service --status-all | grep cman
[root@tng3-1 ~]# service --status-all | grep cluster
modclusterd (pid 2039) is running...
[root@tng3-1 ~]# ps -ef | grep cman
root      4203  2348  0 12:16 pts/0    00:00:00 grep cman
[root@tng3-1 ~]# service cman status
ccsd is stopped
[root@tng3-1 ~]# service rgmanager status

 ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 12:20 ?        00:00:00 init [3]                       
                  
root         2     1  0 12:20 ?        00:00:00 [migration/0]
root         3     1  0 12:20 ?        00:00:00 [ksoftirqd/0]
root         4     1  0 12:20 ?        00:00:00 [watchdog/0]
root         5     1  0 12:20 ?        00:00:00 [events/0]
root         6     1  0 12:20 ?        00:00:00 [khelper]
root         7     1  0 12:20 ?        00:00:00 [kthread]
root        10     7  0 12:20 ?        00:00:00 [kblockd/0]
root        11     7  0 12:20 ?        00:00:00 [cqueue/0]
root        14     7  0 12:20 ?        00:00:00 [khubd]
root        16     7  0 12:20 ?        00:00:00 [kseriod]
root        87     7  0 12:20 ?        00:00:00 [pdflush]
root        88     7  0 12:20 ?        00:00:00 [pdflush]
root        89     7  0 12:20 ?        00:00:00 [kswapd0]
root        90     7  0 12:20 ?        00:00:00 [aio/0]
root       242     7  0 12:20 ?        00:00:00 [kpsmoused]
root       258     7  0 12:20 ?        00:00:00 [kmirrord]
root       263     7  0 12:20 ?        00:00:00 [ksnapd]
root       266     7  0 12:20 ?        00:00:00 [kjournald]
root       293     7  0 12:20 ?        00:00:00 [kauditd]
root       322     1  0 12:20 ?        00:00:00 /sbin/udevd -d
root       620     7  0 12:20 ?        00:00:00 [scsi_eh_0]
root       621     7  0 12:20 ?        00:00:00 [lpfc_worker_0]
root       622     7  0 12:20 ?        00:00:00 [scsi_wq_0]
root       623     7  0 12:20 ?        00:00:00 [fc_wq_0]
root       624     7  0 12:20 ?        00:00:00 [fc_dl_0]
root      1015     7  0 12:20 ?        00:00:00 [kjournald]
root      1443     1  0 12:20 ?        00:00:00 /sbin/dhclient -1 -q -cf
/etc/dhclient-eth0.conf -l
root      1525     1  0 12:20 ?        00:00:00 /usr/sbin/restorecond
root      1536     1  0 12:20 ?        00:00:00 auditd
root      1538  1536  0 12:20 ?        00:00:00 python /sbin/audispd
root      1551     1  0 12:20 ?        00:00:00 syslogd -m 0
root      1554     1  0 12:20 ?        00:00:00 klogd -x
root      1579     1  0 12:20 ?        00:00:00 mcstransd
rpc       1591     1  0 12:20 ?        00:00:00 portmap
root      1611     1  0 12:20 ?        00:00:00 rpc.statd
root      1642     1  0 12:20 ?        00:00:00 rpc.idmapd
dbus      1659     1  0 12:20 ?        00:00:00 dbus-daemon --system
root      1670     1  0 12:20 ?        00:00:00 /usr/sbin/hcid
root      1674     1  0 12:20 ?        00:00:00 /usr/sbin/sdpd
root      1697     1  0 12:20 ?        00:00:00 [krfcommd]
root      1733     1  0 12:20 ?        00:00:00 pcscd
root      1751     1  0 12:20 ?        00:00:00 /usr/bin/hidd --server
root      1765     1  0 12:20 ?        00:00:00 automount
root      1789     1  0 12:20 ?        00:00:00 cupsd
root      1802     1  0 12:20 ?        00:00:00 /usr/sbin/sshd
root      1813     1  0 12:20 ?        00:00:00 xinetd -stayalive -pidfile
/var/run/xinetd.pid
root      1833     1  0 12:20 ?        00:00:00 sendmail: accepting connections
smmsp     1842     1  0 12:20 ?        00:00:00 sendmail: Queue runner@01:00:00
for /var/spool/clie
root      1861     1  0 12:20 ?        00:00:00 gpm -m /dev/input/mice -t exps2
root      1887     1  0 12:20 ?        00:00:00 crond
xfs       1909     1  0 12:20 ?        00:00:00 xfs -droppriv -daemon
root      1928     1  0 12:20 ?        00:00:00 /usr/sbin/atd
root      1940     1  0 12:20 ?        00:00:00 /usr/bin/python
/usr/sbin/yum-updatesd
avahi     1951     1  0 12:20 ?        00:00:00 avahi-daemon: running [tng3-1.local]
avahi     1952  1951  0 12:20 ?        00:00:00 avahi-daemon: chroot helper
68        1962     1  0 12:20 ?        00:00:00 hald
root      1963  1962  0 12:20 ?        00:00:00 hald-runner
root      1981  1963  0 12:20 ?        00:00:00 hald-addon-storage: polling /dev/hdd
root      2011     1  0 12:21 ?        00:00:00 modclusterd
root      2066     1  0 12:21 ?        00:00:00 /usr/sbin/oddjobd -p
/var/run/oddjobd.pid -t 300
root      2097     1  0 12:21 ?        00:00:00 /usr/sbin/saslauthd -m
/var/run/saslauthd -a pam
root      2098  2097  0 12:21 ?        00:00:00 /usr/sbin/saslauthd -m
/var/run/saslauthd -a pam
root      2099  2097  0 12:21 ?        00:00:00 /usr/sbin/saslauthd -m
/var/run/saslauthd -a pam
root      2100  2097  0 12:21 ?        00:00:00 /usr/sbin/saslauthd -m
/var/run/saslauthd -a pam
root      2102  2097  0 12:21 ?        00:00:00 /usr/sbin/saslauthd -m
/var/run/saslauthd -a pam
ricci     2109     1  0 12:21 ?        00:00:00 ricci -u 101
root      2133     1  0 12:21 ?        00:00:00 /usr/sbin/smartd -q never
root      2136     1  0 12:21 ttyS0    00:00:00 /sbin/agetty ttyS0 115200 vt100-nav
root      2137     1  0 12:21 tty1     00:00:00 /sbin/mingetty tty1
root      2138     1  0 12:21 tty2     00:00:00 /sbin/mingetty tty2
root      2139     1  0 12:21 tty3     00:00:00 /sbin/mingetty tty3
root      2140     1  0 12:21 tty4     00:00:00 /sbin/mingetty tty4
root      2141     1  0 12:21 tty5     00:00:00 /sbin/mingetty tty5
root      2142     1  0 12:21 tty6     00:00:00 /sbin/mingetty tty6
root      2205  1802  0 12:23 ?        00:00:00 sshd: root@pts/0 
root      2207  2205  0 12:23 pts/0    00:00:00 -bash
root      2287  2207  0 12:32 pts/0    00:00:00 ps -ef

Comment 1 Len DiMaggio 2007-01-31 18:35:49 UTC
Created attachment 147037 [details]
Screenshot

Comment 2 Len DiMaggio 2007-01-31 18:51:39 UTC
Note - a reboot does not always clear this problem - having the node join a new
cluster does seem to work.

Comment 3 Daniel Walsh 2007-02-01 20:35:47 UTC
Fixed avc in selinux-policy-2.4.6-35

Comment 4 Len DiMaggio 2007-02-08 14:30:15 UTC
So far - the only workarounds are to either have the nodes in question join a
new cluster - or reinstall the OS. Something seems to be cached somewhere.

Comment 5 Stanko Kupcevic 2007-02-09 12:52:05 UTC
modstorage tries to check cluster quorum, but there is no cluster anymore,
therefore cman_tool errors. 

`lvmconf --disable-cluster` should be run on cluster/nodes deletion/removal and
better error message should be generated. 

Comment 6 Stanko Kupcevic 2007-03-06 09:16:12 UTC
Commited to HEAD/RHEL4/RHEL5

Comment 7 RHEL Program Management 2007-03-21 22:21:06 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 Stanko Kupcevic 2007-03-30 18:45:29 UTC
Test with 0.9.2-4.el5 (conga and clustermon)


Comment 10 Daniel Walsh 2007-04-11 19:21:29 UTC
The associated SELinux will not get updated.  They can either generate SELinux
policy modules using audit2allow -M myconga or run in permissive mode.  Fix will
show up in 5.1

Comment 11 Kiersten (Kerri) Anderson 2007-04-23 17:08:53 UTC
Fixing Product Name.  Cluster Suite was merged into Enterprise Linux for version
5.0.

Comment 15 errata-xmlrpc 2007-11-07 15:36:55 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2007-0640.html