Description of problem: I found this problem today, while verify other bug fixes. The detailed steps to recreate are listed below. Version-Release number of selected component (if applicable): RHEL5-Server-20070126.0 luci-0.8-30.el5 ricci-0.8-30.el5 How reproducible: Seems to be 100% Steps to Reproduce: 1. Create a new cluster 2. After the cluster is complete - and the nodes have rebooted - delete the cluster from the cluster tab in luci (not from the homepage/manage systems tab) 3. The cluster nodes should still be accessible as storage nodes. Go to the storage tab in luci and select one of the nodes. This error is returned in a dialog box: An error has occured while probing storage: Generic error on host: cluster tools: cman_tool errored (See the attachment for a screenshot.) Actual results: The error listed above and the in the screenshot. Expected results: No errors. Additional info: I also saw this SELinux AVC message - before I installed these new packages: selinux-policy-2.4.6-32.el5 selinux-policy-targeted-2.4.6-32.el5 type=AVC msg=audit(1170257190.911:160): avc: denied { write } for pid=24613 comm="cman_tool" name="cman_client" dev=dm-0 ino=620947 scontext=system_u:system_r:ricci_modstorage_t:s0 tcontext=system_u:object_r:ccs_var_run_t:s0 tclass=sock_file I'm not too sure about how to clear up this condition - rebooting the nodes seems to do it sometimes. Here are the services/processes running when the failure occurs: [root@tng3-1 ~]# service --status-all | grep cman [root@tng3-1 ~]# service --status-all | grep cluster modclusterd (pid 2039) is running... [root@tng3-1 ~]# ps -ef | grep cman root 4203 2348 0 12:16 pts/0 00:00:00 grep cman [root@tng3-1 ~]# service cman status ccsd is stopped [root@tng3-1 ~]# service rgmanager status ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 12:20 ? 00:00:00 init [3] root 2 1 0 12:20 ? 00:00:00 [migration/0] root 3 1 0 12:20 ? 00:00:00 [ksoftirqd/0] root 4 1 0 12:20 ? 00:00:00 [watchdog/0] root 5 1 0 12:20 ? 00:00:00 [events/0] root 6 1 0 12:20 ? 00:00:00 [khelper] root 7 1 0 12:20 ? 00:00:00 [kthread] root 10 7 0 12:20 ? 00:00:00 [kblockd/0] root 11 7 0 12:20 ? 00:00:00 [cqueue/0] root 14 7 0 12:20 ? 00:00:00 [khubd] root 16 7 0 12:20 ? 00:00:00 [kseriod] root 87 7 0 12:20 ? 00:00:00 [pdflush] root 88 7 0 12:20 ? 00:00:00 [pdflush] root 89 7 0 12:20 ? 00:00:00 [kswapd0] root 90 7 0 12:20 ? 00:00:00 [aio/0] root 242 7 0 12:20 ? 00:00:00 [kpsmoused] root 258 7 0 12:20 ? 00:00:00 [kmirrord] root 263 7 0 12:20 ? 00:00:00 [ksnapd] root 266 7 0 12:20 ? 00:00:00 [kjournald] root 293 7 0 12:20 ? 00:00:00 [kauditd] root 322 1 0 12:20 ? 00:00:00 /sbin/udevd -d root 620 7 0 12:20 ? 00:00:00 [scsi_eh_0] root 621 7 0 12:20 ? 00:00:00 [lpfc_worker_0] root 622 7 0 12:20 ? 00:00:00 [scsi_wq_0] root 623 7 0 12:20 ? 00:00:00 [fc_wq_0] root 624 7 0 12:20 ? 00:00:00 [fc_dl_0] root 1015 7 0 12:20 ? 00:00:00 [kjournald] root 1443 1 0 12:20 ? 00:00:00 /sbin/dhclient -1 -q -cf /etc/dhclient-eth0.conf -l root 1525 1 0 12:20 ? 00:00:00 /usr/sbin/restorecond root 1536 1 0 12:20 ? 00:00:00 auditd root 1538 1536 0 12:20 ? 00:00:00 python /sbin/audispd root 1551 1 0 12:20 ? 00:00:00 syslogd -m 0 root 1554 1 0 12:20 ? 00:00:00 klogd -x root 1579 1 0 12:20 ? 00:00:00 mcstransd rpc 1591 1 0 12:20 ? 00:00:00 portmap root 1611 1 0 12:20 ? 00:00:00 rpc.statd root 1642 1 0 12:20 ? 00:00:00 rpc.idmapd dbus 1659 1 0 12:20 ? 00:00:00 dbus-daemon --system root 1670 1 0 12:20 ? 00:00:00 /usr/sbin/hcid root 1674 1 0 12:20 ? 00:00:00 /usr/sbin/sdpd root 1697 1 0 12:20 ? 00:00:00 [krfcommd] root 1733 1 0 12:20 ? 00:00:00 pcscd root 1751 1 0 12:20 ? 00:00:00 /usr/bin/hidd --server root 1765 1 0 12:20 ? 00:00:00 automount root 1789 1 0 12:20 ? 00:00:00 cupsd root 1802 1 0 12:20 ? 00:00:00 /usr/sbin/sshd root 1813 1 0 12:20 ? 00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid root 1833 1 0 12:20 ? 00:00:00 sendmail: accepting connections smmsp 1842 1 0 12:20 ? 00:00:00 sendmail: Queue runner@01:00:00 for /var/spool/clie root 1861 1 0 12:20 ? 00:00:00 gpm -m /dev/input/mice -t exps2 root 1887 1 0 12:20 ? 00:00:00 crond xfs 1909 1 0 12:20 ? 00:00:00 xfs -droppriv -daemon root 1928 1 0 12:20 ? 00:00:00 /usr/sbin/atd root 1940 1 0 12:20 ? 00:00:00 /usr/bin/python /usr/sbin/yum-updatesd avahi 1951 1 0 12:20 ? 00:00:00 avahi-daemon: running [tng3-1.local] avahi 1952 1951 0 12:20 ? 00:00:00 avahi-daemon: chroot helper 68 1962 1 0 12:20 ? 00:00:00 hald root 1963 1962 0 12:20 ? 00:00:00 hald-runner root 1981 1963 0 12:20 ? 00:00:00 hald-addon-storage: polling /dev/hdd root 2011 1 0 12:21 ? 00:00:00 modclusterd root 2066 1 0 12:21 ? 00:00:00 /usr/sbin/oddjobd -p /var/run/oddjobd.pid -t 300 root 2097 1 0 12:21 ? 00:00:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam root 2098 2097 0 12:21 ? 00:00:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam root 2099 2097 0 12:21 ? 00:00:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam root 2100 2097 0 12:21 ? 00:00:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam root 2102 2097 0 12:21 ? 00:00:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam ricci 2109 1 0 12:21 ? 00:00:00 ricci -u 101 root 2133 1 0 12:21 ? 00:00:00 /usr/sbin/smartd -q never root 2136 1 0 12:21 ttyS0 00:00:00 /sbin/agetty ttyS0 115200 vt100-nav root 2137 1 0 12:21 tty1 00:00:00 /sbin/mingetty tty1 root 2138 1 0 12:21 tty2 00:00:00 /sbin/mingetty tty2 root 2139 1 0 12:21 tty3 00:00:00 /sbin/mingetty tty3 root 2140 1 0 12:21 tty4 00:00:00 /sbin/mingetty tty4 root 2141 1 0 12:21 tty5 00:00:00 /sbin/mingetty tty5 root 2142 1 0 12:21 tty6 00:00:00 /sbin/mingetty tty6 root 2205 1802 0 12:23 ? 00:00:00 sshd: root@pts/0 root 2207 2205 0 12:23 pts/0 00:00:00 -bash root 2287 2207 0 12:32 pts/0 00:00:00 ps -ef
Created attachment 147037 [details] Screenshot
Note - a reboot does not always clear this problem - having the node join a new cluster does seem to work.
Fixed avc in selinux-policy-2.4.6-35
So far - the only workarounds are to either have the nodes in question join a new cluster - or reinstall the OS. Something seems to be cached somewhere.
modstorage tries to check cluster quorum, but there is no cluster anymore, therefore cman_tool errors. `lvmconf --disable-cluster` should be run on cluster/nodes deletion/removal and better error message should be generated.
Commited to HEAD/RHEL4/RHEL5
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Test with 0.9.2-4.el5 (conga and clustermon)
The associated SELinux will not get updated. They can either generate SELinux policy modules using audit2allow -M myconga or run in permissive mode. Fix will show up in 5.1
Fixing Product Name. Cluster Suite was merged into Enterprise Linux for version 5.0.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2007-0640.html