Description of problem: - 2 Nodes RHCS setup. Both nodes are running RHCS and servicing the gfs. We are seeing the following every time one of the nodes is powering down from a node reboot. openais[6111]: [TOTEM] The consensus timeout expired. openais[6111]: [TOTEM] entering GATHER state from 3. The node will never shutdown; this output will be printed forever. The only way to get the node back up is physically power it off and back on. Version-Release number of selected component (if applicable): Kernel Release: 2.6.18-194.el5 RHEL Release: Red Hat Enterprise Linux Server release 5.5 (Tikanga) Version: Linux version 2.6.18-194.el5 (mockbuild.bos.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Tue Mar 16 22:03:12 EDT 2010 Platform: ppc64 How reproducible: Often. Steps to Reproduce: 1. 2 RHEL5.5 PPC servers. 2. Connected to 2 storages. 3. Created 4 volumes from each storage and mapped to cluster group of the 2 servers. 4. RHCS is installed on both servers. 5. Setup the RHCS and also using GFS. 6. Reboot one of the nodes Actual results: The node repeatedly reporting the following message during shutdown sequence: openais[6111]: [TOTEM] The consensus timeout expired. openais[6111]: [TOTEM] entering GATHER state from 3. openais[6111]: [TOTEM] The consensus timeout expired. openais[6111]: [TOTEM] entering GATHER state from 3. Expected results: Node shutdown and boot back up without any issue. Additional info: [root@tsunami ~]# chkconfig --list | egrep "clvmd|gfs |rgmanage|cman" clvmd 0:off 1:off 2:on 3:on 4:on 5:on 6:off cman 0:off 1:off 2:on 3:on 4:on 5:on 6:off gfs 0:off 1:off 2:on 3:on 4:on 5:on 6:off rgmanager 0:off 1:off 2:on 3:on 4:on 5:on 6:off [root@tsunami ~]# service gfs status Configured GFS mountpoints: /home/smashmnt0 /home/smashmnt1 /home/smashmnt2 /home/smashmnt3 /home/smashmnt4 /home/smashmnt5 /home/smashmnt6 /home/smashmnt7 Active GFS mountpoints: /home/smashmnt0 /home/smashmnt1 /home/smashmnt2 /home/smashmnt3 /home/smashmnt4 /home/smashmnt5 /home/smashmnt6 /home/smashmnt7 [root@tsunami ~]# service cman status cman is running. [root@washuu testutils]# chkconfig --list | egrep "clvmd|gfs |rgmanage|cman" clvmd 0:off 1:off 2:on 3:on 4:on 5:on 6:off cman 0:off 1:off 2:on 3:on 4:on 5:on 6:off gfs 0:off 1:off 2:on 3:on 4:on 5:on 6:off rgmanager 0:off 1:off 2:on 3:on 4:on 5:on 6:off [root@washuu testutils]# service gfs status Configured GFS mountpoints: /home/smashmnt0 /home/smashmnt1 /home/smashmnt2 /home/smashmnt3 /home/smashmnt4 /home/smashmnt5 /home/smashmnt6 /home/smashmnt7 Active GFS mountpoints: /home/smashmnt0 /home/smashmnt1 /home/smashmnt2 /home/smashmnt3 /home/smashmnt4 /home/smashmnt5 /home/smashmnt6 /home/smashmnt7 [root@washuu testutils]# service cman status cman is running. [root@tsunami ~]# cat /etc/cluster/cluster.conf ?xml version="1.0"?> cluster alias="washuu-tsunami" config_version="4" name="washuu-tsunami"> fence_daemon post_fail_delay="0" post_join_delay="3"/> clusternodes> clusternode name="washuu" nodeid="1" votes="1"> fence> method name="1"> device name="Persistent_Reserve" node="washuu"/> /method> /fence> /clusternode> clusternode name="tsunami" nodeid="2" votes="1"> fence> method name="1"> device name="Persistent_Reserve" node="tsunami"/> /method> /fence> /clusternode> /clusternodes> cman expected_votes="1" two_node="1"/> fencedevices> fencedevice agent="fence_scsi" name="Persistent_Reserve"/> /fencedevices> rm> failoverdomains> failoverdomain name="tsunami1" ordered="1" restricted="0"> failoverdomainnode name="washuu" priority="2"/> failoverdomainnode name="tsunami" priority="1"/> /failoverdomain> failoverdomain name="washuu1" ordered="1"> failoverdomainnode name="washuu" priority="1"/> failoverdomainnode name="tsunami" priority="2"/> /failoverdomain> /failoverdomains> resources> ip address="172.22.229.160" monitor_link="1"/> ip address="172.22.229.165" monitor_link="1"/> /resources> service autostart="1" domain="tsunami1" exclusive="0" name="service-172.22.229.160" recovery="relocate"> ip ref="172.22.229.160"/> /service> service autostart="1" domain="washuu1" exclusive="0" name="service-172.22.229.165" recovery="relocate"> ip ref="172.22.229.165"/> /service> /rm> /cluster> Console output during shutdown: The system is going down for reboot NOW! INIT: Sending processes the TERM signal Shutting down Cluster Module - cluster monitor: [ OK ] Shutting down Cluster Service Manager... Waiting for services to stop: [ OK ] Cluster Service Manager is stopped. Shutting down ricci: [ OK ] Shutting down smartd: [ OK ] [ OK ] down CIM server: [ OK ] Shutting down Avahi daemon: [ OK ] Shutting down oddjobd: [ OK ] Stopping yum-updatesd: [ OK ] Stopping anacron: [ OK ] Stopping atd: [ OK ] Stopping saslauthd: [ OK ] Stopping cups: [ OK ] Stopping hpiod: [ OK ] Stopping hpssd: [ OK ] Shutting down xfs: [ OK ] Shutting down console mouse services: [ OK ] Shutting down NFS mountd: [ OK ] Shutting down NFS daemon: nfsd: last server has exited nfsd: unexporting all filesystems [ OK ] Shutting down NFS quotas: [ OK ] Shutting down NFS services: [ OK ] Stopping sshd: [ OK ] Shutting down sm-client: [ OK ] Shutting down sendmail: [ OK ] Shutting down vsftpd: [ OK ] Stopping xinetd: [ OK ] Stopping crond: [ OK ] Stopping autofs: Stopping automount: [ OK ] [ OK ] Deactivating VG lvm_vg: Can't deactivate volume group "lvm_vg" with 8 open logical volume(s) [FAILED] Unmounting GFS filesystems: [ OK ] Stopping HAL daemon: [ OK ] Unmounting NFS filesystems: [ OK ] Stopping cluster: Stopping fencing... done Stopping cman... failed /usr/sbin/cman_tool: Error leaving cluster: Device or resource busy [FAILED] Shutting down fcauthd[ OK ] Stopping system message bus: [ OK ] Stopping RPC idmapd: [ OK ] Stopping NFS statd: [ OK ] Stopping portmap: [ OK ] Stopping auditd: audit(1273785255.045:142): audit_pid=0 old=2562 by auid=4294967295 [ OK ] Stopping PC/SC smart card daemon (pcscd): [ OK ] Shutting down kernel logger: [ OK ] Shutting down system logger: [ OK ] Shutting down hidd: [ OK ] [ OK ] Bluetooth services:[ OK ] Shutting down interface eth0: ehea: eth0: Logical port down ehea: eth0: Physical port up ehea: External switch port is backup port [ OK ] Shutting down loopback interface: [ OK ] Starting killall: openais[6111]: [TOTEM] The token was lost in the OPERATIONAL state. openais[6111]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes). openais[6111]: [TOTEM] Transmit multicast socket send buffer size (258048 bytes). openais[6111]: [TOTEM] The network interface is down. openais[6111]: [TOTEM] entering GATHER state from 15. openais[6111]: [TOTEM] entering GATHER state from 2. openais[6111]: [TOTEM] entering GATHER state from 0. openais[6111]: [TOTEM] The consensus timeout expired. openais[6111]: [TOTEM] entering GATHER state from 3. openais[6111]: [TOTEM] The consensus timeout expired. openais[6111]: [TOTEM] entering GATHER state from 3.
Any Updates?
Abdel - is this ppc specific, or can it be reproduced on other arches as well?
After some testing and log file analysis, I'm really NOT sure if this is problem of openais. It *can* be problem of clvm (Can't deactivate volume group "lvm_vg" with 8 open logical volume(s)), init scripts, or maybe cman (Error leaving cluster: Device or resource busy). But fact is, that openais will never receive shutdown signal (cman is responsible for that). I'm reassigning bug to cman. Chrissie, if you will feel that this problem IS problem of openais, please reassign bug back to me.
As Honza says, this is all down to the fact that clvmd can't deactivate the logical volumes. Without that happening there is a whole dependency of things that can't shut down. So the first thing to investigate is what is holding those volumes open. There is also the side-issue that sending signals to openais does not shut it down.
For answering the issue is seeing on different architects, we have also seen this issue on x64 and ia64.
any updates?
The log messages from openais are not valid indicators of this failure: Stopping cman... failed This will mean that openais does not shutdown. When openais doesn't shutdown and the network interfaces are later stopped, openais will print the messages in the log. Not an openais problem.
Wow. [root@molly rc6.d]# ls -l *clvmd* lrwxrwxrwx 1 root root 15 Jul 1 16:36 K74clvmd -> ../init.d/clvmd [root@molly rc6.d]# chkconfig --del clvmd [root@molly rc6.d]# ls -l *clvmd* ls: *clvmd*: No such file or directory [root@molly rc6.d]# grep chkconfig ../init.d/clvmd # chkconfig: - 24 76 [root@molly rc6.d]# chkconfig --level 345 clvmd on [root@molly rc6.d]# ls -l *clvmd* lrwxrwxrwx 1 root root 15 Jul 1 16:40 K74clvmd -> ../init.d/clvmd [root@molly rc6.d]#
Clvmd is stopping at 74 instead of 76 - at the same level as gfs/gfs2. It should be stopping at 76 (after gfs/gfs2).
Performing the following allows clvmd to stop at the right time (i.e. after gfs/gfs2 are unmounted by their respective initscripts): * Remove "Required-Stop: $local_fs" from /etc/init.d/clvmd * chkconfig --del clvmd * chkconfig --level 345 clvmd on Changing component to lvm2-cluster.
Steve Dake noticed that this works as expected on later releases. It turns out that /etc/init.d/netfs "Provides: $local_fs" on Red Hat Enterprise Linux 5, but not on later releases of Fedora or Red Hat Enterprise Linux 6 Beta. This provision interferes with chkconfig, which reorders the initscripts based on the "Provides:" information at the top of the script.
Created attachment 428565 [details] Example fix. After applying this patch to /etc/init.d/netfs, clvmd will be stopped at level 76 as expected, which is after the gfs/gfs2 init scripts.
After applying the above patch to /etc/init.d/netfs, you must perform: * chkconfig --del clvmd * chkconfig --level 345 clvmd on ... in order to reset the links.
But how did it choose to set them at the same level if there's a dependency between them? Is the gfs script missing 'Required-Stop: clvmd' (and Required-Start too)? And is clvmd missing a dependency on cman?
I'd agree... changing this in initscripts seems fishy, not the least because some app may rely on that provide in RHEL 5.
*** This bug has been marked as a duplicate of bug 588903 ***
You're right, Bill. Turns out it was a regression in lvm2-cluster.