Hide Forgot
Description of problem: I saw this after 26 iterations of HA service relocation. There was no nfs client I/O taking place during these iterations. ================================================================================ Iteration 26 started at Fri Feb 4 12:54:30 CST 2011 Verifying that all services are started on all nodes in cluster Sleeping 1 minute(s) in between each relocation... Relocating nfs1 from grant-01 to grant-03 service:nfs1 owner should be [grant-03], not [none]. service:nfs1 is stuck in the [stopped] state. Failed relocation attempt [root@grant-01 ~]# clustat Cluster Status for GRANT @ Fri Feb 4 14:26:10 2011 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ grant-01 1 Online, Local, rgmanager grant-02 2 Online, rgmanager grant-03 3 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:nfs1 (grant-01) stopped [root@grant-03 ~]# clustat Cluster Status for GRANT @ Fri Feb 4 14:27:03 2011 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ grant-01 1 Online, rgmanager grant-02 2 Online, rgmanager grant-03 3 Online, Local, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:nfs1 (grant-01) stopped Feb 4 12:55:23 grant-01 qarshd[21754]: Running cmdline: clusvcadm -r nfs1 -m grant-03 Feb 4 12:55:23 grant-01 rgmanager[2162]: Stopping service service:nfs1 Feb 4 12:55:23 grant-01 rgmanager[21788]: Removing IPv4 address 10.15.89.208/24 from eth0 Feb 4 12:55:33 grant-01 rgmanager[21831]: Removing export: *:/mnt/grant1 Feb 4 12:55:33 grant-01 rgmanager[21863]: Stopping NFS daemons Feb 4 12:55:33 grant-01 mountd[20550]: Caught signal 15, un-registering and exiting. Feb 4 12:55:33 grant-01 kernel: nfsd: last server has exited, flushing export cache Feb 4 12:55:34 grant-01 rgmanager[21967]: Stopping rpc.statd Feb 4 12:55:35 grant-01 rgmanager[22100]: unmounting /mnt/grant1 Feb 4 12:55:35 grant-01 rgmanager[2162]: Service service:nfs1 is stopped Feb 4 12:55:40 grant-01 rgmanager[2162]: #60: Mangled reply from member #3 during RG relocate Feb 4 12:56:32 grant-03 rgmanager[2156]: #37: Error receiving header from 1 sz=0 CTX 0x8acbf0 Version-Release number of selected component (if applicable): 2.6.32-94.el6.x86_64 rgmanager-3.0.12-10.el6.x86_64
*** This bug has been marked as a duplicate of bug 635152 ***