Description of problem: There may already be a bz open for this, but I couldn't find it. I had NFS I/O running to 2 filesystems (ext and gfs) while relocating the service around the cluster. The last time I tried the relocation, the client I/O to the gfs filesystem failed. When I tried an ls, it took like 3-5 minutes to return. Run the cmd on hayes-01: Feb 22 16:59:49 hayes-01 qarshd[23533]: Running cmdline: clusvcadm -r nfs1 -m hayes-03 Service is running on hayes-02: Feb 22 16:59:49 hayes-02 clurgmgrd[12469]: <notice> Stopping service nfs1 Feb 22 16:59:49 hayes-02 clurgmgrd: [12469]: <info> Removing IPv4 address 10.15.89.209 from eth0 Feb 22 16:59:59 hayes-02 clurgmgrd: [12469]: <info> Removing export: *:/mnt/hayes0 Feb 22 16:59:59 hayes-02 clurgmgrd: [12469]: <warning> Dropping node-wide NFS locks Feb 22 16:59:59 hayes-02 clurgmgrd: [12469]: <info> unmounting /dev/mapper/HAYES-HAYES0 (/mnt/hayes0) Feb 22 16:59:59 hayes-02 clurgmgrd: [12469]: <info> Removing export: *:/mnt/hayes1 Feb 22 16:59:59 hayes-02 clurgmgrd: [12469]: <info> unmounting /mnt/hayes1 Feb 22 16:59:59 hayes-02 clurgmgrd[12469]: <notice> Service nfs1 is stopped Service is supposed to relocate to hayes-03: Feb 22 17:00:00 hayes-03 clurgmgrd[13157]: <notice> Starting stopped service nfs1 Feb 22 17:00:00 hayes-03 clurgmgrd: [13157]: <info> mounting /dev/mapper/HAYES-HAYES1 on /mnt/hayes1 Feb 22 17:00:00 hayes-03 kernel: kjournald starting. Commit interval 5 seconds Feb 22 17:00:00 hayes-03 kernel: EXT3 FS on dm-3, internal journal Feb 22 17:00:00 hayes-03 kernel: EXT3-fs: mounted filesystem with ordered data mode. Feb 22 17:00:00 hayes-03 clurgmgrd: [13157]: <info> Adding export: *:/mnt/hayes1 (fsid=7777,rw) Feb 22 17:00:00 hayes-03 kernel: GFS: Trying to join cluster "lock_dlm", "HAYES:HAYES0" Feb 22 17:00:02 hayes-03 kernel: GFS: fsid=HAYES:HAYES0.0: Joined cluster. Now mounting FS... Feb 22 17:00:02 hayes-03 kernel: GFS: fsid=HAYES:HAYES0.0: jid=0: Trying to acquire journal lock... Feb 22 17:00:02 hayes-03 kernel: GFS: fsid=HAYES:HAYES0.0: jid=0: Looking at journal... Feb 22 17:00:02 hayes-03 kernel: GFS: fsid=HAYES:HAYES0.0: jid=0: Done Feb 22 17:00:02 hayes-03 kernel: GFS: fsid=HAYES:HAYES0.0: jid=1: Trying to acquire journal lock... Feb 22 17:00:02 hayes-03 kernel: GFS: fsid=HAYES:HAYES0.0: jid=1: Looking at journal... Feb 22 17:00:02 hayes-03 kernel: GFS: fsid=HAYES:HAYES0.0: jid=1: Done Feb 22 17:00:02 hayes-03 kernel: GFS: fsid=HAYES:HAYES0.0: jid=2: Trying to acquire journal lock... Feb 22 17:00:02 hayes-03 kernel: GFS: fsid=HAYES:HAYES0.0: jid=2: Looking at journal... Feb 22 17:00:02 hayes-03 kernel: GFS: fsid=HAYES:HAYES0.0: jid=2: Done Feb 22 17:00:02 hayes-03 clurgmgrd: [13157]: <info> Adding export: *:/mnt/hayes0 (fsid=8868,rw) Feb 22 17:00:02 hayes-03 clurgmgrd: [13157]: <info> Adding IPv4 address 10.15.89.209 to eth0 Feb 22 17:00:03 hayes-03 clurgmgrd[13157]: <notice> Service nfs1 started Client I/O failure: [accordion_quick] accordion(): cache_open(accrdfile4, 4162, 0666) failed: Stale NFS file handle Version-Release number of selected component (if applicable): 2.6.9-67.ELsmp rgmanager-1.9.72-1 NFS client: 2.6.9-42.ELhugemem
Here was my resource section: <rm> <failoverdomains> <failoverdomain name="HAYES_domain" ordered="0" restricted="0"> <failoverdomainnode name="hayes-01" priority="1"/> <failoverdomainnode name="hayes-02" priority="1"/> <failoverdomainnode name="hayes-03" priority="1"/> </failoverdomain> </failoverdomains> <resources> <ip address="10.15.89.209" monitor_link="1"/> <clusterfs device="/dev/HAYES/HAYES0" force_unmount="1" self_fence="1" fsid="8868" fstype="gfs" mountpoint="/mnt/hayes0" name="HAYES0" options=""/> <fs device="/dev/HAYES/HAYES1" force_fsck="0" force_unmount="1" self_fence="1" fsid="7777" fstype="ext3" mountpoint="/mnt/hayes1" name="HAYES1" options=""/> <nfsexport name="HAYES nfs exports"/> <nfsclient name="*" options="rw" target="*"/> </resources> <service autostart="1" domain="HAYES_domain" name="nfs1" nfslock="1"> <clusterfs ref="HAYES0"> <nfsexport ref="HAYES nfs exports"> <nfsclient ref="*"/> </nfsexport> </clusterfs> <fs ref="HAYES1"> <nfsexport ref="HAYES nfs exports"> <nfsclient ref="*"/> </nfsexport> </fs> <ip ref="10.15.89.209"/> </service> </rm> The relocate did appear to work: [root@hayes-03 etc]# clustat Member Status: Quorate Member Name Status ------ ---- ------ hayes-01 Online, rgmanager hayes-02 Online, rgmanager hayes-03 Online, Local, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- nfs1 hayes-03 started
*** This bug has been marked as a duplicate of 252335 ***