166543 – NFS Stale locks when NFS export moves to another node

Bug 166543 - NFS Stale locks when NFS export moves to another node

Summary: NFS Stale locks when NFS export moves to another node

Keywords:
Status:	CLOSED DUPLICATE of bug 132823
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	nfs-utils
Sub Component:
Version:	4.0
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Steve Dickson
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-08-23 05:11 UTC by Alex Samad
Modified:	2007-11-30 22:07 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2005-10-20 20:50:32 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
cluster conf file (4.44 KB, text/plain) 2005-08-23 05:11 UTC, Alex Samad	no flags	Details
rmtab file from 4 nodes (542 bytes, text/plain) 2005-08-23 05:13 UTC, Alex Samad	no flags	Details
View All

Description Alex Samad 2005-08-23 05:11:46 UTC

Description of problem:
I have setup a 4 node cluster with gfs (AS4, Cluster Suite4, GFS 6.1).  I have 
3 gfs partitions that I am trying to export via nfs. This has been done and I 
can mount them on another redhat AS3 box. When I move the nfs shares to other 
nodes to simulate a failover I get stale nfs handle errors on the client 
machine

Upon investigation I found a article for cluster suite 3, about setting FS for 
nfs shares as these need to be the same for each node.  This seems to have 
been removed (the ability to set options, through the gui), but the 
documentation states that clurmtabd should handle this.

I have run cat /var/lib/nfs/rmtab on all the nodes when it is working (before 
the move and after, when I am getting the stale nfs handles) and the numbers 
are  not similiar or consistant across the nodes. NOTE- I am making assumption 
the numbers in these files are the FS numbers.

I have checked the major minor numbers for all the base devices (lvm 
partitions) and they are the same on all the machines!

here is an example output pre move
---
forall cat /var/lib/nfs/rmtab \; echo 
172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000014
172.20.0.0/16:/mnt/gfs/lv3:0x0000001a

172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000012
172.20.0.0/16:/mnt/gfs/lv3:0x00000017
172.20.231.100:172.20.0.0/16:0x0000000a

172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000014
172.20.0.0/16:/mnt/gfs/lv3:0x0000001a

172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000014
---

and after 

[samad@rhnsat test]$ forall cat /var/lib/nfs/rmtab \; echo 
172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000014
172.20.0.0/16:/mnt/gfs/lv3:0x0000001a

172.20.0.0/16:/mnt/gfs/lv1:0x00000017
172.20.0.0/16:/mnt/gfs/lv2:0x00000012
172.20.0.0/16:/mnt/gfs/lv3:0x00000017
172.20.231.100:172.20.0.0/16:0x0000000b

172.20.0.0/16:/mnt/gfs/lv1:0x00000016
172.20.0.0/16:/mnt/gfs/lv2:0x00000017
172.20.231.100:172.20.0.0/16:0x00000004
172.20.0.0/16:/mnt/gfs/lv3:0x00000002

172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000017




Version-Release number of selected component (if applicable):
rgmanager-1.9.34-1

How reproducible:
everytime

Steps to Reproduce:
1.start cluster
2. mount nfs exports
3. move nfs export to new node
 
  
Actual results:
I get nfs stale locks on the client machine

Expected results:
the nfs shafre should seemlessly failover to new node

Additional info:
na

Comment 1 Alex Samad 2005-08-23 05:11:46 UTC

Created attachment 117989 [details]
cluster conf file

Comment 2 Alex Samad 2005-08-23 05:13:17 UTC

Created attachment 117990 [details]
rmtab file from 4 nodes

this shows the rmtab file from all the nodes

Comment 5 Lon Hohberger 2005-10-04 21:28:45 UTC

Assigning to NFS maintainer, but staying on CC list for now.

Comment 9 Kiersten (Kerri) Anderson 2005-10-20 20:50:32 UTC

We think this is a dup of the NFS Failover defect so are going to close it as
such and link it to that one- BZ 132823.

*** This bug has been marked as a duplicate of 132823 ***

Comment 10 Axel Thimm 2005-10-20 21:35:51 UTC

bug #132823 is protected. Could it be opened to the public? Otherwise it would
be good to keep this one open to have something to track.

Note You need to log in before you can comment on or make changes to this bug.