Bug 166543

Summary:

NFS Stale locks when NFS export moves to another node

Product:

Red Hat Enterprise Linux 4

Reporter:

Alex Samad <alexander.samad>

Component:

nfs-utils

Assignee:

Steve Dickson <steved>

Status:

CLOSED DUPLICATE

QA Contact:

Cluster QE <mspqa-list>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

4.0

CC:

Axel.Thimm, kanderso, lhh, poelstra, rkenna, steved

Target Milestone:

---

Target Release:

---

Hardware:

i386

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2005-10-20 20:50:32 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
cluster conf file	none
rmtab file from 4 nodes	none

Description Alex Samad 2005-08-23 05:11:46 UTC

Description of problem:
I have setup a 4 node cluster with gfs (AS4, Cluster Suite4, GFS 6.1).  I have 
3 gfs partitions that I am trying to export via nfs. This has been done and I 
can mount them on another redhat AS3 box. When I move the nfs shares to other 
nodes to simulate a failover I get stale nfs handle errors on the client 
machine

Upon investigation I found a article for cluster suite 3, about setting FS for 
nfs shares as these need to be the same for each node.  This seems to have 
been removed (the ability to set options, through the gui), but the 
documentation states that clurmtabd should handle this.

I have run cat /var/lib/nfs/rmtab on all the nodes when it is working (before 
the move and after, when I am getting the stale nfs handles) and the numbers 
are  not similiar or consistant across the nodes. NOTE- I am making assumption 
the numbers in these files are the FS numbers.

I have checked the major minor numbers for all the base devices (lvm 
partitions) and they are the same on all the machines!

here is an example output pre move
---
forall cat /var/lib/nfs/rmtab \; echo 
172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000014
172.20.0.0/16:/mnt/gfs/lv3:0x0000001a

172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000012
172.20.0.0/16:/mnt/gfs/lv3:0x00000017
172.20.231.100:172.20.0.0/16:0x0000000a

172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000014
172.20.0.0/16:/mnt/gfs/lv3:0x0000001a

172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000014
---

and after 

[samad@rhnsat test]$ forall cat /var/lib/nfs/rmtab \; echo 
172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000014
172.20.0.0/16:/mnt/gfs/lv3:0x0000001a

172.20.0.0/16:/mnt/gfs/lv1:0x00000017
172.20.0.0/16:/mnt/gfs/lv2:0x00000012
172.20.0.0/16:/mnt/gfs/lv3:0x00000017
172.20.231.100:172.20.0.0/16:0x0000000b

172.20.0.0/16:/mnt/gfs/lv1:0x00000016
172.20.0.0/16:/mnt/gfs/lv2:0x00000017
172.20.231.100:172.20.0.0/16:0x00000004
172.20.0.0/16:/mnt/gfs/lv3:0x00000002

172.20.0.0/16:/mnt/gfs/lv1:0x00000014
172.20.0.0/16:/mnt/gfs/lv2:0x00000017




Version-Release number of selected component (if applicable):
rgmanager-1.9.34-1

How reproducible:
everytime

Steps to Reproduce:
1.start cluster
2. mount nfs exports
3. move nfs export to new node
 
  
Actual results:
I get nfs stale locks on the client machine

Expected results:
the nfs shafre should seemlessly failover to new node

Additional info:
na

Comment 1 Alex Samad 2005-08-23 05:11:46 UTC

Created attachment 117989 [details]
cluster conf file

Comment 2 Alex Samad 2005-08-23 05:13:17 UTC

Created attachment 117990 [details]
rmtab file from 4 nodes

this shows the rmtab file from all the nodes

Comment 5 Lon Hohberger 2005-10-04 21:28:45 UTC

Assigning to NFS maintainer, but staying on CC list for now.

Comment 9 Kiersten (Kerri) Anderson 2005-10-20 20:50:32 UTC

We think this is a dup of the NFS Failover defect so are going to close it as
such and link it to that one- BZ 132823.

*** This bug has been marked as a duplicate of 132823 ***

Comment 10 Axel Thimm 2005-10-20 21:35:51 UTC

bug #132823 is protected. Could it be opened to the public? Otherwise it would
be good to keep this one open to have something to track.