Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 575505

Summary:

NFS4 locking issues

Product:

Red Hat Enterprise Linux 5

Reporter:

Marcus Alves Grando <marcus>

Component:

kernel

Assignee:

Jeff Layton <jlayton>

Status:

CLOSED INSUFFICIENT_DATA

QA Contact:

Red Hat Kernel QE team <kernel-qe>

Severity:

high

Docs Contact:

Priority:

low

Version:

5.5

CC:

bfields, jlayton, steved

Target Milestone:

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2011-03-02 13:18:32 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
tcpdump	none

Description Marcus Alves Grando 2010-03-21 03:16:32 UTC

Created attachment 401520 [details]
tcpdump

Jeff, like we discuss on #551028

I have some problems with NFS4 locks. I'm using Linux client and EMC Celerra server.

# uname -a
Linux c5-64.terra.com.br 2.6.18-194.el5 #1 SMP Fri Mar 19 20:54:58 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
+ Patches of #551028

It's easy to reproduce using LTP test case ltp-full-20100228/testcases/network/nfs/nfslock01.

Steps:
copy nfslock01 programs to /nfs
cd /nfs
mkdir test
/nfs/nfs_flock_dgen /nfs/test/nfs_flock_data 63 16384 0
cp /nfs/test/nfs_flock_data /nfs/test/nfs_flock_idata
/nfs/nfs_flock_frk /nfs/nfs_flock /nfs/test/nfs_flock_idata
CTRL+C
rm -rf test <- Locking issue

I'll attach tcpdump of the problem.

Regards

Comment 1 Jeff Layton 2010-10-19 11:35:30 UTC

Sorry for the delay in looking at this...

Looking at the capture, the first occurrence of an NFS4_OLD_STATEID error is in frame 9842, which is a response to the LOCK request in frame 9840. That lock request has "new lock owner" set to no and a stateid with a seqid of 0.

Prior to that, in frame 9836 there was a LOCKU request that had a stateid with the same data field and the seqid was set to 0xff. So the question becomes why did it go from using a seqid of 0xff to one of 0.

Maybe a problem with the seqid wrapping or the upper bits are being masked off somehow?

Comment 2 Jeff Layton 2010-10-19 12:04:10 UTC

Most of the code seems to consider nfs4_stateid's as a 16 byte opaque value. Still trying to figure out how the seqid embedded in the first 4 bytes gets incremented.

Note that there is a large swath of lock stateid fixes queued for 5.6. Would you be able to test the kernels here and report back as to whether they help this problem?

    http://people.redhat.com/jwilson/el5/

Comment 3 Jeff Layton 2011-01-21 14:10:05 UTC

Ping...have you been able to test more recent kernels? 5.6 is now released, so testing against the latest 5.6 kernel would be ideal.

Comment 4 Jeff Layton 2011-03-02 13:18:32 UTC

No reply from reporter in several months. Closing bug. Please reopen if you are still able to reproduce the bug on 5.6.