Bug 575505
| Summary: | NFS4 locking issues | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Marcus Alves Grando <marcus> | ||||
| Component: | kernel | Assignee: | Jeff Layton <jlayton> | ||||
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 5.5 | CC: | bfields, jlayton, steved | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2011-03-02 13:18:32 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Sorry for the delay in looking at this... Looking at the capture, the first occurrence of an NFS4_OLD_STATEID error is in frame 9842, which is a response to the LOCK request in frame 9840. That lock request has "new lock owner" set to no and a stateid with a seqid of 0. Prior to that, in frame 9836 there was a LOCKU request that had a stateid with the same data field and the seqid was set to 0xff. So the question becomes why did it go from using a seqid of 0xff to one of 0. Maybe a problem with the seqid wrapping or the upper bits are being masked off somehow? Most of the code seems to consider nfs4_stateid's as a 16 byte opaque value. Still trying to figure out how the seqid embedded in the first 4 bytes gets incremented.
Note that there is a large swath of lock stateid fixes queued for 5.6. Would you be able to test the kernels here and report back as to whether they help this problem?
http://people.redhat.com/jwilson/el5/
Ping...have you been able to test more recent kernels? 5.6 is now released, so testing against the latest 5.6 kernel would be ideal. No reply from reporter in several months. Closing bug. Please reopen if you are still able to reproduce the bug on 5.6. |
Created attachment 401520 [details] tcpdump Jeff, like we discuss on #551028 I have some problems with NFS4 locks. I'm using Linux client and EMC Celerra server. # uname -a Linux c5-64.terra.com.br 2.6.18-194.el5 #1 SMP Fri Mar 19 20:54:58 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux + Patches of #551028 It's easy to reproduce using LTP test case ltp-full-20100228/testcases/network/nfs/nfslock01. Steps: copy nfslock01 programs to /nfs cd /nfs mkdir test /nfs/nfs_flock_dgen /nfs/test/nfs_flock_data 63 16384 0 cp /nfs/test/nfs_flock_data /nfs/test/nfs_flock_idata /nfs/nfs_flock_frk /nfs/nfs_flock /nfs/test/nfs_flock_idata CTRL+C rm -rf test <- Locking issue I'll attach tcpdump of the problem. Regards