I reproduced problems locally that resulted in similar reference count leaks, and submitted patches to fix those. According to the original reporter, those patches may help his situation somewhat, but he is still left with an unmountable filesystem.
Further debugging will therefore be required.
I've tested the -178 kernel and (as mentioned) it does not fully fix our bug. The quota information is more correct now, but still not completely correct. In the past the quota would show a value slightly below the hard limit. With the patches applied, the quota shows 28 block in use, where this should be 4 (only a single directory).
I've also tried the 3.0.1 kernel as this seemed to have a patch that could be relevant and I was not sure the -178 kernel has this one:
Author: J. Bruce Fields <firstname.lastname@example.org>
Date: Wed Jun 29 16:49:04 2011 -0400
svcrpc: fix list-corrupting race on nfsd shutdown
commit ebc63e531cc6a457595dd110b07ac530eae788c3 upstream.
The 3.0.1 has similar behaviour.
Kernels having these fixes applied all seem to have the 40s delay during the simulation, although there doesn't seem to be any network traffic during the freeze. Older kernels don't have this freeze period.
I will also add this comment to the new bug report. I've attached a vmcore and tcpdump using the -178 kernel to the support case. I'm not sure you can access that information?
FYI: I can not see comment 2. Does it contain relevant information?
No fix on hand at this point, so too late for 6.2; expecting to get back to this for 6.3.
According to the support case, "the issue is now known and we are expecting to come with the patch soon".
Does that mean you have been able to reproduce the remaining problems and are working on a fix?
If you have not yet been able to reproduce this problem, can I provide some more debugging output in some way? If possible I can run test kernels with specific debugging enabled on the server to hopefully help pinpoint where things go wrong.
Do you have any estimate on when a patch could be available? The system on which I'm testing this needs to be reconfigured for something else in the near future.
Does it make sense to test newer -rc kernels?
Thanks for your patience. Taking a closer look at the 3.0.1 network trace, I see strange behavior around frame 1677: a downgrade to OPEN_WRITE, followed by an open for READ which is granted a read delegation. That read delegation should not have been granted as long as the client still held a write open.
Subsequent delays appear to be due to a write performed under the stateid associated with that open returning DELAY, because the write is causing the server to recall the delegation.
I'm not quite sure what's going on here yet. But I'll try to have a patch for testing soon.
Created attachment 524022 [details]
fix an open-downgrade problem
I think this could explain both of the symptoms you were seeing: the client's and server's idea of the open state could get out of state, causing strange delegation recall behavior that could cause delays on write (explaining the ERR_DELAY replies I see to WRITE's in your trace). And a failure to convert open types in the downgrade logic here could mess up the reference counting.
Thanks for your patience; any testing you could do would be appreciated.
The patch is against the tip of my latest (3.1-rc1-based) tree, but I believe it should also apply to any kernel (such as 3.0.1) that has the patches you previously tested.
I tested the 3.0.1 kernel with your patch two times and it seems to fix the issue!
I no longer have the long delay during the start of the simulation, the quota information is correct and I can unmount the file system on the server! Yay!
I assume this will be in the upstream 3.1 kernel?
Will it be possible to still have this for RHEL 6.2, please?
If the patch is in Linus' tree, can it be proposed to be included in the longterm stable kernels?
So for the RHEL 6.1 kernels I need the 3 patches you provided. I believe the RHEL 5.7 kernel also has this bug now? Does it need all 3 patches or just the latest one?
Thanks for you help!
Thanks once more for the quick test results.
I've posted the same patch upstream, and if nobody catches a problem in review then it should be included in 3.2 (and applied to stable 3.1.z and 3.0.z shortly afterwards). I want to give other upstream developers a chance to comment and then we can start the process for 6.3 and 6.2.z.
I've tried rebuilding the 2.6.32-131.12.1 kernel with the patches applied but it seems it fails to build from source. Even without any patches applied the compilation (on x86_64) fails with
Documentation/video4linux/v4lgrab.c:34:28: error: linux/videodev.h: No such file or directory
Documentation/video4linux/v4lgrab.c: In function 'main':
Documentation/video4linux/v4lgrab.c:103: error: storage size of 'cap' isn't known
Documentation/video4linux/v4lgrab.c:104: error: storage size of 'win' isn't known
Documentation/video4linux/v4lgrab.c:105: error: storage size of 'vpic' isn't known
Documentation/video4linux/v4lgrab.c:116: error: 'VIDIOCGCAP' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:116: error: (Each undeclared identifier is reported only once
Documentation/video4linux/v4lgrab.c:116: error: for each function it appears in.)
Documentation/video4linux/v4lgrab.c:123: error: 'VIDIOCGWIN' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:129: error: 'VIDIOCGPICT' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:135: error: 'VID_TYPE_MONOCHROME' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:137: error: 'VIDEO_PALETTE_GREY' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:138: error: 'VIDIOCSPICT' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:151: error: 'VIDEO_PALETTE_RGB24' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:154: error: 'VIDEO_PALETTE_RGB565' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:158: error: 'VIDEO_PALETTE_RGB555' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:105: warning: unused variable 'vpic'
Documentation/video4linux/v4lgrab.c:104: warning: unused variable 'win'
Documentation/video4linux/v4lgrab.c:103: warning: unused variable 'cap'
make: *** [Documentation/video4linux/v4lgrab] Error 1
make: *** [Documentation/video4linux] Error 2
make: *** Waiting for unfinished jobs....
make: *** [vmlinux] Error 2
Is there a patch for this error that is already applied to a later revision or the RHEL6 kernel? Are the offial RHEL kernels not built from the same sources?
For QA: I've added a pynfs test to
# ./nfs4.0/testserver.py server:/export/ --maketree --rundeps OPDG10
and then, on "server", run "service nfs stop" and "umount /export". The umount should fail before the patch, and succeed after.
I'm not sure why that compile is failing, apologies. It seems unrelated to the patch.
This is a worse bug than I thought, and would be a regression new to 6.2, so I think it should go into 6.2.
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update release.
Compiling the kernel from the .src.rpm works on a 6.0 system, but not a 6.1 system.
When you say that the bug is worse than you thought and that it should go into 6.2, are you referring to the NFS bug or the FTBFS?
"When you say that the bug is worse than you thought and that it should go into
6.2, are you referring to the NFS bug or the FTBFS?"
I'm referring to the NFS bug.
Patch(es) available on kernel-2.6.32-206.el6
Reproduced in 2.6.32-178.el6.x86_64, nable to reproduce in 2.6.32-205.el6.x86_64 and verified in 2.6.32-206.el6.x86_64.
Are the fixes for this bug now in the upstream kernel? What are the relevant commits and/or since what kernel version. Has it been applied to -stable kernels?
The upstream commit was 3d02fa29dec920c, upstream as of 3.2-rc1, 3.1.1, and 3.0.9. Looks like it's passing all our tests, but any additional test results are welcomed.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.