Bug 729176 - ext4 regression: quota incorrect/orphan inodes on removal of (locked) files
Summary: ext4 regression: quota incorrect/orphan inodes on removal of (locked) files
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.1
Hardware: All
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: J. Bruce Fields
QA Contact: Petr Beňas
URL:
Whiteboard:
Depends On: 714153
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-08-08 22:55 UTC by J. Bruce Fields
Modified: 2015-01-04 23:01 UTC (History)
12 users (show)

Fixed In Version: kernel-2.6.32-206.el6
Doc Type: Bug Fix
Doc Text:
Clone Of: 714153
Environment:
Last Closed: 2011-12-06 14:00:58 UTC


Attachments (Terms of Use)
fix an open-downgrade problem (1.81 KB, patch)
2011-09-20 12:29 UTC, J. Bruce Fields
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1530 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux 6 kernel security, bug fix and enhancement update 2011-12-06 01:45:35 UTC

Comment 1 J. Bruce Fields 2011-08-08 23:00:28 UTC
I reproduced problems locally that resulted in similar reference count leaks, and submitted patches to fix those.  According to the original reporter, those patches may help his situation somewhat, but he is still left with an unmountable filesystem.

Further debugging will therefore be required.

Comment 3 Rik Theys 2011-08-10 09:42:25 UTC
I've tested the -178 kernel and (as mentioned) it does not fully fix our bug. The quota information is more correct now, but still not completely correct. In the past the quota would show a value slightly below the hard limit. With the patches applied, the quota shows 28 block in use, where this should be 4 (only a single directory).

I've also tried the 3.0.1 kernel as this seemed to have a patch that could be relevant and I was not sure the -178 kernel has this one:

commit 83d20a07d3fc171d5d7cddb6ebe2cd7a5fee1047
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Wed Jun 29 16:49:04 2011 -0400

    svcrpc: fix list-corrupting race on nfsd shutdown
    
    commit ebc63e531cc6a457595dd110b07ac530eae788c3 upstream.

The 3.0.1 has similar behaviour.

Kernels having these fixes applied all seem to have the 40s delay during the simulation, although there doesn't seem to be any network traffic during the freeze. Older kernels don't have this freeze period.

I will also add this comment to the new bug report. I've attached a vmcore and tcpdump using the -178 kernel to the support case. I'm not sure you can access that information?

Comment 4 Rik Theys 2011-08-10 09:43:50 UTC
FYI: I can not see comment 2. Does it contain relevant information?

Comment 5 J. Bruce Fields 2011-09-13 20:36:07 UTC
No fix on hand at this point, so too late for 6.2; expecting to get back to this for 6.3.

Comment 6 Rik Theys 2011-09-19 09:53:25 UTC
Hi,

According to the support case, "the issue is now known and we are expecting to come with the patch soon".

Does that mean you have been able to reproduce the remaining problems and are working on a fix?

If you have not yet been able to reproduce this problem, can I provide some more debugging output in some way? If possible I can run test kernels with specific debugging enabled on the server to hopefully help pinpoint where things go wrong.

Do you have any estimate on when a patch could be available? The system on which I'm testing this needs to be reconfigured for something else in the near future.

Does it make sense to test newer -rc kernels?

Regards,

Rik

Comment 7 J. Bruce Fields 2011-09-19 15:12:29 UTC
Thanks for your patience.  Taking a closer look at the 3.0.1 network trace, I see strange behavior around frame 1677: a downgrade to OPEN_WRITE, followed by an open for READ which is granted a read delegation.  That read delegation should not have been granted as long as the client still held a write open.

Subsequent delays appear to be due to a write performed under the stateid associated with that open returning DELAY, because the write is causing the server to recall the delegation.

I'm not quite sure what's going on here yet.  But I'll try to have a patch for testing soon.

Comment 8 J. Bruce Fields 2011-09-20 12:29:40 UTC
Created attachment 524022 [details]
fix an open-downgrade problem

I think this could explain both of the symptoms you were seeing: the client's and server's idea of the open state could get out of state, causing strange delegation recall behavior that could cause delays on write (explaining the ERR_DELAY replies I see to WRITE's in your trace).  And a failure to convert open types in the downgrade logic here could mess up the reference counting.

Thanks for your patience; any testing you could do would be appreciated.

The patch is against the tip of my latest (3.1-rc1-based) tree, but I believe it should also apply to any kernel (such as 3.0.1) that has the patches you previously tested.

Comment 9 Rik Theys 2011-09-20 13:30:04 UTC
I tested the 3.0.1 kernel with your patch two times and it seems to fix the issue!

I no longer have the long delay during the start of the simulation, the quota information is correct and I can unmount the file system on the server! Yay!

I assume this will be in the upstream 3.1 kernel?

Will it be possible to still have this for RHEL 6.2, please?

If the patch is in Linus' tree, can it be proposed to be included in the longterm stable kernels?

So for the RHEL 6.1 kernels I need the 3 patches you provided. I believe the RHEL 5.7 kernel also has this bug now? Does it need all 3 patches or just the latest one?

Thanks for you help!

Regards,

Rik

Comment 10 J. Bruce Fields 2011-09-20 19:06:47 UTC
Thanks once more for the quick test results.

I've posted the same patch upstream, and if nobody catches a problem in review then it should be included in 3.2 (and applied to stable 3.1.z and 3.0.z shortly afterwards).  I want to give other upstream developers a chance to comment and then we can start the process for 6.3 and 6.2.z.

Comment 11 Rik Theys 2011-09-26 14:41:40 UTC
Hi,

I've tried rebuilding the 2.6.32-131.12.1 kernel with the patches applied but it seems it fails to build from source. Even without any patches applied the compilation (on x86_64) fails with 

Documentation/video4linux/v4lgrab.c:34:28: error: linux/videodev.h: No such file or directory
Documentation/video4linux/v4lgrab.c: In function 'main':
Documentation/video4linux/v4lgrab.c:103: error: storage size of 'cap' isn't known
Documentation/video4linux/v4lgrab.c:104: error: storage size of 'win' isn't known
Documentation/video4linux/v4lgrab.c:105: error: storage size of 'vpic' isn't known
Documentation/video4linux/v4lgrab.c:116: error: 'VIDIOCGCAP' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:116: error: (Each undeclared identifier is reported only once
Documentation/video4linux/v4lgrab.c:116: error: for each function it appears in.)
Documentation/video4linux/v4lgrab.c:123: error: 'VIDIOCGWIN' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:129: error: 'VIDIOCGPICT' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:135: error: 'VID_TYPE_MONOCHROME' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:137: error: 'VIDEO_PALETTE_GREY' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:138: error: 'VIDIOCSPICT' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:151: error: 'VIDEO_PALETTE_RGB24' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:154: error: 'VIDEO_PALETTE_RGB565' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:158: error: 'VIDEO_PALETTE_RGB555' undeclared (first use in this function)
Documentation/video4linux/v4lgrab.c:105: warning: unused variable 'vpic'
Documentation/video4linux/v4lgrab.c:104: warning: unused variable 'win'
Documentation/video4linux/v4lgrab.c:103: warning: unused variable 'cap'
make[2]: *** [Documentation/video4linux/v4lgrab] Error 1
make[1]: *** [Documentation/video4linux] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [vmlinux] Error 2

Is there a patch for this error that is already applied to a later revision or the RHEL6 kernel? Are the offial RHEL kernels not built from the same sources?

Regards,

Rik

Comment 12 J. Bruce Fields 2011-09-28 01:30:02 UTC
For QA: I've added a pynfs test to

  git://linux-nfs.org/~bfields/pynfs.git

Run

  # ./nfs4.0/testserver.py server:/export/ --maketree --rundeps OPDG10

and then, on "server", run "service nfs stop" and "umount /export".  The umount should fail before the patch, and succeed after.

Comment 13 J. Bruce Fields 2011-09-28 01:32:21 UTC
I'm not sure why that compile is failing, apologies.  It seems unrelated to the patch.

This is a worse bug than I thought, and would be a regression new to 6.2, so I think it should go into 6.2.

Comment 14 RHEL Product and Program Management 2011-09-28 01:40:40 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 15 Rik Theys 2011-09-28 06:35:44 UTC
Hi,

Compiling the kernel from the .src.rpm works on a 6.0 system, but not a 6.1 system.

When you say that the bug is worse than you thought and that it should go into 6.2, are you referring to the NFS bug or the FTBFS?

Regards,

Rik

Comment 17 J. Bruce Fields 2011-09-28 11:21:42 UTC
"When you say that the bug is worse than you thought and that it should go into
6.2, are you referring to the NFS bug or the FTBFS?"

I'm referring to the NFS bug.

Comment 19 Aristeu Rozanski 2011-10-05 15:33:51 UTC
Patch(es) available on kernel-2.6.32-206.el6

Comment 27 Petr Beňas 2011-10-06 15:20:17 UTC
Reproduced in 2.6.32-178.el6.x86_64, nable to reproduce in 2.6.32-205.el6.x86_64 and verified in 2.6.32-206.el6.x86_64.

Comment 28 Rik Theys 2011-12-02 15:11:39 UTC
Are the fixes for this bug now in the upstream kernel? What are the relevant commits and/or since what kernel version. Has it been applied to -stable kernels?

Comment 29 J. Bruce Fields 2011-12-02 16:37:53 UTC
The upstream commit was 3d02fa29dec920c, upstream as of 3.2-rc1, 3.1.1, and 3.0.9.  Looks like it's passing all our tests, but any additional test results are welcomed.

Comment 30 errata-xmlrpc 2011-12-06 14:00:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1530.html


Note You need to log in before you can comment on or make changes to this bug.