Bug 1254020
Summary: | RHEL6.6: NFS client has kernel panic after seeing 'VFS: Busy inodes after unmount ... Self-destruct in 5 seconds. Have a nice day' | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Dave Wysochanski <dwysocha> | |
Component: | kernel | Assignee: | Benjamin Coddington <bcodding> | |
kernel sub component: | NFS | QA Contact: | Yongcheng Yang <yoyang> | |
Status: | CLOSED ERRATA | Docs Contact: | Jana Heves <jsvarova> | |
Severity: | urgent | |||
Priority: | urgent | CC: | bcodding, dhoward, djeffery, eguan, fsorenso, hbarcomb, ikent, jbainbri, jiyin, jsvarova, nmurray, pbokoc, plambri, rkothiya, salmy, smayhew, swhiteho, yoyang | |
Version: | 6.6 | Keywords: | ZStream | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | kernel-2.6.32-582.el6 | Doc Type: | Bug Fix | |
Doc Text: |
Loop checks added to VFS to prevent kernel crashes
The NFS client was previously failing to detect a directory loop for some NFS server directory structures. This failure could cause NFS inodes to remain referenced after attempting to unmount the file system, leading to a kernel crash. This update adds loop checks to VFS, which effectively prevents this problem from occurring.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1272858 (view as bug list) | Environment: | ||
Last Closed: | 2016-05-10 23:08:15 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1172231, 1268411, 1272858 | |||
Attachments: |
Description
Dave Wysochanski
2015-08-16 20:19:10 UTC
Looking at the commits between RHEL6.5 and RHEL6.6 I wonder if either of these could have possibly introduced a regression: [fs] vfs: fix ref count leak in path_mountpoint() [fs] Fix mountpoint reference leakage in linkat Created attachment 1063571 [details]
bash shell script with systemtap which was used for partial debugging
The attached bash + systemtap script produces output like the following as seen in some of the vmcores kernel logs:
VFS: Busy inodes after unmount of 0:b4. Self-destruct in 5 seconds. Have a nice day...
Mon Jul 6 23:15:18 2015 PDT: automount(2406) => umount(4784) => umount.nfs(4785) (/sbin/umount.nfs /mnt/nfs): invalidate_inodes returning 1 for 'nfs' super_block 0xffff880434d7f000 (0:b3)
(In reply to Dave Wysochanski from comment #4) > Looking at the commits between RHEL6.5 and RHEL6.6 > > I wonder if either of these could have possibly introduced a regression: > [fs] vfs: fix ref count leak in path_mountpoint() Maybe not this one specifically. It applies only when the last component of the umount path is a symlink and that doesn't appear to be the case here. That doesn't mean a corner case for the associated series has been missed due to differences between upstream and our kernel. It's worth considering but there was quite a bit of testing surrounding the series. > [fs] Fix mountpoint reference leakage in linkat This looks like a straight forward bug fix. The get and put on the path, old_path, are entirely local so it looks like the path_put() was in fact missing on ESTALE error. Created attachment 1064086 [details]
Analysis of code paths which touch super_block.s_active and why we want to do this
Created attachment 1064087 [details]
WIP systemtap script to trace super_block.s_active - bulk still needs filled in and this may not work - may need a kernel patch
Looking through the vfs related changes from rhel-6.5 to 6.6 there aren't many at all. The most significant change was the addition of the path_mountpoint() series which is meant to avoid revalidation of the last path component during path resolution. This was mostly needed to prevent hangs on NFS umount in cases where the server is unresponsive (I believe). But it was also required for a similar hang with autofs direct mounts at startup. So we need to confirm that the problem hasn't been seen on kernel revisions prior to revision 451 (ie. on rhel-6.5 kernels, rev 431). If we can confirm that then we can focus instrumentation in the functions that patch series adds or changes and there are only a few. As far as s_active goes I think that's a symptom and not a cause because the code where the VFS message is issued is invoked when the mount reference count goes to 0 to cleanup the in memory mount tree and discard the super block. If the mount reference count isn't right then all sorts of bad things happen on the final mntput(). So it's probably better to assume this is a mount reference count problem. In rhel-6 the struct vfsmount still holds the mount reference count so that's where we should look for reference count accounting problems and specifically on entry and exit of the functions I mentioned above. The difficult part will be working out what the counts should be (there are so many special cases) so we know a problem when we see it. I'll have a look at the functions and see if I can spot anything and try and work out what we should expect. But without a reproducer we have an uphill battle since the customer will get sick of testing very quickly. Ian (In reply to Ian Kent from comment #23) > Looking through the vfs related changes from rhel-6.5 to > 6.6 there aren't many at all. snip ... > > I'll have a look at the functions and see if I can spot > anything and try and work out what we should expect. After looking through the path_mountpoint() change I can't see a problem with the mount ref count. It occurs to me that autofs wouldn't be selecting these mounts to expire unless the mount ref count indicated they weren't in use so the problem likely occurred before the path_mountpoint() code was called. The autofs expire code hasn't changed for a long long time so I doubt the problem lies there. There are a couple of calls that lead to path_mountpoint() due to autofs ioctls but an initial check indicates they are ok too. I'll work through this again but I'm not hopeful. Then it will be back to the changelog. Ian (In reply to Ian Kent from comment #24) > > After looking through the path_mountpoint() change I can't > see a problem with the mount ref count. > > It occurs to me that autofs wouldn't be selecting these > mounts to expire unless the mount ref count indicated they > weren't in use so the problem likely occurred before the > path_mountpoint() code was called. > > The autofs expire code hasn't changed for a long long time > so I doubt the problem lies there. Actually that's not right. There was a change that went into rhel-6.6 for expiry of symlinks. If there's a symlink present in an autofs file system only the last access time can be checked and when it expires it amounts to an unlink() done by the daemon, not a umount. I don't see how that could be a problem here but a debug log and some knowledge of what's in the filesystem should tell us if confusion has crept in somewhere. Ian Created attachment 1064376 [details]
verbose automount debugging
Here is verbose automount debugging, beginning with the first mount of a filesystem that will result in the 'Busy inodes', until all the filesystems that produce the 'Busy inodes' are unmounted.
This is just one time through, however each cycle of mount/unmount for these 6 particular /users/datadomain mounts looks about the same.
(In reply to Frank Sorenson from comment #28) > Created attachment 1064376 [details] > verbose automount debugging > > Here is verbose automount debugging, beginning with the first mount of a > filesystem that will result in the 'Busy inodes', until all the filesystems > that produce the 'Busy inodes' are unmounted. > > This is just one time through, however each cycle of mount/unmount for these > 6 particular /users/datadomain mounts looks about the same. When you post debug logs you should post the entire log not just fragments and give any time stamps of likely problem activity, if possible. It may be harder for me to work those logs but often something has happened long before the problem is actually seen. Ian Created attachment 1068648 [details]
case 01441432 - find plus tcpdump
I can reproduce on RHEL6 2.6.32-504.el6.x86_64: [root@rhel6 ~]# cd /exports [root@rhel6 exports]# mkdir exports_loop [root@rhel6 exports]# mount -o bind ../exports exports_loop [root@rhel6 exports]# cd [root@rhel6 ~]# /etc/init.d/nfs start Starting RPC svcgssd: [ OK ] Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS mountd: [ OK ] Starting NFS daemon: [ OK ] [root@rhel6 ~]# exportfs -ua [root@rhel6 ~]# exportfs -o crossmnt *:/exports [root@rhel6 ~]# mount rhel6:/exports /manmnt/rhel6 [root@rhel6 ~]# stat /manmnt/rhel6/exports_loop/. File: `/manmnt/rhel6/exports_loop/.' Size: 4096 Blocks: 8 IO Block: 262144 directory Device: 18h/24d Inode: 275493 Links: 6 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2015-08-31 10:08:20.883717287 -0400 Modify: 2015-08-31 10:08:14.451617433 -0400 Change: 2015-08-31 10:08:14.451617433 -0400 [root@rhel6 ~]# umount /manmnt/rhel6 [root@rhel6 ~]# tail -1 /var/log/messages Aug 31 10:11:13 rhel6 kernel: VFS: Busy inodes after unmount of 0:18. Self-destruct in 5 seconds. Have a nice day... I think it's safe to rule out autofs at this point. Upstream fedora server seems not to create this problem. Nice work Ben! I tried your simple reproducer on a few NFS client versions. Here are the results. NOTE that "FAIL" means we were able to mount the share, and after umount I saw "VFS: Busy inodes after unmount of 0:17. Self-destruct in 5 seconds." in /var/log/messages RHEL5u11 (2.6.18-402.el5): PASS - mount fails with "mount.nfs: Stale NFS file handle" RHEL6u0 (2.6.32-71.40.1.el6): FAIL RHEL6u1 (2.6.32-131.4.1.el6): FAIL * NOTE: Interestingly the 'find' returned a lot of these Aug 31 09:27:18 rhel6u1-node1 kernel: fsid 0:16: expected fileid 0x54fe3, got 0x409d6 Aug 31 09:27:18 rhel6u1-node1 kernel: NFS: server 192.168.122.49 error: fileid changed RHEL6u2 (2.6.32-220.2.1.el6): FAIL RHEL6u3 (2.6.32-279.11.1.el6): FAIL RHEL6u4 (2.6.32-358.el6): FAIL RHEL6u5 (2.6.32-431.32.1.el6): FAIL RHEL7u0 (3.10.0-123.el7): PASS - mount fails with "Too many levels of symbolic links" Assuming Ben's reproducer is the same as the customer's problem, I think we can remove Regression since it doesn't look like RHEL6 has always had this bug. One small correction - all my tests used the default NFS version for the release. So RHEL6-RHEL7 it was an NFSv4 mount, and RHEL5 was NFSv3. I tried RHEL5 NFSv4 mount and I could mount it, but I don't get the "busy inodes after umount" message. Generate reproduce case as Comment 65. Thanks Ben! I'm seeing some strange differences in behavior, depending on what exactly I'm exporting and mounting: nfs server side: # mkdir -p /loop_test/dir_loop /exports/loop_test/dir_loop # mount -o bind /loop_test /loop_test/dir_loop # mount -o bind /exports/loop_test /exports/loop_test/dir_loop # echo -e "/loop_test\t*(rw,crossmnt)" >> /etc/exports # echo -e "/exports/loop_test\t*(rw,crossmnt)" >> /etc/exports # exportfs -arv exporting *:/exports/loop_test exporting *:/loop_test # find /loop_test /exports/loop_test /loop_test find: File system loop detected; `/loop_test/dir_loop' is part of the same file system loop as `/loop_test'. /exports/loop_test find: File system loop detected; `/exports/loop_test/dir_loop' is part of the same file system loop as `/exports/loop_test'. nfs client side: # mkdir -p /mnt/loop_test # mount vm17:/loop_test /mnt/loop_test/ # find /mnt/loop_test/ /mnt/loop_test/ find: File system loop detected; `/mnt/loop_test/dir_loop' is part of the same file system loop as `/mnt/loop_test/'. # umount /mnt/loop_test # dmesg -c [47853.323277] VFS: Busy inodes after unmount of 0:19. Self-destruct in 5 seconds. Have a nice day... # mount vm17:/exports/loop_test /mnt/loop_test/ # find /mnt/loop_test /mnt/loop_test find: `/mnt/loop_test/dir_loop': Too many levels of symbolic links # umount /mnt/loop_test/ # dmesg -c So: * when mounting server:/export, we get the 'File system loop detected', followed by the 'Busy inodes after unmount' * when mounting server:/directory/export, we get the 'Too many levels of symbolic links' message instead, and no 'Busy inodes after unmount' *** Bug 1236695 has been marked as a duplicate of this bug. *** Patch(es) available on kernel-2.6.32-582.el6 Move to VERIFIED according to Comment 100 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-0855.html |