Bug 1749390
Summary: | VFS: Busy inodes after unmount of loop0 when encountering duplicate directory inodes | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Frank Sorenson <fsorenso> | ||||||
Component: | kernel | Assignee: | Miklos Szeredi <mszeredi> | ||||||
kernel sub component: | VFS | QA Contact: | Kun Wang <kunwan> | ||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||
Severity: | high | ||||||||
Priority: | urgent | CC: | aviro, bfields, dhoward, dhowells, djeffery, dwysocha, esandeen, jaeshin, jaltman, lvaz, marc.c.dionne, mszeredi, swhiteho, tborcin, xzhou, yoyang, zyan | ||||||
Version: | 7.6 | Keywords: | Regression, Reproducer, ZStream | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | kernel-3.10.0-1106.el7 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1774943 1781158 1781159 1781585 (view as bug list) | Environment: | |||||||
Last Closed: | 2020-03-31 19:29:38 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1774943, 1781158, 1781159, 1781585 | ||||||||
Attachments: |
|
Description
Frank Sorenson
2019-09-05 14:07:25 UTC
I'm not sure that I understand the issue. You cannot hard link a directory, so I'm trying to figure out whats really the issue here. isofs creation permits the equivalent of a hardlinked directory. When the iso is mounted, all kernels other than 7.6.z and 7.y (y > 6) behave as if hardlinked directories are allowed. So either returning EIO is a RHEL 7 regression, or RHEL 8 (and upstream) have a bug, in that they do not return EIO. The bug is that we shouldn't have hard linked directories in the first place. In your example a bind mount is used before running mkisofs, and if that is what is intended, it could possibly be replicated by some means (effectively an auto-mount) when the fs is later mounted by the kernel. However, a hard linked directory is not allowed, and if there are some previous kernels that might appear to support it, then those are buggy. This sounds like something that needs to be resolved upstream. So after the setup steps, we have this tree structure: # tree --inodes iso_root/ iso_root/ ├── [16853932] dir1 │ ├── [16853933] testfile1 │ ├── [16853934] testfile2 │ └── [16853935] testfile3 └── [16853932] dir2 ├── [16853933] testfile1 ├── [16853934] testfile2 └── [16853935] testfile3 and if we list the dirs 1 at a time to not get the clash from teh vfs, we see that mkisofs did create them w/ the same inode nr: # mount -oloop out.iso mnt/ # ls -id mnt/dir1 1856 mnt/dir1 # umount mnt # mount -oloop out.iso mnt/ mount: /dev/loop0 is write-protected, mounting read-only # ls -id mnt/dir2 1856 mnt/dir2 (Interestingly mksquashfs doesn't produce the same result, because it ends up with different inode nrs for the 2 dirs within the fs image) So it's ... odd that mkisofs chooses to create two dirs w/ the same inode number. I'm inclined to think that this may be a mkisofs bug, although it would be good to know why this result seems unique to RHEL7. just as background, the customer encountered this issue with essentially the following: # mkdir -p iso_root/release-{1.0,1.1,1.2,latest} # mount server:/exports/rpms/rpms-1.0 iso_root/release-1.0 # mount server:/exports/rpms/rpms-1.1 iso_root/release-1.1 # mount server:/exports/rpms/rpms-1.2 iso_root/release-1.2 # mount server:/exports/rpms/rpms-1.2 iso_root/release-latest # mkisofs -o out.iso -fRJ iso_root 2>&1 # mount -oloop out.iso /mnt/tmp 2>/dev/null * affected kernels # tree --inodes /mnt/tmp /mnt/tmp ├── [ 1856] release-1.0 ├── [ 1920] release-1.1 └── [ 2048] release-latest # ls -lid /mnt/tmp/* ls: cannot access /mnt/tmp/release-1.2: Input/output error 1856 drwxr-xr-x 2 root root 2048 Sep 5 13:20 /mnt/tmp/release-1.0 1920 drwxr-xr-x 2 root root 2048 Sep 5 13:20 /mnt/tmp/release-1.1 2048 drwxr-xr-x 2 root root 2048 Sep 5 13:20 /mnt/tmp/release-latest * upstream kernel # tree --inodes /mnt/tmp /mnt/tmp ├── [ 1856] release-1.0 ├── [ 1920] release-1.1 ├── [ 2048] release-1.2 └── [ 2048] release-latest also, rhel 7.6.z and 7.7+ kernels throw the following message (not seen with other kernels) on unmount: [1382870.155190] VFS: Busy inodes after unmount of loop0. Self-destruct in 5 seconds. Have a nice day... ***** note also the following upstream behavior, when given a true directory loop: # mkdir -p iso_root/dir3/subdir # mount -obind iso_root/dir3 iso_root/dir3/subdir # mkisofs -o out.iso -fRJ iso_root 2>&1 # tree --inodes /mnt/tmp/ /mnt/tmp/ ├── [ 1856] dir1 │ ├── [ 1870] testfile1 │ ├── [ 1866] testfile2 │ └── [ 1862] testfile3 ├── [ 1856] dir2 │ ├── [ 1870] testfile1 │ ├── [ 1866] testfile2 │ └── [ 1862] testfile3 └── [ 1984] dir3 # ls -lid /mnt/tmp/dir3 /mnt/tmp/dir3/subdir ls: cannot access /mnt/tmp/dir3/subdir: Too many levels of symbolic links 1984 drwxr-xr-x 3 root root 2048 Sep 5 10:27 /mnt/tmp/dir3 in the kernel messages: [1378462.410201] VFS: Lookup of 'subdir' in iso9660 loop0 would have caused loop (no 'busy inodes' message on unmount) Bisecting it lands on commit 7449bc83052a970f5fc15d6c94369cc1b2a214eb (HEAD) Author: Zheng Yan <zyan> Date: Thu Oct 4 07:02:44 2018 +0200 [fs] dcache: d_splice_alias mustn't create directory aliases Talking w/ Viro - isofs returning same inode nr for 2 dirs /is/ a bug. Not clear whether it's in mkisofs, or in the kernel isofs driver, i.e. how isofs_get_ino() generates inodes. But, upstream silently mitigates the problem: } else if (!IS_ROOT(new)) { struct dentry *old_parent = dget(new->d_parent); int err = __d_unalias(inode, dentry, new); <----- HERE write_sequnlock(&rename_lock); if (err) { dput(new); new = ERR_PTR(err); } dput(old_parent); } Essentially, this was done by b5ae6b15b merge d_materialise_unique() into d_splice_alias() upstream, which should be backported. [08:56] <viro> OK, what you want is commit b5ae6b15bd73e35b129408755a0804287a87e041 [08:56] <viro> Author: Al Viro <viro.org.uk> [08:56] <viro> Date: Sun Oct 12 22:16:02 2014 -0400 [08:56] <viro> merge d_materialise_unique() into d_splice_alias() [08:56] <viro> backported [08:57] <viro> essentially, d_materialize_unique() is d_splice_alias() with opposite order of arguments *and* mitigation of that kind of crap [08:58] <viro> in mainline I'd just folded the latter into d_splice_alias() and killed d_materialize_unique() off [08:58] <viro> very few callers of the latter to start with [08:59] <viro> in RHEL we'd want a wrapper for d_splice_alias() - basically, calling it with the arguments flipped [08:59] <viro> again, kABI... I have opened bz1749860 against cdrkit in RHEL 8 to address creation of ISOs with hardlinked directories (In reply to Frank Sorenson from comment #12) > I have opened bz1749860 against cdrkit in RHEL 8 to address creation of ISOs > with hardlinked directories Sounds good, if I use OSX to create an iso image from the directories problematic ISO created above, it assigns new/distinct inode numbers so this does seem like a mkisofs flaw as well. I had a go at al's suggestion but I'm not confident enough messing with dcache.c, would be better to have someone more familiar investigate it I think. Miklos, would you be able to take a look at this? I know it is outside of your normal realm, but dcache knowledge is required in order to provide the required fix I here it seems. Okay. Miklos, do you have any updates? We need to figure out the correct flags/version that we want to target. Created attachment 1618935 [details]
proposed patch
So the real bug here is the lack of iput() on error. Easily fixed with the attached patch (untested).
I wouldn't mess with backporting commit b5ae6b15bd73 ("merge d_materialise_unique() into d_splice_alias()"), as it's basically irrelevant if lookup returns EIO or ELOOP in the broken iso image case.
Created attachment 1621014 [details]
proposed patch to work around buggy iso images
So if we want to mitigate buggy mkisofs, then we do need __d_unalias(). But it's a kludge: it can still sporadically fail with EBUSY if hard linked directories have a different parent directory. Also regardless of the parent directories results from getcwd(2) will jump around the hard links' location based on which one was accessed last.
I've tried to carefully compare what d_splice_alias() and d_materialise_unique() do, and AFAICS the patch is correct, but this is not a trivial fix, like the previous patch, so obviously it's higher risk.
There's one change that I haven't fully looked into: after the patch security_d_instantiate() isn't called on an extant alias, while previously it was. I don't know if this was just an oversight or if there was some reason to that security_d_instantiate() call. A security person needs to look at that.
Patch(es) committed on kernel-3.10.0-1106.el7 May I ask which of the 2 attached patches has been included in the 1106 kernel and will be in 7.8, since I don't have access to it? Also, are there any plans to include a fix in 7.6 and 7.7 kernels? Just the comment #20 patch, which is the one with the iput additions. Currently I don't see any requests for backport to earlier kernels. The real issue here however is that the images are incorrect and we have a bug open for mkisofs to fix that issue as a separate item. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:1016 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |