Bug 584484
Summary: | When doing r/o bind mounts ro flag is improperly prograted back to source device. | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Lennart Poettering <lpoetter> | ||||
Component: | util-linux-ng | Assignee: | Karel Zak <kzak> | ||||
Status: | CLOSED UPSTREAM | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 13 | CC: | anton, dougsland, gansalmon, itamar, jonathan, kernel-maint, kmcmartin, kzak | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-06-14 10:49:16 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Lennart Poettering
2010-04-21 16:51:20 UTC
This seems to work on 2.6.33-1.fc13.i686 OK, I think I know more now: The remount request is apparently applied to the whole of /tmp if you type it like shown above, and depending on whether somebody has a file open for write on /tmp this will fail with EBUSY or not. If it doesn't fail, then the entire /tmp tree is actually made read-only! As it appears read-only bind mounts are hence entirely broken: 20 [root@omega] /tmp# mkdir a b 21 [root@omega] /tmp# mount --bind a b 22 [root@omega] /tmp# touch /tmp/waldo /tmp/a/waldo2 /tmp/b/waldo3 23 [root@omega] /tmp# mount -o ro,remount a b 24 [root@omega] /tmp# touch /tmp/waldo /tmp/a/waldo2 /tmp/b/waldo3 touch: setting times of `/tmp/waldo': Read-only file system touch: cannot touch `/tmp/a/waldo2': Read-only file system touch: cannot touch `/tmp/b/waldo3': Read-only file system 25 [root@omega] /tmp# mount -o rw,remount /tmp 26 [root@omega] /tmp# touch /tmp/waldo /tmp/a/waldo2 /tmp/b/waldo3 touch: cannot touch `/tmp/b/waldo3': Read-only file system 27 [root@omega] /tmp# That is on a freshly booted 2.6.33.2-57.fc13.x86_64. Hmm, so i played around with --make-private, under the assumption that this weirdness might have something to do with the shared subtree logic, but this didn't change anything: even if both /tmp and /tmp/b are marked as "private" the ro change on /tmp/b will still be reflected back to /tmp. I even made /tmp/a a bind mount on itself and also marked it private, to no luck. In summary, there is something wrong with the MS_RDONLY flag propagation for bind mounts. Hmm, that is on ext3 btw. Btw, just for completeness sake, the two actual mount syscalls involved above look like this: 17 [root@omega] /tmp# strace -e mount mount --bind a b mount("/tmp/a", "b", 0x7f15416dddd0, MS_MGC_VAL|MS_BIND, NULL) = 0 and 19 [root@omega] /tmp# strace -e mount mount -o ro,remount a b mount("/tmp/a", "b", NULL, MS_MGC_VAL|MS_RDONLY|MS_REMOUNT, NULL) = -1 EBUSY (Device or resource busy) I think the problem is a documentation problem. You're asking mount to: mount -o remount,ro [list of mounts] And /tmp/a is not a mount, it's a subdir of /tmp. So you end up with /tmp mounted readonly. [root@ihatethathostname tmp]# mkdir a b [root@ihatethathostname tmp]# mount --bind a b [root@ihatethathostname tmp]# touch waldo a/waldo2 b/waldo3 [root@ihatethathostname tmp]# ls a waldo2 waldo3 [root@ihatethathostname tmp]# ls b waldo2 waldo3 [root@ihatethathostname tmp]# mount -o remount,ro b [root@ihatethathostname tmp]# touch waldo a/waldo2 b/waldo3 touch: cannot touch `b/waldo3': Read-only file system [root@ihatethathostname tmp]# mount -o remount,rw b [root@ihatethathostname tmp]# touch waldo a/waldo2 b/waldo3 [root@ihatethathostname tmp]# mount -o remount,ro a b [root@ihatethathostname tmp]# touch waldo a/waldo2 b/waldo3 touch: cannot touch `waldo': Read-only file system touch: cannot touch `a/waldo2': Read-only file system touch: cannot touch `b/waldo3': Read-only file system The first case, mount -o remount,ro /tmp/b (which is the bind mount) seems to work as intended. (In reply to comment #6) > I think the problem is a documentation problem. > > You're asking mount to: > mount -o remount,ro [list of mounts] Actually not. I was only passing the exact same args as with the original mount, which is what the man page suggests. > And /tmp/a is not a mount, it's a subdir of /tmp. So you end up with /tmp > mounted readonly. That's not what happens, if you strace things. mount will only issue one mount() syscall, not two. > [root@ihatethathostname tmp]# mkdir a b > [root@ihatethathostname tmp]# mount --bind a b > [root@ihatethathostname tmp]# touch waldo a/waldo2 b/waldo3 > [root@ihatethathostname tmp]# ls a > waldo2 waldo3 > [root@ihatethathostname tmp]# ls b > waldo2 waldo3 > [root@ihatethathostname tmp]# mount -o remount,ro b > [root@ihatethathostname tmp]# touch waldo a/waldo2 b/waldo3 > touch: cannot touch `b/waldo3': Read-only file system > [root@ihatethathostname tmp]# mount -o remount,rw b > [root@ihatethathostname tmp]# touch waldo a/waldo2 b/waldo3 > [root@ihatethathostname tmp]# mount -o remount,ro a b > [root@ihatethathostname tmp]# touch waldo a/waldo2 b/waldo3 > touch: cannot touch `waldo': Read-only file system > touch: cannot touch `a/waldo2': Read-only file system > touch: cannot touch `b/waldo3': Read-only file system > > The first case, mount -o remount,ro /tmp/b (which is the bind mount) seems to > work as intended. Hmm, this is certainly interesting. I have now prepared this C test case: http://0pointer.de/public/robind.c Which hopefully shows the problem. I marked with assert()s the expected outcome, and at least on my machine I will run into two different of the asserts, depending whether /tmp can be r/o mounted or not, depending on whether some app has a writable file open or not. BTW, on my machine the strace for this test case is something like this: mkdir("/tmp/a", 0777) = 0 mkdir("/tmp/b", 0777) = 0 mount("/tmp/a", "/tmp/b", NULL, MS_BIND, NULL) = 0 open("/tmp/waldo1", O_WRONLY|O_CREAT, 0777) = 3 close(3) = 0 open("/tmp/a/waldo2", O_WRONLY|O_CREAT, 0777) = 3 close(3) = 0 open("/tmp/b/waldo3", O_WRONLY|O_CREAT, 0777) = 3 close(3) = 0 mount(NULL, "/tmp/b", NULL, MS_RDONLY|MS_REMOUNT, NULL) = 0 open("/tmp/waldo1", O_WRONLY|O_CREAT, 0777) = -1 EROFS (Read-only file system) brk(0) = 0x996e000 brk(0x998f000) = 0x998f000 write(2, "robind: robind.c:33: main: Asser"..., 54robind: robind.c:33: main: Assertion `r >= 0' failed. ) = 54 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0 gettid() = 8118 tgkill(8118, 8118, SIGABRT) = 0 --- SIGABRT (Aborted) @ 0 (0) --- +++ killed by SIGABRT (core dumped) +++ Aborted (core dumped) The two mounts go through, but we see how the second one changed the r/o bit of /tmp, so that subsequent write accesses fail with EROFS. And that shouldn't happen. kyle@phobos / $ sudo mount -t tmpfs none /kyle kyle@phobos / $ cd /kyle kyle@phobos /kyle $ sudo mkdir a b kyle@phobos /kyle $ sudo touch foo-root a/foo-A b/foo-B kyle@phobos /kyle $ sudo strace -e mount mount --bind a/ b/ mount("/kyle/a", "b/", 0x7f3b06ee8dd0, MS_MGC_VAL|MS_BIND, NULL) = 0 kyle@phobos /kyle $ cat /proc/mounts | grep kyle none /kyle tmpfs rw,relatime 0 0 none /kyle/b tmpfs rw,relatime 0 0 kyle@phobos /kyle $ sudo touch foo-root a/foo-A b/foo-B kyle@phobos /kyle $ sudo strace -e mount mount -o remount,ro b/ mount("/kyle/a", "/kyle/b", 0x7f1581bd3dd0, MS_MGC_VAL|MS_RDONLY|MS_REMOUNT|MS_BIND, NULL) = 0 kyle@phobos /kyle $ sudo touch foo-root a/foo-A b/foo-B touch: cannot touch `b/foo-B': Read-only file system kyle@phobos /kyle $ cat /proc/mounts | grep kyle none /kyle tmpfs rw,relatime 0 0 none /kyle/b tmpfs ro,relatime 0 0 Ok, working up to here. kyle@phobos /kyle $ sudo strace -e mount mount -o remount,ro a/ b/ mount("/kyle/a", "b/", NULL, MS_MGC_VAL|MS_RDONLY|MS_REMOUNT, NULL) = 0 kyle@phobos /kyle $ sudo touch foo-root a/foo-A b/foo-B touch: cannot touch `foo-root': Read-only file system touch: cannot touch `a/foo-A': Read-only file system touch: cannot touch `b/foo-B': Read-only file system kyle@phobos /kyle $ cat /proc/mounts | grep kyle none /kyle tmpfs ro,relatime 0 0 none /kyle/b tmpfs ro,relatime 0 0 Things break with mount -o remount,ro a/ b/ kyle@phobos /kyle $ sudo strace -e mount mount -o remount,rw a/ b/ mount("/kyle/a", "b/", NULL, MS_MGC_VAL|MS_REMOUNT, NULL) = 0 kyle@phobos /kyle $ cat /proc/mounts | grep kyle none /kyle tmpfs rw,relatime 0 0 none /kyle/b tmpfs rw,relatime 0 0 Things are rw again, let's try making them ro kyle@phobos /kyle $ sudo strace -e mount mount -o remount,ro b/ mount("/kyle/a", "/kyle/b", 0x7f38874f4900, MS_MGC_VAL|MS_RDONLY|MS_REMOUNT, NULL) = 0 kyle@phobos /kyle $ cat /proc/mounts | grep kyle none /kyle tmpfs ro,relatime 0 0 none /kyle/b tmpfs ro,relatime 0 0 Wait, now it's propogating it back, even though the command worked before! (Notice that MS_BIND is no longer specified.) Looks like a util-linux-ng bug in mount to me, and a corner case of the kernel behaviour. :\ Created attachment 408635 [details]
robind.c that works
Or-ing in MS_BIND on line 29 makes the test-case succeed. Not sure why util-linux is dropping the bit. (/kyle was a tmpfs freshly created.)
(In reply to comment #9) > kyle@phobos /kyle $ sudo strace -e mount mount -o remount,ro a/ b/ > mount("/kyle/a", "b/", NULL, MS_MGC_VAL|MS_RDONLY|MS_REMOUNT, NULL) = 0 > kyle@phobos /kyle $ sudo touch foo-root a/foo-A b/foo-B > touch: cannot touch `foo-root': Read-only file system > touch: cannot touch `a/foo-A': Read-only file system > touch: cannot touch `b/foo-B': Read-only file system > kyle@phobos /kyle $ cat /proc/mounts | grep kyle > none /kyle tmpfs ro,relatime 0 0 > none /kyle/b tmpfs ro,relatime 0 0 > > Things break with mount -o remount,ro a/ b/ If you specify both paths, mtab will not be read and the only options set will be the ones you provide. > kyle@phobos /kyle $ sudo strace -e mount mount -o remount,rw a/ b/ > mount("/kyle/a", "b/", NULL, MS_MGC_VAL|MS_REMOUNT, NULL) = 0 > kyle@phobos /kyle $ cat /proc/mounts | grep kyle > none /kyle tmpfs rw,relatime 0 0 > none /kyle/b tmpfs rw,relatime 0 0 > > Things are rw again, let's try making them ro > But now it's not a bind mount anymore. > kyle@phobos /kyle $ sudo strace -e mount mount -o remount,ro b/ > mount("/kyle/a", "/kyle/b", 0x7f38874f4900, MS_MGC_VAL|MS_RDONLY|MS_REMOUNT, > NULL) = 0 > kyle@phobos /kyle $ cat /proc/mounts | grep kyle > none /kyle tmpfs ro,relatime 0 0 > none /kyle/b tmpfs ro,relatime 0 0 > > Wait, now it's propogating it back, even though the command worked before! > (Notice that MS_BIND is no longer specified.) > > Looks like a util-linux-ng bug in mount to me, and a corner case of the kernel > behaviour. :\ Anyway, there is a bug somewhere, but I don't think it is the kernel. I.e. either mount needs to be fixed to OR in MS_BIND in this case, or the man page needs to be fixed not to claim that "mount -o remount,ro newdir" would do the job, if it must be that something like "mount --bind -o remount,ro newdir" that makes things work. Reassigning to util-linux-ng. (Oh, and I verified that MS_BIND in the remount is indeed the one thing that makes things work. Thanks Kyle, for tracking that down.) Some notes: - the same filesystem could be mounted more than once - from kernel point of view there is not difference between mount /dev/sda1 /mnt/a mount /dev/sda1 /mnt/b and mount /dev/sda1 /mnt/a mount --bind /mnt/a /mnt/b for kernel the same FS is mounted on two places. The important detail is that kernel does not maintain information about a way how the mountpoint was created (bind or non-bind) and it does not store the "bind" option to /proc/mounts. So "cat /proc/mounts" does not make sense here. - MS_REMOUNT|MS_RDONLY -- updates filesystem superblock, it means the change is visible on all places in VFS where the filesystem is mounted - MS_REMOUNT|MS_BIND|MS_RDONLY updates the mount option and the change is visible for the mountpoint only. - the "bind" option is maintained in /etc/mtab only mount(8) behaviour: a) "mount -o remount,ro /mnt/a" reads /etc/{mtab,fstab} b) "mount -o remount,ro /mnt/b /mnt/a" does not read fstab/mtab and mtab is updated only. This mount(8) behaviour is documented in the mount.8 man page. I see only one bug -- in the the man page is not information that the --bind is required for the remount on systems without /etc/mtab or in case that mtab is ignored (because mount source and target are specified). I'll add this info to the man page. (In reply to comment #12) > not to claim that "mount -o remount,ro newdir" would do the job It does the job if the "bind" options is stored in your mtab: # mount /dev/sda6 /mnt/a # mount --bind /mnt/a /mnt/b # grep -E '/mnt/(a|b)' /proc/mounts /dev/sda6 /mnt/a ext4 rw,relatime,barrier=1,data=ordered 0 0 /dev/sda6 /mnt/b ext4 rw,relatime,barrier=1,data=ordered 0 0 # mount -o remount,ro /mnt/b <<<< # grep -E '/mnt/(a|b)' /proc/mounts /dev/sda6 /mnt/a ext4 rw,relatime,barrier=1,data=ordered 0 0 /dev/sda6 /mnt/b ext4 ro,relatime,barrier=1,data=ordered 0 0 ^^ (In reply to comment #13) > I see only one bug -- in the the man page is not information that the --bind is > required for the remount on systems without /etc/mtab or in case that mtab is > ignored (because mount source and target are specified). I'll add this info to > the man page. The upstream version of the man page has been updated. It will be available in F-14. |