Bug 537463 - du using a ghosted automounted directory results in 'No such file or directory'
du using a ghosted automounted directory results in 'No such file or directory'
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: coreutils (Show other bugs)
5.7
All Linux
medium Severity medium
: rc
: ---
Assigned To: Ondrej Vasik
qe-baseos-daemons
:
: 509923 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-11-13 12:46 EST by Jeff Bastian
Modified: 2011-07-21 08:11 EDT (History)
8 users (show)

See Also:
Fixed In Version: coreutils-5.97-31.el5
Doc Type: Bug Fix
Doc Text:
If the ghost option is enabled for an automount point, the du command fails on an automounted directory if it's not mounted yet. It works on the second attempt. Additional fts_stat call was added to reveal subsequent changes in hierarchy to du command, thus the du command succeeds even for the first attempt.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-07-21 06:35:31 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
autofs debug logs from the 'du' test (16.01 KB, text/plain)
2009-11-13 12:51 EST, Jeff Bastian
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Legacy) 23570 None None None Never

  None (edit)
Description Jeff Bastian 2009-11-13 12:46:26 EST
Description of problem:
If the ghost option is enabled for an automount point, the du command fails on an automounted directory if it's not mounted yet.  It works on the second attempt.

This problem was fixed for the find command in bug 448869, but it persists for du.


Version-Release number of selected component (if applicable):
autofs-5.0.1-0.rc2.131.el5_4.1
kernel-2.6.18-164.6.1.el5
kernel-xen-2.6.18-164.6.1.el5


How reproducible:
every time

Steps to Reproduce:
1. Configure an automount to use the ghost option
     [jbastian@termite ~]$ grep data /etc/auto.master
     /data   /etc/auto.data --ghost
     [jbastian@termite ~]$ grep centipede /etc/auto.data
     centipede       -intr,tcp centipede:/export
     [jbastian@termite ~]$ sudo service autofs restart
     Stopping automount:                                        [  OK  ]
     Starting automount:                                        [  OK  ]

2. Verify that the mount point is not yet mounted
     [jbastian@termite ~]$ grep centipede /proc/mounts
     [jbastian@termite ~]$

3. Try running 'du' on the mount point
     [jbastian@termite ~]$ du /data/centipede
     du: `/data/centipede': No such file or directory

4. Note that it did get mounted, however
     [jbastian@termite ~]$ grep centipede /proc/mounts
     centipede:/export /data/centipede nfs rw,vers=3,rsize=32768,...

5. Subsequent attempts work fine
     [jbastian@termite ~]$ du /data/centipede
     4448    /data/centipede/data/wallpapers
     4456    /data/centipede/data
     4464    /data/centipede

Actual results:
du failed on the first attempt (but it did mount the directory)

Expected results:
du succeeds

Additional info:
Comment 1 Jeff Bastian 2009-11-13 12:51:37 EST
Created attachment 369466 [details]
autofs debug logs from the 'du' test
Comment 2 Jeff Bastian 2009-11-13 16:16:43 EST
Curious, if I add a "/." to the end of the path, it works on the first try:

   [jbastian@termite ~]$ grep centipede /proc/mounts
   [jbastian@termite ~]$ du /data/centipede/.
   4448    /data/centipede/./data/wallpapers
   4456    /data/centipede/./data
   4464    /data/centipede/.
Comment 3 Jeff Bastian 2009-11-13 16:53:09 EST
Side note: du works on the first try with Fedora 11 (autofs-5.0.4-45.x86_64 and kernel-2.6.30.9-96.fc11.x86_64)
Comment 4 Jeff Bastian 2009-11-13 17:35:36 EST
There might be a problem with 'du' on RHEL-5.  With a bit of hacking, I managed to build RHEL-5's coreutils-5.97 on Fedora 11, and 'du' failed with the same 'No such file or directory' error on F-11.


I then built the latest coreutils from upstream on RHEL-5 and it worked fine:

     [jbastian@termite ~]$ grep centipede /proc/mounts
     [jbastian@termite ~]$ /tmp/coreutils-7.6/src/du /data/centipede
     4448    /data/centipede/data/wallpapers
     4456    /data/centipede/data
     4456    /data/centipede

I'll move this to the coreutils component.
Comment 5 Jeff Bastian 2009-11-23 18:20:18 EST
Note: 'du -L' also works fine.

   [jbastian@termite ~]$ grep centipede /proc/mounts
   [jbastian@termite ~]$ du -L /data/centipede
   4448    /data/centipede/data/wallpapers
   4456    /data/centipede/data
   4456    /data/centipede

I've been comparing coreutils-5.97 and coreutil-6.5 (which works) and the -L flag changes the fts_open() flags from FTS_PHYSICAL to FTS_LOGICAL.

The lib/fts* files have had 100s of changes between 5.97 and 6.5 and it's difficult to narrow down which changes might have an effect here.
Comment 6 Ian Kent 2009-11-23 22:54:48 EST
I'm not sure this will be useful but I had a quick look at
strace output from my F-10 system where this worked as required
(although I'm not sure this working as required is actually a
good thing, but that's another story) and RHEL-5.3 where it didn't.

Note that I observed that the -L appears to work but actually doesn't
quite get it right either.

On F-10 the interesting bit (du /test/foo):
newfstatat(AT_FDCWD, "/test/foo", {st_mode=02, st_size=17592186044416, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "/test/foo", O_RDONLY) = 3
fstat(3, {st_mode=S_IFDIR|0555, st_size=4096, ...}) = 0
fcntl(3, F_GETFL)                       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
fcntl(3, F_DUPFD, 3)                    = 4
getdents(3, /* 2 entries */, 4096)      = 48
getdents(3, /* 0 entries */, 4096)      = 0
close(3) 

and on RHEL (du /test/foo):
lstat("/test/foo", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
open(".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY) = 3
fchdir(3)                               = 0
open("/test/foo", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 4
fstat(4, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
fcntl(4, F_SETFD, FD_CLOEXEC)           = 0
fstat(4, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
close(4)

and on RHEL (du -L /test/foo):
stat("/test/foo", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
open("/test/foo", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
getdents(3, /* 2 entries */, 4096)      = 48
getdents(3, /* 0 entries */, 4096)      = 0
close(3)

The first thing that stands out is the use of O_NONBLOCK.
Although I only looked briefly I didn't see any select() or poll()
calls that would wait for this potentially lengthy callback to the
automount daemon to complete and I noticed what looks like incomplete information returned for /test/foo from the "du -L" command. Perhaps
the -L appears to work because it takes longer to complete but I can't
say for sure.

OTOH, in the F-10 trace the newfstatat() shouldn't cause the mount
to happen but the openat() will and it will block until the mount
completes. At least I think that is the case but I will need to
check the flags used in the kernel to say for sure. In any case
it behaves as required so I'm likely correct.

Anyway, I'll look a bit more when I get a chance.
Ian
Comment 7 Ian Kent 2009-11-23 22:59:14 EST
(In reply to comment #6)
> 
> OTOH, in the F-10 trace the newfstatat() shouldn't cause the mount
> to happen but the openat() will and it will block until the mount
> completes. At least I think that is the case but I will need to
> check the flags used in the kernel to say for sure. In any case
> it behaves as required so I'm likely correct.

To clarify, openat() will block if it triggers a mount.

My question to self is will trigger the auto mount with the
provided flags and that appears to be the case (so probably
not much of a question really).

Ian
Comment 16 RHEL Product and Program Management 2010-08-09 14:38:14 EDT
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.
Comment 18 Kamil Dudka 2011-01-11 09:34:07 EST
I am able to repeat the failure.  Bug 509923 and bug 501848 seem to be closely related.  I'll try to find some solution.
Comment 19 Kamil Dudka 2011-01-11 16:04:11 EST
Created attachment 472895 [details]
possible solution

It looks like el5 clone of bug 501848 - the attached patch solves the problem for me.  Could anybody give it a try?
Comment 20 RHEL Product and Program Management 2011-01-11 16:04:31 EST
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.
Comment 23 Kamil Dudka 2011-01-11 16:26:01 EST
RHEL-5 findutils also detects the dev/ino change underneath it, but seems to be able to survive the event -- it just prints a warning:

# umount /autofs/boot; find /autofs
/autofs
/autofs/boot
find: WARNING: Hard link count is wrong for /autofs/boot: this may be a bug in your filesystem driver.  Automatically turning on find's -noleaf option.  Earlier results may have failed to include directories that should have been searched.
/autofs/boot/grub
...
Comment 25 Kamil Dudka 2011-01-11 17:33:17 EST
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.
Comment 26 Jeff Bastian 2011-01-12 15:01:20 EST
(In reply to comment #19)
> It looks like el5 clone of bug 501848 - the attached patch solves the problem
> for me.  Could anybody give it a try?


Your patch fixed the problem for me!

And to my surprise, most of my original reproducer system was still set up from over a year ago so it was easy to verify this patch fixed the bug.

Jeff
Comment 28 Kamil Dudka 2011-01-12 15:20:34 EST
Comment on attachment 472895 [details]
possible solution

Jeff, thank you for testing the patch.
Comment 33 Ondrej Vasik 2011-03-31 10:05:29 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
If the ghost option is enabled for an automount point, the du command fails on an automounted directory if it's not mounted yet.  It works on the second attempt. Additional fts_stat call was added to reveal subsequent changes in hierarchy to du command, thus the du command succeeds even for the first attempt.
Comment 34 Ondrej Vasik 2011-04-01 10:29:21 EDT
*** Bug 509923 has been marked as a duplicate of this bug. ***
Comment 37 errata-xmlrpc 2011-07-21 06:35:31 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1074.html
Comment 38 errata-xmlrpc 2011-07-21 08:11:40 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1074.html

Note You need to log in before you can comment on or make changes to this bug.