Bug 533569 - nfs4: two directories may have identical st_dev and st_ino
nfs4: two directories may have identical st_dev and st_ino
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
All Linux
high Severity medium
: ---
: ---
Assigned To: Jeff Layton
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-11-07 06:52 EST by Jim Meyering
Modified: 2013-03-13 16:41 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-10-18 11:29:02 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Jim Meyering 2009-11-07 06:52:48 EST
Description of problem: in the vicinity of a mount point directory, two directories may have the same device and inode number.  This is a serious problem because many tools treat the condition as indicating a hard directory cycle, which usually indicates file system corruption.

Version-Release number of selected component (if applicable):
2.6.31.5-122.fc12.x86_64

How reproducible: every time

Steps to Reproduce:
Based on the set-up from Kamil Dudka in https://bugzilla.redhat.com/show_bug.cgi?id=501848#c45

# mount | grep ^/
...
/dev/sda8 on /home type ext4 (rw,noatime)
...
# top=/home
# cat /etc/exports
# printf "/ *(fsid=0,crossmnt)\n$top *(crossmnt)\n" >> /etc/exports
# service nfs restart
...
# mkdir /tmp/mnt
# mount -t nfs4 localhost:/ /tmp/mnt
# stat --printf "%d %i %n\n" /tmp/mnt{,$top}
22 2 /tmp/mnt
22 2 /tmp/mnt/home

Then, using the very latest du from upstream coreutils.git,
I see this:

    $ du /tmp/mnt > /dev/null
    du: WARNING: Circular directory structure.
    This almost certainly means that you have a corrupted file system.
    NOTIFY YOUR SYSTEM MANAGER.
    The following directory is part of the cycle:
      `/tmp/mnt/home'

Actual results: above


Expected results: different dev and/or inode, no du failure


Additional info:
Comment 1 Steve Dickson 2009-11-10 13:33:05 EST
> # stat --printf "%d %i %n\n" /tmp/mnt{,$top}
> 22 2 /tmp/mnt
> 22 2 /tmp/mnt/home
I do see this... but 
    
> $ du /tmp/mnt > /dev/null
> du: WARNING: Circular directory structure.
> This almost certainly means that you have a corrupted file system.
> NOTIFY YOUR SYSTEM MANAGER.
> The following directory is part of the cycle:
> `/tmp/mnt/home'

What kernel are you using and nfs-utils
Comment 2 Steve Dickson 2009-11-10 13:35:03 EST
I meant to say... I don't see the du error... what kernel/nfs-utils are
you using..
Comment 3 Kamil Dudka 2009-11-10 13:37:09 EST
(In reply to comment #2)
> I meant to say... I don't see the du error... what kernel/nfs-utils are
> you using..  

You need to compile GNU coreutils from git to see the error.
Comment 4 Jim Meyering 2009-11-10 13:39:56 EST
Hi Steve, kernel version is listed above.
nfs-utils-1.2.0-18.fc12.x86_64
Comment 5 Jeff Layton 2009-11-10 13:45:33 EST
I think I understand what the issue is here. I just don't think that there's much we can do about it...

The stat program is doing a lstat() and that doesn't trigger a submount (LOOKUP_FOLLOW isn't set). So we end up doing a GETATTR call that returns info on the root inode of the /home mount. So the stat() syscall gets the "real" st_ino of /tmp/mnt/home, but the st_dev is still that of the parent (/tmp/mnt).

This is particularly evident here because the root of any ext3/4 filesystem has an st_ino of 2.

I think our options are:

1) fix the kernel to trigger a submount even when LOOKUP_FOLLOW isn't set (quite possibly very hard on performance)

2) fix the kernel to return a bit more info when we have a "potential mountpoint" like this. My suggestion on LKML was to coopt a new st_mode/i_mode bit and use that to indicate that a directory is potentially a new mountpoint if someone were to walk into it

So far, my suggestion hasn't received any feedback upstream.
Comment 6 Bug Zapper 2009-11-16 10:17:01 EST
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 7 Jim Meyering 2010-03-16 03:48:36 EDT
AFAIK, nothing has changed, so I've reset "Version:" to rawhide.
Comment 8 Bug Zapper 2010-03-16 08:18:51 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 9 Jim Meyering 2010-04-08 08:33:19 EDT
Still affects rawhide, too.
Comment 10 Bug Zapper 2010-07-30 06:46:43 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle.
Changing version to '14'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 11 Jim Meyering 2010-09-02 03:02:44 EDT
Changing version back to 'rawhide'.
Comment 12 Ric Wheeler 2012-10-17 03:45:33 EDT
Is this something that we can change in upstream or should we close this out?
Comment 13 Jeff Layton 2012-10-18 11:29:02 EDT
Not much we can do, I don't think...

If anything, the automount semantics are even less likely to trigger a mount these days. I think the only hope for this problem is the xstat() work that dhowells was working on, but that has sort of died upstream.

I'll go ahead and close this WONTFIX for now. Please reopen it if you want to discuss it further.

Note You need to log in before you can comment on or make changes to this bug.