+++ This bug was initially created as a clone of Bug #237108 +++ Description of problem: When exporting mounts across files systems. nfsv4 does not work properly. Version-Release number of selected component (if applicable): 2.6.9-42.EL How reproducible: Always (in my test setup) Steps to Reproduce: Build a REHL4-U4 server, create several partitions,enable nfs Disk confguration: /dev/hda1 /boot /dev/hda2 / /dev/hda3 /export In the local mounted filesystem /export create directory /export/home /etc/exports / *(rw,fsid=0,sync,nohide,no_root_squash) /export/home *(rw,fsid=1,sync,nohide,no_root_squash) Actual results: From a client you will not be able to mount the server via nfsv4. You will get an error. Expected results: This should work fine Additional info: Basically what it looks like is NFSv4 can't export off of the root filesystem. Spoke with SteveD about this already. He can reproduce the error case. -- Additional comment from steved on 2007-04-19 15:24 EST -- I believe I know what the problem is.. There are to exports '/' and '/export/home' with '/' being the pseudo root. Both have nohide set and both have different fsids which is needed. Now reason 'mount -tnfs 4 server:/export/home' fails with permission denied is because even though '/' and '/export/home' are exported the middle directory '/export' is not. So when the server try to access /export on its why to /export/home the access is denied. Now the reason 'mount -tnfs3 server:/export' succeeds but the directory is empty is because the server is seeing the export directory on '/' but not the mount point. So by exporting middle directory 'export' should take care of this problem: / *(rw,fsid=0,sync,nohide,no_root_squash) /export/ *(rw,fsid=1,sync,nohide,no_root_squash) /export/home *(rw,fsid=2,sync,nohide,no_root_squash) -- Additional comment from jburke on 2007-04-27 08:50 EST -- Exporting the middle directory did work. -- Additional comment from jlayton on 2007-06-22 12:06 EST -- Looks like an upstream and RHEL5 issue as well... -- Additional comment from jlayton on 2007-06-22 15:30 EST -- Yikes, this is a sticky problem. 4.6 seems doubtful -- setting to 4.7. RFC 3530 has quite a bit to say about this issue (around chapter 7). The server is expected to create a 'pseudo' export when there are gaps in the namespace like this, but it seems like no version of Linux so far actually does. I can see why -- you need to detect when a mountpoint might have exports that are below it and treat that differently than the case when one doesn't. I don't see how to do that without some fairly complex logic. Perhaps we can fix this up by making implicit exports for mountpoints that lie in between the namespace root and the actual export. We'd have to flag them so they'd be invisible to users, but the lookups could then cross them... -- Additional comment from jlayton on 2007-06-25 10:14 EST -- Actually, it looks like Bruce Fields posted a status update to the NFS4 list and mentioned this problem specifically: ---------[snip]----------- - export paths consistent with nfsv2/v3: No really promising progress, though (thanks mainly to Bryce Harrington), we do now have some prototype code at: git://linux-nfs.org/~bfields/nfs-tuils.git pseudo-export which attempts to solve the whole problem in userspace by automatically loopback-mounting a file to serve as the NFSv4 root, bind-mounting under it everything listed in the exports, and faking up corresponding exports in mountd. The code more or less works, but I'm not sure it's the right approach. This is a really important problem, and not one I've had a lot of time to work on lately, so we could definitely still use help here. ---------[snip]----------- I need to have a closer look, but this looks like it's really still in the design phase. I'm going to see if I can get involved with it, but I don't have a good idea so far of the right approach for solving this. I don't really like this userspace hackery either, but we might can use that as a model of how to fix this in kernel space. -- Additional comment from jlayton on 2007-07-10 10:53 EST -- I figure we're going to have to do something similar to the approach that Bryce has done here. When exports are done, we need to fill in the gaps with a pseudo filesystem of some sort. Some possible approaches: 1) use Bryce's approach whole-hog. Do it all in userspace with a /dev/loop ext2 fs as the base. 2) use Bryce's basic approach, but use tmpfs as the underlying filesystem (may require kernel work to make sure that tmpfs is up to the job). I'd prefer this rather than dealing with losetup. 3) move all of it into the kernel, perhaps with a new filesystem type as the underlying pseudo-rootfs. We could maintain a separate pseudofs root directory with appropriate bind mounts that's disconnected from the actual root dir. 4) don't do any sort of pseudo-rootfs. When a directory is exported, check to see if it's "disconnected". If it is, backtrack dentries until we get to something that is exported. Tag those dentries with a flag that indicates that they are only to be used for LOOKUP calls and fix up LOOKUP to recognize a dir that has been tagged this way. Only do lookups in it for entries that are valid exports or similarly tagged entries. This is still all very much in the blue-sky stage. While doing this in kernel somehow feels cleaner to me, I don't know of a good reason to *not* do it in userspace. Perhaps #2 is the best thing to do? -- Additional comment from jlayton on 2007-07-10 11:08 EST -- #4 might be really tough when mixed in with subtree-check.
There seems to be a general idea that tmpfs (or something like it) as the pseudo root is the right thing to do. The problem there is that the filehandles for those sorts of directories can change (for instance, on reboot). So we need to declare those filehandles volatile -- something we don't do much with in Linux so far.
*** Bug 248062 has been marked as a duplicate of this bug. ***
I'm thinking it does not make sense to back port the RHEl6 code to RHEL5 since there is some much difference... Unless some can come up with a compelling argument, I going to close this bz with a WONTFIX.