Description of problem: negative dentries causing screwiness when unlinking a file stored on an NFS mount. Version-Release number of selected component (if applicable): kernel-2.4.21-20.EL How reproducible: Always Steps to Reproduce: 1. Export a local filesystem via NFS 2. mkdir /mnt/1 /mnt/2 3. mount -t nfs server:/export /mnt/1 4. mount -t nfs server:/export /mnt/2 5. ls /mnt/1/file-that-does-not-exist 6. touch /mnt/2/file-that-does-not-exist Actual results: [root@hogwash mnt]# mount -t nfs localhost:/tmp /mnt/1 [root@hogwash mnt]# mount -t nfs localhost:/tmp /mnt/2 [root@hogwash mnt]# ls /mnt/1/oogabooga ls: /mnt/1/oogabooga: No such file or directory [root@hogwash mnt]# touch /mnt/2/oogabooga [root@hogwash mnt]# rm /mnt/1/oogabooga rm: cannot lstat `/mnt/1/oogabooga': No such file or directory [root@hogwash mnt]# ls /mnt/[12]/oogabooga /mnt/2/oogabooga [root@hogwash mnt]# Expected results: [root@hogwash mnt]# ls /mnt/1/oogabooga ls: /mnt/1/oogabooga: No such file or directory [root@hogwash mnt]# touch /mnt/2/oogabooga [root@hogwash mnt]# rm /mnt/1/oogabooga [root@hogwash mnt]# Additional info: A workaround is to mount the file system with acdirmin=0 and acdirmax=0. Then the nfs_neg_need_reval() function in fs/nfs/dir.c always returns true, meaning the nfs code never trusts negative dentries, and always does a fresh LOOKUP. But this then affects all system calls, not just unlink(). And it hurts NFS performance a lot.
Is there any news on this?
Well it appears to be an interopablitily issues since I can not reproduce the problem with RHEL3 server and client.... But it is reproducible with a Solaris 10 server and RHEL3 client (as stated in the report) Investigating the posted patch and possible other options
This patch is a bit confusing.... although I like the idea of intent bits,I don't see how this patch helps since its not the sys_unlink() that is failing (during the 'rm /mnt/1/oogabooga'), its the lstat(). Which means the sys_unlink() is not even being called so is unclear to me how setting intent bits in the sys_unlink() will help. Now the reason lstat() (or the lookup of /mnt/1/oogabooga) is failing is because of how the vfs layer caches directory entries (or dirents). When 'ls /mnt/1/oogabooga' is done, a negative dirent is created in the dirent cache. When 'touch /mnt/2/oogabooga' is done, as new (or used) dirent is created, different from the negative dirent because of the different fileystems (or super blocks). Finally when the 'ls /mnt/1/oogabooga' is done again, the lookup fails because the negative dirent is used until NFS times it out. The moral of the story, because the Linux VFS creates different dirents on different fileystems that point to the same file, there is really nothing we can do about this other than use the acdirmin and acdirmax mount options to cut down the time outs