From Bugzilla Helper: User-Agent: Mozilla/4.77 [en] (X11; U; Linux 2.2.16-3 i686; Nav) Description of problem: The NFS client in the linux kernel (both 2.2.18 and 2.4.6) does the wrong thing when an NFS server returns NFS3ERR_BAD_COOKIE on a readdir operation. What it does: At line 399 in kernel-2.4.6/linux/fs/nfs/dir.c ... if (res == -EBADCOOKIE) { /* This means either end of directory */ if (desc->entry->cookie != desc->target) { /* Or that the server has 'lost' a cookie */ res = uncached_readdir(desc, dirent, filldir); if (res >= 0) continue; } res = 0; break; } else if (res < 0) ... What this appears to do is to start the readdir() from the beginning again. What it should do is to return an error code to the user who called readdir(). See RFC1813 for a discussion of the behavior. I saw the problem when using a network appliance NFS server. I have one nfs client creating and deleting files in a directory, and another nfs client is trying to do readdir() to go through all of the files in the directory. I expected that the readdir() operation would see every file that was neither created nor deleted, and it might see some of the created files or some of the deleted files. Instead, the client sees some files twice, and some files get skipped. The readdir should not be starting over. The client should have been notified that there was a problem, and then the client could have started the readdir() operation all over, throwing away its partial results. How reproducible: Always Steps to Reproduce: 1. Use an network appliance NFS server and two linux clients. 2. Client A is doing a readdir() to get the entries in a directory containing files named A*. 3. Client B is concurrently adding and removing subdirectories or files from the directory. The files being added and deleted are named B*. Actual Results: Client A will see the same A* file twice, and will see some A* files skipped. (It is OK if Client A sees B* files, or doesn't see them. Expected Results: Client A should have seen each A* file exactly once. Additional info:
This appears to be fixed in later kernels.