Used the following code to troubeshoot: #include <dirent.h> #include <stdlib.h> main(int argc, char **argv){ struct dirent **namelist; int n; if (argc != 2) { printf("usage: %s dirname\n", argv[0]); exit(1); } printf("%s: started with argument %s\n", argv[0], argv[1]); n = scandir(argv[1], &namelist, 0, alphasort); if (n < 0) perror("scandir"); else while(n--) printf("%s\n", namelist[n]->d_name); } The above code produces the following results when ran against a nfs mounted filesystem, exported from a HPUX 10.20 device: lcano/t /scott /home/lcano/t: started with argument /scott nawprv nawbk nadata lost+found ldmis2 .. . Which is correct. Produces the following when ran against a irix 6.3 (I'm assuming that 6.4 and 6.5 would produce the same results, but I didn't check). lcano/t /scott /home/lcano/t: started with argument /scott scandir: Invalid argument Which did not work. Also tried it on a IRIX 5.1 system and it worked. Analyzed the code through: ../glibc-2.0.7/sysdeps/unix/readdir.c 59 bytes = __getdirentries (dirp->fd, dirp->data, maxread, &base); ../glibc-2.0.7/sysdeps/unix/getdents.c 79 retval = __getdents (fd, (char *) kdp, red_nbytes); and kdp->d_off seems invalid for irix 6.3 but ok for hpux 10.20 and irix 5.1: -- irix 6.3 (gdb) p *kdp $145 = {d_ino = 128, d_off = 5934, d_reclen = 12, d_name = ".\000\200\000\000\0001u\034\000\020\000..\000\206\004\b\203\000\000\0002u\034\000\020\000sj1\000\000\000\200\000\020\000%r{\034\020\000sj2\000\000\000fv*\000\024\000jac obs\0007\000@\000\220\n@H3\000\000\003\000\000\0002\000\000\000\000\000\000\000\000\000\000\000\000\020\t\000m\017\t\000m\017\t\000\000\000\000\000\005\000\000\000\000\020\t\000\00 0\220\t\000\204\204\t\000HC\n\000\000\000\t\000\003\000\000\000\000\000\000\000#E\000\000w1\000@Kb\000@H#\000@P+\000@\rH\000@p+\000@\2361\000@H#\000@4\000\000\000 \000\000\000\210+\000 @lz?19"...} ls -al /scott total 9 drwxr-xr-x 5 root root 48 Dec 29 1998 . drwxr-xr-x 30 root root 1024 Dec 29 04:18 .. drwxr-xr-x 51 1082 2008 4096 Dec 29 07:28 jacobs drwxr-xr-x 8 1082 2008 98 Dec 17 12:02 sj1 drwxr-xr-x 14 1082 2008 4096 Dec 17 11:07 sj2 -- irix 5.1 (gdb) p *kdp $146 = {d_ino = 2, d_off = 1, d_reclen = 12, d_name = ".\000\002\000\000\000\002\000\000\000\020\000..\000\206\004\b\003\000\000\000\003\000\000\000\030\000lost+found\000\000\000\000\034\000\000\000\004\000\000\000\024\000nadat adv\000@8\a\000\000\005\000\000\000\024\000nadop2\000\000\000\000\211\000\000\000\000\002\000\000\020\000test\000\000\000\020\t\000m\017\t\000m\017\t\000\000\000\000\000\005\000\000\00 0\000\020\t\000\000\220\t\000\204\204\t\000HC\n\000\000\000\t\000\003\000\000\000\000\000\000\000#E\000\000w1\000@Kb\000@H#\000@P+\000@\rH\000@p+\000@\2361\000@H#\000@4\000\000\000 \00 0\000\000\210+\000@lz?19"...} ls -al /scott total 16 drwxr-xr-x 5 root root 512 Dec 29 1998 . drwxr-xr-x 30 root root 1024 Dec 29 04:18 .. drwx------ 2 root root 10752 Apr 28 1997 lost+found drwxrwxr-x 19 1591 wd4 1024 Dec 8 22:40 nadatadv drwxr-xr-x 3 1595 wd4 512 Apr 28 1997 nadop2 -rw-r--r-- 1 root root 348 Dec 29 1997 test -- hpux 10.20 (gdb) p *kdp $148 = {d_ino = 2, d_off = 12, d_reclen = 12, d_name = ".\000\002\000\000\000\030\000\000\000\020\000..\000\206\004\b\003\000\000\000,\000\000\000\030\000lost+found\000\000\000\000\001X\000\000@\000\000\000\024\000nawprv\000r\00 0@\000\200\000\000P\000\000\000\024\000nadata\000\000\000\000\005X\000\000`\000\000\000\024\000ldmis2\000\020\t\000\000\230\000\000\000\004\000\000\020\000nawbk\000\000\020\t\000\000\2 20\t\000\204\204\t\000HC\n\000\000\000\t\000\003\000\000\000\000\000\000\000#E\000\000w1\000@Kb\000@H#\000@P+\000@\rH\000@p+\000@\2361\000@H#\000@4\000\000\000 \000\000\000\210+\000@lz ?19"...} ls -al /scott total 11 drwxr-xr-x 7 root root 1024 Dec 1 1997 . drwxr-xr-x 30 root root 1024 Dec 29 04:18 .. drwxr-xr-x 9 1686 wd4 4096 Dec 29 1998 ldmis2 drwxr-xr-x 2 root root 8192 Aug 30 1993 lost+found drwxr-xr-x 2 1318 wd4 24 Sep 20 1993 nadata drwxr-xr-x 18 1611 2008 2048 Nov 24 15:54 nawbk drwxr-xr-x 19 1674 2008 1024 Sep 22 08:05 nawprv In readdir _DIRENT_HAVE_D_RECLEN is defined and thus uses the value for dp->d_off to set dirp->filepos. My appologizes for being so wordy with this description. Thanks, Luis Cano lcano.gov
The #includes for the test code should be: #include <dirent.h> #include <stdlib.h> Sorry for the omission. Louie
The greater and less than sign, part of the #include statement is not carrying over to the "view bug". Once again, the includes should be: #include dirent.h #include stdlib.h and the greater and lesser than signs inserted. I need to read the FAQ. Louie
The root cause of this problem is that the Linux kernel (at least in the 2.0.* series) cannot handle NFS v2 directory cookies that have their high bit set (NFS v2 directory cookies are nominally unsigned 32-bit blobs). This is a difficult problem to solve for real because the user/kernel interface to deal with them is lseek(), which only deals with positive signed 32-bit integers and thus will never be able to cope well with such cookies. It is possible to create a somewhat gory kernel hack that will let glibc et all do a telldir() (but not a seekdir()) on such cookies. This keeps glibc's readdir() happy. I have such a patch if people want a copy; send email. ------- Email Received From Luis Cano <lcano.gov> 02/22/99 10:57 -------
comments?
I believe that this is worked around in the 2.2 kernel, but I don't expect that we will release a 2.0 kernel with a workaround for this as an errata upgrade for 5.2. If you like, you might try our unofficial 2.2 kernel upgrade and see if it works better for you. Alternatively, if the "somewhat gory" kernel hack from cks worked for you, please note it here.
This bug seems to still be present in the 2.2.* kernel series (I don't have a Rawhide/Starbuck/6.0 machine handy, but I tried it on a 2.2.6-ac2 kernel on a RH 5.2 install, and it had the same problem).
Alan, do you have any clue on this one?
Two suggestions. Firstly there is an Irix option to use 32bit cookies. Check the irix docs - I no longer have an indy so I cant check easily here what it is. Xecondly get me a tcpdump of the failing case. The only problem I know about for Irix is duplicated directory cookies. That would be very different to what you see. Check the tcpdump docs btw you can ask it to capture all of the packet and decode higher protocols the trace then has NFS data in it which I can work from. Alan
This could be very well a kernel problem. Does a later release of the kernel (2.2.x) fixed the problem? Is it still true for a 6.0 system running glibc-2.1.1 or later? Please reopen if the problem still persists.