From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; T312461) Description of problem: NFS mount of Ultrix 4.2 systems appears to work fine, but in very large directories it only lists about 1/50 of the files. For example, in a directory with 5958 files, it displays around 100. No errors are given, no errors are logged. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Mount a filesystem from Ultrix. 2. 'ls' a directory with at least 4000 files 3. 'ls' the directory from Ultrix 4. Compare Actual Results: [root@york-bizapps5 acdms2]# df . Filesystem 1k-blocks Used Available Use% Mounted on vsgraham:/source.old 1025542 595881 327107 65% /export/vaxstuff/backups/mounts/vsgraham/source [root@york-bizapps5 acdms2]# pwd /export/vaxstuff/backups/mounts/vsgraham/source [root@york-bizapps5 source]# ls acdms2/editions |wc -l 180 Expected Results: # uname -a ULTRIX vsgraham 4.2 0 VAX # pwd /source.old # ls acdms2/editions |wc -l 9188 Additional info: This doesn't happen on smaller directories. I don't know where the exact cutoff is, since the smallest directory it has happened to is over 5,000 entries, and the largest that it hasn't happened to is just over 400 entries.
Two things to check here. 1) If you are using IP chains or netfilter on the Linux machine, make sure that you are set up to allow fragmented packets through. 2) Try setting r/wsize on the server (in /etc/exports, I assume for Ultrix?) to 1024 and see if this fixes the problem.
No filtering on the system at all. And Ultrix is oooooold BSD. No options in exports file except what to map uid 0 to, and readonly. Here's from the man page -- identifiers are hostnames or yp groups. pathname [-r=#] [-o] [identifier_1 identifier_2 ... identifier_n] Any way to figure out the size used by Ultrix, and give an option to 'mount' to make them match?
> Any way to figure out the size used by Ultrix, and give an option to 'mount' to > make them match? I'm not at all familiar with Ultrix. You could try snooping the RPC traffic between the server and client and see if you can figure out the request size. My first suspicion was that the server is sending out very large UDP packets which are being fragmented at the network level and then not being reassembled correctly. Have you seen this problem in any earlier versions of RHL? Some other things to try: Upgrade the kernel on the Linux machine to 2.4.9-31. And this is probably a long-shot, but you wouldn't happen to be using the tcsh, would you? If so, do you see this problem with csh or bash?
Just FYI, I'll take "extremely inconsistent results for one..." [root@york-bizapps5 vsgraham]# mount -t nfs -o rsize=4096,wsize=4096 vsgraham:/s ource.old source [root@york-bizapps5 vsgraham]# ls source/acdms2/editions |wc -l 180 [root@york-bizapps5 vsgraham]# umount source [root@york-bizapps5 vsgraham]# mount -t nfs -o rsize=2048,wsize=2048 vsgraham:/s ource.old source [root@york-bizapps5 vsgraham]# ls source/acdms2/editions |wc -l 360 [root@york-bizapps5 vsgraham]# umount source [root@york-bizapps5 vsgraham]# mount -t nfs -o rsize=1024,wsize=1024 vsgraham:/s ource.old source [root@york-bizapps5 vsgraham]# ls source/acdms2/editions |wc -l 360 [root@york-bizapps5 vsgraham]# umount source [root@york-bizapps5 vsgraham]# mount -t nfs -o rsize=512,wsize=512 vsgraham:/sou rce.old source [root@york-bizapps5 vsgraham]# ls source/acdms2/editions |wc -l 180
To answer your question about fragmentation, they are both on the same LAN. Here's a tcpdump, though, that appears to confirm your suspician. I haven't read it through yet, just forwarding on ... [root@york-bizapps5 cvs]# tcpdump host vsgraham tcpdump: listening on eth0 19:23:59.083106 arp who-has vsgraham.accentopto.com tell york-bizapps5 19:23:59.083896 arp reply vsgraham.accentopto.com is-at 8:0:2b:18:1f:c4 19:23:59.083915 york-bizapps5.2503990956 > vsgraham.accentopto.com.nfs: 148 look up [|nfs] (DF) 19:23:59.089219 vsgraham.accentopto.com.nfs > york-bizapps5.2503990956: reply ok 128 lookup [|nfs] 19:23:59.089308 york-bizapps5.2520768172 > vsgraham.accentopto.com.nfs: 148 look up [|nfs] (DF) 19:23:59.093543 vsgraham.accentopto.com.nfs > york-bizapps5.2520768172: reply ok 128 lookup [|nfs] 19:23:59.093737 york-bizapps5.2537545388 > vsgraham.accentopto.com.nfs: 144 read dir [|nfs] (DF) 19:23:59.115146 vsgraham.accentopto.com.nfs > york-bizapps5.2537545388: reply ok 1472 readdir offset 1 size 8613 eof (frag 36411:1480@0+) 19:23:59.116967 vsgraham.accentopto.com > york-bizapps5: (frag 36411:1480@1480+) 19:23:59.117936 vsgraham.accentopto.com > york-bizapps5: (frag 36411:1212@2960) 19:23:59.118071 york-bizapps5.2554322604 > vsgraham.accentopto.com.nfs: 144 read dir [|nfs] (DF) 19:23:59.140092 vsgraham.accentopto.com.nfs > york-bizapps5.2554322604: reply ok 1472 readdir offset 1 size 8716 eof (frag 36667:1480@0+) 19:23:59.141861 vsgraham.accentopto.com > york-bizapps5: (frag 36667:1480@1480+) 19:23:59.142827 vsgraham.accentopto.com > york-bizapps5: (frag 36667:1212@2960) 19:23:59.143240 york-bizapps5.2571099820 > vsgraham.accentopto.com.nfs: 144 read dir [|nfs] (DF) 19:23:59.164739 vsgraham.accentopto.com.nfs > york-bizapps5.2571099820: reply ok 1472 readdir offset 1 size 8821 eof (frag 36923:1480@0+) 19:23:59.166552 vsgraham.accentopto.com > york-bizapps5: (frag 36923:1480@1480+) 19:23:59.167520 vsgraham.accentopto.com > york-bizapps5: (frag 36923:1212@2960)
FYI that tcpdump was of me rerunning the last command from the output above -- ls source/acdms2/editions | wc -l with the mount still set at 512 read and write.
I swear I checked /var/log/messages before, but anyway, look what I found: May 13 20:50:38 york-bizapps5 kernel: NFS: short packet in readdir reply! May 13 20:50:41 york-bizapps5 last message repeated 9 times May 13 20:50:41 york-bizapps5 kernel: NFS: giant filename in readdir (len 0x989a 1808)!
Using these messages, I finally found a clue. Here's some discussion and a potential patch in this mailing-list thread. http://linux-kernel.skylab.org/20011014/msg00639.html
And here's a possible patch? From redhat? http://www.uwsg.iu.edu/hypermail/linux/kernel/9906.1/0228.html
That patch is against 2.3.5, so it is already in our kernel. Which kernel version are you running?
Linux york-bizapps5 2.4.9-31 #1 Tue Feb 26 07:11:02 EST 2002 i686 unknown
And update/status on this bug?
Yes, appears to be a problem with v3 support in our 2.4.9-31 errata kernel. We're working on a fix for the next errata kernel.
Staled out. Sorry. [But actually I think RHEL 3 works ok in this situation]