Description of problem: Copying a file from an NFS mount to local disk exhibits stalls as long as 5 seconds: [relevant output from strace -tt cp /path/to/nfs/file /tmp] 11:44:01.195024 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 11:44:06.789237 fstat64(4, {st_mode=S_IFREG|0644, st_size=1347584, ...}) = 0 These stalls ultimately cause the copy of a 1.44MB floppy disk image to take about a minute and a half to complete. The strace -r output did not show any unusual timings for the system calls, but the -tt wall-clock output did show these long delays between read() and the subsequent fstat64()/_llseek()/fcntl64()/write() calls. Version-Release number of selected component (if applicable): kernel-2.4.9-e.34smp kernel-2.4.9-e.35smp How reproducible: In our testing, this was 100% reproducable with both kernels, and with the default udp mount [rw]size, 4096, 8192, and 32768 set. This was against a NetApp 960 running OnTap 6.4.2P6. During these tests, the filer was showing ~10% cpu utilization, and the interface we were testing against was not heavily used. We did not test tcp transport. Steps to Reproduce: 1. mount filer:/volX/export /mnt/filer 2. time cp /mnt/filer/file /tmp Actual results: Observe the throughput being much lower than it should be, and the above strace results. Expected results: The file should copy in a time that is reasonable given the networking speeds of the two hosts. Additional info: Falling back to the e.30smp kernel allows us to get consistent timings in this simple test.
I too am seeing attrocious performance at the e.35smp kernel. I too am running a Netapp at release 6.4.2P6 (but it's an F740). Load average reaches bursts of up to 140. and bad Apache response times. I am reverting back to the e.30 kernel tomorrow.
This has long been fixed in the current erratum, e.38.
I disagree. It is much better but performance is still much worse than e.30, and I have already opened up support issue #312835 and returned to e.30 once again.