Description of problem: Bug 321111 still not fixed. Red Hat Enterprise 5 with kernel greater than 2.6.18.53.el5 suffers from degreaded NFS performance. Version-Release number of selected component (if applicable): Kernel > 2.6.18.53.el5 How reproducible: Install kernel 2.6.18.53.el5 and create NFS mount to a directory containing a large number of files. Run "time ls -al /tmp/mnt | wc -l" My test is like this: time ls -al /tmp/mnt | wc -l ; time ls -al /tmp/mnt/sdp | wc -l This measure the time it takes to "stat" all the files in these directories. Please note that the first time you stat a directory with 100K files it may take a VERY long time (I've seen 60 seconds response time) as it has to cache all the access info the first time for all 100K files. The next request though should be MUCH faster (like a few seconds). *** From Red Hat Enterprise 5 NFS Client (2.6.18.53.el5) First query to the /tmp/mnt/sdp directory (100K files) took: 47 seconds Second query to /tmp/mnt/sdp directory took: 17 seconds. *** From Red Hat Enterprise 5 NFS Client (2.6.18-92.1.10.el5) or any release after 2.6.18.53: First query to the /tmp/mnt/sdp directory (100K files) took: 124 seconds Second query to /tmp/mnt/sdp directory took: 42 seconds. Expected results: If this issue was resolved according to BUG 321111 then I would expect ls -al times to be as fast as the 2.6.18.53 kernel, however the time is twice a long. Here is another thread - this one talk about 2 different bugzilla reports on the issue & mentioned possible beta kernel fix in 5.2 although some said they have tried it and it doesn't solve the problem. http://www.mail-archive.com/rhelv5-list@redhat.com/msg03028.html Additional info:
There was no expectation that this situation would be explicitly addressed by bz321111. I think that some work needs to be done to characterize the over the wire traffic between the two releases and see how they are different and then what can be done. After that, then we can look into optimizing other things.
The first thing I'd want to see is the output from nfsstat -c after the same test being run on each kernel. That should allow us to see what sort of changes we have in the numbers of each call and may help show why this is occurring.
I am testing using a -142 kernel on a 2.8GHz i686 system with 1GB of memory. I created a directory with 100,000 files and then ran testing using the mechanism described in the Description. I created a shell script which runs "nfsstat -c", "time ls -al | wc -l", and then "nfsstat -c" again. As far as I can tell, the system is behaving as expected. The first pass took about 32 seconds and generated the expected number of GETATTR (1), LOOKUP (100,000), and READDIR (788) operations. The second pass took about 17 seconds and generated the expected number of GETATTR (100,001), LOOKUP (0), and READDIR (0) operations. The fifth pass actually ran faster, taking about 5 seconds and generated the expected number of operations again, GETATTR (1), LOOKUP (1), and READDIR (0). The differences between the passes are expected. The first pass must populate the various caches, so must the read the directory using READDIR, which populates the directory contents cache, then lookup all of the files in the directory using LOOKUP, which populates the dentry cache and fills in the attribute cache for each file. The GETATTR was to open the directory and was due to the close-to-open semantics. The second pass generated no more LOOKUP and READDIR operations because of the directory reading and filename looking up was satisfied in the directory contents cache and the dentry cache. The GETATTR operations were 1 when opening the directory (due to the close-to-open semantics) and the other 100,000 to refresh the attribute cache for each file in the directory. By the fifth pass, the attribute cache timeout was long enough to be able to satisfy all of the needs for attribute information directly from the attribute cache. The seventh pass was again similar to the second pass because the attribute caches had timed out and the over the wire GETATTR operations had to be done again to refresh the caches. In short, the system that I tested is behaving as expected. I'm afraid that I'm going to need some more information to help to diagnose a problem. Is it safe to assume that exactly the same hardware, not even just similar systems, but exactly the same hardware was used to test the various operating system images?
If more information becomes available, please reopen this bugzilla and I will do more investigation then.