Description of problem: When running autofs and accessing/expiring mount points repeatedly, the system will slowly leak kernel memory. On my 128MB celeron, an over-night run of the below test caused the system load average to hit 5, and the system to be pretty unusable. The size-256 slab looks like so: size-256 365535 365535 256 24369 24369 1 : 252 63 Note that the system only has 128MB of ram, and this is taking roughly 90MB! Version-Release number of selected component (if applicable): 2.4.21-28.ELsmp How reproducible: I'm not sure yet, but it is a very controlled test case, so I will know soon. Steps to Reproduce: 1. Setup a server to export /export/<dir> 2. Create a number of directories under /export/<dir> like so: for n in `seq 1 48`; do mkdir $n; touch $n/$n; done 3. Now, on the client, create an automounter map to point at your server: (replace <IP> with the server IP) for n in `seq 1 48`; do echo "$n <IP>:/export/<dir>/$n" >> /etc/auto.test ; done 4. Create an entry in the auto.master on the client to point at this map: echo "/test /etc/auto.test --timeout=1" >> /etc/auto.master 5. Restart autofs on the client. 6. ensure that nfs services are running on the server 7. On the client, create a script that does this: while true; do for n in `seq 1 48`; do ls /test/$n ; done; sleep 2; done 8. For my test, I ran two copies of this script on the client. Actual results: The kernel will slowly leak memory, and the system will begin to swap. The load average will climb, as well. Expected results: The kernel should not leak memory! Additional info: I'm in the process of trying to reproduce this without autofs in the picture. I get messages like this in the kernel log: nfs_get_root: getattr error = 5 nfs_read_super: get root inode failed nfs warning: mount version older than kernel RPC: Can't bind to reserved port (98). So the error cases for nfs are suspect. If I can't reproduce this without autofs, then I'll try to reproduce without nfs. This can be achieved by making the client also the server. The client automount map would then point to localhost, and the automound daemon will do bind mounts instead of nfs mounts.
Created attachment 111630 [details] sysrq-m output when the problem occurs
Created attachment 111631 [details] contents of /proc/slabinfo
Created attachment 111661 [details] Fix a memory leak in autofs4_wait This patch resolves the problem on my system. Tests successfully ran over night with no growth in the size-256 cache.
Fix the summary.
A fix for this was committed to the 2.4.21-32.1.EL kernel buld. Setting status to MODIFIED.
A fix for this problem was committed to the RHEL3 U6 patch pool on 20-Apr-2005 (in kernel version 2.4.21-32.1.EL). *** This bug has been marked as a duplicate of 160392 ***
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html