From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4b) Gecko/20030509 Description of problem: After one day of heavy activity such as compiling KDE, my system becomes terribly swappy and unusable. Inspecting /proc/slabinfo reveals several hundered thousands of allocated mm_struct objects, which would exhaust almost all my memory. I've analyzed the problem on my custom version of 2.4.20-9, but I've seen the same behavior for some months on several other kernel versions from RedHat, both older and newer. I'm not sure wether the leak is being caused by creating processes, filesystem activity or even network activity (I'm using distcc and I keep the source tree on NFS). I'm also using ReiserFS. This doesn't happen on my server which is also running 2.4.20-9 with a slightly different configuration. I'm attaching my .config file for reference. Version-Release number of selected component (if applicable): kernel-source-2.4.20-9 (and several other versions) How reproducible: Always Steps to Reproduce: 1. run big compiling job 2. cat /proc/slabinfo and look at mm_struct 3. you should see _MANY_ instances of mm_struct (>1000)
Created attachment 91758 [details] this is the .config file for which the bug shows
Created attachment 91759 [details] this is the .config file for the computer which doesn't show the bug
This appears to be fixed in 2.4.21-20.1.2024.2.1.nptl, you could probably close this bug now.
There appears to be some similar leak in the currently shipping kernel for Red Hat 9 (kernel-2.4.20-20.9). Tracking the mm_struct "allocated pages" value from /proc/slabinfo shows a steady growth over time. See for instance http://tyge.sslug.dk/bb-cgi/larrd-grapher.cgi?host=tyge.sslug.dk&service=slabinfo&graph=daily which is a graph of the slabinfo values sampled every 5 minutes. Some software appears to trigger this leak. It has been the subject of much discussion on the "Big Brother network Monitor" mailing list (http://bb4.com/) since the code implementing the Big Brother paging scheme appears to trigger this leak quite often. So systems running this monitoring system gradually goes into a thrashing mode, where everything gets swapped out and the system requires a reboot.
I'm seeing a constant leak in size-4096 on a machine running 2.4.20- 18 SMP BIGMEM, which might / might not be related to the machine finally going out of memory and going into a hang. I'm trying 2.4.20-20.9 now to see if it helps. from first look it appears like the problem might be gone. will update on long-term results.
The Problem is still in Kernel 2.4.20-24.9smp I'm using the "BigBrother network monitor" as described above by Henrik Storner. I am forced to make "planned reboots" on my RedHat-Box every 3-4 days (like it's usual for the OS from Redmond)
I have also seen this problem on a RH9 (2.4.20-28.9) system running BigBrother. I am have to reboot the server roughly every two days. mm_struct values increase until the system becomes unresponsive and must be rebooted. In my case the system is 200Mhz Pentium. I have attached a file containing /proc/slabinfo for a few hours, showing the increasing mm_struct. The system has been up for roughly 26 hours at this point and free memory is already down 50MB. This system is a server running BigBrother, Apache and not much else.
Created attachment 96838 [details] An hourly concatination of /proc/slabinfo showing the increasing mm_struct
I'm also reproducing this regularly on a RHL9 build machine: mm_struct 56138 56145 256 3743 3743 1 the machine is used as a nightly build system, copying a bunch of large tar files off NFS and building them. (this is also an Athlon box)
Dave, if you want access to the box, let me know.
Comment #5 mentions size-4096. There is a leak in ext3. See http://marc.theaimsgroup.com/?l=linux-kernel&m=106637047820058&w=2 for a patch.
This can't be the cause of the original bug report. At the time, I was exclusively using riserfs on the machine where the bug had shown up. Also, it seems unlikely that a filesystem allocates mm_struct objects. BTW, when I switched to 2.6.x, I've never seen this problem again.
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/