From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20040913 Firefox/0.10 Description of problem: Error affects GeoViz, an application from SIS. When memory is allocated and freed, this is not available, until 15-30 minutes later, even if the application exit. Apparently, this is similar to https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=124576 Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.Run a program that allocates more than 50% of available RAM + Swap 2.Exit 3.Immediately rerun the above program Actual Results: Swapping or memory allocation error Expected Results: no memory allocation error Additional info:
First of all, the normal system behavior is to hold on pagecache pages that a process accessed rather than freeing upon exit. This works this way so that when another process accesses the same file pages they will already be in memory and won't have to be re-read. What this means to users and/or casual observers is that before you start running an application there is more free memory than when the application terminates. Having said that, the kernel is responsible for reclaiming pages when a memory deficit occurs. We have made several changes to the kernel in RHEL3-U3 and RHEL3-U4 that expedite reclaiming pagecache pages rather than swapping anonymous memory pages when the system runs out of memory. You really need to run the very latest RHEL3 update kernel in order to have the system reclaim pagecache pages instead of swapping. Can you test with the latest RHEL3-U4 kernel? We can provide you with the latest RHEL 3 U4 kernel (2.4.21-22.EL) once it emerges from QA (later today or tomorrow). The current RHEL 3 U4 beta ISOs, which are posted to Red Hat Network (RHN) contain the 2.4.21-21.EL DO NOT contain the fix that you'll need. We'll send mail and also post the location of the -22.EL kernel here.
Arun, please try the latest RHEL3-U4 kernel and see if that fixes the problem you are seeing. Like I said above on 10/27 its normal behavior for the Linux kernel to hold onto filesystem cache pages so you cant expect the the system to have the exact number of free pages when the program ends as it did when it first started. However if this is causing memory allocation failures especially at the user level there is certainly a bug somewhere. Please try out the latest RHEL3-U4 kernel located here: ftp://partners.redhat.com/a61d109e2483b0bf579b0b5f90a5ea8c/2.4.21-27.EL/ Larry
Arun, we are waiting on you to confirm whether this problem has been fixed. Please download the appropriate RPM from the URL specified in comment #4 and update this bug report. Thanks in advance. (The kernel version should be 2.4.21-27.EL.)
Actually, my team attempted to use this and the feedback is enclosed. This link only points to the single processor version. We also need the "smp" version of this. I tried going up one level and adding "smp" to the link and neither gets me to a place where I can find the "smp" version. There is no sense testing this unless we can also verify that it works in multiple processor mode. Can you contact the people at RedHat and get a link to the "smp" version?
Arun, its there. This link points kernel rpms for every kernel configuration for every architecture we build: ftp://partners.redhat.com/a61d109e2483b0bf579b0b5f90a5ea8c/2.4.21-27.EL/ Specifically this is the exact link to the x86_64 smp kernel: ftp://partners.redhat.com/a61d109e2483b0bf579b0b5f90a5ea8c/2.4.21-27.EL/x86_64/kernel-smp-2.4.21-27.EL.x86_64.rpm Larry Woodman
Thanks! We will give it a try right away. Cheers, Arun
Hello, Arun. Any confirmation that Larry's test kernel resolved this problem? Also, there have been two security errata issued since U4, so if you're going to begin your testing from the beginning, it would be better to use the latest 2.4.21-27.0.2.EL package from RHN (which was released last night).
I can confirm based on our testing that we are no longer seeing this problem with U4. Thanks for all your help, it is much appreciated. Thanks, Arun
A fix for this problem was committed to the RHEL3 U4 patch pool on 23-Sep-2004 (in kernel version 2.4.21-20.11.EL), and this was later followed up by a 2nd fix committed on 18-Oct-2004 (in kernel version 2.4.21-22.EL). Thanks for the confirmation, Arun.
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-550.html