Description of problem: I'm using a HP Integrity rx1600 Server with 1 GB RAM for the test. After booting up, we've got this: --- snipp --- [root@rx1600 root]# free total used free shared buffers cached Mem: 1005968 151568 854400 0 22464 50736 -/+ buffers/cache: 78368 927600 Swap: 2040208 0 2040208 [root@rx1600 root]# --- snapp --- For this benchmark, I use iozone <http://www.iozone.org/> here, the application itself needs 100-300 MB RAM (depends on the parameters given to iozone). Iozone does lots of I/O things (as the name says) and this needs page cache for which is finally taken from the RAM. If too much RAM is required (happens sometimes at the tests), the Linux kernel kills applications (I think, this is the normal kernel 2.4 behaviour): --- snipp --- May 27 14:51:19 rx1600 kernel: Out of Memory: Killed process 9424 (sendmail). May 27 14:51:45 rx1600 kernel: Out of Memory: Killed process 9461 (xfs). May 27 14:51:56 rx1600 kernel: Out of Memory: Killed process 9983 (bash). May 27 14:51:57 rx1600 sshd(pam_unix)[9978]: session closed for user root May 27 14:54:40 rx1600 kernel: Out of Memory: Killed process 10142 (bash). May 27 14:54:40 rx1600 sshd(pam_unix)[10140]: session closed for user root --- snapp --- PID 10142 contained iozone, so it was also killed. But if I login again and check the memory again: --- snipp --- [root@rx1600 root]# free total used free shared buffers cached Mem: 1005968 987232 18736 0 546928 293984 -/+ buffers/cache: 146320 859648 Swap: 2040208 0 2040208 [root@rx1600 root]# --- snapp --- We've got only 18 MB free...I tried to started iozone with the same parameters again, but it was killed ~ 30 seconds after start (out of memory, too). So my result is, that the kernel seems to leak memory after killing applications through "Out of Memory", because the normal behaviour would be (or should), that the page cache is given free after the out-of-memory kill which devotes more free RAM...but it isn't case here :-( And iozone is really killed (ps -aux | grep iozone returned nothing). Version-Release number of selected component (if applicable): kernel-2.4.21-9.EL and newer How reproducible & Steps to Reproduce: Everytime, see above. Actual results: Well, a reboot for example gives me the leaked memory back. Expected results: When the kernel kills applications caused "Out of Memory" problems, the used page cache should be given free which devotes to more free RAM.
Robert, can you see if the unexpected OOM kill results still a problem with the latest RHEL3 IA64 kernel? It is located in: http://people.redhat.com/~lwoodman/IA64/ Also, FYI, the pagecache memory that was mapped into a process which was OOM-killed will not get freed up immediately. Instead, the reference count will be decremented accordingly and the pages will remain in the pagecache until they are either reused by some other process or the system deems in necessary reclaim and free them. The system is not leaking memory, its just holding on to cached filesystem data pages. If that results in pre-mature OOM kills then thats a separate problem, not a memory leak. We have made several changes to the kernel since the 2.4.21-9.EL that will delay and/or eliminate OOM kills and that what I want you to test for us. Larry Woodman Larry Woodman
At least 2.4.21-27.EL solves this issue for me - at RHEL3. RHEL4 doesn't have already this problem. Thank you, Larry :)
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-550.html