From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5a) Gecko/20030729 Mozilla Firebird/0.6.1 Description of problem: We are running a system without swap (necessary because Linux aggressively swaps and disk utilization is at a premium). If we have a program that leaks memory, eventually, the system freezes in what looks like a kswapd runaway. Here is top output from moments before the freeze: 08:29:02 up 3 min, 3 users, load average: 0.21, 0.11, 0.04 52 processes: 50 sleeping, 2 running, 0 zombie, 0 stopped CPU0 states: 0.0% user 0.0% system 0.0% nice 0.0% iowait 100.0% idle CPU1 states: 8.0% user 91.0% system 0.0% nice 0.0% iowait 0.0% idle CPU2 states: 0.0% user 1.0% system 0.0% nice 0.0% iowait 98.0% idle CPU3 states: 0.0% user 34.0% system 0.0% nice 0.0% iowait 65.0% idle Mem: 2063936k av, 2053552k used, 10384k free, 0k shrd, 208k buff 1947248k active, 1660k inactive Swap: 0k av, 0k used, 0k free 34196k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 2115 root 25 0 1864M 1.8G 32 R 98.6 92.5 0:06 1 usemm 11 root 15 0 0 0 0 DW 34.6 0.0 0:00 3 kswapd 1876 root 15 0 1412 1412 1176 S 1.7 0.0 0:00 1 sshd 2091 root 15 0 728 728 540 R 1.7 0.0 0:00 2 top And here is the source code of the usemm programm designed to make the system freeze: #include <stdio.h> #include <malloc.h> #include <errno.h> int main(int argc, char *argv[]) { size_t chunksize = atoi(argv[1]); unsigned char *cptr; int i; while (1) { cptr = (char *)malloc(chunksize * sizeof(char)); if (cptr == NULL) { fprintf( stderr, "Unable to allocate %d bytes: %s\n", sizeof(char) * chunksize, strerror(errno) ); exit(-1); } /* now make it all dirty. */ for (i = 0; i < chunksize; i++) { cptr[i] = (unsigned char)(i&0xff); } /* do it again; let our memory leak! */ } } This is reproducible whether the system is running under load or has just booted up. Now I understand that running out of memory on a system is a Bad Thing(tm), but the kernel should be able to kill these processes, no? If anything, when the program goes to malloc memory and there is no more free, it should fail and exit, but that never happens. It looks like the kernel goes into a state where its either trying to kill the process and can't or doing so much work trying to free memory that it doesn't have time for anything else. What's worse is that while this program chews through the memory and freezes the system in seconds, a slow memory leak (literally over the span of weeks) causes the same behavior. It seems that no matter how slow you go, the process never fails to allocate memory; the system always freezes first. Version-Release number of selected component (if applicable): kernel-2.4.20-19.9 How reproducible: Always Steps to Reproduce: 1. run usemm program 2. 3. Actual Results: System freezes. Kswapd is usually using a lot of CPU before system becomes unresponsive. Expected Results: Process should die when it allocates more memory than it's able to. Either that or the system should panic or something. The box shouldn't freeze up like this. Additional info: While we tested on systems without swap, we were able to reproduce the problem in the early stages on systems with swap. Thinking that excessive swapping may have been the problem, we disabled swap and got the same results. sysctl vm settings: vm.max_map_count = 65536 vm.max-readahead = 31 vm.min-readahead = 3 vm.page-cluster = 3 vm.pagetable_cache = 25 50 vm.kswapd = 512 32 8 vm.overcommit_memory = 0 vm.bdflush = 30 500 0 0 50 300 60 20 0
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/