Description of problem: IssueTracker 50542 opened to match this BZ. RHEL3 pre-U4 kernel 2.4.21-20.5 plus two patches described in bugzilla 131525 (lwoodman's patch and my free_more_memory patch). System running OMSA omdiag system memory test. System has 256MB RAM, and 512, 1GB, and then 2GB swap available. System runs out of swap space and livelocks with each amount of swap space. Failure occurs in <3 hours. SysRq-{m,p,t] output indicates that the VM moves pages between the active anonymous, inactive dirty, and inactive laundry list repeatedly and continuously. No disk I/O is occuring. Out-of-memory killer does kill 16 threads of the omawsd32 web server daemon during the run. I'm not sure if these all happen at once, or if the system can make forward progress after killing them. Why is the system running out of swap space? Version-Release number of selected component (if applicable): How reproducible: easy on one system at Dell. Other similar systems did not fail in the same manner.
Can you attach the show_mem() output from dmesg when the OOM kill occurs? Larry
Created attachment 104451 [details] 20040928-sysrq-outofswap.txt Sorry, it was attached in the IssueTracker, just not in the bugzilla. The top 16 or so Show Mem's come from the oom_killer killing an omawsd32 thread. Following that are my pressing sysrq-[mptw] repeatedly.
Matt, do you know what the "OMSA omdiag system memory test" does? Do you really think it uses up all of the memory and swap space or do you think we are leaking swap space? Can you grab a "ps aux" when the system is in this state so we can see the VSS and RSS of every process? Larry
Something is leaking swap space, I don't know if it's the tool itself or the kernel. omdiag (I don't have the source to it, but it is Dell-developed just not open source) first allocates as much memory as it can, up to 95% of system RAM via calls to malloc() in large (1MB+) blocks, reducing the block size as malloc() fails, until it's gotten to 95% of RAM. For systems with >2GB RAM, it forks itself until it can malloc up to the 95% point. It then spawns two threads per process, one of which touches each allocated page in a loop, which keeps its VSS=RSS and most of its pages on the active anonymous lists. The second thread write/read/compares each byte of each page. After a while it decides it's finished, and frees all the memory again, and starts over. The goal is to induce single-bit-correctable ECC memory errors. In practice it beats on the VM until it cries uncle. This is the same tool that induced the kswapd deadlock we've just fixed. The processes are named omdiag and memorytestprocess. At the point of failure, no additional shell work is possible. I did happen to have a 'top' running on another VT, which I can still see the first 15 or so processes on. It's showing six instances of memorytestprocess, each with SIZE=256M and RSSs of 1396, 1336, 1364, 1264, 1500, 1704, and one instance of omdiag with SIZE=9944 and RSS=1752. You've got a point, there shouldn't be six memorytestprocess processes listed on this config, there should only be one at a time because there's <2GB RAM so it need not fork to be able to allocate all RAM. /me is going to talk to the omdiag writers...
Even if the app does manage to accidentally run the system out of swap space, the kernel shouldn't livelock.
Agreed, I wonder if the OOM killing of tasks is not resulting in the freeing of memory and/or swap space as it was designed to do? Larry
maybe, but the processes which were oom_killed were little web server threads, not the larger memorytestprocess processes. So the oom_killer may have done little good anyhow in this case.
note, only 16 (or in another case, 11) threads of omaws32 were killed, when there were likely 30+ threads running. We know for a fact from the sysrq-t that there were more threads. Therefore, no memory was reclaimed during the kill. See 2.4-bk for wli's patch on 13-Aug-2004 to mm/oom_kill.c which fixes this by taking the task_lock and mmlist_lock when reading p->mm (which could otherwise wind up NULL accidentally), and calling mmput (mm) on the mm for the whole process to make sure the whole mm is freed.
A newer build of omdiag (AppCD 4.1.0 rev A00) no longer induces this failure. It correctly cleans up the extra memorytestprocess processes on exit, as well as only allocates 85% system RAM for the test rather than 95%. However, I believe the kernel failure is still real and a valid bug to fix.
The related IT was closed, so closing this.
I don't believe we actually did anything to fix this, so I'm changing the disposition to WONTFIX.