Bug 31617

Summary: [VM balance] top/ps/procinfo lock up under high load.
Product: [Retired] Red Hat Linux Reporter: Ed McKenzie <eem12>
Component: kernelAssignee: Michael K. Johnson <johnsonm>
Status: CLOSED WONTFIX QA Contact: Brock Organ <borgan>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2001-08-09 18:58:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
spawn.py none

Description Ed McKenzie 2001-03-13 03:05:06 UTC
Under high load, top/ps/procinfo stop updating, while other applications
seem to keep running (slowly, of course.) top can lock up and stop updating
for two to three minutes at a time!  I would hope that psutils would be no
less responsive under load, as that's basically the only way to rescue a
box that is thrashing to death.

Comment 1 Glen Foster 2001-03-19 15:26:29 UTC
We (Red Hat) should really try to fix this before next release.

Comment 2 Arjan van de Ven 2001-03-27 16:16:30 UTC
A fix for this is in Linus kernel 2.4.3-pre8
Unfortionatly, this fix has caused major instabilities.
We're hoping to get a stable fix as soon as possible

Comment 3 Ed McKenzie 2001-06-12 02:42:52 UTC
I don't think 2.4.3 fixed it, as 2.4.5-0.2.9 is no better than the 7.1-release
kernel.

I've attached a python script to help demonstrate the problem.  Run this script
in one terminal and wait for it to stop spawning threads; after it's done, run a
ps aux in another terminal.  Alternately, if you have *alot* of patience, try
starting top in another terminal.

It's not nearly so bad under normal circumstances, but in any case ps and top
aren't very useful under high load.  FreeBSD does quite well on this test
compared to Linux 2.4.

Comment 4 Ed McKenzie 2001-06-12 02:51:18 UTC
Created attachment 20803 [details]
spawn.py

Comment 5 Ed McKenzie 2001-08-09 18:58:08 UTC
It turns out this problem is caused by VMA fragmentation in heavily threaded 
processes.  Watching ps output, there's a marked slowdown as it prints stats 
for threads.  glibc issue?  Kernel issue?  Who knows.  For now, don't run lots 
of threads on Linux ...