Bug 109019

Summary: System not responding with processes waiting on wakeup_kswapd
Product: [Retired] Red Hat Linux Reporter: yuval yeret <yuval>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: high    
Version: 7.1CC: riel
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-01-05 04:06:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yuval yeret 2003-11-04 14:11:01 UTC
Description of problem:
While under significant IO load, a p4 hyperthreaded SMP machine (2 
real CPUs) configured for HIGHMEM support gets into a zombie state 
where it can answer pings, respond to magic keys, etc. but login gets 
stuck and most other processes are not responding. 
Magic keys output shows most processes in D state, with the following 
call trace (only difference is the actual process name)
Oct 12 09:27:26 node0 kernel: crond         D C53D6000     0 23425   
1392         23426 23424 (NOTLB)
Oct 12 09:27:26 node0 kernel: Call Trace:   [wakeup_kswapd+282/320]  
(0xc670ded0))
Oct 12 09:27:26 node0 kernel: Call Trace:   [<c0140cfa>]  
(0xc670ded0))
Oct 12 09:27:27 node0 kernel: [__alloc_pages+222/848]  (0xc670def8))
Oct 12 09:27:27 node0 kernel: [<c0141e8e>]  (0xc670def8))
Oct 12 09:27:27 node0 kernel: [__get_free_pages+16/32]  (0xc670df24))
Oct 12 09:27:27 node0 kernel: [<c0142110>]  (0xc670df24))
Oct 12 09:27:27 node0 kernel: [pipe_new+18/176]  (0xc670df28))
Oct 12 09:27:27 node0 kernel: [<c0154c02>]  (0xc670df28))
Oct 12 09:27:27 node0 kernel: [get_pipe_inode+28/144]  (0xc670df34))
Oct 12 09:27:27 node0 kernel: [<c0154ccc>]  (0xc670df34))



Version-Release number of selected component (if applicable):
2.4.20-18 RH errata kernel (compiled from sources)



How reproducible:
Run a lot of load comprised of both IO and heavy CPU usage, with 
memory quite filled. 

  
Actual results:
System in zombie state. can answer to ping. cannot answer to rsh/ssh. 
login to console doesn't progress beyond typing the user name. 
Magic keys respond and show a lot of processes in D state, including 
system processes such as sshd, crond, etc. 

Expected results:
No matter what IO/CPU load system should remain functioning, even if 
very slowly. 

Additional info:
This seemed related:
1. http://groups.google.com/groups?hl=en&lr=lang_en|lang_iw&ie=UTF-
8&oe=UTF-8&safe=off&selm=20030109023006%245fb4%40gated-
at.bofh.it&rnum=10

Comment 1 Dave Jones 2004-01-05 04:06:17 UTC
Try with 2.4.20-27.7 errata kernel, which fixes lots of problems in
this area. Also note that RHL 7 is now end of life since Jan 1st.