Bug 133183
Summary: | cpio with many files flips kswapd, system hangs | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Roberto Bourgonjen <otrebor> | ||||||
Component: | kernel | Assignee: | Larry Woodman <lwoodman> | ||||||
Status: | CLOSED ERRATA | QA Contact: | |||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3.0 | CC: | dlewis, joe, petrides, redhat-bugzilla, riel, say, tao | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2004-12-20 20:56:40 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Roberto Bourgonjen
2004-09-22 09:08:01 UTC
Roberto, for starters can you get me several top, AltSysrq-M and AltSysrq-W outputs when your system is in this state. Thanks, Larry Woodman Larry, I don't think I manage to do that. The server is back at the coloc, running Fedora 2 now (already had one server crash but that is another story). I will see if I have time to run this on my development server (which has only 3Gig mem AND slightly less fast harddisks, which might make a difference), but that'll be next week the earliest. Seems bug #124058 might be related. Roberto, when you get a chance can you test out the fix for this problem?
Its located in:
>>>>http://people.redhat.com/~lwoodman/.RHEL3/kernel-smp-2.4.21-22.prune_icachefix.EL.i686.rpm
Thanks, Larry
Created attachment 105537 [details]
captured sysrq data during various times
This is the sysrq information as tiome stamped on the system.
Created attachment 105540 [details]
Screen Captures of TOP
This is the screen capture of TOP and some other comments.
I have exactly the same kswapd issue and have documentation attached. Please look at this as soon as possible as system is not useable. All I have to do is restore many files from tape using cpio, or copy files with scp or rcp accross the network. Don, this is with the latest kernel I posted? Also, please get me a few AltSysrq-T outputs so I can see where kswapd is hanging out. Larry Larry: I don't know about kernels that you have posted. I have been around Unix a long time but Redhat is somewhat new to me. I only have what up2date would have provided. Kernel 2.4.21-20.ELsmp. Can you give me more instructions on the AltSysrq-T? I will have to start the test up again as the system has been re- booted.
Don, can you grab this kernel and give it a try?
>>>>http://people.redhat.com/~lwoodman/.RHEL3/kernel-smp-2.4.21-22.prune_icachefix.EL.i686.rpm
Larry
Larry: We have installed your prune_icachefix Kernel and run our cpio test. The system never became bogged down by the kswapd daemon. The system load seemd a bit high for the actual work being done, around 1.4. When can we expect a production fix and release of this Kernel. Thanks for your help on this matter. Don Lewis I have also tried the prune_icachefix kernel, and a simple perl script fetching images from a webcam and analyzing it with GD, that runs very steady on the 2.4.21-20.ELsmp kernel, now eats memory like crazy, expanding to over 5 Gb within half an hour. So I am afraid there are some real serious side-effects to your solution. A fix for this problem has just been committed to the RHEL3 U4 patch pool this evening (in kernel version 2.4.21-23.EL). Hi there. Is the patch for this issue generally available ? I too have the same problem with kswapd taking down my systems (built several new servers for a critical project with RHEL AS 3.0 U3). The link to the patch ealier in this thread is no longer valid and I need to get my systems operational. THANKS !!! Joe The fix is in the latest U4 kernel, which is in beta test right now (and is available in the RHN beta channel). However, there will be another respin next week, so the -23.EL kernel is not exactly what will be released in Update 4. I would advise waiting until the final U4 is released (beginning of December). Unfortunately I can't wait a month for this... when I look at the RHN AS3 Update 4 beta channel these are the only kernel pachages I see. glibc-kernheaders-2.4-9.1.87.i386.rpm kernel-2.6.9-1.648_EL.i586.rpm kernel-2.6.9-1.648_EL.src.rpm kernel-smp-2.6.9-1.648_EL.i586.rpm kernel-utils-2.4-13.1.37.i386.rpm Are those the correct kernel packages ? Thanks again, Joe Joe, you're looking in the wrong channel. The kernel version is 2.4.21-23.EL, and 2.4.21-24.EL will be built tonight (but won't be available in RHN for about a week). Ah ! I found the right location this time. Thanks for your help with this. Joe Ernie, For our internal policy we must build custom kernel. We have the same problem that has been described. Please, could you give us direct link to the patch to fix this problem? Another variant: please, give tip how can I find src.rpm of the kernel with fix of the kswapd problem in RHN? I looked at it but did not see any way to find src.rpm :( The relevant RPM is kernel-source-2.4.21-27.EL.i386.rpm, which can be found in the i386 subdirectory of the following URL: ftp://partners.redhat.com/a61d109e2483b0bf579b0b5f90a5ea8c/2.4.21-27.EL/ The kernel (along with the rest of U4) is scheduled for release on 20-Dec-2004, at which time you will be able to find it in the main RHN channel(s). An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-550.html |