Bug 78759
Summary: | The system randomly hangs on kswapd | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 2.1 | Reporter: | Pietro Dania <p.dania> | ||||||
Component: | kernel | Assignee: | Larry Woodman <lwoodman> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Brian Brock <bbrock> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 2.1 | CC: | bnocera, christoph.pirchl, john-redhat, jon.nangle, jure.pecar, kmannth, lcm, mark, oivpe, p.ready, ray, rob, scott.carlson, shibata, steve, tao | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2007-10-19 19:25:47 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Pietro Dania
2002-11-29 07:55:37 UTC
I am confident that we have fixed the AS2.1 kswapd hand problems. Would it be possible to try the new kernel out on a test machine??? The kernel can be found in: http://people.redhat.com/lwoodman/.private/ Thanks, Larry Woodman Larry, I am trying out your kernel on a test machine and it does not appear to fix the problem. 18 root 15 0 0 0 0 SW 0.0 0.0 0:18 kswapd kernel 2.4.9-e.8corruptsdata.17enterprise is there anything else i can try? We have a machine which may be suffering from the same problem: root 18 1.6 0.0 0 0 ? SW Nov28 350:06 [kswapd] However the machine is "stable", it doesn't crash. It will regularly slowdown for 15-60 seconds. During this time the machine is locking up for a few seconds at a time and then recovering. "while usleep 500000; do date; done" will skip seconds during the slowdowns. This is a 4 (physical) processor Xeon box with Hyperthreading. Kernel: 2.4.9-e.10summit. Is there a "summit" kernel I can test on this machine? Mark, that is the same symptoms i am having. You have an x360 or x440? I see the problem on our 2 x360s. I am running the summit kernel, but have tried all the other kernels with no success. I finally called Redhat and they asked me to run some stats (kernel profiling) when the problem happens and send to them. I can send this email to you. I would suggest you call Redhat and report the problem also... if this isn't fixed soon i will have to switch back to 7.3. I'd like to make some remarks. In order to evaluate 2.1AS, should my customer decide to purchase it, i rebuilded it from SRPMS and deployed it by hand on top of a 7.2 (ugly but easy way). I only had UNCERTIFIED hardware available for testing (HP lp1000r/lp2000r). Everything seemed to work fine; i set up a couple of FOS clusters running a very light service (a multicast streaming server) and didn't observe any slowdown. I then installed the machine described above and started experiencing the problem. Ray, yes, this is happening on an x440. I'm also having this problem on a Compaq DL380 w/2 Proc & 4GB Ram with Multithreading enabled. kswapd consumes 99.9% of the CPU for 30-45 minutes at a time once the available memory that I have gets rather low. We're running a very high volume sendmail application on this server that occasionally gets 3000 sendmail children processes running on it at a time. 6 root 25 0 0 0 0 RW 99.9 0.0 671:16 kswapd More stuff if needed. I have the information to turn on profiling, but since this is a production system, we have not done that yet. Linux version 2.4.9-e.8smp (bhcompile.redhat.com) (gcc version 2.96 2 0000731 (Red Hat Linux 7.2 2.96-108.1)) #1 SMP Fri Jul 19 15:38:30 EDT 2002 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009f400 (usable) BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000008fffc000 (usable) BIOS-e820: 000000008fffc000 - 0000000090000000 (ACPI data) BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved) BIOS-e820: 00000000ffc00000 - 0000000100000000 (reserved) Scanning bios EBDA for MXT signature 1407MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000f4fd0 hm, page 000f4000 reserved twice. hm, page 000f5000 reserved twice. hm, page 000f3000 reserved twice. hm, page 000f4000 reserved twice. On node 0 totalpages: 589820 zone(0): 4096 pages. zone(1): 225280 pages. Intel MultiProcessor Specification v1.4 Virtual Wire compatibility mode. OEM ID: COMPAQ Product ID: PROLIANT APIC at: 0xFEE00000 Processor #3 Pentium(tm) Pro APIC version 16 Processor #0 Pentium(tm) Pro APIC version 16 I/O APIC #8 Version 17 at 0xFEC00000. I/O APIC #2 Version 17 at 0xFEC01000. Processors: 2 Kernel command line: ro root=/dev/cciss/c0d0p9 Initializing CPU#0 Detected 1396.545 MHz processor. Console: colour VGA+ 80x25 Calibrating delay loop... 2785.28 BogoMIPS Memory: 2308028k/2359280k available (1976k kernel code, 42672k reserved, 107k da ta, 244k init, 1441776k highmem) I think we are seeing the same thing with a Proliant DL760R PIII900 with 8 processors and 8GB RAM. We had kswapd problems under RH6.2, which were resolved by upgrading the kernel from 2.4.2 to 2.4.16 (aka "the Google problem"). But it hung again when we upgraded from 4cpu/4GB RAM to this machine, so now we are testing Advanced Server. Hoping to put this machine in production next week. Is there a way to confirm this is the issue? Any monitoring we should be doing to help diagnose? Any update on resolution? Created attachment 89454 [details]
System profiling
Here is some profiling which was done while experiencing the slowdowns, as
requested.
We are getting ready to release an AS2.1 errata kernel to fix the kswapd issue you are seeing. You can try out a beta release of that errata kernel if you like, it can be downloaded from: http://people.redhat.com/lwoodman/.private/ Larry Woodman After applying the 2.4.9-e.12summit kernel errata the problem still appears to be occuring. Hello, I have been testing 2.4.9-e.12summit on a x440. I think I have found the root of the problem. kswapd can't ever reclaim slabcache memory. After running some I/O heavy scripts the memory state of the machine can be described as total: used: free: shared: buffers: cached: Mem: 16880508928 7220150272 9660358656 0 978944 6545244160 Swap: 2146754560 0 2146754560 MemTotal: 16484872 kB MemFree: 9433944 kB MemShared: 0 kB Buffers: 956 kB Cached: 6391840 kB SwapCached: 0 kB Active: 3731408 kB Inact_dirty: 444 kB Inact_clean: 2660808 kB Inact_target: 4121192 kB HighTotal: 15859160 kB HighFree: 9432412 kB LowTotal: 625712 kB LowFree: 1532 kB SwapTotal: 2096440 kB SwapFree: 2096440 kB BigPagesFree: 0 kB slabinfo - version: 1.1 (SMP) .... mnt_cache 13 30 128 1 1 1 : 252 126 inode_cache 698012 698145 512 99735 99735 1 : 124 62 dentry_cache 512 1860 128 62 62 1 : 252 126 dquot 0 0 128 0 0 1 : 252 126 filp 929 1110 128 37 37 1 : 252 126 names_cache 12 12 4096 12 12 1 : 60 30 buffer_head 953133 1265100 128 42170 42170 1 : 252 126 mm_struct 186 195 256 13 13 1 : 252 126 cat /proc/sys/vm/pagecache 2 50 70 This snapshot was taken about an hour after scripts had finished. Checks on top show kswapd using lot's of cpu time trying to reclaim as much lowmem as it can. During this state the box can easily hang with a strong attack on lowmem usage. If the rest of lowmem is needed kswapd will will go into a loop and stay there (at least 10-20 min) which in my book is hung. There needs to be a way for the slabcache pages to be reclaimed. Thanks, Keith Keith, can you get us an "AltSysrq m" output and a full /proc/slabinfo output when the machine is in this state and attach it to this bug? FYI, we did add code to AS2.1 to cause some of the kernel data structures to be reclaimed from the slab when there is a lowmem shortage. Perhaps we need to be more agressive for 16GB systems though. Thanks, Larry Woodman Created attachment 90320 [details]
meminfo slabinfo and top
This is output from meminfo slabinfo and top when the kernel is
in a low- lowmem state. (Kernel is 2.4.9-e.12summit)
We have a 2x1400 mhz P-III dl380 w/4 gB running 2.4.9-e.12enterprise. I think we might have the same problem: This machine becomes unusable during 1) nfs copies of big files or 2) dd'ing of big files (e.g., dd if=/dev/zero of=/opt/2gB-testfile bs=1024 count=2M). You can't login to the machine on the console; commands like "ls", "free" take minutes to return. If we boot into non-RHAS 2.4.20 kernel, everything works as it should. Boot back into 2.4.9-e.12 (or any of the older ones), then problem exists. We are also seeing this on a 2x2.8GHz/6GB hyperthreaded DL380 G3 running 2.4.9- e.12enterprise. Larry, I notice that you have uploaded a -14 build to your website - is this worth trying or is it still a work in progress? Jon, please try the kernel in http://people.redhat.com/lwoodman/.private/ It specifically flushes inodes and bufferheaders so they can be returned to the slab cache and then the slab caches pages can be reclaimed. This should free up lowmem. Please let me know how this works. Larry Woodman Yep I am having the same problem. It only happens on 440s and 360s. Kernel I am using is 2.4.9-e.12enterprise. I find that it is easy to reproduce. I take a machine say with 2GB Memory, look at top. Shows 2GB installed, 1.4GB free, 100MB Bufer and 300MB Cached and then: scp -pr root@localhost:/usr /other/area. for about 20 Minutes the copy streams fine, then it crawls to a halt. a du in /other/area shows apx 1.4GB copied over, all activity is now affected, a ps 14+ seconds. A top shows: that the 1.4GB of free memory has moved to cache (not buffer)and that we now only have apx 5MB free. not enough to run smoothly, the kswapd is busy but not flatout. I have noticed that bdflush does not kick into life that often and that even though I only have 6 -> 4 MB free swap is running a 0% used. I have tried this on MDK 9 amd RH 8.0 both are OK. Regards Paul Hi Guys, I've been testing 2.4.9-e.14.1enterprise and it looks like the problem is fixed. Although it drops down to 5 MB at times it often jumps back to 15 / 20 MB, then after about 20 Mins. free jumps to 80 MB. Thanks. Paul Hello all, I have tested 2.4.9-e.14.1summit on an 16gig x440 box and the problem seems to be fixed. It is way better than before, I have been unable to hang to box due to kswapd issues (unlike e.12). Thanks for looking into this. Keith The new enterprise kernel is working well for us too - nice job. thanks! Jon I've noticed that a new kernel (2.4.9-e.16) has been released, and I've been asked if this kernel contains the fixes that was applied to 2.4.9-e.14.1, can you confirm if the new release also conatin this fix? Thanks Oivind We're still seeing this, albeit nowhere near as often as before. One of our DB servers (running Sybase on raw partitions) is getting this about once a day, usually when we are backing up a large DB to the filesystem (with several stripes). Is there something I can tune under /proc/sys/vm that will alleviate the problem? I don't mind giving up some buffer/cache space if it means that kswapd can recover more gracefully when under very heavy pressure. We have found the problem is still occuring with the e.16summit kernel. However it seems to occur a bit less often now than with the earlier kernels. I have increased the values in /proc/sys/vm/freepages to attempt to give the system a little more room for manoevure. I might have some usefull info to add to this bug. My setup is a ~500gb reiserfs volume used for cyrus mailstore on a dual p3 1.26ghz box with 2gb ram. With 2.4.9-e.3 i could measure deadlocks caused by kswapd in tens of minutes, so the box was useless as a server. kswapd used more than 1300 minutes of cpu time in about two days of uptime. The box never touched swap partition at all. Upgrading to 2.4.9-e.16 makes noticeable difference. kswapd still kicks in ocasionally, but it has only spent ~50 minutes of cpu time in two and a half days of uptime. What bothers me is that when it starts its job, the box still slows down to the point of being unusuable, but fortunately this time is now measured in seconds. FYI: for this specific workload (cyrus mailstore), i found out that -aa kernels give much better performance and also much better 'feeling' (responsiveness) of the server. I'll try to do some tests when i finish migrating users from the old system to this new one. Our 4 servers w/ RH-AS2.1 aloso randomly hangs and/or slowdown by kswapd CPU eater. At 1st, kernel-2.4.9-e3 hangs very often; once or twice a day. After we version-UPed kernels to 2.4.9-e16, we get slowdowns twice in a week. Followings are our 'top' result during latest slowdown. At that time, only 3 cron batch jobs were running . If you need more infomation, Please let us know what should we show. And Please this bug fix priority to HIGH. Best Regards, Hisaaki Shibata ===== 2003/05/20 23:59:19 ===== 11:59pm up 5 days, 14:49, 0 users, load average: 0.08, 0.06, 0.01 166 processes: 164 sleeping, 2 running, 0 zombie, 0 stopped CPU0 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU2 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU3 states: 0.1% user, 0.0% system, 0.0% nice, 99.0% idle Mem: 2058832K av, 2044540K used, 14292K free, 652K shrd, 219500K buff Swap: 4192944K av, 76K used, 4192868K free 1653216K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND | [snip] | 10 root 15 0 0 0 0 SW 0.0 0.0 2:15 kswapd | ===== 2003/05/21 00:10:52 ===== 12:10am up 5 days, 15:00, 0 users, load average: 73.86, 67.02, 36.97 191 processes: 189 sleeping, 2 running, 0 zombie, 0 stopped CPU0 states: 0.0% user, 0.1% system, 0.0% nice, 99.0% idle CPU1 states: 0.1% user, 0.1% system, 0.0% nice, 98.1% idle CPU2 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU3 states: 16.0% user, 3.0% system, 0.0% nice, 80.0% idle Mem: 2058832K av, 2045536K used, 13296K free, 652K shrd, 225480K buff Swap: 4192944K av, 76K used, 4192868K free 1644660K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND | [snip] | 10 root 15 0 0 0 0 SW 0.0 0.0 12:17 kswapd | ===== 2003/05/21 00:12:53 ===== 12:12am up 5 days, 15:02, 0 users, load average: 12.72, 45.89, 32.86 180 processes: 177 sleeping, 3 running, 0 zombie, 0 stopped CPU0 states: 0.0% user, 0.1% system, 0.0% nice, 99.0% idle CPU1 states: 90.0% user, 9.0% system, 0.0% nice, 0.0% idle CPU2 states: 0.1% user, 0.1% system, 0.0% nice, 98.0% idle CPU3 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle Mem: 2058832K av, 2053496K used, 5336K free, 652K shrd, 240600K buff Swap: 4192944K av, 76K used, 4192868K free 1635976K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND | [snip] | 10 root 15 0 0 0 0 SW 0.0 0.0 12:17 kswapd | Hi, Tried the memtest.sh Script from http://people.redhat.com/dledford/memtest.html The System crashes or hung after 20 minutes running the test, also the kswapd has 99.9% cpu load. Seems to be the same proble as described above !? Tried this with kernel-2.4.9-e.12, e.16 and also e.20 The behavior is slidly different, but the symptoms are the same. Any idea what else i can try System Info IBM X440 4 x 2,4 GHz Xeon 8 GB Ram ext. FAStT700 with QLogic QLA 2312 thanks in advance CHristoPh I extracted the kernel source from the Redhat 9 kernel (rpm2cpio) and have been running that on my AS systems. it has been going for about a week with no major problems, and seems to do a good job of keeping memory free (with the kscand threads). i believe this kernel (like -ac) uses the rmap VM. i have given up on the AS kernel until Redhat puts this bug at a higher priority and fixes it for good. There are obviously still some major problems in the AS kernel VM. We have been dealing with this bug since December and have not seen any progress. It has left a very bad taste in my manager's mouth, and he will question the use of AS or any linux for future large-scale projects. Sorry to get political in a bug report, but that is the reality. If any of you are still having problems, try the RH9 kernel. As people are updating this bug, and I'm on the CC List. Please remember that bugzilla is not a support channel. If you want focused help on your problems, contact your local Red Hat support. BTW, I believe that the 2.4.9-e.24 kernel has fixed these issues. See https://rhn.redhat.com/network/errata/details/index.pxt?eid=1698 for details Cheers Updatet to kernel 2.4.9-e.24 still no changes. After running 45 minutes the memtest.sh script, the server hangs ! 12:40pm up 1:42, 3 users, load average: 3,79, 11,97, 17,68 113 processes: 110 sleeping, 3 running, 0 zombie, 0 stopped CPU0 states: 1,5% user, 61,2% system, 0,5% nice, 36,4% idle CPU1 states: 0,2% user, 49,0% system, 0,2% nice, 50,1% idle CPU2 states: 2,3% user, 59,0% system, 0,0% nice, 38,0% idle CPU3 states: 0,1% user, 84,4% system, 0,3% nice, 14,2% idle Mem: 8240660K av, 4670672K used, 3569988K free, 0K shrd, 1288K buff Swap: 2096220K av, 0K used, 2096220K free 3837888K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 10 root 25 0 0 0 0 RW 99,9 0,0 35:42 kswapd 2189 root 39 0 1112 1112 824 R 13,7 0,0 14:47 top Any suggestions ? CHristoPh Pirchl Hi everyone, When i run the testscript with 4 GB RAM (instead of 8GB RAM), everything is working ;-)) Does anyone know, if the summit Kernel is compiled with HIGH MEMORY SUPPORT=64GB or with HIGH MEMORY SUPPORT=4GB RAM ? Thanks in advance CHristoPh This is just FYI: We're running 2.4.9-e.16, which seems to be better than the really old kernels AFA memory management, but we're now running into more problems with general "sluggishness" on e.16 boxes, which matches some symptoms above. Per RH's advice (see bug #85231), we adjusted these parameters: echo 4000 > /proc/sys/vm/high_io_sectors echo 2000 > /proc/sys/vm/low_io_sectors but when we crank up oracle, we see periodic sluggishness on the console and poor oracle performance compared to our current production kernel (2.4.17-rmap12f). We're in the same boat as Ray DeJean above--we can't deploy RHAS to our db farm until this problem is solved, and we _have_ to run RHAS because oracle won't support any other linux except SUSE-enterprise, which we may start testing soon. We haven't tried e.24 yet, though, and it looks like there's a lot of VM work from the errata notes on rhn. Whats going on here is that the system is out of lowmem because the system is managing 16GB which requires ~300MB of lowmem for the mem-map and there is ~6GB of memory in the pagecache which is also consuming ~500MB of lowmem for the inodes and buffer headers as can be seen in the slabinfo output. So, kswapd cant reclaim the lowmem directly. I did add additional memory reclaiming logic to launder highmem pagecache memory when lowmem is consumed by inodes and buffer headers. This will free the inodes and buffer headers so the slab memory can also be freed to reduce lowmem pressure. This logic is in the e.24 kernel but not e.16 so you should try that kernel out ASAP. Larry Woodman Doesn't this problem relate to bug #98333? This bug is filed against RHEL2.1, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you. |