Red Hat Bugzilla – Bug 293641
kswapd0 hangs the system
Last modified: 2012-06-20 12:12:37 EDT
Description of problem:
I am running RHEL AS 4.0 Update 4 on HP-DL580 with 16GB of memory and 4 Xeon
processors. Once about every month the kswapd0 process is taking all the CPU
resources on the machine and bringing it to halt.
Actually there are three such servers. These servers are part of our 3-node
Oracle RAC cluster and runs our production databases. kswapd process runs on a
server and takes all the CPU resources. Other processes are hung. Since this
node is part of the Oracle Cluster it is being evicted from the cluster when
unable to communicate.
I have seen this problem happening on all the three nodes at different times.
They have happened several times. They have happened after the server has been
running for about 1 month.
Version-Release number of selected component (if applicable):
[email@example.com] cat /etc/redhat-release
Red Hat Enterprise Linux AS release 4 (Nahant Update 4)
[firstname.lastname@example.org] uname -a
Linux racdbp2.dcb.ichotels.com 2.6.9-42.0.8.ELsmp #1 SMP Tue Jan 23 12:49:51 EST
2007 x86_64 x86_64 x86_64 GNU/Linux
It has been happened several times on all of three servers. This has happened
after the server is up for about 1 month.
Steps to Reproduce:
I will update few files with the data collected before the server rebooted.
Created attachment 197701 [details]
mpstat data collected on the server just before it rebooted
Created attachment 197721 [details]
top data collected on the server just before it rebooted
Created attachment 197731 [details]
vmstat data collected on the server before it rebooted.
The server has 16GB memory and 18GB of swap.
total used free shared buffers cached
Mem: 16417680 15057028 1360652 0 246984 8021748
-/+ buffers/cache: 6788296 9629384
Swap: 18876364 0 18876364
I am working on this issue. Can you get me AltSysrq-W and AltSysrq-M outputs
when this happens just to make sure its the same thing I think it is?
Thanks, Larry Woodman
The problem is this is happening about once a month. The most recent server
reboot was yesterday after an uptime of 42 days. If the system is hung for few
minutes it is being rebooted by the Oracle cluster software. It is hard to tell
when the system will be hung.
I was running Oracle's data collection utility called oswatcher. It basically
collects data about the system every 30 seconds and stores in the files. I can
upload those files to you.
I am sorry I cannot get you AltSysrq-W and AltSysrq-M data.
If you want to talk to me please feel free to call me at 770-604-5606 or
419-290-9988. I really appreciate your help.
Created attachment 198811 [details]
This is the typical data that is collected by Oracle oswatcher utility.
The oswatcher data is rolled-over every 48 hours, so if you need some old data
I can upload it.
Good Morning Larry,
Please update. THanks.
Is anyone working on this? I am asking for an update on this for the last
several days with no response from your end. This is unprofessional. Thanks.
Sorry for the delay. There is no way to get AltSysrq-M or AltSysrq-W data?
This is what I usually need to debug a hung system. Also, can you temm me if
the system is running with memory interleaving enabled or if it is a NUMA system.
1.) echo 1 > /proc/sys/kernel/sysrq
2.) echo m > /proc/sysrq-trigger
3.) dmesg and attach the output.
Thanks, Larry Woodman
Created attachment 205521 [details]
There is no way to get AltSysrq-M or AltSysrq-W data?
No. We don't know when the system will be hung next time. It will be rebooted
automagically by Oracle within few minutes once it is hung.
Also, can you tell me if the system is running with memory interleaving enabled
or if it is a NUMA system.
I will ask the system admins at the data center and let you know as soon as
I have attached the dmesg output.
This is not a NUMA system. This is a DL580 G3, and I do not think NUMA is an option.
[root@racdbp1]~# numactl --show
preferred node: 0
[root@racdbp1]~# numactl --hardware
available: 1 nodes (0-0)
node 0 size: 16895 MB
node 0 free: 46 MB
[root@racdbp1]~# dmesg |grep -i numa
No NUMA configuration found
[root@racdbp1]~# dmesg |grep command
Bootdata ok (command line is ro root=LABEL=/ apm=off nousb apm=off iommu=off)
Kernel command line: ro root=LABEL=/ apm=off nousb apm=off iommu=off console=tty0
Not being a NUMA system is even more of a reason I need to see AltSysrw-M and
AltSysrw-W outputs when the system is hung. Having said that I did make changes
to RHEL4-U6 to prevent the system form getting into this state, can you try the
latest RHEL4-U6 beta kernel??? If yes, you also need to set the new tunable
parameter /proc/sys/vm/pagecache to 10.
If I give you a kernel that will print the AltSysrq-M and AltSysrq-W output to
the console when you ping the system can you run it??? I'd REALLY like see
exactly what the system is doing when this hang occurs co I can verify its the
same problem I fixed in RHEL4-U6. Also, are you considering running the latest
Your service Sucks!! We have decided to move to Oracle Linux. Thanks for response.
Did you get a chance to try the latest RHEL4-U6 kernel??? I made a change to
prevent the system form getting hung in this state. You need to install
RHEL4-U6 and then set /proc/sys/vm/pagecache to 10%. This will prevent kswapd
and all other callers of try_to_free_pages() from getting stuck on the
zone->lru_lock. We have verified that this prevents the hang you are seeing
when memory becomes exhausted on x86_64 systems with lots of CPUs and Lots or RAM.
I'm having issues with kswapd on RHEL 4 U6. I've tried kernels 2.6.9-67.0.20,
2.6.9-67.0.15 and 2.6.9-67.0.4 both with and without smp.
Seems to be a memory leak I'm going to check RHEl 5, but I can't go to that at
this particular juncture.
Any help would be greatly appreciated.
Kathy, can you provide us with whatever data you have on this kswaopd issue???
Is the system hanging, if so can you get AltSysrq-M output so I casn see the
exact memory state? Also, you seem to think the system is leaking memory, can
you provide me with whatever data or evidence you have of this? Finally, if you
can send me some sort of reproducer program that I can run on my system that
would make debugging this problem much faster than going back and forth.
Thanks, Larry Woodman
I have consistently reproduced the issue with the installation of the Intel
compilers from: http://intel.com/cd/software/products/asmo-na/eng/219771.htm
2.6.9-67.0.4.ELsmp just never seems to get past the testing mode of the above
In the other kernels I am using top and saw the swap go to 2G of usage and have
2G of actual RAM in the system a Dell GX280. I imagine I can reproduce on my
Dell GX 755 as well.
Need to do some work to get firefox and g++ on my RHEL 5 test system and can let
you know the results there.
I've also contacted Intel Product Support about this, but no response yet.
Thanks for you prompt reply.
I just tried on a freshly installed RHEL 4 U6 Dell Optiplex 755 and it seems to
3G RAM in this baby, but it never seems to hardly touch it...
certainly it never swaps.
Kernel is 2.6.9-67.0.20.ELsmp
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life.
Please See https://access.redhat.com/support/policy/updates/errata/
If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.