Red Hat Bugzilla – Bug 118397
system needlessly thrashing swap partition
Last modified: 2007-11-30 17:07:00 EST
Since changing from a box running squid on Red Hat 7.1 to a new box
running squid on RHEL 3.0, web access through our squid proxy server
has become very significantly slower, and load average has increased.
I can't see why this is.
I attach output from top and vmstat:
12:06:04 up 5 days, 16:29, 2 users, load average: 1.04, 1.01, 0.94
59 processes: 58 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait
total 11.3% 0.0% 0.9% 0.0% 0.0% 37.6%
cpu00 9.9% 0.0% 0.9% 0.0% 0.0% 10.8%
cpu01 12.8% 0.0% 0.9% 0.0% 0.0% 64.3%
Mem: 1028484k av, 1012088k used, 16396k free, 0k shrd,
785112k actv, 187964k in_d, 5000k in_c
Swap: 2096472k av, 269308k used, 1827164k free
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU
22153 squid 25 0 202M 103M 800 D 12.3 10.3 108:17 0
23879 root 19 0 1180 1180 892 R 0.4 0.1 0:00 0 top
1 root 15 0 488 456 436 S 0.0 0.0 0:09 0 init
procs memory swap io
r b swpd free buff cache si so bi bo in cs us
sy id wa
1 2 270220 16296 272624 589028 23 14 15 28 29 32
4 1 2 5
We suspect ext3 is causing the slow down, while squid waits for the
data to be written to disk.
We've switched to ext2 for the cache partition and will see if that
Switching to ext2 fixed the problem.
Switching to ext2 delayed the problem reoccurring. Now the problem is
Since RHEL 3 has the NPTL, should aufs or ufs be used?
Does squid work well with NTPL?
Created attachment 98822 [details]
Created attachment 98823 [details]
This seems to be a kernel bug.
The swap partition is needlessly thrashing, despite free ram.
To confirm that the high iowait was due to the kernel needlessly
swapping, the machine was rebooted with zero swap space.
iowait is low, squid is running happily, and the internet access is
back to Red Hat 7.1 speeds.
kswapd hogged the computer for five minutes today.
~45% cpu for the whole time.
The machine was unusable as a web proxy.
Is there any progress on this?
There are now nine comments here, no comments from Red Hat.
Weeks after reporting this bug, I now have a box that will serve
squid requests most of the time, but with kswapd going mad despite no
swap space the rest of the time.
The lack of swap space means that grep has become slow - the kernel
isn't caching disk access.
Despite being encouraged to submit bug reports, Red Hat *still* don't
even appear to be doing anything at all, and I'm seriously
considering switching to SuSE.
I think you got the wrong impression about what bugzilla is.
Bugzilla is *not* support. Bugzilla has no SLA.
If your production server has a problem you really should contact RH
Reassigning to Larry. -ernie
Please try to reproduce this problem with the RHEL3 Update 2 kernel
and let me know how it goes. We did make changes that reduce swap
aggression in U2.
We made our own independent change that involved echoing a value to
the proc virtual fs. It seems to have worked. We reduced the value
from 100 to 30.
I'll post what we did once I've confirmed it'll stay working.
It's still working. I'll close the bug - one question though:
Is RHEL3 Update 2 identical to RHEL3 Update 1 + updates from up2date?
Yes, assuming you're subscribed to the RHEL3 beta channel. After
RHEL3 U2 is officially released (expected sometime next week),
then the answer is "yes" in any case.
An errata has been issued which should help the problem described in this bug report.
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen
this bug report if the solution does not work for you.