Since changing from a box running squid on Red Hat 7.1 to a new box running squid on RHEL 3.0, web access through our squid proxy server has become very significantly slower, and load average has increased. I can't see why this is. I attach output from top and vmstat: top: 12:06:04 up 5 days, 16:29, 2 users, load average: 1.04, 1.01, 0.94 59 processes: 58 sleeping, 1 running, 0 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 11.3% 0.0% 0.9% 0.0% 0.0% 37.6% 50.0% cpu00 9.9% 0.0% 0.9% 0.0% 0.0% 10.8% 78.2% cpu01 12.8% 0.0% 0.9% 0.0% 0.0% 64.3% 21.7% Mem: 1028484k av, 1012088k used, 16396k free, 0k shrd, 272336k buff 785112k actv, 187964k in_d, 5000k in_c Swap: 2096472k av, 269308k used, 1827164k free 589580k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 22153 squid 25 0 202M 103M 800 D 12.3 10.3 108:17 0 squid 23879 root 19 0 1180 1180 892 R 0.4 0.1 0:00 0 top 1 root 15 0 488 456 436 S 0.0 0.0 0:09 0 init vmstat: procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa 1 2 270220 16296 272624 589028 23 14 15 28 29 32 4 1 2 5
We suspect ext3 is causing the slow down, while squid waits for the data to be written to disk. We've switched to ext2 for the cache partition and will see if that improves things.
Switching to ext2 fixed the problem. Closing.
Switching to ext2 delayed the problem reoccurring. Now the problem is back. Since RHEL 3 has the NPTL, should aufs or ufs be used? Does squid work well with NTPL?
Created attachment 98822 [details] vmstat output
Created attachment 98823 [details] top output
This seems to be a kernel bug. The swap partition is needlessly thrashing, despite free ram.
To confirm that the high iowait was due to the kernel needlessly swapping, the machine was rebooted with zero swap space. iowait is low, squid is running happily, and the internet access is back to Red Hat 7.1 speeds.
kswapd hogged the computer for five minutes today. ~45% cpu for the whole time. The machine was unusable as a web proxy.
Is there any progress on this? There are now nine comments here, no comments from Red Hat.
Weeks after reporting this bug, I now have a box that will serve squid requests most of the time, but with kswapd going mad despite no swap space the rest of the time. The lack of swap space means that grep has become slow - the kernel isn't caching disk access. Despite being encouraged to submit bug reports, Red Hat *still* don't even appear to be doing anything at all, and I'm seriously considering switching to SuSE.
I think you got the wrong impression about what bugzilla is. Bugzilla is *not* support. Bugzilla has no SLA. If your production server has a problem you really should contact RH support.
Reassigning to Larry. -ernie
Please try to reproduce this problem with the RHEL3 Update 2 kernel and let me know how it goes. We did make changes that reduce swap aggression in U2. Larry Woodman
We made our own independent change that involved echoing a value to the proc virtual fs. It seems to have worked. We reduced the value from 100 to 30. I'll post what we did once I've confirmed it'll stay working.
It's still working. I'll close the bug - one question though: Is RHEL3 Update 2 identical to RHEL3 Update 1 + updates from up2date?
Yes, assuming you're subscribed to the RHEL3 beta channel. After RHEL3 U2 is officially released (expected sometime next week), then the answer is "yes" in any case.
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2004-188.html