Bug 437202
Summary: | kswapd causing system disk to have 50% io wait | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Mike Snitzer <snitzer> |
Component: | kernel | Assignee: | Peter Zijlstra <pzijlstr> |
Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | 5.1 | CC: | dmugtasimov, lwang, lwoodman, riel |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-05-09 14:09:34 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Mike Snitzer
2008-03-12 20:34:59 UTC
One important fact I forgot to mention: If any of the system services (e.g. ntpd, autofs, etc) is stopped kswapd releases and the system disk's heavy IO subsides. It should be noted that even when the system is experiencing the kswapd load the system isn't amazingly loaded: load average: 1.80, 1.82, 1.90 Once a service (e.g. autofs, aka automount) is stopped the load on the system returns to 0. I suspect that one of the problems is that, when kswapd is started, almost no memory is freeable. This causes kswapd to free memory more and more agressively, increasing its free targets. Under some circumstances - I have not figured out the problem yet, even though I see it once a week or so on my own system - it looks like kswapd (and other processes in the pageout code) will continue to free pages even after the system has lots of free pages already. I am not quite sure how to fix this, since sometimes the VM actually needs to do this. Eg. to satisfy higher order allocations. I added the tunable /proc/sys/vm/pagecache to RHEL5-U1. This tunable(which defaults to 100) controls the percentage of memory that can be in the pagecache before we start reclaiming for the pagecache almost entirely. It works by having mark_page_accessed() place pagecache pages on the inactive list if the percentage of the pagecache is over /proc/sys/vm/pagecache. This way if you lower /proc/sys/vm/pagecache to 10 the inactive list is almost entirely pagecache pages which are mostly clean due to pdflush and kupdate, therefore the system does not need to swap especially if majority of the pagedemmand is via the pagecache. Another observation is that min_free_kbytes controls the zone min, low and high watermarks. If you increase min_free_kbytes, low gets min*2 and high gets min*3. This will cause kswapd to be woken up earlier and free more pages until it stops running. If this does not work, we probably need to change the scaling of low and high so they are much higher than 2 and 3 times min respectively. This way the allocator will wake up kswapd much earlier than it will drive the free list down below min and kswapd has a chance of keeping up with teh memory demand. Larry Woodman Oh and one final comment, if you move the swap partition tio a different device kswapd's IO will not stall other process's IO requests. Mike, can you please provide some feedback on how your systems behavie after suggested tunning parameters settings? Sorry for not getting back to you sooner (was away/busy). Turns out that the users of the systems that were having a problem resolved the issue simply by adding more physical memory (went from 1G to 2G). Testing with the suggested tunings will be tough seeing as the systems in question aren't under my control (or available to me). Has anyone that is looking at this issue reproduced this behavior (Rik seemed to say he had seen it periodically) or are you purely relying on me to make further progress? I also have the same problem for my Ubuntu on Lenovo T520. I do have 8Gb RAM. It definitely not a lack of RAM, it is a bug in kswapd. $ uname -a Linux dmugtasimov 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux $ cat /etc/issue Linux Mint 13 Maya \n \l $ free -m total used free shared buffers cached Mem: 7939 2301 5638 0 67 1051 -/+ buffers/cache: 1182 6757 Swap: 15999 0 15999 |