From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040115
Description of problem:
I am seeing significant swapping on systems even though there is
plenty of free (although used by cache) memory. I have 6 dual Opteron
systems, 5 with 4GB of memory, and one with 8GB, all running the same
kernel, and all showing this behavior. A user will start a couple of
simulations, each taking 1GB of RAM and writing lots of data over NFS.
The system will then swap fairly heavily, having a very bad affect
on simulation performance. I've reproduced the behavior with both
Matlab and LS-DYNA (commercial FEM code), so it's not application
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Fill available memory with cached data.
2. Start resource intensive application.
3. Weep as performance plummets.
Actual Results: System begins swapping.
Expected Results: The cached memory should be tossed, and filled in
with actual application data.
I'll attach a file with some system output.
Created attachment 98489 [details]
vmstat, free, and ps output
Can you get me a quick "AltSysrq M" output when the system is in this
state? You can do this by:
1.) login as root
2.) echo 1 > /proc/sys/kernel/sysrq
3.) echo m > /proc/sysrq-trigger
4.) attach the dmesg output to this bug
Created attachment 98534 [details]
"AltSysrq M" output
3 times -- let me know if you need more.
Ah, I think I see the problem. Try setting pagecache.maxpercent to 15
via "echo 1 15 15 > /proc/sys/vm/pagecache" and see if this helps.
It does indeed. I'm now seeing minimal swapping with the same
I have come across this same problem but even if I set the following
in vm.pagecache to 2 10 20, I still get more than 20% of memory used
as cache -
Here is /proc/meminfo -
[grma@shane 59] ~ > cat /proc/meminfo total: used: free:
shared: buffers: cached:Mem: 525836288 445038592 80797696 0
31404032 187092992Swap: 2146787328 30904320 2115883008
MemTotal: 513512 kB
MemFree: 78904 kB
MemShared: 0 kB
Buffers: 30668 kB
Cached: 181156 kB
SwapCached: 1552 kB
Active: 347540 kB
ActiveAnon: 204324 kB
ActiveCache: 143216 kB
Inact_dirty: 32452 kB
Inact_laundry: 27108 kB
Inact_clean: 5700 kB
Inact_target: 82560 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 513512 kB
LowFree: 78904 kB
SwapTotal: 2096472 kB
SwapFree: 2066292 kB
Hugepagesize: 4096 kB
My system seems to grind to a halt due to their being no memory
available to the applications because it is all being used for cache.
Any ideas what I can do to prevent this?
Can someone please explain why Redhat seem to be doing nothing about a
MAJOR issue with their RHEL 3 flagship product.
I have been doing some research into this and it is definitely a
problem that people are coming across.
It appears that there is no work around so where is the updated kernel
- this problem has been around for a long time now. Where is the
support that we pay Â£000's for??
The pagecache parameter(specifically the third one: maxpercent) does
not stop the system from allocating more than that amount of memory in
the pagecache. Instead, when the system runs out of memory and starts
reclaiming, it forces the pagecache to give up memory until it is down
below /proc/sus/vm/pagecache.maxpercent before it starts reclaiming
anonymous memory and therefore starts paging. If your system is not
responding to lowering this value like Joshua's did, please get me
some AltSysrq-M outputs(like Joshua did) and attach them to this bug.
I'm more than happy to figure out whats going on in your system.
Created attachment 98693 [details]
swap usage when there's still free (although cached) memory
AltSysRq-M output for memory problem
grmansell, could you please try "echo 30 >
(the new default for RHEL3 U2)
I have added the new value for inactive_clean_percent and will
feedback if I get a problem again.
I must say that the machine has not locked up since I logged the call
report - typical !!!
For the record, I already had /proc/sys/vm/inactive_clean_percent set
to 30, due to bug 115438.
Joshua, is this problem now fixed with the inactive_cache_percent
set top 30 or have you still seen your system swap excessively?
Thanks, Larry Woodman
Sorry if I wasn't clear. When I opened this bug (i.e. when I was
seeing the swapping), I already had inactive_clean_percent set to 30.
On your advice in this bug, I set pagecache to "1 15 15", and that
fixed the problem.
A fix for this problem has just been committed to the RHEL3 U3
patch pool this evening (in kernel version 2.4.21-15.10.EL).
*** Bug 127240 has been marked as a duplicate of this bug. ***
Created attachment 101654 [details]
patch to evict page cache faster
If the patch that is currently applied to the tree isn't aggressive enough to
completely resolve your problem, this patch might help a bit more. By evicting
the page cache more aggressively, more memory should be left over for
Setting vm.pagecache to "1 15 15" helped a bit but not completely. And I
was unable to check Rik's patch because of some kernel compilation
problems (bug #127365).
We cannot wait for U3. Those swapping issues give us some performance
problems after update from Redhat 7.3 to RHES3.
In response to comment #18, the RHEL3 U3 beta will begin in a few
weeks. You are welcome to try the fixed kernel during the beta
period. From now until then, the fixed kernel is undergoing Q/A.
I wouldn't recommend running a kernel that has not yet been through
Also, Rik has answered bug #127365.
Applied Rik's patch. Swapping went back to normal. Thanks. Should I return
vm.pagecache to something from "1 15 15"?
But one problem still persists. From time to time we are getting errors:
oracleMOON: error while loading shared
libraries: /ora/product/9.2.0/lib/libjox9.so: cannot make segment writable for
relocation: Cannot allocate memory
It looks like Linux not Oracle error.
To remind situation - kernel is hugemem, Oracle is relinked to be able to use
Should I open new bugzilla entry about this bug?
Good to hear that my patch brings swapping back to normal. The Oracle
problem does indeed look like a bug, could you please open a bugzilla
entry about it ?
An errata has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.