Bug 118152
Summary: | swap usage when there's still free (although cached) memory | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Joshua Baker-LePain <joshua.bakerlepain> |
Component: | kernel | Assignee: | Larry Woodman <lwoodman> |
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | gary.mansell, mindaugas, petrides, riel, woodard |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-09-02 04:31:10 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Joshua Baker-LePain
2004-03-12 16:09:04 UTC
Created attachment 98489 [details]
vmstat, free, and ps output
Can you get me a quick "AltSysrq M" output when the system is in this state? You can do this by: 1.) login as root 2.) echo 1 > /proc/sys/kernel/sysrq 3.) echo m > /proc/sysrq-trigger 4.) attach the dmesg output to this bug Larry Woodman Created attachment 98534 [details]
"AltSysrq M" output
3 times -- let me know if you need more.
Ah, I think I see the problem. Try setting pagecache.maxpercent to 15 via "echo 1 15 15 > /proc/sys/vm/pagecache" and see if this helps. Larry It does indeed. I'm now seeing minimal swapping with the same workload. Thanks! I have come across this same problem but even if I set the following in vm.pagecache to 2 10 20, I still get more than 20% of memory used as cache - Here is /proc/meminfo - [grma@shane 59] ~ > cat /proc/meminfo total: used: free: shared: buffers: cached:Mem: 525836288 445038592 80797696 0 31404032 187092992Swap: 2146787328 30904320 2115883008 MemTotal: 513512 kB MemFree: 78904 kB MemShared: 0 kB Buffers: 30668 kB Cached: 181156 kB SwapCached: 1552 kB Active: 347540 kB ActiveAnon: 204324 kB ActiveCache: 143216 kB Inact_dirty: 32452 kB Inact_laundry: 27108 kB Inact_clean: 5700 kB Inact_target: 82560 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 513512 kB LowFree: 78904 kB SwapTotal: 2096472 kB SwapFree: 2066292 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 4096 kB My system seems to grind to a halt due to their being no memory available to the applications because it is all being used for cache. Any ideas what I can do to prevent this? Can someone please explain why Redhat seem to be doing nothing about a MAJOR issue with their RHEL 3 flagship product. I have been doing some research into this and it is definitely a problem that people are coming across. It appears that there is no work around so where is the updated kernel - this problem has been around for a long time now. Where is the support that we pay £000's for?? The pagecache parameter(specifically the third one: maxpercent) does not stop the system from allocating more than that amount of memory in the pagecache. Instead, when the system runs out of memory and starts reclaiming, it forces the pagecache to give up memory until it is down below /proc/sus/vm/pagecache.maxpercent before it starts reclaiming anonymous memory and therefore starts paging. If your system is not responding to lowering this value like Joshua's did, please get me some AltSysrq-M outputs(like Joshua did) and attach them to this bug. I'm more than happy to figure out whats going on in your system. Thanks, Larry Created attachment 98693 [details]
swap usage when there's still free (although cached) memory
AltSysRq-M output for memory problem
grmansell, could you please try "echo 30 > /proc/sys/vm/inactive_clean_percent" ? (the new default for RHEL3 U2) I have added the new value for inactive_clean_percent and will feedback if I get a problem again. I must say that the machine has not locked up since I logged the call report - typical !!! For the record, I already had /proc/sys/vm/inactive_clean_percent set to 30, due to bug 115438. Joshua, is this problem now fixed with the inactive_cache_percent set top 30 or have you still seen your system swap excessively? Thanks, Larry Woodman Sorry if I wasn't clear. When I opened this bug (i.e. when I was seeing the swapping), I already had inactive_clean_percent set to 30. On your advice in this bug, I set pagecache to "1 15 15", and that fixed the problem. Thanks again. A fix for this problem has just been committed to the RHEL3 U3 patch pool this evening (in kernel version 2.4.21-15.10.EL). *** Bug 127240 has been marked as a duplicate of this bug. *** Created attachment 101654 [details]
patch to evict page cache faster
If the patch that is currently applied to the tree isn't aggressive enough to
completely resolve your problem, this patch might help a bit more. By evicting
the page cache more aggressively, more memory should be left over for
applications.
Setting vm.pagecache to "1 15 15" helped a bit but not completely. And I was unable to check Rik's patch because of some kernel compilation problems (bug #127365). We cannot wait for U3. Those swapping issues give us some performance problems after update from Redhat 7.3 to RHES3. In response to comment #18, the RHEL3 U3 beta will begin in a few weeks. You are welcome to try the fixed kernel during the beta period. From now until then, the fixed kernel is undergoing Q/A. I wouldn't recommend running a kernel that has not yet been through Q/A. Also, Rik has answered bug #127365. Applied Rik's patch. Swapping went back to normal. Thanks. Should I return vm.pagecache to something from "1 15 15"? But one problem still persists. From time to time we are getting errors: oracleMOON: error while loading shared libraries: /ora/product/9.2.0/lib/libjox9.so: cannot make segment writable for relocation: Cannot allocate memory It looks like Linux not Oracle error. To remind situation - kernel is hugemem, Oracle is relinked to be able to use 2.7GB SGA. Should I open new bugzilla entry about this bug? Good to hear that my patch brings swapping back to normal. The Oracle problem does indeed look like a bug, could you please open a bugzilla entry about it ? An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-433.html |