Bug 160033
Summary: | Kernel swaps out Oracle instead of releasing cache | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Dirk Gfroerer <dirk.gfroerer> | ||||
Component: | kernel | Assignee: | Larry Woodman <lwoodman> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4.0 | CC: | cchan, jplans, jwest, lgranquist, rich, riel | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-06-07 05:44:11 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Dirk Gfroerer
2005-06-10 06:53:25 UTC
RHEL4 works differently than RHEL3. Please try lowering /proc/sys/vm/swappiness from the default of 60 to 50 then 40 then 30 until your system reclaims pagecache pages rather than swapping out Oracle. Please let me know how this goes. Thanks, Larry Woodman Sorry for the delay. The machine was running with /proc/sys/vm/swappiness 0 for some time now (with the U2 kernel in the meantime) put still was using a large amount of swap. In order to be sure my memory calculations are correct, I've turned now swap completey off for the machine. I haven't seen any "out of memory" problems on the machine. And we put quite some load on it during this time. However I don't consider this to be a workaround. Do you think it will make a difference if we set swappinnes to 30 or 40? I've switched back to the default value of 60 and then used 50, 40, 30, ... didn't make a difference. Installed today Oracle 10g R2 on a AMD64/EM64T with RHEL 4U2. Same behaviour. The machine has 4GB of RAM. Oracle is configured to use about 2GB of RAM. The machine just started to use the page file (currently 90 MB are in use). I've encountered this as well: AMD Athlon XP 3000+ 1GB of RAM kernel-2.6.9-11.EL Red Hat Enterprise Linux ES release 4 (Nahant Update 2) vm.swappiness = 0 Box is a mail gateway that runs sendmail and spamassassin. System originally on the verge of collapse due to very high iowait% - there are two servers that do the exact same thing as this one, are RHEL 2.1 and RHEL 3.0, with none of these problems. When I first logged on the server was in 30% swap with only 40% physical memory usage. I turned swappines to zero, rebooted, and the server is running in 10% swap with only 25% physical memory used. On a busy mail server running IDE even 10% of swap is noticable. End result is this will be yet another server we have to revert back to RHEL 3 because of performance issues. I'd rather not disable swap either. Any suggestions? Out of the box RHEL 4 isn't doing so well against RHEL 3 (or even RHEL 2.1) and we deploy thousands of servers. I just verified this again on another RHEL4 server, 2.6.9-22: with vm.swappiness = 0, 10-minute intervals: memory swap used used 62.89 6.20 49.72 6.79 42.68 6.91 47.42 7.24 48.62 7.83 48.67 8.11 52.29 9.35 49.86 11.10 47.71 11.73 49.64 12.28 50.68 12.45 55.14 13.16 53.79 13.79 48.72 13.81 52.02 15.44 53.37 16.20 48.14 18.67 We're having to roll back the kernel to 2.6.9-11 again. Is this thread still alive? If I can offer any more information or assistance then please do let me know. I've opened a support ticket in the mean time and after some discussion with Red Hat they told me to open a support ticket with Oracle. Oracle told me this is the expected behaviour (!) and that I have to set LOCK_SGA=true in the initXXX.ora file which locks the SGA of Oracle into the RAM. I'm not seeing this issue any more on my machine(s). However I'm not really satisified since I have to be very careful when setting up a new database. If I assign too much RAM it will blow up with out of memory. Also no one came up with an explanation on why application memory is being paged out when vm.swapiness is set to 0. Created attachment 129064 [details]
memory stats
Comment on attachment 129064 [details]
memory stats
RHEL4 vm behaviour leaves much to be desired. Converting boxes over to FC4 due
to the performance hit.
What is the state of this bug now? Is the system still swapping out the Oracle SGA of have the issues been with the tuning? If the system is still swapping out the SGA please get me several Alt Sysrq M outputs whe the swapping is occuring. Thanks, Larry Woodman I had similar problem and in my case helped to turn on hugepages (eg. in /etc/sysctl.conf set vm.nr_hugepages = 5120). Before I had (out of 16GB RAM) 5.5GB cached (containing 5.3GB swap cache) and system was swapping out over 2GB. After the hugepages were allowed, no swapping, cached is only 400MB and still have few gigs of free memory. I'm seeing behavior similar to this on RHEL4 (2.6.9-89.ELsmp) and on RHEL5 kernels (2.6.18-128.2.1.el5 installed on a RHEL4 O/S), with swappiness set to 0 in a java application. The servers are 16GB with 10GB heap and about 7-8GB of tenured, which are live objects but some of them are very infrequently accessed. The app mmap()s a lot of very large files (~50GB of mmap()'d files in the VMA space) and the VM is clearly scavenging less-used pages in the tenured generation and almost seems to be pinning the mmap()'d pages (however after a lot of investigation i don't see any calls to mmap() or mlock() which actually would pin the pages) and evicting the tenured generation to swap. After a couple hours a FullGC kicks in and walks through all the objects in the tenured generation and I'm seeing 15-30 minute FullGC stops as opposed to 30 second FullGC stops. I'm not positive what the actual VM pressure is on the mmap()'d files, but its plausible that the evicted tenured heap pages are only accessed every few hours by the FullGC. The disks tend to be ~20% utilized on a 2-disk RAID1 SAS 10k array (Dell 1950). Using 'swapoff -a' might be a workaround and I'm testing that now... I'm suspecting that swappiness=0 is still too aggressive in swapping out anonymous pages for server apps. An anonymous page that hasn't been hit in days is still potentially more valuable to me than a buffer cache page that got hit a minute ago (some of my apps have stable enough GC behavior that they will FullGC *very* infrequently, but when they do a walk through the tenured gen it is a disaster to have to pull those pages in from swap). I do need to do more work on CMS collection and Garbage First collectors in java, and more work on addressing having lots (GBs) of very infrequently accessed objects in the tenured gen, but I can't eliminate this pattern of memory access from these servers completely. So far I don't have a good synthetic test case. The basic design of the VM in RHEL 4 and RHEL 5 will not allow a complete fix for this issue, only tweaks to make it behave better most of the time. In the upstream kernel (and for RHEL 6), this problem has been addressed with the split LRU VM, which was merged in 2.6.28 and continues to get small fixes and tweaks. In RHEL 6 this issue should be resolved. Lowering swappiness is likely to make the situation described here worse. Ths swappiness tunable controls how agressively the system deactivates active pages that are mapped. Since mmap()'d file pages and anonymous pages are both mapped into virtual address spaces lowing swappiness tells the system not to deactivated either mmap()'d file pages or anonymous pages until the system is under signaficantly more memory pressure. Larry Woodman Thanks for the comments. I had assumed that "swapiness" would treat anonymous and file-backed pages differently, I didn't realize the balance was between VMA mapped and pure page cache pages. Are there any important /proc tweakables for the 2.6.30.1 kernel for the LRU VM to control how aggressively anon pages are swapped out? In 2.6.30, swappiness does what you expect :) |