Description of problem: The kernel appears to be too conservative in its use of system memory. It will write pages to swap even with plenty of unused RAM available. Version-Release number of selected component (if applicable): kernel-smp-2.6.9-34.EL How reproducible: Always Steps to Reproduce: 1. Install kernel-smp-2.6.9-34.EL on fully-updated RHEL4/U3 system 2. Run system for several days 3. Note increasing use of swap even with what appears to be usused system RAM Actual results: Increasing use of swap even with lots of unused memory Expected results: Program pages should be kep in system RAM unless there is pressure from other programs or the disk cache. Additional info: This probably seems on its face to be another case of a newbie who doesn't understand that memory not used by running programs is used for disk caching. It's not. I have plenty of RAM on this system, and most of it remains free all the time, not even used for disk caching For example: # uptime 08:05:19 up 11 days, 23:53, 1 user, load average: 0.00, 0.00, 0.00 # free total used free shared buffers cached Mem: 1035468 239864 795604 0 8000 79308 -/+ buffers/cache: 152556 882912 Swap: 1767128 120620 1646508 The increase in swap use is not gradual. The swap use tends to jump when the nightly cron jobs are run, probably the result of slocate and virus checking causing the use of more RAM for the disk cache. Maybe there isn't a kernel problem, just a problem in the display of the statistics. But "free" usually shows the system with 1024MB of RAM as having ~800MB unused. More info: # cat /proc/meminfo MemTotal: 1035468 kB MemFree: 800832 kB Buffers: 7112 kB Cached: 74948 kB SwapCached: 10420 kB Active: 155468 kB Inactive: 53996 kB HighTotal: 131008 kB HighFree: 672 kB LowTotal: 904460 kB LowFree: 800160 kB SwapTotal: 1767128 kB SwapFree: 1646424 kB Dirty: 16 kB Writeback: 0 kB Mapped: 139724 kB Slab: 16752 kB Committed_AS: 363688 kB PageTables: 1908 kB VmallocTotal: 106488 kB VmallocUsed: 2552 kB VmallocChunk: 103584 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 4096 kB That "LowFree: 800160 kB" suggests that it is not "free" that is erroneously displaying the state of the memory. It appears that ~120MB worth of pages of program data has been flushed to swap even while plenty of unused RAM remains. And finally, this: # cat /proc/sys/vm/swappiness 60 Thank you.
what architecture are you running on?
i686 Specifically dual Pentium3 processors, running the kernel-smp-2.6.9-34.EL.i686 kernel.
ok. thanks. if possible can you try the test kernel at: http://people.redhat.com/~jbaron/bz185110/ and report back?
Err, isn't this the wrong architecture? I don't think my 32-bit CPUs will be happy with a 64-bit kernel. Also, how much risk am I taking with this kernel? The machine that I am seeing the problem on is in production and stability is important. Still, I do want to get this problem resolved since it appears that the bulk of my system RAM is being kept usused. Can I get a relative risk estimate? Thanks.
sorry about that. That is the wrong kernel. I'll build you the correct one. What i'll do is take the kernel you're currently running -34 and change the 1 patch that i believe is causing this problem. While i can't say that there is no risk, i have been maintaining the rhel kernels for quite some time and would say it is very low risk and will likely resolve this issue for you. will update soon. thanks.
Is this problem SMP-specific? I've got another Pentium3 machine running RHEL/U3 that is less mission critical. If a uni-processor system will demonstrate the problem and fix I'll use that instead of the SMP machine. Also, I am comfortable patching the kernel. If the tentative fix can be provided in the form of a patch file I'm willing to do my own build. Thanks.
Created attachment 127441 [details] remove this patch Not sure if this is smp specific....i wouldn't think so. What i wanted to try was reverting a patch we already have in the kernel. ie apply it as patch -p1 -R
Status: I've rebuilt the kernel after applying the above patch. No errors or new warnings seen at boot time with the rebuilt kernel. (I'm running the changed kernel on the SMP machine that I reported the problem on, just to minimize the variables for this test.) I'll get back you in about a week with the results of the modification.
ok. thanks for keeping us posted.
I guess I don't need a week to report back. After 2 days of uptime, I am seeing the kind of behavior I expect, certainly the sort of behavior I'm used to seeing in the Fedora Core kernels. Note below how there is only ~11MB of free system memory rather than the ~800MB I've been seeing with the RHEL4 kernels. (Again, this is with 1024MB of RAM installed.) It appears that the system memory is almost entirely used by running programs and disk cache. # uptime 08:00:55 up 1 day, 16:09, 1 user, load average: 0.00, 0.00, 0.00 # free total used free shared buffers cached Mem: 1035500 1023604 11896 0 216236 382436 -/+ buffers/cache: 424932 610568 Swap: 1767128 0 1767128 # cat /proc/meminfo MemTotal: 1035500 kB MemFree: 11704 kB Buffers: 216372 kB Cached: 382436 kB SwapCached: 0 kB Active: 199108 kB Inactive: 490848 kB HighTotal: 131008 kB HighFree: 252 kB LowTotal: 904492 kB LowFree: 11452 kB SwapTotal: 1767128 kB SwapFree: 1767128 kB Dirty: 236 kB Writeback: 0 kB Mapped: 107148 kB Slab: 325608 kB Committed_AS: 214284 kB PageTables: 1732 kB VmallocTotal: 106488 kB VmallocUsed: 2544 kB VmallocChunk: 103764 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 4096 kB
A further report: after 5 days it still looks good. Backing out that patch seems to have fixed the problem I was seeing. Thanks. # uptime 07:56:51 up 5 days, 16:05, 1 user, load average: 0.00, 0.00, 0.00 # free total used free shared buffers cached Mem: 1035500 1011840 23660 0 196300 349672 -/+ buffers/cache: 465868 569632 Swap: 1767128 88 1767040 # cat /proc/meminfo MemTotal: 1035500 kB MemFree: 23660 kB Buffers: 196308 kB Cached: 349664 kB SwapCached: 0 kB Active: 254348 kB Inactive: 421232 kB HighTotal: 131008 kB HighFree: 252 kB LowTotal: 904492 kB LowFree: 23408 kB SwapTotal: 1767128 kB SwapFree: 1767040 kB Dirty: 28 kB Writeback: 0 kB Mapped: 145716 kB Slab: 327944 kB Committed_AS: 251708 kB PageTables: 1792 kB VmallocTotal: 106488 kB VmallocUsed: 2568 kB VmallocChunk: 103764 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 4096 kB
committed in stream U4 build 34.20. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/
Will this fix be incorporated into future U3 releases, or will it not be part of the standrd RHEL4 kernel until U4?
wouldn't be included until U4
*** Bug 193696 has been marked as a duplicate of this bug. ***
Verified.... the following reproduces the problem fairly easily on a 1GB system: dd if=/dev/zero of=/dev/null conv=swab bs=500M count=1 /etc/cron.daily/slocate.cron When running -34, the system will be in swap within minutes and only about 30-40% of the memory will be used and remain relatively unchanged. When running -40, the system will hit swap, but utilize less swap and the memory are reported by free will show about 70% utilized and the utilization will slowly increase to about 90% of memory used.
Is there any idea of when this fix is going to be implemented in the next kernel release? we have just upgraded 6 boxes to fix an earlier vulnerability in the older kernel, we have just noticed that one of our customer with 1gb of ram + 2.6.9-34.0.2 + sql.. is getting very high load on his server since the upgrade. We are getting the same symptons as the guy above except because its getting so many queries from sql and they arent being served from the ram, the disk is just getting thrashed and causing very high load, which then sometimes causes the box to hang completely.
This problem was fixed in RHEL4-U4/kernel-2.6.9-42. If this doesn not fix this problem where Lowmem is getting pre-maturely swapped out, please let me know. Larry Woodman
Hi larry, Redhat es4 lists the latest kernel as kernel-devel-2.6.9-34.0.2.EL on up2date.. I can also not see that version when searching through the packages on rhn.. Doing an all channel search (which includes all the beta ones) Is the only way of upgrading it downloading the rpm and doing a manual rpm -i? if so where can i source this from and is it a commercial release? How come this is not on up2date yet, if the broken one is ? :) regards anthony
When is RHEL4-U4 getting released i don't see it....
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0575.html