Bug 407211 - The VFS cache is not freed when there is not enough free memory to allocate
Summary: The VFS cache is not freed when there is not enough free memory to allocate
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.4
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-12-01 10:02 UTC by Need Real Name
Modified: 2012-06-20 15:57 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-20 15:57:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Attached: lengthy log of the kernel messages after disabling the OOM killer, showing eventual allocation success (43.20 KB, text/plain)
2007-12-01 10:16 UTC, Need Real Name
no flags Details

Description Need Real Name 2007-12-01 10:02:30 UTC
Description of problem:

When large memory allocations fail (eg: 256megs), the VFS cache (in my case - 
5.5gigs worth, on my 8gig machine) is not freed, thus the allocation fails and 
the OOM killer "takes out" the process requesting the memory.

Disabling the OOM killer causes a lot of kernel warnings and other ghastly-
looking messages, but everything seems to work fine regardless (allocation 
eventually works OK, nothing gets killed, everything keeps working).

Version-Release number of selected component (if applicable):

4u4

How reproducible:

Always

Steps to Reproduce:
1. run your kernel for a while (eg: days) so that the VFS cache has time to 
ascimilate all free memory on your machine
2. try to allocate more memory than is "free" - hint - cat /proc/meminfo
3. Here's someone elses code and patch: http://lkml.org/lkml/2006/11/22/17
4. Another way to reproduce this is to run vmware server 1.0.x and power up 
some VMs, wait a few days, then try powering up 1 more VM - it will always 
fail 100% of the time (will get OOM killed).  If the VFS cache is not used 
(eg: after a fresh reboot), the "1 more VM" will always succeed 100% of the 
time.  Disabling the OOM killer will also succeed 100% of the time
  
Actual results:

Process killed

Expected results:

Allocation should succeed, after sufficient VFS cache is freed

Additional info:

Comment 1 Need Real Name 2007-12-01 10:10:02 UTC
Example log of a server with 5.5gigs of memory "used" in the VFS cache, OOM 
killing a process that wanted a small amount of memory more than was 
marked "free":

*Approx* memory situation:-

[root@svr log]# cat /proc/meminfo
MemTotal:      8247880 kB
MemFree:        145884 kB  <==  145M free
Buffers:          3808 kB
Cached:        5635476 kB  <== 5634M in VFS cache
SwapCached:     446796 kB
Active:        6908460 kB
Inactive:       614712 kB
HighTotal:     7405512 kB
HighFree:        14080 kB
LowTotal:       842368 kB
LowFree:        131804 kB
SwapTotal:    12345936 kB
SwapFree:     11718012 kB
Dirty:            6852 kB
Writeback:           0 kB
Mapped:        5535544 kB
Slab:           118536 kB
CommitLimit:  16469876 kB
Committed_AS:  4290788 kB
PageTables:      22872 kB
VmallocTotal:   106488 kB
VmallocUsed:      9732 kB
VmallocChunk:    95872 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     2048 kB

OOM-Kill messages:-

Nov 29 19:57:41 svr kernel: oom-killer: gfp_mask=0xd0
Nov 29 19:57:41 svr kernel: Mem-info:
Nov 29 19:57:41 svr kernel: DMA per-cpu:
Nov 29 19:57:41 svr kernel: cpu 0 hot: low 2, high 6, batch 1
Nov 29 19:57:41 svr kernel: cpu 0 cold: low 0, high 2, batch 1
Nov 29 19:57:41 svr kernel: cpu 1 hot: low 2, high 6, batch 1
Nov 29 19:57:41 svr kernel: cpu 1 cold: low 0, high 2, batch 1
Nov 29 19:57:41 svr kernel: cpu 2 hot: low 2, high 6, batch 1
Nov 29 19:57:41 svr kernel: cpu 2 cold: low 0, high 2, batch 1
Nov 29 19:57:41 svr kernel: cpu 3 hot: low 2, high 6, batch 1
Nov 29 19:57:41 svr kernel: cpu 3 cold: low 0, high 2, batch 1
Nov 29 19:57:41 svr kernel: Normal per-cpu:
Nov 29 19:57:41 svr kernel: cpu 0 hot: low 32, high 96, batch 16
Nov 29 19:57:42 svr kernel: cpu 0 cold: low 0, high 32, batch 16
Nov 29 19:57:42 svr kernel: cpu 1 hot: low 32, high 96, batch 16
Nov 29 19:57:42 svr kernel: cpu 1 cold: low 0, high 32, batch 16
Nov 29 19:57:43 svr kernel: cpu 2 hot: low 32, high 96, batch 16
Nov 29 19:57:43 svr kernel: cpu 2 cold: low 0, high 32, batch 16
Nov 29 19:57:43 svr kernel: cpu 3 hot: low 32, high 96, batch 16
Nov 29 19:57:43 svr kernel: cpu 3 cold: low 0, high 32, batch 16
Nov 29 19:57:43 svr kernel: HighMem per-cpu:
Nov 29 19:57:43 svr kernel: cpu 0 hot: low 32, high 96, batch 16
Nov 29 19:57:43 svr kernel: cpu 0 cold: low 0, high 32, batch 16
Nov 29 19:57:43 svr kernel: cpu 1 hot: low 32, high 96, batch 16
Nov 29 19:57:43 svr kernel: cpu 1 cold: low 0, high 32, batch 16
Nov 29 19:57:43 svr kernel: cpu 2 hot: low 32, high 96, batch 16
Nov 29 19:57:43 svr kernel: cpu 2 cold: low 0, high 32, batch 16
Nov 29 19:57:43 svr kernel: cpu 3 hot: low 32, high 96, batch 16
Nov 29 19:57:43 svr kernel: cpu 3 cold: low 0, high 32, batch 16
Nov 29 19:57:43 svr kernel: 
Nov 29 19:57:43 svr kernel: Free pages:       22820kB (640kB HighMem)
Nov 29 19:57:43 svr kernel: Active:1648084 inactive:275240 dirty:4738 
writeback:0 unstable:0 free:5705 slab:29408 mapped:1137675 pagetables:5507
Nov 29 19:57:43 svr kernel: DMA free:12156kB min:180kB low:360kB high:540kB 
active:0kB inactive:0kB present:16384kB pages_scanned:918953 
all_unreclaimable? yes
Nov 29 19:57:43 svr kernel: protections[]: 0 0 0
Nov 29 19:57:43 svr kernel: Normal free:10024kB min:10056kB low:20112kB 
high:30168kB active:2248kB inactive:529652kB present:901120kB 
pages_scanned:752004 all_unreclaimable? yes
Nov 29 19:57:43 svr kernel: protections[]: 0 0 0
Nov 29 19:57:43 svr kernel: HighMem free:640kB min:512kB low:1024kB 
high:1536kB active:6590088kB inactive:571308kB present:7929852kB 
pages_scanned:0 all_unreclaimable? no
Nov 29 19:57:43 svr kernel: protections[]: 0 0 0
Nov 29 19:57:43 svr kernel: DMA: 5*4kB 3*8kB 3*16kB 3*32kB 3*64kB 0*128kB 
0*256kB 1*512kB 1*1024kB 1*2048kB 2*4096kB = 12156kB
Nov 29 19:57:43 svr kernel: Normal: 0*4kB 5*8kB 130*16kB 11*32kB 2*64kB 
0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 10024kB
Nov 29 19:57:43 svr kernel: HighMem: 32*4kB 0*8kB 0*16kB 0*32kB 0*64kB 2*128kB 
1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 640kB
Nov 29 19:57:43 svr kernel: Swap cache: add 1374857, delete 1267113, find 
556283/712083, race 0+140
Nov 29 19:57:43 svr kernel: 0 bounce buffer pages
Nov 29 19:57:43 svr kernel: Free swap:       11679648kB
Nov 29 19:57:43 svr kernel: 2211839 pages of RAM
Nov 29 19:57:43 svr kernel: 1851378 pages of HIGHMEM
Nov 29 19:57:44 svr kernel: 149935 reserved pages
Nov 29 19:57:44 svr kernel: 1461767 pages shared
Nov 29 19:57:44 svr kernel: 107753 pages swap cached
Nov 29 19:57:44 svr kernel: Out of Memory: Killed process 25203 (vmware-vmx).


Comment 2 Need Real Name 2007-12-01 10:16:54 UTC
Created attachment 274671 [details]
Attached: lengthy log of the kernel messages after disabling the OOM killer, showing eventual allocation success

As above - after:

echo 0 > /proc/sys/vm/oom-kill

Attached: lengthy log of the kernel messages after disabling the OOM killer,
and re-running the same process from above - note that it warns that it *would*
have OOM-killed about half a dozen times, but even though it does not,
everything works itself out in the end.

Comment 3 Linda Wang 2008-08-12 16:13:25 UTC
echo 0 > /proc/sys/vm/oom-kill

so, per last comment, sounds like 
setting /proc/sys/vm/oom-kill did help...

if this works, will it be an acceptable
workaround to tune your system this way by default?

thanks.

Comment 4 Need Real Name 2008-08-12 23:46:45 UTC
No.  Setting /proc/sys/vm/oom-kill does not generally help - if you are lucky, your host machine will almost grind to a halt for an hour or two, but will eventually recover.  Normally however - the host grinds to a halt for about 24 hours, before eventually locking up completely.

I did a massive amount of testing with different servers, different software, and different operating systems.  I also attempted to debug the problem and patch the kernel OOM code, and I filed numerous bug reports.  I cannot find the exact problem, and neither me nor anyone else seems able/willing to fix this problem on RedHat.

The current workaround that I reluctantly had no other choice but to take is to switch to openSUSE - this is the only stable host O/S that I found that does not have the OOM bug - all RedHat and other Linux and BSD distros that I tested exhibit the same behaviour.

Comment 5 Jiri Pallich 2012-06-20 15:57:24 UTC
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. 
Please See https://access.redhat.com/support/policy/updates/errata/

If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.


Note You need to log in before you can comment on or make changes to this bug.