Created attachment 468712 [details]
Description of problem:
On a system with 4GB of memory, we end up with OOM kills, and 443178 active objects in the avtab_node.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
(on this 4g box, oops)
(need copyten.csh & copyten.main in /root/copten)
# Prepare files for copying
mkdir -p /files/tenfiles
for I in 205265992 43076496 33975530 45280566 12828649 8482669 46831855 14404182 22119710 103650696;
do fallocate -l $I /files/tenfiles/file-$I;
mount /dev/ram1 /mnt/test
./copyten.csh -vv /mnt/test
OOM killer kills processes, and the system never recovers
I'll attach the shell scripts, /proc/meminfo, /proc/slabinfo and dmesg output.
Created attachment 468713 [details]
Created attachment 468714 [details]
Created attachment 468715 [details]
Created attachment 468716 [details]
Larry, would you mind taking a look at this to see if you can figure out where the memory has gone?
This actually copied from an ext2 fs on /dev/ram0 (created by the script) to another one on /dev/ram1.
I think you need swap space to back up a ramdisk...
OK, so it's a misconfiguration? I'm just surprised that after the ramdisks were unmounted, the memory was still not reclaimable. Does that sound right?
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
This request was erroneously denied for the current release of Red Hat
Enterprise Linux. The error has been fixed and this request has been
re-proposed for the current release.
Should we close this as notabug?
(In reply to comment #13)
> Should we close this as notabug?
No, I'd like Larry's opinion on why the memory remained unreclaimable. See comment number 9.
I couldnt reproduce this when I had swap space late last year. I'll try again.
This request was erroneously denied for the current release of
Red Hat Enterprise Linux. The error has been fixed and this
request has been re-proposed for the current release.
OK, I finally did reproduce this using both ramfs and tmpfs with no swap space. In both cases the pagecache memory for these filesystems is not reclaimable so if you overcommit memory with them the reclaim code moves all the pagecache pages from the active file LRU to the inactive file LRU list and then to the unevictable list because its essentially wired. Also, they are not visible via /proc/meminfo. Since there is no memory left the system OOMkills processes and still makes no progress freeing memory because its all unevictable.
<about 3GB used by ramfs pages are in pagecache>
active_anon:30852 inactive_anon:532 isolated_anon:0
active_file:23171 inactive_file:626164 isolated_file:0
unevictable:0 dirty:25 writeback:0 unstable:0
free:227598 slab_reclaimable:6504 slab_unreclaimable:22340
<after overcommitting all memory pages are moved to unevictable list>
active_anon:3499 inactive_anon:8877 isolated_anon:0
active_file:1216 inactive_file:3829 isolated_file:0
unevictable:524288 dirty:4 writeback:0 unstable:0
free:371127 slab_reclaimable:2859 slab_unreclaimable:22517
When I unmount the filesystem all the pages are moved from the unevictable list back to the free list but the system can kill just about every process before in an attempt to reclaim memory.
<after unmounting ramfs filesystem pages are removed from unevictable & freed>
active_anon:3549 inactive_anon:10305 isolated_anon:0
active_file:1279 inactive_file:4873 isolated_file:0
unevictable:0 dirty:10 writeback:0 unstable:0
free:893840 slab_reclaimable:1648 slab_unreclaimable:22334
The bottom line is I dont think we can support using either ramfs or tmpfs with no swap space and over-committing memory with those file systems. In other words dont use the size=4G mount option on a 4GB system and expect it to work correctly if you use everything.
What do you think Jeff???
Actually after looking at your dmesg output there is nothing on the unevictable list. I cant reproduce that behavior, can you?
(In reply to comment #19)
> Actually after looking at your dmesg output there is nothing on the unevictable
> list. I cant reproduce that behavior, can you?
I'll give it another try tomorrow and update the bug. Thanks for looking into this, Larry!
OK, after talking with Larry, we agree that this is just the result of a misconfiguration. Don't do that.
Specifically the ramdisk driver allocates the pages for the ramdisk using alloc_pages and holds them in a private cache forever(until the system reboots). Since this example overcommits the RAM in 2 ramdisks memory is exhausted and the system OOMkills everything until it finally panics because there is nothing else to OOMkill.
Bottom line is you cant overcommit RAM using ramdisk, ramfs or tmpfs with no swap space or the system will OOMkill everything until it panics or hangs.