Description of problem: Jan 24 11:17:02 franky. kernel: qemu-kvm invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0 Jan 24 11:17:02 franky. kernel: qemu-kvm cpuset=emulator mems_allowed=0 Jan 24 11:17:02 franky. kernel: Pid: 13888, comm: qemu-kvm Tainted: GF W 3.7.2-204.fc18.x86_64 #1 Jan 24 11:17:02 franky. kernel: Call Trace: Jan 24 11:17:02 franky. kernel: [<ffffffff810d2a81>] ? cpuset_print_task_mems_allowed+0x91/0xa0 Jan 24 11:17:02 franky. kernel: [<ffffffff8162c8b4>] dump_header+0x79/0x1b3 Jan 24 11:17:02 franky. kernel: [<ffffffff811330e7>] ? find_lock_task_mm+0x27/0x70 Jan 24 11:17:02 franky. kernel: [<ffffffff8118af70>] ? try_get_mem_cgroup_from_mm+0x50/0x70 Jan 24 11:17:02 franky. kernel: [<ffffffff812ec783>] ? ___ratelimit+0xa3/0x120 Jan 24 11:17:02 franky. kernel: [<ffffffff811334b7>] oom_kill_process+0x1c7/0x310 Jan 24 11:17:02 franky. kernel: [<ffffffff8106acd5>] ? has_ns_capability_noaudit+0x15/0x20 Jan 24 11:17:02 franky. kernel: [<ffffffff8118dcdf>] mem_cgroup_out_of_memory+0x29f/0x2c0 Jan 24 11:17:02 franky. kernel: [<ffffffff8118e473>] __mem_cgroup_try_charge+0x773/0x8d0 Jan 24 11:17:02 franky. kernel: [<ffffffff814c265a>] ? dm_get_live_table+0x4a/0x60 Jan 24 11:17:02 franky. kernel: [<ffffffff8118d6c0>] ? mem_cgroup_write+0x3b0/0x3b0 Jan 24 11:17:02 franky. kernel: [<ffffffff8118ed9e>] mem_cgroup_charge_common+0x8e/0x120 Jan 24 11:17:02 franky. kernel: [<ffffffff8118f86a>] mem_cgroup_cache_charge+0x7a/0xa0 Jan 24 11:17:02 franky. kernel: [<ffffffff81130447>] add_to_page_cache_locked+0x67/0x1a0 Jan 24 11:17:02 franky. kernel: [<ffffffff8113059a>] add_to_page_cache_lru+0x1a/0x40 Jan 24 11:17:02 franky. kernel: [<ffffffff812fb7e1>] ? list_del+0x11/0x40 Jan 24 11:17:02 franky. kernel: [<ffffffff811d0845>] mpage_readpages+0xa5/0x140 Jan 24 11:17:02 franky. kernel: [<ffffffff81219b80>] ? noalloc_get_block_write+0x30/0x30 Jan 24 11:17:02 franky. kernel: [<ffffffff81219b80>] ? noalloc_get_block_write+0x30/0x30 Jan 24 11:17:02 franky. kernel: [<ffffffff8112f7c7>] ? __page_cache_alloc+0xd7/0x100 Jan 24 11:17:02 franky. kernel: [<ffffffff812158fd>] ext4_readpages+0x1d/0x20 Jan 24 11:17:02 franky. kernel: [<ffffffff8113c0e7>] __do_page_cache_readahead+0x1b7/0x240 Jan 24 11:17:02 franky. kernel: [<ffffffff8113c491>] ra_submit+0x21/0x30 Jan 24 11:17:02 franky. kernel: [<ffffffff8113c5b5>] ondemand_readahead+0x115/0x240 Jan 24 11:17:02 franky. kernel: [<ffffffff8113c7b3>] page_cache_sync_readahead+0x33/0x50 Jan 24 11:17:02 franky. kernel: [<ffffffff81131a10>] generic_file_aio_read+0x540/0x6c0 Jan 24 11:17:02 franky. kernel: [<ffffffff81194b27>] do_sync_read+0xa7/0xe0 Jan 24 11:17:02 franky. kernel: [<ffffffff81195279>] vfs_read+0xa9/0x180 Jan 24 11:17:02 franky. kernel: [<ffffffff8119552a>] sys_pread64+0x9a/0xb0 Jan 24 11:17:02 franky. kernel: [<ffffffff8163ded9>] system_call_fastpath+0x16/0x1b Jan 24 11:17:02 franky. kernel: Task in /libvirt/qemu/Server2008r2 killed as a result of limit of /libvirt/qemu/Server2008r2 Jan 24 11:17:02 franky. kernel: memory: usage 4549836kB, limit 4549836kB, failcnt 565605 Jan 24 11:17:02 franky. kernel: memory+swap: usage 0kB, limit 9007199254740991kB, failcnt 0 Jan 24 11:17:02 franky. kernel: Mem-Info: Jan 24 11:17:02 franky. kernel: Node 0 DMA per-cpu: Jan 24 11:17:02 franky. kernel: CPU 0: hi: 0, btch: 1 usd: 0 Jan 24 11:17:02 franky. kernel: CPU 1: hi: 0, btch: 1 usd: 0 Jan 24 11:17:02 franky. kernel: CPU 2: hi: 0, btch: 1 usd: 0 Jan 24 11:17:02 franky. kernel: CPU 3: hi: 0, btch: 1 usd: 0 Jan 24 11:17:02 franky. kernel: CPU 4: hi: 0, btch: 1 usd: 0 Jan 24 11:17:02 franky. kernel: CPU 5: hi: 0, btch: 1 usd: 0 Jan 24 11:17:02 franky. kernel: CPU 6: hi: 0, btch: 1 usd: 0 Jan 24 11:17:02 franky. kernel: CPU 7: hi: 0, btch: 1 usd: 0 Jan 24 11:17:02 franky. kernel: Node 0 DMA32 per-cpu: Jan 24 11:17:02 franky. kernel: CPU 0: hi: 186, btch: 31 usd: 195 Jan 24 11:17:02 franky. kernel: CPU 1: hi: 186, btch: 31 usd: 192 Jan 24 11:17:02 franky. kernel: CPU 2: hi: 186, btch: 31 usd: 0 Jan 24 11:17:02 franky. kernel: CPU 3: hi: 186, btch: 31 usd: 18 Jan 24 11:17:02 franky. kernel: CPU 4: hi: 186, btch: 31 usd: 30 Jan 24 11:17:02 franky. kernel: CPU 5: hi: 186, btch: 31 usd: 176 Jan 24 11:17:02 franky. kernel: CPU 6: hi: 186, btch: 31 usd: 0 Jan 24 11:17:02 franky. kernel: CPU 7: hi: 186, btch: 31 usd: 30 Jan 24 11:17:02 franky. kernel: Node 0 Normal per-cpu: Jan 24 11:17:02 franky. kernel: CPU 0: hi: 186, btch: 31 usd: 71 Jan 24 11:17:02 franky. kernel: CPU 1: hi: 186, btch: 31 usd: 91 Jan 24 11:17:02 franky. kernel: CPU 2: hi: 186, btch: 31 usd: 146 Jan 24 11:17:02 franky. kernel: CPU 3: hi: 186, btch: 31 usd: 130 Jan 24 11:17:02 franky. kernel: CPU 4: hi: 186, btch: 31 usd: 110 Jan 24 11:17:02 franky. kernel: CPU 5: hi: 186, btch: 31 usd: 130 Jan 24 11:17:02 franky. kernel: CPU 6: hi: 186, btch: 31 usd: 49 Jan 24 11:17:02 franky. kernel: CPU 7: hi: 186, btch: 31 usd: 52 Jan 24 11:17:02 franky. kernel: [291B blob data] Jan 24 11:17:02 franky. kernel: Node 0 DMA free:15888kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):...nreclaimable? no Jan 24 11:17:02 franky. kernel: lowmem_reserve[]: 0 2139 15997 15997 Jan 24 11:17:02 franky. kernel: Node 0 DMA32 free:2175560kB min:9028kB low:11284kB high:13540kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB iso...nreclaimable? no Jan 24 11:17:02 franky. kernel: lowmem_reserve[]: 0 0 13858 13858 Jan 24 11:17:02 franky. kernel: Node 0 Normal free:4780740kB min:58488kB low:73108kB high:87732kB active_anon:5275760kB inactive_anon:206408kB active_file:1650712kB inactive_file:1709500kB unevictable:4...nreclaimable? no Jan 24 11:17:02 franky. kernel: lowmem_reserve[]: 0 0 0 0 Jan 24 11:17:02 franky. kernel: Node 0 DMA: 2*4kB 1*8kB 2*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15888kB Jan 24 11:17:02 franky. kernel: Node 0 DMA32: 14*4kB 10*8kB 10*16kB 9*32kB 16*64kB 8*128kB 12*256kB 6*512kB 4*1024kB 6*2048kB 525*4096kB = 2175560kB Jan 24 11:17:02 franky. kernel: Node 0 Normal: 368*4kB 591*8kB 283*16kB 157*32kB 421*64kB 601*128kB 237*256kB 162*512kB 177*1024kB 23*2048kB 1047*4096kB = 4780104kB Jan 24 11:17:02 franky. kernel: 862236 total pagecache pages Jan 24 11:17:02 franky. kernel: 7631 pages in swap cache Jan 24 11:17:02 franky. kernel: Swap cache stats: add 18439, delete 10808, find 3479/3479 Jan 24 11:17:02 franky. kernel: Free swap = 8145088kB Jan 24 11:17:02 franky. kernel: Total swap = 8175612kB Jan 24 11:17:02 franky. kernel: 4193776 pages RAM Jan 24 11:17:02 franky. kernel: 107207 pages reserved Jan 24 11:17:02 franky. kernel: 2680404 pages shared Jan 24 11:17:02 franky. kernel: 1862358 pages non-shared Jan 24 11:17:02 franky. kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name Jan 24 11:17:02 franky. kernel: [13623] 107 13623 1275536 1059312 2236 6839 0 qemu-kvm Jan 24 11:17:02 franky. kernel: Memory cgroup out of memory: Kill process 13993 (qemu-kvm) score 335 or sacrifice child Jan 24 11:17:02 franky. kernel: Killed process 13993 (qemu-kvm) total-vm:5102144kB, anon-rss:4226008kB, file-rss:11240kB Free reports: total used free shared buffers cached Mem: 16346276 5134592 11211684 0 276768 3143840 -/+ buffers/cache: 1713984 14632292 Swap: 8175612 0 8175612 The VM in question was allocated 4096MB of ram. Where is this cgroup limit coming from?
It appears that setting disk cache mode to "None" corrects this issue.
There was a bug in libvirt (fixed upstream) where it set the cgroup memory limit too low causing qemu to die under load. I believe the fix for this is: commit 3c83df679e8feab939c08b1f97c48f9290a5b8cd Author: Michal Privoznik <mprivozn> Date: Tue Jan 8 10:15:49 2013 +0100 qemu: Relax hard RSS limit Currently, if there's no hard memory limit defined for a domain, libvirt tries to calculate one, based on domain definition and magic equation and set it upon the domain startup. The rationale behind was, if there's a memory leak or exploit in qemu, we should prevent the host system trashing. However, the equation was too tightening, as it didn't reflect what the kernel counts into the memory used by a process. Since many hosts do have a swap, nobody hasn't noticed anything, because if hard memory limit is reached, process can continue allocating memory on a swap. However, if there is no swap on the host, the process gets killed by OOM killer. In our case, the qemu process it is. To prevent this, we need to relax the hard RSS limit. Moreover, we should reflect more precisely the kernel way of accounting the memory for process. That is, even the kernel caches are counted within the memory used by a process (within cgroups at least). Hence the magic equation has to be changed: limit = 1.5 * (domain memory + total video memory) + (32MB for cache per each disk) + 200MB (By the way -- the description of this bug is wrong. It *does* affect you even if you have swap). There's also bug 891324 in libvirtd (actually, in libselinux) where a memory leak causes libvirtd to grow and grow in size until there's no memory left to run guests; however that's probably not the bug you are hitting.
Thanks Rich, that sounds like the culprit.
libvirt-0.10.2.3-1.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/libvirt-0.10.2.3-1.fc18
libvirt-0.10.2.3-1.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report.
Possibly this bug has started to happen again (bug 966939).