Bug 966939
Summary: | Memory cgroup out of memory: Kill process <PID> (qemu-system-x86) score 302 or sacrifice child | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> | ||||||||||
Component: | libvirt | Assignee: | Libvirt Maintainers <libvirt-maint> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 19 | CC: | berrange, clalancette, crobinso, itamar, jforbes, jyang, laine, libvirt-maint, lloyd.cobb, mprivozn, rjones, rlpowell, veillard, virt-maint | ||||||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | libvirt-1.0.5.7-1.fc19 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2013-11-15 20:36:38 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Richard W.M. Jones
2013-05-24 10:34:41 UTC
So I can pretty reliably make qemu be killed by running `make -C tests/md check' over and over inside the guest. I'm not sure if it's this specific test that is causing it to die or just any use of libguestfs. Here is 'top' on the host showing qemu every 10 seconds until it died. It doesn't appear to grow very much. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14456 qemu 20 0 4641m 1.8g 12m S 101.9 11.7 2:44.89 qemu-system-x86 14456 qemu 20 0 4641m 1.8g 12m S 97.4 11.7 2:54.59 qemu-system-x86 14456 qemu 20 0 4641m 1.8g 12m S 97.3 11.7 3:04.50 qemu-system-x86 14456 qemu 20 0 4641m 1.8g 12m S 94.3 11.7 3:15.01 qemu-system-x86 14456 qemu 20 0 4641m 1.8g 12m S 6.5 11.7 3:23.54 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 97.3 11.7 3:25.58 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 102.2 11.7 3:34.90 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 97.3 11.7 3:44.76 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 64.8 11.7 3:54.48 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 97.4 11.7 4:03.68 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 97.3 11.7 4:13.14 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 103.5 11.7 4:23.24 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 97.3 11.7 4:32.95 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 96.7 11.7 4:42.87 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 97.0 11.7 4:52.02 qemu-system-x86 14456 qemu 20 0 4625m 1.8g 12m S 97.3 11.7 5:01.97 qemu-system-x86 [... omitted lots of uninteresting lines in the middle ...] 14456 qemu 20 0 4697m 2.0g 12m S 0.0 13.3 30:18.25 qemu-system-x86 14456 qemu 20 0 4673m 2.0g 12m S 0.0 13.3 30:18.33 qemu-system-x86 14456 qemu 20 0 4673m 2.0g 12m S 6.2 13.3 30:18.48 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 0.0 13.3 30:18.63 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 0.0 13.3 30:18.71 qemu-system-x86 14456 qemu 20 0 4625m 2.0g 12m S 0.0 13.3 30:18.85 qemu-system-x86 14456 qemu 20 0 4625m 2.0g 12m S 0.0 13.3 30:18.96 qemu-system-x86 14456 qemu 20 0 4625m 2.0g 12m S 0.0 13.3 30:19.11 qemu-system-x86 14456 qemu 20 0 4625m 2.0g 12m S 0.0 13.3 30:19.19 qemu-system-x86 14456 qemu 20 0 4665m 2.0g 12m S 0.0 13.3 30:19.26 qemu-system-x86 14456 qemu 20 0 4625m 2.0g 12m S 0.0 13.3 30:19.33 qemu-system-x86 14456 qemu 20 0 4625m 2.0g 12m S 0.0 13.3 30:19.46 qemu-system-x86 14456 qemu 20 0 4625m 2.0g 12m S 0.0 13.3 30:19.59 qemu-system-x86 14456 qemu 20 0 4625m 2.0g 12m S 0.0 13.3 30:19.73 qemu-system-x86 14456 qemu 20 0 4625m 2.0g 12m S 0.0 13.3 30:19.84 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 97.0 13.3 30:25.40 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 97.0 13.3 30:34.78 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 12.4 13.3 30:44.45 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 97.2 13.3 30:54.40 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 103.9 13.3 31:04.36 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 97.3 13.3 31:13.93 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 97.3 13.3 31:23.94 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 96.1 13.3 31:33.97 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 103.3 13.3 31:43.98 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 97.4 13.3 31:54.12 qemu-system-x86 14456 qemu 20 0 4649m 2.0g 12m S 97.3 13.3 32:03.28 qemu-system-x86 14456 qemu 20 0 4641m 2.0g 12m S 97.4 13.3 32:13.35 qemu-system-x86 14456 qemu 20 0 4689m 2.0g 12m S 103.4 13.3 32:23.58 qemu-system-x86 14456 qemu 20 0 4689m 2.0g 12m S 97.3 13.3 32:33.25 qemu-system-x86 14456 qemu 20 0 4689m 2.0g 12m S 97.3 13.3 32:43.46 qemu-system-x86 14456 qemu 20 0 0 0 0 Z 0.0 0.0 32:50.19 qemu-system-x86 <--- killed here by oom-killer Created attachment 752549 [details]
guest XML
Created attachment 752560 [details]
/proc/<PID>/maps from qemu
I reran it, running top every 1 second instead of every 10 seconds. It's not very interesting, but here are the top results right before the crash: 16578 qemu 20 0 4643m 1.8g 12m S 103.9 12.0 12:30.27 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 114.8 12.0 12:31.64 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m R 96.1 12.0 12:32.89 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 102.2 12.0 12:34.20 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 96.6 12.0 12:35.44 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 102.1 12.0 12:36.60 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 102.1 12.0 12:37.76 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 97.3 12.0 12:38.91 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 102.4 12.0 12:40.07 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 95.2 12.0 12:41.22 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 102.7 12.0 12:42.38 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 103.8 12.0 12:43.55 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 97.3 12.0 12:44.70 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 97.3 12.0 12:45.76 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 97.4 12.0 12:46.90 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 103.5 12.0 12:48.06 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 97.3 12.0 12:49.21 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 103.7 12.0 12:50.37 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 12.6 12.0 12:51.25 qemu-system-x86 16578 qemu 20 0 4643m 1.8g 12m S 103.8 12.0 12:51.94 qemu-system-x86 16578 qemu 20 0 0 0 0 R 24.8 0.0 12:52.95 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.30 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.31 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.31 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.31 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.31 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.31 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.31 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.31 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.31 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.31 qemu-system-x86 16578 qemu 20 0 0 0 0 D 0.0 0.0 12:53.31 qemu-system-x86 Created attachment 752563 [details]
/proc/<PID>/smaps from qemu
Created attachment 752564 [details]
/proc/<PID>/status from qemu
The limits were introduced as a fix for bug 771424. Maybe we shouldn't introduce them at all. This bug appears to have been reported against 'rawhide' during the Fedora 20 development cycle. Changing version to '20'. More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora20 The VIRT for the qemu process has always been *significantly* larger than the RAM limit I set for the VM (like, ~3500MiB vs. 2048MiB), and I don't know why, and I'd sure like to, but that's off topic for this ticket. :) What is on topic is that I'm running Fedora 19, and I've been overtaxing my poor hypervisor for years, and everything was fine until I noticed that it was swapping the qemus with plenty of RAM left and set swappiness to 0 (and later to 1; the following happened when it was at 1), thus leading to: Nov 5 14:20:18 basti kernel: [110395.859780] qemu-system-x86 invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0 Nov 5 14:20:18 basti kernel: [110395.909095] qemu-system-x86 cpuset=emulator mems_allowed=0 Nov 5 14:20:18 basti kernel: [110395.956727] CPU: 1 PID: 15602 Comm: qemu-system-x86 Not tainted 3.10.3-300.fc19.x86_64 #1 Nov 5 14:20:18 basti kernel: [110396.004113] Hardware name: Dell Inc. Precision WorkStation 390 /0DN075, BIOS 2.1.2 11/30/2006 Nov 5 14:20:18 basti kernel: [110396.051614] ffff88021edc1000 ffff880209d0f9c8 ffffffff81643216 ffff880209d0fa30 Nov 5 14:20:18 basti kernel: [110396.098018] ffffffff81640274 ffff880209d0f9f8 ffffffff8118c058 ffff880230136320 Nov 5 14:20:18 basti kernel: [110396.142619] 0000000000000000 ffff880200000000 0000000000000206 ffff88022f4d16e0 Nov 5 14:20:18 basti kernel: [110396.186463] Call Trace: Nov 5 14:20:18 basti kernel: [110396.229494] [<ffffffff81643216>] dump_stack+0x19/0x1b Nov 5 14:20:18 basti kernel: [110396.272976] [<ffffffff81640274>] dump_header+0x7a/0x1b6 Nov 5 14:20:18 basti kernel: [110396.315418] [<ffffffff8118c058>] ? try_get_mem_cgroup_from_mm+0x28/0x60 Nov 5 14:20:18 basti kernel: [110396.358233] [<ffffffff811319ce>] oom_kill_process+0x1be/0x310 Nov 5 14:20:18 basti kernel: [110396.400675] [<ffffffff8118c227>] ? mem_cgroup_iter+0x197/0x2f0 Nov 5 14:20:18 basti kernel: [110396.442810] [<ffffffff8118f15e>] __mem_cgroup_try_charge+0xade/0xb60 Nov 5 14:20:18 basti kernel: [110396.484348] [<ffffffff8118fa40>] ? mem_cgroup_charge_common+0x120/0x120 Nov 5 14:20:18 basti kernel: [110396.525722] [<ffffffff8118f9a6>] mem_cgroup_charge_common+0x86/0x120 Nov 5 14:20:18 basti kernel: [110396.566974] [<ffffffff8119172a>] mem_cgroup_cache_charge+0x7a/0xa0 Nov 5 14:20:18 basti kernel: [110396.608311] [<ffffffff8112e958>] add_to_page_cache_locked+0x58/0x1d0 Nov 5 14:20:18 basti kernel: [110396.649999] [<ffffffff8112eaea>] add_to_page_cache_lru+0x1a/0x40 Nov 5 14:20:18 basti kernel: [110396.691548] [<ffffffff812fb61d>] ? list_del+0xd/0x30 Nov 5 14:20:18 basti kernel: [110396.732642] [<ffffffff8113a75d>] __do_page_cache_readahead+0x21d/0x240 Nov 5 14:20:18 basti kernel: [110396.773957] [<ffffffff8113aba6>] ondemand_readahead+0x126/0x250 Nov 5 14:20:18 basti kernel: [110396.815872] [<ffffffff8113ad03>] page_cache_sync_readahead+0x33/0x50 Nov 5 14:20:18 basti kernel: [110396.858487] [<ffffffff8112fcd5>] generic_file_aio_read+0x4b5/0x700 Nov 5 14:20:18 basti kernel: [110396.901378] [<ffffffff811cd5ec>] blkdev_aio_read+0x4c/0x70 Nov 5 14:20:18 basti kernel: [110396.943950] [<ffffffff81196fd0>] do_sync_read+0x80/0xb0 Nov 5 14:20:18 basti kernel: [110396.986213] [<ffffffff811975dc>] vfs_read+0x9c/0x170 Nov 5 14:20:18 basti kernel: [110397.028857] [<ffffffff81198202>] SyS_pread64+0x72/0xb0 Nov 5 14:20:18 basti kernel: [110397.071044] [<ffffffff81651819>] system_call_fastpath+0x16/0x1b Nov 5 14:20:18 basti kernel: [110397.113626] Task in /machine/stodi.libvirt-qemu killed as a result of limit of /machine/stodi.libvirt-qemu Nov 5 14:20:18 basti kernel: [110397.158316] memory: usage 2047992kB, limit 2048000kB, failcnt 208905 Nov 5 14:20:18 basti kernel: [110397.203441] memory+swap: usage 0kB, limit 9007199254740991kB, failcnt 0 Nov 5 14:20:18 basti kernel: [110397.248441] kmem: usage 0kB, limit 9007199254740991kB, failcnt 0 Nov 5 14:20:19 basti kernel: [110397.294061] Memory cgroup stats for /machine/stodi.libvirt-qemu: cache:6488KB rss:2041400KB rss_huge:651264KB mapped_file:40KB inactive_anon:579124KB active_anon:1462288KB inactiv e_file:3308KB active_file:3168KB unevictable:0KB Nov 5 14:20:19 basti kernel: [110397.392164] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name Nov 5 14:20:19 basti kernel: [110397.442373] [ 9188] 107 9188 895249 512501 1208 0 0 qemu-system-x86 Nov 5 14:20:19 basti kernel: [110397.494000] Memory cgroup out of memory: Kill process 16089 (qemu-system-x86) score 1003 or sacrifice child Nov 5 14:20:19 basti kernel: [110397.546583] Killed process 16089 (qemu-system-x86) total-vm:3580996kB, anon-rss:2039868kB, file-rss:10136kB which I believe to be the issue under discussion. It does not strike me as OK that a silently-chosen limit I didn't know about should cause a VM to be OOMed when there's plenty of RAM left on the hypervisor, just because I tweaked swappiness. On the plus side, now I know about virsh memtune, and that seems to fix things nicely. -Robin This issue is fixed upstream for a while (and I forgot to update this bug, sorry): commit 16bcb3b61675a88bff00317336b9610080c31000 Author: Michal Privoznik <mprivozn> AuthorDate: Fri Aug 9 14:46:54 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Mon Aug 19 11:16:58 2013 +0200 qemu: Drop qemuDomainMemoryLimit This function is to guess the correct limit for maximal memory usage by qemu for given domain. This can never be guessed correctly, not to mention all the pains and sleepless nights this code has caused. Once somebody discovers algorithm to solve the Halting Problem, we can compute the limit algorithmically. But till then, this code should never see the light of the release again. $ git describe --contains 16bcb3b61675a88bff00317336b9610080c31000 v1.1.2-rc1~86 Therefore, I am closing this bug. Since comment #9 is talking about fedora 19, we should backport that patch libvirt-1.0.5.7-1.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/libvirt-1.0.5.7-1.fc19 Confirmed that no surprise memtune is added for a host that doesn't have one. Thanks! -Robin Package libvirt-1.0.5.7-1.fc19: * should fix your issue, * was pushed to the Fedora 19 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing libvirt-1.0.5.7-1.fc19' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-20798/libvirt-1.0.5.7-1.fc19 then log in and leave karma (feedback). libvirt-1.0.5.7-1.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report. |