RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1013758 - Cgroups memory limit are causing the virt to be terminated unexpectedly
Summary: Cgroups memory limit are causing the virt to be terminated unexpectedly
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 891653
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-30 17:41 UTC by Michal Privoznik
Modified: 2013-11-21 09:11 UTC (History)
17 users (show)

Fixed In Version: libvirt-0.10.2-28.el6
Doc Type: Bug Fix
Doc Text:
Cause: Previously, libvirt has contained a heuristic to determine the limit for maximal memory usage by a qemu process. If the limit was reached, kernel just killed the qemu process and hence the domain was killed as well. This, however, can't be guessed correctly. Never. Consequence: Domains were killed randomly. Fix: The heuristic was dropped. Result: Domains aren't killed by kernel anymore.
Clone Of: 891653
Environment:
Last Closed: 2013-11-21 09:11:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1581 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2013-11-21 01:11:35 UTC

Description Michal Privoznik 2013-09-30 17:41:06 UTC
+++ This bug was initially created as a clone of Bug #891653 +++

[...]

--- Additional comment from xingxing on 2013-09-30 09:19:34 CEST ---

libvirt-0.10.2-18.el6_4.14.x86_64
still there.

--- Additional comment from Thomas Lee on 2013-09-30 19:11:53 CEST ---

We recently had what appears to be this bug occur on one of our critical production machines, running RHEL 6.4.  This issue is marked CLOSED, but if indeed what we've experienced is the same issue, I don't think it's truly resolved.

# rpm -q libvirt
libvirt-0.10.2-18.el6_4.9.x86_64

# virsh dumpxml oasis-replica.1
<domain type='kvm' id='12'>
  <name>oasis-replica.1</name>
  <uuid>bb6756a5-3396-3307-92ca-92e75d453a7b</uuid>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <os>
    <type arch='x86_64' machine='rhel6.3.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/libvirt/images/oasis-replica.1-hda.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/libvirt/images/oasis-replica.1-hdb.qcow2'/>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/net/nas01/srv/oasis-replica.1-hdc.qcow2'/>
      <target dev='hdc' bus='ide'/>
      <alias name='ide0-1-0'/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <controller type='usb' index='0'>
      <alias name='usb0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='ide' index='0'>
      <alias name='ide0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:08:35:a4'/>
      <source bridge='br0'/>
      <target dev='vnet2'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='52:54:00:0a:61:6b'/>
      <source bridge='br1'/>
      <target dev='vnet3'/>
      <model type='virtio'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/3'/>
      <target port='0'/>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/3'>
      <source path='/dev/pts/3'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='5901' autoport='yes' listen='127.0.0.1' keymap='en-us'>
      <listen type='address' address='127.0.0.1'/>
    </graphics>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='none'/>
</domain>

Excerpt from /var/log/messages:

Sep 27 12:12:01 vm07 kernel: qemu-kvm invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0
Sep 27 12:12:01 vm07 kernel: qemu-kvm cpuset=emulator mems_allowed=0-1
Sep 27 12:12:01 vm07 kernel: Pid: 56054, comm: qemu-kvm Not tainted 2.6.32-358.18.1.el6.x86_64 #1
Sep 27 12:12:01 vm07 kernel: Call Trace:
Sep 27 12:12:01 vm07 kernel: [<ffffffff810cb641>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Sep 27 12:12:01 vm07 kernel: [<ffffffff8111ce40>] ? dump_header+0x90/0x1b0
Sep 27 12:12:01 vm07 kernel: [<ffffffff811725e1>] ? task_in_mem_cgroup+0xe1/0x120
Sep 27 12:12:01 vm07 kernel: [<ffffffff8111d2c2>] ? oom_kill_process+0x82/0x2a0
Sep 27 12:12:01 vm07 kernel: [<ffffffff8111d1be>] ? select_bad_process+0x9e/0x120
Sep 27 12:12:01 vm07 kernel: [<ffffffff8111da42>] ? mem_cgroup_out_of_memory+0x92/0xb0
Sep 27 12:12:01 vm07 kernel: [<ffffffff81173824>] ? mem_cgroup_handle_oom+0x274/0x2a0
Sep 27 12:12:01 vm07 kernel: [<ffffffff81171260>] ? memcg_oom_wake_function+0x0/0xa0
Sep 27 12:12:01 vm07 kernel: [<ffffffff81173e09>] ? __mem_cgroup_try_charge+0x5b9/0x5d0
Sep 27 12:12:01 vm07 kernel: [<ffffffff81175187>] ? mem_cgroup_charge_common+0x87/0xd0
Sep 27 12:12:01 vm07 kernel: [<ffffffff81175218>] ? mem_cgroup_newpage_charge+0x48/0x50
Sep 27 12:12:01 vm07 kernel: [<ffffffff81142ac4>] ? do_wp_page+0x1a4/0x920
Sep 27 12:12:01 vm07 kernel: [<ffffffff81143a3d>] ? handle_pte_fault+0x2cd/0xb50
Sep 27 12:12:01 vm07 kernel: [<ffffffffa0441bdc>] ? nfs_direct_req_free+0x3c/0x50 [nfs]
Sep 27 12:12:01 vm07 kernel: [<ffffffff8127a9e7>] ? kref_put+0x37/0x70
Sep 27 12:12:01 vm07 kernel: [<ffffffffa0441f31>] ? nfs_file_direct_read+0x1c1/0x230 [nfs]
Sep 27 12:12:01 vm07 kernel: [<ffffffff811444fa>] ? handle_mm_fault+0x23a/0x310
Sep 27 12:12:01 vm07 kernel: [<ffffffff810474e9>] ? __do_page_fault+0x139/0x480
Sep 27 12:12:01 vm07 kernel: [<ffffffff810874f6>] ? group_send_sig_info+0x56/0x70
Sep 27 12:12:01 vm07 kernel: [<ffffffff8108754f>] ? kill_pid_info+0x3f/0x60
Sep 27 12:12:01 vm07 kernel: [<ffffffff81513b6e>] ? do_page_fault+0x3e/0xa0
Sep 27 12:12:01 vm07 kernel: [<ffffffff81510f25>] ? page_fault+0x25/0x30
Sep 27 12:12:01 vm07 kernel: Task in /libvirt/qemu/oasis-replica.1 killed as a result of limit of /libvirt/qemu/oasis-replica.1
Sep 27 12:12:01 vm07 kernel: memory: usage 1889792kB, limit 1889792kB, failcnt 90358
Sep 27 12:12:01 vm07 kernel: memory+swap: usage 2879872kB, limit 9007199254740991kB, failcnt 0
Sep 27 12:12:01 vm07 kernel: Mem-Info:
Sep 27 12:12:01 vm07 kernel: Node 0 DMA per-cpu:
Sep 27 12:12:01 vm07 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    1: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    2: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    3: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    4: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    5: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    6: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    7: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    8: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    9: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   10: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   11: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   12: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   13: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   14: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   15: hi:    0, btch:   1 usd:   0
Sep 27 12:12:01 vm07 kernel: Node 0 DMA32 per-cpu:
Sep 27 12:12:01 vm07 kernel: CPU    0: hi:  186, btch:  31 usd: 184
Sep 27 12:12:01 vm07 kernel: CPU    1: hi:  186, btch:  31 usd: 169
Sep 27 12:12:01 vm07 kernel: CPU    2: hi:  186, btch:  31 usd: 164
Sep 27 12:12:01 vm07 kernel: CPU    3: hi:  186, btch:  31 usd: 160
Sep 27 12:12:01 vm07 kernel: CPU    4: hi:  186, btch:  31 usd: 117
Sep 27 12:12:01 vm07 kernel: CPU    5: hi:  186, btch:  31 usd:  48
Sep 27 12:12:01 vm07 kernel: CPU    6: hi:  186, btch:  31 usd:  60
Sep 27 12:12:01 vm07 kernel: CPU    7: hi:  186, btch:  31 usd:  30
Sep 27 12:12:01 vm07 kernel: CPU    8: hi:  186, btch:  31 usd:  56
Sep 27 12:12:01 vm07 kernel: CPU    9: hi:  186, btch:  31 usd:  23
Sep 27 12:12:01 vm07 kernel: CPU   10: hi:  186, btch:  31 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   11: hi:  186, btch:  31 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   12: hi:  186, btch:  31 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   13: hi:  186, btch:  31 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   14: hi:  186, btch:  31 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   15: hi:  186, btch:  31 usd:   0
Sep 27 12:12:01 vm07 kernel: Node 0 Normal per-cpu:
Sep 27 12:12:01 vm07 kernel: CPU    0: hi:  186, btch:  31 usd:  54
Sep 27 12:12:01 vm07 kernel: CPU    1: hi:  186, btch:  31 usd:  35
Sep 27 12:12:01 vm07 kernel: CPU    2: hi:  186, btch:  31 usd:  36
Sep 27 12:12:01 vm07 kernel: CPU    3: hi:  186, btch:  31 usd:  57
Sep 27 12:12:01 vm07 kernel: CPU    4: hi:  186, btch:  31 usd: 162
Sep 27 12:12:01 vm07 kernel: CPU    5: hi:  186, btch:  31 usd: 156
Sep 27 12:12:01 vm07 kernel: CPU    6: hi:  186, btch:  31 usd: 143
Sep 27 12:12:01 vm07 kernel: CPU    7: hi:  186, btch:  31 usd:  54
Sep 27 12:12:01 vm07 kernel: CPU    8: hi:  186, btch:  31 usd: 161
Sep 27 12:12:01 vm07 kernel: CPU    9: hi:  186, btch:  31 usd: 167
Sep 27 12:12:01 vm07 kernel: CPU   10: hi:  186, btch:  31 usd: 176
Sep 27 12:12:01 vm07 kernel: CPU   11: hi:  186, btch:  31 usd: 137
Sep 27 12:12:01 vm07 kernel: CPU   12: hi:  186, btch:  31 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU   13: hi:  186, btch:  31 usd:  96
Sep 27 12:12:01 vm07 kernel: CPU   14: hi:  186, btch:  31 usd: 173
Sep 27 12:12:01 vm07 kernel: CPU   15: hi:  186, btch:  31 usd:  29
Sep 27 12:12:01 vm07 kernel: Node 1 Normal per-cpu:
Sep 27 12:12:01 vm07 kernel: CPU    0: hi:  186, btch:  31 usd: 161
Sep 27 12:12:01 vm07 kernel: CPU    1: hi:  186, btch:  31 usd: 171
Sep 27 12:12:01 vm07 kernel: CPU    2: hi:  186, btch:  31 usd: 174
Sep 27 12:12:01 vm07 kernel: CPU    3: hi:  186, btch:  31 usd: 162
Sep 27 12:12:01 vm07 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    5: hi:  186, btch:  31 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Sep 27 12:12:01 vm07 kernel: CPU    7: hi:  186, btch:  31 usd: 133
Sep 27 12:12:01 vm07 kernel: CPU    8: hi:  186, btch:  31 usd:  22
Sep 27 12:12:01 vm07 kernel: CPU    9: hi:  186, btch:  31 usd:  52
Sep 27 12:12:01 vm07 kernel: CPU   10: hi:  186, btch:  31 usd:  61
Sep 27 12:12:01 vm07 kernel: CPU   11: hi:  186, btch:  31 usd: 100
Sep 27 12:12:01 vm07 kernel: CPU   12: hi:  186, btch:  31 usd: 161
Sep 27 12:12:01 vm07 kernel: CPU   13: hi:  186, btch:  31 usd:  51
Sep 27 12:12:01 vm07 kernel: CPU   14: hi:  186, btch:  31 usd: 178
Sep 27 12:12:01 vm07 kernel: CPU   15: hi:  186, btch:  31 usd: 170
Sep 27 12:12:01 vm07 kernel: active_anon:4527685 inactive_anon:1525557 isolated_anon:0
Sep 27 12:12:01 vm07 kernel: active_file:294282 inactive_file:1265201 isolated_file:0
Sep 27 12:12:01 vm07 kernel: unevictable:0 dirty:2 writeback:0 unstable:0
Sep 27 12:12:01 vm07 kernel: free:315084 slab_reclaimable:22633 slab_unreclaimable:139023
Sep 27 12:12:01 vm07 kernel: mapped:7695 shmem:51 pagetables:17243 bounce:0
Sep 27 12:12:01 vm07 kernel: Node 0 DMA free:15672kB min:40kB low:48kB high:60kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15288kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Sep 27 12:12:01 vm07 kernel: lowmem_reserve[]: 0 2990 16120 16120
Sep 27 12:12:01 vm07 kernel: Node 0 DMA32 free:572920kB min:8348kB low:10432kB high:12520kB active_anon:979480kB inactive_anon:577852kB active_file:86620kB inactive_file:345100kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3062596kB mlocked:0kB dirty:0kB writeback:0kB mapped:124kB shmem:0kB slab_reclaimable:5360kB slab_unreclaimable:100120kB kernel_stack:0kB pagetables:60kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 27 12:12:01 vm07 kernel: lowmem_reserve[]: 0 0 13130 13130
Sep 27 12:12:01 vm07 kernel: Node 0 Normal free:53336kB min:36652kB low:45812kB high:54976kB active_anon:8752624kB inactive_anon:2641696kB active_file:525668kB inactive_file:1182224kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:13445120kB mlocked:0kB dirty:8kB writeback:0kB mapped:12888kB shmem:72kB slab_reclaimable:39944kB slab_unreclaimable:169032kB kernel_stack:4640kB pagetables:36948kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 27 12:12:01 vm07 kernel: lowmem_reserve[]: 0 0 0 0
Sep 27 12:12:01 vm07 kernel: Node 1 Normal free:618408kB min:45064kB low:56328kB high:67596kB active_anon:8378636kB inactive_anon:2882680kB active_file:564840kB inactive_file:3533480kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:16531680kB mlocked:0kB dirty:0kB writeback:0kB mapped:17768kB shmem:132kB slab_reclaimable:45228kB slab_unreclaimable:286940kB kernel_stack:824kB pagetables:31964kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 27 12:12:01 vm07 kernel: lowmem_reserve[]: 0 0 0 0
Sep 27 12:12:01 vm07 kernel: Node 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15672kB
Sep 27 12:12:01 vm07 kernel: Node 0 DMA32: 13618*4kB 5805*8kB 3368*16kB 1946*32kB 1058*64kB 395*128kB 265*256kB 176*512kB 76*1024kB 1*2048kB 0*4096kB = 573168kB
Sep 27 12:12:01 vm07 kernel: Node 0 Normal: 1951*4kB 502*8kB 179*16kB 58*32kB 17*64kB 12*128kB 6*256kB 19*512kB 22*1024kB 0*2048kB 0*4096kB = 52956kB
Sep 27 12:12:01 vm07 kernel: Node 1 Normal: 304*4kB 1496*8kB 5642*16kB 2618*32kB 1445*64kB 778*128kB 339*256kB 148*512kB 75*1024kB 0*2048kB 0*4096kB = 618656kB
Sep 27 12:12:01 vm07 kernel: 1824637 total pagecache pages
Sep 27 12:12:01 vm07 kernel: 264866 pages in swap cache
Sep 27 12:12:01 vm07 kernel: Swap cache stats: add 1685236, delete 1420370, find 798807/863738
Sep 27 12:12:01 vm07 kernel: Free swap  = 0kB
Sep 27 12:12:01 vm07 kernel: Total swap = 2097144kB
Sep 27 12:12:01 vm07 kernel: 8384511 pages RAM
Sep 27 12:12:01 vm07 kernel: 172522 pages reserved
Sep 27 12:12:01 vm07 kernel: 1130444 pages shared
Sep 27 12:12:01 vm07 kernel: 7610200 pages non-shared
Sep 27 12:12:01 vm07 kernel: [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
Sep 27 12:12:01 vm07 kernel: [ 3670]   107  3670  1206371   239985   1       0             0 qemu-kvm
Sep 27 12:12:01 vm07 kernel: Memory cgroup out of memory: Kill process 3670 (qemu-kvm) score 611 or sacrifice child
Sep 27 12:12:01 vm07 kernel: Killed process 3670, UID 107, (qemu-kvm) total-vm:4825484kB, anon-rss:955344kB, file-rss:4596kB
Sep 27 12:12:01 vm07 kernel: Kill process 3694 (vhost-3670) sharing same memory
Sep 27 12:12:01 vm07 kernel: Kill process 3695 (vhost-3670) sharing same memory
Sep 27 12:12:01 vm07 kernel: br0: port 3(vnet2) entering disabled state
Sep 27 12:12:01 vm07 kernel: device vnet2 left promiscuous mode
Sep 27 12:12:01 vm07 kernel: br0: port 3(vnet2) entering disabled state
Sep 27 12:12:01 vm07 kernel: br1: port 3(vnet3) entering disabled state
Sep 27 12:12:01 vm07 kernel: device vnet3 left promiscuous mode
Sep 27 12:12:01 vm07 kernel: br1: port 3(vnet3) entering disabled state

Comment 2 Michal Privoznik 2013-09-30 18:17:01 UTC
I'm removing the heuristic that guesses the max mem limit for a qemu process. I've proposed the patches and hence I'm moving this to POST:

http://post-office.corp.redhat.com/archives/rhvirt-patches/2013-September/msg00967.html

Comment 4 Jincheng Miao 2013-10-08 16:34:47 UTC
This bug can be verified, I rebuilt a qemu-kvm by my self and inserted a memory leak for check the memory limit which libvirtd set, just like:

int kvm_run(CPUState *env)
{
...
+    int num = 1024*1024*100; 
+    void* ptr;
   again:
+    ptr = malloc(num);
+    printf("malloc %p %d\n", ptr, num);
...


For previous libvirt-0.10.2-27.el6, the heuristic limit will cause qemu-kvm terminated earlier. The memory used before quiting is:
# free
             total       used       free     shared    buffers     cached
Mem:       8001120    2958312    5042808          0      31964     205540
-/+ buffers/cache:    2720808    5280312
Swap:            0          0          0


In libvirt-0.10.2-28.el6, it drops the heuristic limit, and OOM killed is not easy, qemu-kvm will use almost entire vm, then quit:
# free
             total       used       free     shared    buffers     cached
Mem:       8001120    7879944     121176          0       9948      64100
-/+ buffers/cache:    7805896     195224
Swap:            0          0          0

So I change the status to VERIFIED.

Comment 5 Jincheng Miao 2013-10-30 10:50:58 UTC
Since the patch set includes fix related to mlock, 
-    if (mlock)
-        virCommandSetMaxMemLock(cmd, def->mem.hard_limit * 1024);
+    if (mlock) {
+        unsigned long long memKB;
+
+        /* VFIO requires all of the guest's memory to be
+         * locked resident, plus some amount for IO
+         * space. Alex Williamson suggested adding 1GiB for IO
+         * space just to be safe (some finer tuning might be
+         * nice, though).
+         */
+        memKB = def->mem.hard_limit ?
+            def->mem.hard_limit : def->mem.max_balloon + 1024 * 1024;
+        virCommandSetMaxMemLock(cmd, memKB * 1024);
+    }

so I did the following additional test:

1. start a guest with memoryBacking locked, and do not set hard_limit

# virsh dumpxml r6
<domain type='kvm' id='5'>
  <name>r6</name>
  <uuid>f14754c8-7d43-9808-6eba-e279c503713b</uuid>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <memoryBacking>
    <locked/>
  </memoryBacking>
...

we can see the 'Max locked memory' has been set according memory balloon size.
# cat /proc/`pidof qemu-kvm`/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            10485760             unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             62347                62347                processes 
Max open files            1024                 4096                 files     
Max locked memory         2147483648           2147483648           bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       62347                62347                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        

2. start a guest with memoryBacking locked, and set hard_limit to 2000000KiB
# virsh dumpxml r6
<domain type='kvm' id='5'>
  <name>r6</name>
  <uuid>f14754c8-7d43-9808-6eba-e279c503713b</uuid>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <memtune>
    <hard_limit unit='KiB'>2000000</hard_limit>
  </memtune>
  <memoryBacking>
    <locked/>
  </memoryBacking>
...

The 'Max locked memory' has been set to hard_limit.
# cat /proc/`pidof qemu-kvm`/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            10485760             unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             62347                62347                processes 
Max open files            1024                 4096                 files     
Max locked memory         2048000000           2048000000           bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       62347                62347                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us

Comment 7 errata-xmlrpc 2013-11-21 09:11:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1581.html


Note You need to log in before you can comment on or make changes to this bug.