Bug 615228

Summary: oom in vhost_dev_start
Product: Red Hat Enterprise Linux 6 Reporter: Gerd Hoffmann <kraxel>
Component: qemu-kvmAssignee: Michael S. Tsirkin <mst>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: llim, mkenneth, syeghiay, szhou, tburke, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.97.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-11-10 21:26:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gerd Hoffmann 2010-07-16 08:49:33 UTC
Description of problem:
qemu aborts due to the oom check triggering

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.91.el6.x86_64

Steps to Reproduce:
win2k8 64bit guest with virtio nic and vhost enabled.
started migration right after boot.
  
Actual results:
qemu aborts

Expected results:
qemu continues

Additional info:
(gdb) bt
#0  0x0000003710e329b5 in raise () from /lib64/libc.so.6
#1  0x0000003710e34195 in abort () from /lib64/libc.so.6
#2  0x00000000004755e5 in oom_check (size=<value optimized out>) at qemu-malloc.c:30
#3  qemu_malloc (size=<value optimized out>) at qemu-malloc.c:59
#4  0x00000000004756b6 in qemu_mallocz (size=562949953421312) at qemu-malloc.c:75
#5  0x0000000000422f05 in vhost_dev_start (hdev=0x11e3140, vdev=0x1741140)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/vhost.c:665
#6  0x00000000004222a5 in vhost_net_start (net=0x11e3140, dev=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/vhost_net.c:131
#7  0x000000000041fa8f in virtio_net_set_status (vdev=0x1741140, status=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-net.c:867
#8  0x00000000004206b4 in virtio_set_status (opaque=0x1240a30, addr=<value optimized out>, val=7)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio.h:129
#9  virtio_ioport_write (opaque=0x1240a30, addr=<value optimized out>, val=7)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-pci.c:223
#10 0x000000000042a053 in kvm_handle_io (env=0x11f9560)
    at /usr/src/debug/qemu-kvm-0.12.1.2/kvm-all.c:535
#11 kvm_run (env=0x11f9560) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:975
#12 0x000000000042a239 in kvm_cpu_exec (env=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1658
#13 0x000000000042ae5f in kvm_main_loop_cpu (_env=0x11f9560)
    at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1900
#14 ap_main_loop (_env=0x11f9560) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1950
#15 0x00000037116077e1 in start_thread () from /lib64/libpthread.so.0
#16 0x0000003710ee151d in clone () from /lib64/libc.so.6
(gdb) up
#1  0x0000003710e34195 in abort () from /lib64/libc.so.6
(gdb) up
#2  0x00000000004755e5 in oom_check (size=<value optimized out>) at qemu-malloc.c:30
30              abort();
(gdb) up
#3  qemu_malloc (size=<value optimized out>) at qemu-malloc.c:59
59          return oom_check(malloc(size ? size : 1));
(gdb) up
#4  0x00000000004756b6 in qemu_mallocz (size=562949953421312) at qemu-malloc.c:75
75          ptr = qemu_malloc(size);
(gdb) up
#5  0x0000000000422f05 in vhost_dev_start (hdev=0x11e3140, vdev=0x1741140)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/vhost.c:665
665                 qemu_mallocz(hdev->log_size * sizeof *hdev->log) : NULL;
(gdb) print *hdev
$1 = {client = {set_memory = 0x423650 <vhost_client_set_memory>, 
    sync_dirty_bitmap = 0x422860 <vhost_client_sync_dirty_bitmap>, 
    migration_log = 0x422b40 <vhost_client_migration_log>, list = {le_next = 0x0, 
      le_prev = 0xbeb550}}, control = 10, mem = 0x11ea420, vqs = 0x11e31b8, nvqs = 2, 
  features = 1023410176, acked_features = 0, backend_features = 0, started = false, 
  log_enabled = true, log = 0x0, log_size = 70368744177664}

Comment 2 RHEL Program Management 2010-07-16 09:17:40 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 4 Michael S. Tsirkin 2010-07-16 11:24:38 UTC
Please post qemu command line.

Comment 5 Gerd Hoffmann 2010-07-16 11:30:41 UTC
/mort/guests/bin/qemu-rhel6 -name win2k8 -m 1G -monitor unix:/mort/guests/sockets/xeni-win2k8/monitor,server,nowait -qmp unix:/mort/guests/sockets/xeni-win2k8/qmp,server,nowait -serial unix:/mort/guests/sockets/xeni-win2k8/serial,server,nowait -enable-kvm -spice port=5920,disable-ticketing -hda /mort/guests/image/win2k8.img -cdrom /usr/share/virtio-win/virtio-win.iso -netdev tap,id=net0,script=/mort/guests/bin/vm-ifup,downscript=/mort/guests/bin/vm-ifdown,vhost=on -device virtio-net-pci,mac=52:54:00:ff:00:05,netdev=net0 -vga cirrus -usbdevice tablet -cpu qemu64,+sse2,+x2apic -rtc-td-hack -no-kvm-pit-reinjection -L /usr/share/qemu-kvm

'start guest; sleep 5; start migration-script' triggers this in most of the cases for me.

Comment 9 Michael S. Tsirkin 2010-07-16 12:18:15 UTC
can you come on #kvm to discuss this?

Comment 10 Michael S. Tsirkin 2010-07-16 12:23:04 UTC
Am I right guessing this only happens if you migrate
during guest boot? what if you remove -vga cirrus
and/or tablet?

Comment 12 Michael S. Tsirkin 2010-07-16 14:44:59 UTC
ok, so the bug is in strt during migration: we try to
init log size without setting ring layout first,
which gets insane results.

Comment 17 Shirley Zhou 2010-07-30 08:21:36 UTC
Reproduce this bug in qemu-kvm-0.12.1.2-2.91.el6, this bug happens when do migration as windows entering into "start windows normally" stage.

Verify this bug in qemu-kvm-0.12.1.2-2.104.el6, this bug does not exist.

Change bug status to verified.

Comment 18 releng-rhel@redhat.com 2010-11-10 21:26:39 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.