Bug 623735
Summary: | hot unplug of vhost net virtio NIC causes qemu segfault | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Alex Williamson <alex.williamson> |
Component: | qemu-kvm | Assignee: | jason wang <jasowang> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.0 | CC: | akong, alex.williamson, berrange, chayang, clalance, gcosta, jasowang, khong, lihuang, llim, michen, mkenneth, szhou, tburke, virt-maint |
Target Milestone: | beta | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-0.12.1.2-2.128.el6 | Doc Type: | Bug Fix |
Doc Text: |
Cause: bug in the vhost start/stop code inside qemu-kvm.
Consequence: Hotplug a virtio nic with vhost as its backend would crash qemu-kvm.
Fix: Fix the vhost/virtio-net start/stop code.
Result: Hotplug a virtio nic works well.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2011-05-19 11:26:19 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 580951, 642595 |
Description
Alex Williamson
2010-08-12 16:18:21 UTC
Chris, Daniel - I can imagine the answer, but should libvirt be polling for the device_del to have actually completed before we do the netdev_del? I'm not sure any of the drivers are expecting the backend to go away while the frontend is still running. Shouldn't an OS be able to NAK the hot unplug and still have the device working? this time cc'd, Chris, Daniel - please see comment2 This issue has been proposed when we are only considering blocker issues in the current Red Hat Enterprise Linux release. ** If you would still like this issue considered for the current release, ask your support representative to file as a blocker on your behalf. Otherwise ask that it be considered for the next Red Hat Enterprise Linux release. ** QEMU offers no way to determine if device_del has completed or not and doesn't block on completion, nor return an error if the guest refuses. Regardless QEMU shouldn't crash if you remove the backend of a device. We explicitly want to be able todo that to change the network backend on the fly without changing the frontend. So yes, we should try to wait for completion, but that's not possible with QEMU today. Re-assigning back to Michael. We either need some kind of synchronization with libvirt to delay the netdev_del until after the device_del completes, or vhost needs to be robust enough to handle the netdev going away before the device_del completes. reproduce this bug on qemu-kvm-0.12.1.2-2.113.el6_0.1.x86_64. (gdb) bt #0 tap_set_offload (nc=0x0, csum=1, tso4=1, tso6=1, ecn=1, ufo=1) at net/tap.c:252 #1 0x00000000004205a3 in virtio_net_set_features (vdev=0x37b2560, features=269484003) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-net.c:220 #2 0x000000000042102e in virtio_ioport_write (opaque=<value optimized out>, addr=<value optimized out>, val=269484003) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-pci.c:204 #3 0x000000000042ab48 in kvm_handle_io (env=0x2d75010) at /usr/src/debug/qemu-kvm-0.12.1.2/kvm-all.c:541 #4 kvm_run (env=0x2d75010) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:975 #5 0x000000000042ac09 in kvm_cpu_exec (env=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1658 #6 0x000000000042b82f in kvm_main_loop_cpu (_env=0x2d75010) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1900 #7 ap_main_loop (_env=0x2d75010) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1950 #8 0x00000035896077e1 in start_thread () from /lib64/libpthread.so.0 #9 0x0000003588ee153d in clone () from /lib64/libc.so.6 *** Bug 637505 has been marked as a duplicate of this bug. *** Test hot-unplug virtio nic with vhost=on via virt-manager using package https://brewweb.devel.redhat.com/taskinfo?taskID=2841906, core dump again, but it is different from comment8. (gdb) bt #0 0x00000034c8e75782 in malloc_consolidate () from /lib64/libc.so.6 #1 0x00000034c8e78612 in _int_malloc () from /lib64/libc.so.6 #2 0x00000034c8e79a3d in malloc () from /lib64/libc.so.6 #3 0x0000000000475b35 in qemu_malloc (size=<value optimized out>) at qemu-malloc.c:59 #4 0x0000000000475c16 in qemu_mallocz (size=4120) at qemu-malloc.c:75 #5 0x0000000000494aae in qdict_new () at qdict.c:38 #6 0x0000000000495f95 in json_message_process_token (lexer=0x11cefa0, token=0x12052a0, type=JSON_OPERATOR, x=1, y=63) at json-streamer.c:45 #7 0x0000000000495da3 in json_lexer_feed_char (lexer=0x11cefa0, ch=123 '{') at json-lexer.c:299 #8 0x0000000000495ed7 in json_lexer_feed (lexer=0x11cefa0, buffer=0x7fffea30e020 "{", size=1) at json-lexer.c:322 #9 0x00000000004124d2 in monitor_control_read (opaque=<value optimized out>, buf=<value optimized out>, size=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/monitor.c:4478 #10 0x00000000004b6f2a in qemu_chr_read (opaque=0x10d6640) at qemu-char.c:154 #11 tcp_chr_read (opaque=0x10d6640) at qemu-char.c:2072 #12 0x000000000040b4af in main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4234 #13 0x0000000000428cfa in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2133 #14 0x000000000040e5cb in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4444 #15 main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6601 move to verified based on comment#22 Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: bug in the vhost start/stop code inside qemu-kvm. Consequence: Hotplug a virtio nic with vhost as its backend would crash qemu-kvm. Fix: Fix the vhost/virtio-net start/stop code. Result: Hotplug a virtio nic works well. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0534.html An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0534.html |