| Summary: | hotunplug scsi device will cause qemu coredump | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Yang Meng <meyang> | ||||
| Component: | qemu-kvm-rhev | Assignee: | Markus Armbruster <armbru> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 7.3 | CC: | aliang, chayang, huding, juzhang, knoel, meyang, ngu, pezhang, pingl, shuang, virt-maint, xuwei | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-09-09 09:13:33 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
core file location: http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/rhel7/bug1327377/ Could be a duplicate of bug 1318181. I suspect this is duplicate of bug 1341531. We fixed that one in qemu-kvm-rhev-2.6.0-12.el7. Could you please retest this bug with that version? If it appears to be fixed there, also testing the version before would be nice. hi, you can refer to this bug : https://bugzilla.redhat.com/show_bug.cgi?id=1362084 #comment 1 i copy the comment from the author(areis) blockdev-add API is still experimental upstream and not supported by libvirt. But instead of closing this BZ, I'm reassigning to Kevin for further evaluation, while deferring it to 7.4. Yes, blockdev-add is experimental, but the stack backtrace shows a crash in drive_del. It might be reproducible even without blockdev-add. We've recently fixed a crash bug in drive_del (bug 1341531) and a related crash bug in drive_add (bug 1352865). Let's exclude them before we dig deeper. Please retest with qemu-kvm-rhev-2.6.0-16.el7 (contains both fixes). If it works, retest with qemu-kvm-rhev-2.6.0-12.el7 (just the drive_del fix). If it works, retest with qemu-kvm-rhev-2.6.0-11.el7. As usual, provide a stack backtrace when you observe a crash. (In reply to Markus Armbruster from comment #13) > Yes, blockdev-add is experimental, but the stack backtrace shows a crash in > drive_del. It might be reproducible even without blockdev-add. > > We've recently fixed a crash bug in drive_del (bug 1341531) and a related > crash bug in drive_add (bug 1352865). Let's exclude them before we dig > deeper. Please retest with qemu-kvm-rhev-2.6.0-16.el7 (contains both > fixes). If it works, retest with qemu-kvm-rhev-2.6.0-12.el7 (just the > drive_del fix). If it works, retest with qemu-kvm-rhev-2.6.0-11.el7. As > usual, provide a stack backtrace when you observe a crash. the reproduce results is attached as a file: all of the version i tried will also got the crash. the coredump files: http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/rhel7/meyang-blockdev-add/ Created attachment 1198944 [details]
reproduce-log
The crashes observed in comment#14 are almost certainly bug 1362084, as Yang Meng suggested in comment#12. The difference is that the invalid input triggering the crash is '"detect-zeroes": true' for bug 1362084, and "'writeback': false" here. But now I wonder about comment#0... I'm pretty sure qemu-kvm-rhev-2.5.0-4.el7.x86_64 also crashes on "'writeback': false". How could it ever get past that and crash in drive_del? Let's fix bug 1362084 first, then see whether we can still reproduce anything bad here. Yang Meng kindly verified that with bug 1362084 fixed, the incorrect blockdev-add command is rejected cleanly. I can't reproduce the crash reported in comment#0 step 4 __com.redhat_drive_del, because as far as I can tell, all versions of qemu-kvm either crash or reject in step 3 blockdev-add. I tried omitting the invalid part of blockdev-add ('writeback': false), no dice. But perhaps I'm doing something wrong. Yang Meng, could you double-check for me? Use { "execute": "blockdev-add", "arguments": {'options' : {'driver': 'raw', 'id':'drive-disk1', 'discard':'unmap', 'rerror':'stop', 'werror':'stop', 'file': {'driver': 'host_device', 'filename': '/dev/sdb'}, 'cache': { 'direct': true, 'no-flush': false }}} } If this works now, we can close the bug. (In reply to Markus Armbruster from comment #20) > Yang Meng kindly verified that with bug 1362084 fixed, the incorrect > blockdev-add command is rejected cleanly. > > I can't reproduce the crash reported in comment#0 step 4 > __com.redhat_drive_del, because as far as I can tell, all versions of > qemu-kvm either crash or reject in step 3 blockdev-add. > > I tried omitting the invalid part of blockdev-add ('writeback': > false), no dice. But perhaps I'm doing something wrong. > > Yang Meng, could you double-check for me? Use > > { "execute": "blockdev-add", "arguments": {'options' : {'driver': 'raw', > 'id':'drive-disk1', 'discard':'unmap', 'rerror':'stop', 'werror':'stop', > 'file': {'driver': 'host_device', 'filename': '/dev/sdb'}, 'cache': { > 'direct': true, 'no-flush': false }}} } > > If this works now, we can close the bug. hi, using your command : { "execute": "blockdev-add", "arguments": {'options' : {'driver': 'raw', 'id':'drive-disk1', 'discard':'unmap', 'rerror':'stop', 'werror':'stop', 'file': {'driver': 'host_device', 'filename': '/dev/sdb'}, 'cache': { 'direct': true, 'no-flush': false }}} } the hotplug works fine. i just tried your test build in #comment 17 Thanks, Yang Meng! This bug is certainly a duplicate of bug 1362084 and probably also a duplicate of bug 1341531. Closing as duplicate of the former. *** This bug has been marked as a duplicate of bug 1362084 *** |
Description of problem: hotunplug scsi device will cause qemu coredump Version-Release number of selected component (if applicable): hostinfo: qemu-kvm-rhev-2.5.0-4.el7.x86_64 kernel-3.10.0-370.el7.x86_64 guest info: kernel-3.10.0-327.10.1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1.modprobe scsi_debug dev_size_mb=1024 lbpu=1 [root@ibm-x3650m4-06 job-results]# lsscsi | grep scsi_debug [12:0:0:0] disk Linux scsi_debug 0004 /dev/sdb 2.boot up qemu: /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -vga std \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20160411-140419-Gg5YcEa4,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20160411-140419-Gg5YcEa4,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idSAyD5R \ -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20160411-140419-Gg5YcEa4,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20160411-140419-Gg5YcEa4,path=/var/tmp/seabios-20160411-140419-Gg5YcEa4,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20160411-140419-Gg5YcEa4,iobase=0x402 \ -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 \ -device virtio-scsi-pci,id=scsi,indirect_desc=off,event_idx=off,bus=pci.0 \ -drive file=/home/RHEL-Server-7.2-64-virtio-scsi.qcow2,if=none,id=drive-hd-disk,format=qcow2,cache=none,werror=stop,rerror=stop,discard=on \ -device scsi-hd,drive=drive-hd-disk,id=scsi_disk \ -device virtio-net-pci,mac=9a:f2:f3:f4:f5:f6,id=idcXgxuy,vectors=4,netdev=id7Hlkh4,bus=pci.0,addr=04,disable-legacy=off,disable-modern=on \ -netdev tap,id=id7Hlkh4,vhost=on \ -m 32768 \ -smp 16,maxcpus=16,cores=8,threads=1,sockets=2 \ -cpu 'SandyBridge' \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -monitor stdio \ 3.hotplug the disk: nc -U /var/tmp/monitor-qmpmonitor1-20160411-140419-Gg5YcEa4 {"execute": "qmp_capabilities"} { "execute": "blockdev-add", "arguments": {'options' : {'driver': 'raw', 'id':'drive-disk1', 'discard':'unmap', 'rerror':'stop', 'werror':'stop', 'file': {'driver': 'host_device', 'filename': '/dev/sdb'}, 'cache': { 'writeback': false, 'direct': true, 'no-flush': false }}} } {"execute":"device_add","arguments":{"driver":"virtio-scsi-pci","id":"scsi1","bus":"pci.0","addr":"0x8"}} {"execute":"device_add","arguments":{"driver":"scsi-hd","drive":"drive-disk1","id":"data-disk1","bus":"scsi1.0"}} 4.hotunplug the disk: {"execute":"device_del","arguments":{"id":"data-disk1"} } {"execute":"device_del","arguments":{"id":"scsi1"} } {"execute":"__com.redhat_drive_del","arguments":{"id":"drive-disk1"} } 5.qemu coredumped (gdb) bt full #0 0x00007fa33c8fdfc6 in __strcmp_sse42 () from /lib64/libc.so.6 No symbol table info available. #1 0x00007fa344b6fc74 in qdict_find (qdict=qdict@entry=0x7ffe6c3eb3f8, key=key@entry=0x7fa344bfda9d "id", bucket=<optimized out>) at qobject/qdict.c:115 entry = 0x7fa346490460 #2 0x00007fa344b70136 in qdict_get (qdict=0x7ffe6c3eb3f8, key=key@entry=0x7fa344bfda9d "id") at qobject/qdict.c:161 entry = <optimized out> #3 0x00007fa344b70319 in qdict_get_str (qdict=<optimized out>, key=key@entry=0x7fa344bfda9d "id") at qobject/qdict.c:285 No locals. #4 0x00007fa3449cb3f7 in hmp_drive_del (mon=<optimized out>, qdict=<optimized out>) at blockdev.c:2741 id = <optimized out> blk = <optimized out> bs = <optimized out> aio_context = <optimized out> local_err = 0x7fa346cf4400 #5 0x00007fa3449076eb in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /usr/src/debug/qemu-2.5.0/monitor.c:3905 local_err = 0x0 obj = <optimized out> data = 0x0 ---Type <return> to continue, or q <return> to quit--- input = <optimized out> args = 0x7fa346cf3200 cmd_name = <optimized out> mon = 0x7fa3464c1500 __func__ = "handle_qmp_command" #6 0x00007fa344b71db0 in json_message_process_token (lexer=0x7fa3464c1568, input=0x7fa3464d4b60, type=JSON_RCURLY, x=71, y=8) at qobject/json-streamer.c:93 parser = 0x7fa3464c1560 token = 0x7fa34895c8b0 #7 0x00007fa344b85ab3 in json_lexer_feed_char (lexer=lexer@entry=0x7fa3464c1568, ch=125 '}', flush=flush@entry=false) at qobject/json-lexer.c:310 new_state = <optimized out> __PRETTY_FUNCTION__ = "json_lexer_feed_char" #8 0x00007fa344b85b7e in json_lexer_feed (lexer=0x7fa3464c1568, buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:360 err = <optimized out> i = <optimized out> #9 0x00007fa344b71ea9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>) at qobject/json-streamer.c:113 No locals. #10 0x00007fa3449059eb in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /usr/src/debug/qemu-2.5.0/monitor.c:3921 old_mon = 0x0 ---Type <return> to continue, or q <return> to quit--- #11 0x00007fa3449d310e in qemu_chr_be_write (len=<optimized out>, buf=0x7ffe6c3eb530 "}\265>l\376\177", s=0x7fa3464be880) at qemu-char.c:280 No locals. #12 tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x7fa3464be880) at qemu-char.c:2902 chr = 0x7fa3464be880 s = 0x7fa3464945b0 buf = "}\265>l\376\177\000\000\240\031\267D\243\177\000\000\030\003\000\000\000\000\000\000\365\003\267D\243\177\000\000 \347\355G\243\177\000\000\203\372\266D\243\177\000\000 \347\355G\243\177", '\000' <repeats 18 times>, "\364\026\267D\243\177\000\000\220ǥG\243\177\000\000\020\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\243\177\000\000 \347\355G\243\177\000\000\214\272a=\243\177\000\000\017\000\000\000\000\000\000\000\200MMF\243\177\000\000\000ȥG\243\177\000\000P\271>l\376\177\000\000\000\255\245G\243\177\000\000\000m\324G\243\177\000\000hd\324G\243\177\000\000\"\376`=\243\177\000\000\000\255\245G\243\177\000\000\r\000\000\000\000\000\000\000"... len = <optimized out> size = <optimized out> #13 0x00007fa33d5ff79a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 No symbol table info available. #14 0x00007fa344afd100 in glib_pollfds_poll () at main-loop.c:211 context = 0x7fa346457080 pfds = <optimized out> #15 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:256 ret = 2 ---Type <return> to continue, or q <return> to quit--- spin_counter = 0 #16 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:504 ret = 2 timeout = 4294967295 timeout_ns = <optimized out> #17 0x00007fa3448db6ee in main_loop () at vl.c:1923 nonblocking = <optimized out> last_io = 2 #18 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4695 i = <optimized out> snapshot = <optimized out> linux_boot = <optimized out> initrd_filename = <optimized out> kernel_filename = <optimized out> kernel_cmdline = <optimized out> boot_order = 0x7fa346416948 "cdn" boot_once = 0x7fa346416958 "c" cyls = <optimized out> heads = <optimized out> secs = <optimized out> translation = <optimized out> hda_opts = <optimized out> opts = <optimized out> ---Type <return> to continue, or q <return> to quit--- machine_opts = <optimized out> icount_opts = <optimized out> olist = <optimized out> optind = 64 optarg = 0x7fa3464164e0 "pc" loadvm = <optimized out> machine_class = <optimized out> cpu_model = <optimized out> vga_model = 0x7ffe6c3edc71 "std" qtest_chrdev = <optimized out> qtest_log = <optimized out> pid_file = <optimized out> incoming = <optimized out> show_vnc_port = <optimized out> defconfig = <optimized out> userconfig = 113 log_mask = <optimized out> log_file = <optimized out> trace_events = <optimized out> trace_file = <optimized out> maxram_size = <optimized out> ram_slots = <optimized out> vmstate_dump_file = <optimized out> ---Type <return> to continue, or q <return> to quit--- main_loop_err = 0x0 err = 0x0 __func__ = "main" Actual results: qemu coredumped Expected results: no error ,hotunplug successfully. Additional info: cpuinfo: processor : 23 vendor_id : GenuineIntel cpu family : 6 model : 45 model name : Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz stepping : 7 microcode : 0x710 cpu MHz : 2403.593 cache size : 15360 KB physical id : 1 siblings : 12 core id : 5 cpu cores : 6 apicid : 43 initial apicid : 43 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid xsaveopt bogomips : 4004.81 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: