Bug 869586
Summary: | core dump happens when quitting qemu via monitor | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Xiaoqing Wei <xwei> | ||||||||||||||
Component: | qemu-kvm | Assignee: | Laszlo Ersek <lersek> | ||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||||
Priority: | medium | ||||||||||||||||
Version: | 6.4 | CC: | acathrow, areis, bsarathy, dron, flang, juzhang, kwolf, lnovich, michen, mkenneth, pbonzini, qzhang, shuang, stefanha, virt-maint | ||||||||||||||
Target Milestone: | rc | ||||||||||||||||
Target Release: | --- | ||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||
OS: | Unspecified | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | qemu-kvm-0.12.1.2-2.370.el6 | Doc Type: | Bug Fix | ||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | |||||||||||||||||
: | 959102 (view as bug list) | Environment: | |||||||||||||||
Last Closed: | 2013-11-21 05:54:48 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | |||||||||||||||||
Bug Blocks: | 959102 | ||||||||||||||||
Attachments: |
|
Description
Xiaoqing Wei
2012-10-24 10:43:59 UTC
Created attachment 632688 [details]
thread apply all bt full
FWIW, this is unlikely to be a monitor/qmp bug. My impression is that either something is corrupting memory and the bh list got corrupted; or a bottom half got destroyed but didn't get removed from the bug list. Kevin, I remember seeing something similar involving quit in a block layer bug, can you take a quick at this one? Xiaoping, do you remember what you were doing when the bug happened? Doesn't really ring a bell. I once had problems with nested event loops, where the inner qemu_bh_poll() would free the BH while the outer one was still working on it, but this should be fixed with 648fb0ea (which RHEL 6 does have since February). Paolo fixed a similar bug in aio.c, which RHEL doesn't have, but it would crash in a different place. I would agree, though, that it's most likely not a monitor bug. (In reply to comment #2) Hi Luiz, > Xiaoping, do you remember what you were doing when the bug happened? The Only operation in guest is shutting down(the guest is running w/o any workload), and since the shutdown process is too slow, autotest trying to kill it by monitor, nothing else. Xiaoqing So, you do a shutdown in the guest and run quit in the monitor in parallel, I'll try that. https://bugzilla.redhat.com/show_bug.cgi?id=952687#c4 Hi Luiz, we have another similar bz, could you pls help to have a check whether dup ? Thx *** Bug 952687 has been marked as a duplicate of this bug. *** (In reply to comment #1) > Created attachment 632688 [details] > thread apply all bt full Thread 6: kvm VCPU thread, running guest code Thread 5: QXL thread (red_worker_main) Thread 4: kvm VCPU thread, having just exited guest code & waiting for the BQL Thread 3: Posix AIO thread, waiting for a new AIO request Thread 2: Posix AIO thread, waiting for a new AIO request Thread 1: qemu IO thread, finishing off an interation of the main loop, crashing in qemu_bh_poll() qemu_bh_poll() doesn't seem to be reentrant by different threads. I can see many qemu_aio_flush() / qemu_aio_wait() calls around the source, both of which functions call qemu_bh_poll(). Are we sure no other thread than the IO thread massages the BH list? For example, in the above set of threads, could Thread 3 or Thread 2 (AIO threads) interfere with the BH list? qemu_bh_poll() is reentrant actually, see the "nesting" variable. It is also safe to call qemu_bh_schedule/cancel/delete while another thread is calling qemu_bh_poll(), though for cancel the results are of course unpredictable. (In reply to comment #13) > qemu_bh_poll() is reentrant actually, see the "nesting" variable. It is reentrant from within the same thread; it's not reentrant for multiple threads without synchronization. (The "nesting" variable itself is not synchronized.) From comment #10: > qemu_bh_poll() doesn't seem to be reentrant by different threads. Then, > It is also safe to call qemu_bh_schedule/cancel/delete while another thread > is calling qemu_bh_poll(), How so? The first three functions modify fields ("scheduled" / "deleted") of the referenced QEMUBH object, while the other thread calling qemu_bh_poll() reads those fields. I can't see any synchronization. Without synchronization, changes to these fields made by thread T1 are not required to be visible to thread T2 for any length of time. If T1 destroys some "business logic" object and deletes a pending, associated BH with qemu_bh_delete(), T2 may not see the "deleted" field as nonzero in qemu_bh_poll(), and call the callback that supposedly operates on the (already nonexistent) "business logic" object. Created attachment 739640 [details]
core dump file - 0
For the fileshare just can keep the file for a period of time, the attachment is the core dump file.
Created attachment 739641 [details]
core dump file - 1
Created attachment 739643 [details]
core dump file - 2
(In reply to comment #10) > (In reply to comment #1) > > Created attachment 632688 [details] > > thread apply all bt full > > Thread 6: kvm VCPU thread, running guest code > Thread 5: QXL thread (red_worker_main) > Thread 4: kvm VCPU thread, having just exited guest code & waiting for the > BQL > Thread 3: Posix AIO thread, waiting for a new AIO request > Thread 2: Posix AIO thread, waiting for a new AIO request > Thread 1: qemu IO thread, finishing off an interation of the main loop, > crashing in qemu_bh_poll() > > qemu_bh_poll() doesn't seem to be reentrant by different threads. Agreed. It must be called with the global mutex held. > I can see many qemu_aio_flush() / qemu_aio_wait() calls around the source, > both of which functions call qemu_bh_poll(). Are we sure no other thread > than the IO thread massages the BH list? For example, in the above set of > threads, could Thread 3 or Thread 2 (AIO threads) interfere with the BH list? The chance is low. The POSIX aio workers really are just preadv()/pwritev()/fdatasync()/ioctl() worker functions - they don't call back into QEMU code. What I find interesting is: #0 qemu_bh_poll () at async.c:65 bh = 0xff575700ffff57 The address looks very suspicious since a heap-allocated address should be aligned to sizeof(void*). This is probably memory corruption. > > qemu_bh_poll() is reentrant actually, see the "nesting" variable. > > It is reentrant from within the same thread; it's not reentrant for multiple > threads without synchronization. (The "nesting" variable itself is not > synchronized.) From comment #10: Right, you're supposed to call it within the "big QEMU lock", which is almost always held. > > It is also safe to call qemu_bh_schedule/cancel/delete while another thread > > is calling qemu_bh_poll(), > > How so? The first three functions modify fields ("scheduled" / "deleted") of > the referenced QEMUBH object, while the other thread calling qemu_bh_poll() > reads those fields. I can't see any synchronization. > > Without synchronization, changes to these fields made by thread T1 are not > required to be visible to thread T2 for any length of time. The qemu_notify_event() ensures that qemu_bh_poll() is executed from the beginning once more after qemu_bh_schedule(). (It is reasonable to assume that a system call includes a memory barrier). > If T1 destroys some "business logic" object and deletes a pending, associated > BH with qemu_bh_delete(), T2 may not see the "deleted" field as nonzero in > qemu_bh_poll(), and call the callback that supposedly operates on the > (already nonexistent) "business logic" object. This is beyond the scope of the BH machinery. This should be handled with some kind of reference counting, or by deferring the actual deletion of the object to a(nother) bottom half or another callback that is guaranteed to be called under the big QEMU lock. Anyway this doesn't happen in RHEL6, which is almost exclusively single threaded. Upstream commits related to the bug that Kevin mentioned in comment 3: - commit 4231c88d27d9e46e6ad6e6b7bbb6e442bcf9cd05 - commit 2db2bfc0ccac5fd68dbf0ceb70fbc372c5d8a8c7 - commit 00f78533326c5ba2e62fafada16655aa558a5520 but this is unlikely to be the same corruption. Created attachment 744712 [details]
BH list from coredump in comments 17-19
The coredump attached for comments 17-19 saved a little bit different crash.
There are 5 threads. 4 VCPU threads are waiting for qemu_mutex. The IO
thread crashes as follows:
#0 qemu_bh_delete (bh=0x2) at /usr/src/debug/qemu-kvm-0.12.1.2/async.c:118
#1 0x00007f81bfe3adbf in virtio_blk_dma_restart_bh (opaque=<value optimized out>)
at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-blk.c:465
#2 0x00007f81bfe5d031 in qemu_bh_poll () at /usr/src/debug/qemu-kvm-0.12.1.2/async.c:70
#3 0x00007f81bfe26419 in main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4017
#4 0x00007f81bfe4898a in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2244
#5 0x00007f81bfe29018 in main_loop (argc=50, argv=<value optimized out>, envp=<value optimized out>)
at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4187
#6 main (argc=50, argv=<value optimized out>, envp=<value optimized out>)
at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6526
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f81bfd9d980 (LWP 15318))]#0 qemu_bh_delete (bh=0x2)
at /usr/src/debug/qemu-kvm-0.12.1.2/async.c:118
118 bh->scheduled = 0;
(gdb) frame 1
#1 0x00007f81bfe3adbf in virtio_blk_dma_restart_bh (opaque=<value optimized out>)
at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-blk.c:465
465 qemu_bh_delete(s->bh);
(gdb) print s
$1 = (VirtIOBlock *) 0x7f81c0c28270
The BH chain seems more or less sound, but either the VirtIOBlock objects
linked via bh->opaque are garbled, or the opaque link itself is messed up.
We crash under the 5th BH (opaque == 0x7f81c0c28270).
See the attached graph.
Are so many virtio_blk_dma_restart_bh() BHs okay for a virtual machine that
has only two disks?
Is maybe upstream commit 69b302b2044a9a0f6d157d25b39a91ff7124c61f related? The virtlab job linked in comment 16 says "block_hotplug" many times. Bug 822386, also linked from the virtlab job, has the same stackdump as comment 22 here. Luiz reproduced the problem with the upstream tree on 2012-Aug-16 (see bug 822386 comment 23), but upstream commit 69b302b2044a9a0f6d157d25b39a91ff7124c61f is from Feb 2013. My reason for suspecting 69b302b2 is this: when we unplug the virtio-blk disk, the vmstate change handler it has registered in virtio_blk_init() is not torn down in RHEL-6. So when we issue a shutdown request (do_quit() --> qemu_system_shutdown_request()), the vmstate change handler, virtio_blk_dma_restart_cb(), - accesses invalid memory directly, - registers a BH (=virtio_blk_dma_restart_bh) for the nonexistent device. This sort of seems consistent with the DDD data chart in comment 22, with several pointer chains leading to "storage0.qcow2", which I suppose is the disk that's repeatedly (un)plugged. (I would prefer to confirm this against "block_hotplug.one_pci.fmt_qcow2.block_virtio" but I don't know how to download the steps for that test case.) I found a crazy long list of virtio_blk_dma_restart_cb() handlers in the core dump, hanging off "vm_change_state_head". *** Bug 822386 has been marked as a duplicate of this bug. *** Created attachment 745608 [details]
virtio-blk: fix unplug + virsh reboot (RHEL-6 backport of 69b302b2)
Reproduce this bug as follow version: Host: # uname -r 2.6.32-358.6.1.el6.x86_64 # rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.356.el6.x86_64 Guest: RHEL 5.8-64 2.6.18-348.el5 Because this bug senario very difficult to reproduce,so choose bug822386 senario to reproduce. Steps as same as the duplicate Bug 822386(description0) 1.Boot guest /usr/libexec/qemu-kvm -monitor stdio -S -chardev socket,id=serial_id_20120515-041452-KkUY,path=/tmp/serial-20120515-041452-KkUY,server,nowait -device isa-serial,chardev=serial_id_20120515-041452-KkUY -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=0x4 -drive file=/home/RHEL-Server-5.8-64.qcow2,index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,boot=on,snapshot=off,readonly=off,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idLYjg29,mac=9a:6e:47:a6:d8:f9,id=ndev00idLYjg29,bus=pci.0,addr=0x3 -netdev tap,id=idLYjg29,vhost=on -m 2048 -smp 4,cores=2,threads=1,sockets=2 -cpu 'Opteron_G4' -drive index=1,if=none,id=drive-ide0-0-0,media=cdrom,boot=off,snapshot=off,readonly=on,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -device usb-tablet,id=usb-tablet1,bus=usb1.0 -spice port=8000,disable-ticketing -vga qxl -rtc base=utc,clock=host,driftfix=slew -M rhel6.3.0 -boot order=cdn,once=c,menu=off -no-kvm-pit-reinjection -enable-kvm 2. In guest: #modprobe acpiphp 3. Hotplug-in a virtio-blk (qemu) __com.redhat_drive_add id=hot1,file=/root/hot1.raw,format=raw,media=disk (qemu) device_add virtio-blk-pci,id=hot_virtio,drive=hot1 4.Delete the virtio-blk device (qemu) device_del hot_virtio 5. Stop/Cont guest (qemu) stop Results: ... (qemu) __com.redhat_drive_add id=hot1,file=/root/hot1.raw,format=raw,media=disk (qemu) device_add virtio-blk-pci,id=hot_virtio,drive=hot1 [New Thread 0x7fffe53fb700 (LWP 8481)] (qemu) device_del hot_virtio (qemu) stop (qemu) con[Thread 0x7fffe53fb700 (LWP 8481) exited] cont (qemu) Program received signal SIGSEGV, Segmentation fault. qemu_bh_delete (bh=0x390) at /usr/src/debug/qemu-kvm-0.12.1.2/async.c:118 118 bh->scheduled = 0; (gdb) bt #0 qemu_bh_delete (bh=0x390) at /usr/src/debug/qemu-kvm-0.12.1.2/async.c:118 #1 0x00007ffff7df3daf in virtio_blk_dma_restart_bh (opaque=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-blk.c:465 #2 0x00007ffff7e15ff1 in qemu_bh_poll () at /usr/src/debug/qemu-kvm-0.12.1.2/async.c:70 #3 0x00007ffff7ddf419 in main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4017 #4 0x00007ffff7e0197a in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2244 #5 0x00007ffff7de2008 in main_loop (argc=42, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4187 Verify this bug as following version: Host: # uname -r 2.6.32-392.el6.x86_64 # rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.376.el6.x86_64 Guest: 2.6.18-348.el5 Steps as same as reproduce ,because this bug is difficult to reproduce,so i verify this bug use script. # cat bug869586-loop.sh #!/bin/bash while true do echo "info pci" | nc -U /tmp/monitor-unix echo "__com.redhat_drive_add id=hot1,file=/root/hot1.raw,format=raw,media=disk" | nc -U /tmp/monitor-unix echo "device_add virtio-blk-pci,id=hot_virtio,drive=hot1" | nc -U /tmp/monitor-unix; sleep 3 echo "info pci" | nc -U /tmp/monitor-unix echo "device_del hot_virtio" | nc -U /tmp/monitor-unix echo "info pci" | nc -U /tmp/monitor-unix echo "stop" | nc -U /tmp/monitor-unix echo "cont" | nc -U /tmp/monitor-unix sleep 3 done Resutls: Tried more than 400 times ,not hit any problem,guest work well.So this bug fixed. *** Bug 918798 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-1553.html |