Bug 1123698 - qemu-kvm core dump when hot-plug a virtio-scsi disk to guest and reboot
Summary: qemu-kvm core dump when hot-plug a virtio-scsi disk to guest and reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.6
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: rc
: ---
Assignee: Fam Zheng
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-28 02:16 UTC by Xiaomei Gao
Modified: 2014-10-14 07:03 UTC (History)
13 users (show)

Fixed In Version: qemu-kvm-0.12.1.2-2.433.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-10-14 07:03:02 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1490 normal SHIPPED_LIVE qemu-kvm bug fix and enhancement update 2014-10-14 01:28:27 UTC

Description Xiaomei Gao 2014-07-28 02:16:57 UTC
Description of problem:
Qemu-kvm will core dump when hot-plug a disk and reboot guest. Downgrade to qemu-kvm-0.12.1.2-2.428.el6.x86_64.rpm, it works well. So it is a regression issue introduced by "Enable ioenventfd for virtio-scsi-pci".

Version-Release number of selected component (if applicable):
Host version:
kernel-2.6.32-490.el6.x86_64
qemu-kvm-0.12.1.2-2.430.el6.x86_64

Guest version:
qemu-kvm-0.12.1.2-2.430.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Boot guest
/usr/libexec/qemu-kvm  \
    -name 'virt-tests-vm1' \
    -nodefaults \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,addr=0x4 \
    -drive file='/home/RHEL-Server-6.6-64bit.qcow2',if=none,id=virtio-scsi0-id0,media=disk,cache=none,snapshot=off,format=qcow2,aio=threads \
    -device scsi-hd,bus=virtio_scsi_pci0.0,drive=virtio-scsi0-id0,id=scsi0,bootindex=0 \
    -device e1000,netdev=idIeF4BM,mac='9a:38:38:38:38:8e',bus=pci.0,addr=0x5,id='idwsOlVH' \
    -netdev tap,id=idIeF4BM \
    -m 8192 \
    -smp 8,maxcpus=8,cores=4,threads=1,sockets=2 \
    -cpu 'Westmere' \
    -M rhel6.5.0 \
    -enable-kvm \
    -monitor stdio \
    -qmp tcp:0:6666,server,nowait

2. Hot-plug a data disk
(QMP){"execute":"__com.redhat_drive_add", "arguments": {"file":"/home/storage1.qcow2","format":"qcow2","id":"virtio-scsi0-id1"}}

(QMP){"execute":"device_add","arguments":{"driver":"virtio-scsi-pci","id":"virtio_scsi_pci1"}}

(QMP){"execute":"device_add","arguments":{"driver":"scsi-hd","drive":"virtio-scsi0-id1","id":"scsi1"}}

3. Reboot in guest
# reboot

Actual results:
Qemu-kvm core dump.
(gdb) bt
#0  0x00007ffff4843915 in raise () from /lib64/libc.so.6
#1  0x00007ffff48450f5 in abort () from /lib64/libc.so.6
#2  0x00007ffff483ca3e in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff483cb00 in __assert_fail () from /lib64/libc.so.6
#4  0x00007ffff7dd1c85 in virtio_pci_stop_ioeventfd (proxy=0x7ffff8a436f0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-pci.c:314
#5  0x00007ffff7dd2b25 in virtio_ioport_write (opaque=0x7ffff8a436f0, addr=<value optimized out>, val=0)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-pci.c:368
#6  0x00007ffff7ddf0d7 in kvm_handle_io (env=0x7ffff88649d0) at /usr/src/debug/qemu-kvm-0.12.1.2/kvm-all.c:145
#7  kvm_run (env=0x7ffff88649d0) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1061
#8  0x00007ffff7ddf2c9 in kvm_cpu_exec (env=0x7ffff88649d0) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1756
#9  0x00007ffff7de01bd in kvm_main_loop_cpu (_env=0x7ffff88649d0) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2018
#10 ap_main_loop (_env=0x7ffff88649d0) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2074
#11 0x00007ffff76ef9d1 in start_thread () from /lib64/libpthread.so.0
#12 0x00007ffff48f9ccd in clone () from /lib64/libc.so.6

Expected results:
Guest works happily.

Additional info:

Comment 1 Xiaomei Gao 2014-07-28 02:20:07 UTC
(In reply to Xiaomei Gao from comment #0)
> Guest version:
> qemu-kvm-0.12.1.2-2.430.el6.x86_64

Here is correct guest version
kernel-2.6.32-490.el6.x86_64

Comment 4 Xiaomei Gao 2014-07-28 03:18:39 UTC
We have tried to hot-plug virtio-blk disk and did not trigger the issue.

Btw, if the hot-plugged disk shares one scsi controller with system disk, the issue will not be triggered.

Comment 5 Fam Zheng 2014-07-29 01:35:19 UTC
Rationale of the issue is noted below.

Hot plugged devices are initialized after existing devices are already running, by the guest. When the virtio-scsi is just plugged, guest initializes it with a new allocated BAR address.

When the guest reboots, the device is seen at boot time. It's initialized together with all the other devices, in a sequence as the guest prefers.

In other words, in the fist case, virtio-scsi is initialized after all the other devices, but not in the second case. This makes a different BAR in RHEL 6 (not quite in RHEL 7 though).

So when reboot, BAR is changed and the device is reset.

Currently in qemu-kvm, we need to set up ioeventfd with an address computed based on BAR at the time of device initialization, and tear it down on device reset with the same address. The issue comes when BAR is changed - we lost the original ioeventfd address and use the new BAR to unassign it. Hence the assert on success of unassign fails, with -ENOENT.

The fix is to stop ioeventfd before BAR change and start it after.

Thanks to Paolo for helping debug this issue.

Fam

Comment 6 Fam Zheng 2014-07-29 12:02:07 UTC
Normally, ioeventfd is not started when guest changes BAR. Another bug in qemu-kvm forgets to reset virtio-scsi on vm reboot.

So two separate bugs need to be fixed here.

Comment 7 Qunfang Zhang 2014-07-30 01:18:00 UTC
(In reply to Fam Zheng from comment #6)
> Normally, ioeventfd is not started when guest changes BAR. Another bug in
> qemu-kvm forgets to reset virtio-scsi on vm reboot.
> 
> So two separate bugs need to be fixed here.

Hi, Fam

Could we cover both of the two fixes with the scenario in comment 0 when we get the fixed build?

Comment 8 Fam Zheng 2014-07-30 01:40:00 UTC
Yes, please test the scenario in comment 0.

Comment 11 Jeff Nelson 2014-07-30 21:55:56 UTC
Fix included in qemu-kvm-0.12.1.2-2.433.el6

Comment 13 Xiaomei Gao 2014-08-13 05:39:08 UTC
- Reproduce the issue on qemu-kvm-0.12.1.2-2.430.el6.x86_64.

1. Prepare a data disk which will be hot-pluged
# qemu-img create -f qcow2 /home/storage1.qcow2  10G

2. Boot guest
# /usr/libexec/qemu-kvm  \
   -device virtio-scsi-pci,id=virtio_scsi_pci0,addr=0x4 \
  -drive file='/home/RHEL-Server-6.6-64bit.qcow2',if=none,id=virtio-scsi0-id0,media=disk,cache=none,snapshot=off,format=qcow2,aio=threads \
  -device scsi-hd,drive=virtio-scsi0-id0 \
  -qmp tcp:0:6666,server,nowait

3. Hot-plug the data disk
(QMP) {"execute":"__com.redhat_drive_add", "arguments": {"file":"/home/storage1.qcow2","format":"qcow2","id":"virtio-scsi0-id1"}}
(QMP){"execute":"device_add","arguments":{"driver":"virtio-scsi-pci","id":"virtio_scsi_pci1", "addr":"0x6"}}
{"execute":"device_add","arguments":{"driver":"scsi-hd","drive":"virtio-scsi0-id1","id":"scsi1"}}

4. Reboot guest
# reboot

5. Results
Qemu-kvm core dump
(qemu) qemu-kvm: virtio_pci_set_host_notifier_internal: unable to unmap ioeventfd: -2
qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.12.1.2/hw/virtio-pci.c:314: virtio_pci_stop_ioeventfd: Assertion `r >= 0' failed

- Verify the bug on qemu-kvm-0.12.1.2-2.436.el6.x86_64.

1. Host kernel : kernel-2.6.32-496.el6.x86_64
   Guest kernel version : kernel-2.6.32-496.el6.x86_64

2. Repeat the above steps 3 times.

3. Results:
Qemu-kvm works happily and smoothly

Based the above test results, the bug has been verified.

Comment 14 errata-xmlrpc 2014-10-14 07:03:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1490.html


Note You need to log in before you can comment on or make changes to this bug.