Bug 1123698

Summary: qemu-kvm core dump when hot-plug a virtio-scsi disk to guest and reboot
Product: Red Hat Enterprise Linux 6 Reporter: Xiaomei Gao <xigao>
Component: qemu-kvmAssignee: Fam Zheng <famz>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: urgent    
Version: 6.6CC: bsarathy, chayang, famz, juzhang, mazhang, michen, mkenneth, qzhang, rbalakri, tlavigne, virt-maint, wquan, xigao
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.433.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-14 07:03:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Xiaomei Gao 2014-07-28 02:16:57 UTC
Description of problem:
Qemu-kvm will core dump when hot-plug a disk and reboot guest. Downgrade to qemu-kvm-0.12.1.2-2.428.el6.x86_64.rpm, it works well. So it is a regression issue introduced by "Enable ioenventfd for virtio-scsi-pci".

Version-Release number of selected component (if applicable):
Host version:
kernel-2.6.32-490.el6.x86_64
qemu-kvm-0.12.1.2-2.430.el6.x86_64

Guest version:
qemu-kvm-0.12.1.2-2.430.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Boot guest
/usr/libexec/qemu-kvm  \
    -name 'virt-tests-vm1' \
    -nodefaults \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,addr=0x4 \
    -drive file='/home/RHEL-Server-6.6-64bit.qcow2',if=none,id=virtio-scsi0-id0,media=disk,cache=none,snapshot=off,format=qcow2,aio=threads \
    -device scsi-hd,bus=virtio_scsi_pci0.0,drive=virtio-scsi0-id0,id=scsi0,bootindex=0 \
    -device e1000,netdev=idIeF4BM,mac='9a:38:38:38:38:8e',bus=pci.0,addr=0x5,id='idwsOlVH' \
    -netdev tap,id=idIeF4BM \
    -m 8192 \
    -smp 8,maxcpus=8,cores=4,threads=1,sockets=2 \
    -cpu 'Westmere' \
    -M rhel6.5.0 \
    -enable-kvm \
    -monitor stdio \
    -qmp tcp:0:6666,server,nowait

2. Hot-plug a data disk
(QMP){"execute":"__com.redhat_drive_add", "arguments": {"file":"/home/storage1.qcow2","format":"qcow2","id":"virtio-scsi0-id1"}}

(QMP){"execute":"device_add","arguments":{"driver":"virtio-scsi-pci","id":"virtio_scsi_pci1"}}

(QMP){"execute":"device_add","arguments":{"driver":"scsi-hd","drive":"virtio-scsi0-id1","id":"scsi1"}}

3. Reboot in guest
# reboot

Actual results:
Qemu-kvm core dump.
(gdb) bt
#0  0x00007ffff4843915 in raise () from /lib64/libc.so.6
#1  0x00007ffff48450f5 in abort () from /lib64/libc.so.6
#2  0x00007ffff483ca3e in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff483cb00 in __assert_fail () from /lib64/libc.so.6
#4  0x00007ffff7dd1c85 in virtio_pci_stop_ioeventfd (proxy=0x7ffff8a436f0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-pci.c:314
#5  0x00007ffff7dd2b25 in virtio_ioport_write (opaque=0x7ffff8a436f0, addr=<value optimized out>, val=0)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/virtio-pci.c:368
#6  0x00007ffff7ddf0d7 in kvm_handle_io (env=0x7ffff88649d0) at /usr/src/debug/qemu-kvm-0.12.1.2/kvm-all.c:145
#7  kvm_run (env=0x7ffff88649d0) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1061
#8  0x00007ffff7ddf2c9 in kvm_cpu_exec (env=0x7ffff88649d0) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1756
#9  0x00007ffff7de01bd in kvm_main_loop_cpu (_env=0x7ffff88649d0) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2018
#10 ap_main_loop (_env=0x7ffff88649d0) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2074
#11 0x00007ffff76ef9d1 in start_thread () from /lib64/libpthread.so.0
#12 0x00007ffff48f9ccd in clone () from /lib64/libc.so.6

Expected results:
Guest works happily.

Additional info:

Comment 1 Xiaomei Gao 2014-07-28 02:20:07 UTC
(In reply to Xiaomei Gao from comment #0)
> Guest version:
> qemu-kvm-0.12.1.2-2.430.el6.x86_64

Here is correct guest version
kernel-2.6.32-490.el6.x86_64

Comment 4 Xiaomei Gao 2014-07-28 03:18:39 UTC
We have tried to hot-plug virtio-blk disk and did not trigger the issue.

Btw, if the hot-plugged disk shares one scsi controller with system disk, the issue will not be triggered.

Comment 5 Fam Zheng 2014-07-29 01:35:19 UTC
Rationale of the issue is noted below.

Hot plugged devices are initialized after existing devices are already running, by the guest. When the virtio-scsi is just plugged, guest initializes it with a new allocated BAR address.

When the guest reboots, the device is seen at boot time. It's initialized together with all the other devices, in a sequence as the guest prefers.

In other words, in the fist case, virtio-scsi is initialized after all the other devices, but not in the second case. This makes a different BAR in RHEL 6 (not quite in RHEL 7 though).

So when reboot, BAR is changed and the device is reset.

Currently in qemu-kvm, we need to set up ioeventfd with an address computed based on BAR at the time of device initialization, and tear it down on device reset with the same address. The issue comes when BAR is changed - we lost the original ioeventfd address and use the new BAR to unassign it. Hence the assert on success of unassign fails, with -ENOENT.

The fix is to stop ioeventfd before BAR change and start it after.

Thanks to Paolo for helping debug this issue.

Fam

Comment 6 Fam Zheng 2014-07-29 12:02:07 UTC
Normally, ioeventfd is not started when guest changes BAR. Another bug in qemu-kvm forgets to reset virtio-scsi on vm reboot.

So two separate bugs need to be fixed here.

Comment 7 Qunfang Zhang 2014-07-30 01:18:00 UTC
(In reply to Fam Zheng from comment #6)
> Normally, ioeventfd is not started when guest changes BAR. Another bug in
> qemu-kvm forgets to reset virtio-scsi on vm reboot.
> 
> So two separate bugs need to be fixed here.

Hi, Fam

Could we cover both of the two fixes with the scenario in comment 0 when we get the fixed build?

Comment 8 Fam Zheng 2014-07-30 01:40:00 UTC
Yes, please test the scenario in comment 0.

Comment 11 Jeff Nelson 2014-07-30 21:55:56 UTC
Fix included in qemu-kvm-0.12.1.2-2.433.el6

Comment 13 Xiaomei Gao 2014-08-13 05:39:08 UTC
- Reproduce the issue on qemu-kvm-0.12.1.2-2.430.el6.x86_64.

1. Prepare a data disk which will be hot-pluged
# qemu-img create -f qcow2 /home/storage1.qcow2  10G

2. Boot guest
# /usr/libexec/qemu-kvm  \
   -device virtio-scsi-pci,id=virtio_scsi_pci0,addr=0x4 \
  -drive file='/home/RHEL-Server-6.6-64bit.qcow2',if=none,id=virtio-scsi0-id0,media=disk,cache=none,snapshot=off,format=qcow2,aio=threads \
  -device scsi-hd,drive=virtio-scsi0-id0 \
  -qmp tcp:0:6666,server,nowait

3. Hot-plug the data disk
(QMP) {"execute":"__com.redhat_drive_add", "arguments": {"file":"/home/storage1.qcow2","format":"qcow2","id":"virtio-scsi0-id1"}}
(QMP){"execute":"device_add","arguments":{"driver":"virtio-scsi-pci","id":"virtio_scsi_pci1", "addr":"0x6"}}
{"execute":"device_add","arguments":{"driver":"scsi-hd","drive":"virtio-scsi0-id1","id":"scsi1"}}

4. Reboot guest
# reboot

5. Results
Qemu-kvm core dump
(qemu) qemu-kvm: virtio_pci_set_host_notifier_internal: unable to unmap ioeventfd: -2
qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.12.1.2/hw/virtio-pci.c:314: virtio_pci_stop_ioeventfd: Assertion `r >= 0' failed

- Verify the bug on qemu-kvm-0.12.1.2-2.436.el6.x86_64.

1. Host kernel : kernel-2.6.32-496.el6.x86_64
   Guest kernel version : kernel-2.6.32-496.el6.x86_64

2. Repeat the above steps 3 times.

3. Results:
Qemu-kvm works happily and smoothly

Based the above test results, the bug has been verified.

Comment 14 errata-xmlrpc 2014-10-14 07:03:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1490.html