Bug 827284 - QEMU core dump when resume guest from the NFS server recover with iozone tool running in the guest
QEMU core dump when resume guest from the NFS server recover with iozone tool...
Status: CLOSED DUPLICATE of bug 808664
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.3
x86_64 Linux
unspecified Severity unspecified
: rc
: ---
Assigned To: Paolo Bonzini
Virtualization Bugs
: TestOnly
Depends On: 808664
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-31 23:02 EDT by Sibiao Luo
Modified: 2012-07-20 13:55 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-07-20 13:55:38 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Sibiao Luo 2012-05-31 23:02:17 EDT
Description of problem:
boot a guest with the image located on NFS storage and virtio-scsi-pci, run iozone tool in the guest, and then disconnect the NFS server to generate a EIO, after that reconnect NFS, and resume the paused guest by running "cont" in QEMU monitor, but the QEMU core dump, fail to resume.

Version-Release number of selected component (if applicable):
host info:
# uname -r && rpm -q qemu-kvm-rhev
2.6.32-274.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.295.el6.x86_64
guest info:
guest name: RHEL-6.3-Snapshot-5-20120523.1-x86_64
application: iozone

How reproducible:
100%

Steps to Reproduce:
1.configure the NFS and mount it.
# cat /etc/exports 
/home *(rw,no_root_squash,sync)
...
# mount -o soft,timeo=15,retrans=3,nosharecache $ip_nfs_server:/home/ /mnt/
2.boot a guest with the image located on NFS storage and virtio-scsi-pci.
eg: <qemu-kvm-commands>-drive file=/mnt/RHEL-6.3-Snapshot-5-20120523.1-x86_64.qcow2,if=none,id=scsi_drive,format=qcow2,aio=native,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0 -device scsi-hd,drive=scsi_drive,scsi-id=1,lun=0,id=blk_image,bootindex=1
3.run iozone in the guest. <-----NOTE: It's very important
eg: # iozone -a
4.disconnect the NFS server to generate a EIO and check status.
# service nfs stop

(qemu) block I/O error in device 'scsi_drive': Input/output error (5)
block I/O error in device 'scsi_drive': Input/output error (5)
block I/O error in device 'scsi_drive': Input/output error (5)
block I/O error in device 'scsi_drive': Input/output error (5)
...
(qemu) info status
VM status: paused (io-error)
5.reconnect NFS server.
# service nfs restart
6.resume the paused guest by running "cont" in monitor.
(qemu) cont
  
Actual results:
after the step 6, the QEMU core dump.
(qemu) cont
(qemu) qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.12.1.2/hw/scsi-disk.c:369: scsi_write_data: Assertion `r->req.aiocb == ((void *)0)' failed.

Program received signal SIGABRT, Aborted.
0x00007ffff57768a5 in raise () from /lib64/libc.so.6

(gdb) bt
#0  0x00007ffff57768a5 in raise () from /lib64/libc.so.6
#1  0x00007ffff5778085 in abort () from /lib64/libc.so.6
#2  0x00007ffff576fa1e in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff576fae0 in __assert_fail () from /lib64/libc.so.6
#4  0x00007ffff7e5ec51 in scsi_write_data (req=0x7fff4767d170) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/scsi-disk.c:369
#5  0x00007ffff7e5e246 in scsi_dma_restart_bh (opaque=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/scsi-bus.c:53
#6  0x00007ffff7e1ccd1 in qemu_bh_poll () at async.c:70
#7  0x00007ffff7dea6c9 in main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4032
#8  0x00007ffff7e0bdfa in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2244
#9  0x00007ffff7ded09c in main_loop (argc=20, argv=<value optimized out>, envp=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4202
#10 main (argc=20, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6427
(gdb)

Expected results:
VM resumed successfully without any error, and the guest can work correctlly.

Additional info:
my host cpuinfo:
# lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
CPU socket(s):         1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 42
Stepping:              7
CPU MHz:               1600.000
BogoMIPS:              6782.72
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-7
Comment 1 Sibiao Luo 2012-05-31 23:24:10 EDT
(In reply to comment #0)
> Description of problem:
> boot a guest with the image located on NFS storage and virtio-scsi-pci, run
> iozone tool in the guest, and then disconnect the NFS server to generate a
> EIO, after that reconnect NFS, and resume the paused guest by running "cont"
> in QEMU monitor, but the QEMU core dump, fail to resume.
> 
I also test the ide-drive and virtio-blk interface, they did not have this issue.

forget paste my qemu-kvm command line.
# /usr/libexec/qemu-kvm -M rhel6.3.0 -cpu SandyBridge -enable-kvm -smp 2 -m 2G -usb -device usb-tablet,id=input0 -name test_sluo -uuid `uuidgen` -drive file=/mnt/RHEL-6.3-Snapshot-5-20120523.1-x86_64.qcow2,if=none,id=scsi_drive,format=qcow2,aio=native,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0 -device scsi-hd,drive=scsi_drive,scsi-id=1,lun=0,id=blk_image,bootindex=1 -netdev tap,script=/etc/qemu-ifup,id=netdev0 -device virtio-net-pci,netdev=netdev0,id=device-net0 -vnc :1 -balloon none -monitor stdio
Comment 2 Paolo Bonzini 2012-07-20 09:50:09 EDT
Same as bug 808664.
Comment 3 Ademar Reis 2012-07-20 13:55:38 EDT
(In reply to comment #2)
> Same as bug 808664.

Marking as duplicated. QE will test this scenario to verify Bug 808664.

*** This bug has been marked as a duplicate of bug 808664 ***

Note You need to log in before you can comment on or make changes to this bug.