Bug 990504 - qemu core dump when stop/restart the NFS server for IDE with werror/rerror=stop
qemu core dump when stop/restart the NFS server for IDE with werror/rerror=stop
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.0
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Kevin Wolf
Virtualization Bugs
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-31 06:27 EDT by Sibiao Luo
Modified: 2014-06-17 23:32 EDT (History)
11 users (show)

See Also:
Fixed In Version: qemu-kvm-rhev-1.5.3-49.el7.x86_64
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-13 05:44:07 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Sibiao Luo 2013-07-31 06:27:40 EDT
Description of problem:
Configure the NFS and mount it, boot a guest via IDE interface located on NFS storage, disconnect the NFS server during some I/O to the disk in guest, then resume the paused guest by running "cont" in monitor after reconnect NFS server, then qemu core dumped.

Version-Release number of selected component (if applicable):
host info:
3.10.0-3.el7.x86_64
qemu-kvm-1.5.2-1.el7.x86_64
seabios-1.7.2.2-2.el7.x86_64
guest info:
3.10.0-3.el7.x86_64

How reproducible:
1/1

Steps to Reproduce:
1.configure the NFS and mount it.
# cat /etc/exports 
/home *(rw,no_root_squash,sync)
# mount -o soft,timeo=15,retrans=3,nosharecache $nfs_server_ip:/home/ /mnt/2.
2.boot a guest with ide interface(werror=stop,rerror=stop) located on NFS storage.
e.g:...-drive file=/mnt/RHEL-7.0-20130628.0-Server-x86_64.qcow3bk,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,serial="QEMU-DISK1" -device ide-hd,drive=drive-system-disk,id=system-disk,bootindex=1
3.stop the nfs to stop the vm.
# /bin/systemctl stop  nfs.service
4.restart nfs and press 'c' to resume vm.
# /bin/systemctl restart  nfs.service
(qemu) cont

Actual results:
after step 3, VM status was paused, but fail to generate EIO, refer to bug 895797.
(qemu) info status 
VM status: running
(qemu) info status 
VM status: running
(qemu) info status 
VM status: paused (io-error)
(qemu) info status 
VM status: paused (io-error)
(qemu) info status 
after step 4, qemu core dump.
(qemu) cont
(qemu) qemu-kvm: hw/ide/pci.c:313: bmdma_cmd_writeb: Assertion `bm->bus->dma->aiocb == ((void *)0)' failed.
Aborted (core dumped)
(gdb) bt
#0  0x00007f3ff5065a19 in raise () from /lib64/libc.so.6
#1  0x00007f3ff5067128 in abort () from /lib64/libc.so.6
#2  0x00007f3ff505e986 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007f3ff505ea32 in __assert_fail () from /lib64/libc.so.6
#4  0x00007f3ff984df4f in bmdma_cmd_writeb (bm=0x7f3ffbacf0d0, val=8) at hw/ide/pci.c:313
#5  0x00007f3ff996e7b2 in access_with_adjusted_size (addr=addr@entry=0, value=value@entry=0x7f3fe7ffeb58, size=1, 
    access_size_min=<optimized out>, access_size_max=<optimized out>, access=access@entry=
    0x7f3ff996ed70 <memory_region_write_accessor>, opaque=opaque@entry=0x7f3ffbacf1f8)
    at /usr/src/debug/qemu-1.5.2/memory.c:364
#6  0x00007f3ff996fc87 in memory_region_iorange_write (iorange=<optimized out>, offset=0, width=1, data=8)
    at /usr/src/debug/qemu-1.5.2/memory.c:439
#7  0x00007f3ff996d322 in kvm_handle_io (count=1, size=1, direction=1, data=<optimized out>, port=49256)
    at /usr/src/debug/qemu-1.5.2/kvm-all.c:1482
#8  kvm_cpu_exec (env=env@entry=0x7f3ffba624d0) at /usr/src/debug/qemu-1.5.2/kvm-all.c:1634
#9  0x00007f3ff99189b5 in qemu_kvm_cpu_thread_fn (arg=0x7f3ffba624d0) at /usr/src/debug/qemu-1.5.2/cpus.c:759
#10 0x00007f3ff799ec53 in start_thread (arg=0x7f3fe7fff700) at pthread_create.c:308
#11 0x00007f3ff512513d in clone () from /lib64/libc.so.6
(gdb)

Expected results:
it can resume geust successfully.

Additional info:
# /usr/libexec/qemu-kvm -S -cpu SandyBridge -enable-kvm -m 4096 -smp 4,sockets=2,cores=2,threads=1 -no-kvm-pit-reinjection -name sluo -uuid 43425b70-86e5-4664-bf2c-3b76699a8aec -rtc base=localtime,clock=host,driftfix=slew -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0,bus=pci.0,addr=0x3 -chardev socket,id=channel1,path=/tmp/helloworld1,server,nowait -device virtserialport,chardev=channel1,name=com.redhat.rhevm.vdsm.1,bus=virtio-serial0.0,id=port1,nr=1 -chardev socket,id=channel2,path=/tmp/helloworld2,server,nowait -device virtserialport,chardev=channel2,name=com.redhat.rhevm.vdsm.2,bus=virtio-serial0.0,id=port2,nr=2 -drive file=/mnt/RHEL-7.0-20130628.0-Server-x86_64.qcow3bk,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,serial="QEMU-DISK1" -device ide-hd,drive=drive-system-disk,id=system-disk,bootindex=1 -device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x5 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -netdev tap,id=hostnet0,vhost=off,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=2C:41:38:B6:32:21,bus=pci.0,addr=0x6,bootindex=2 -k en-us -boot menu=on -vnc :1 -spice port=5931,disable-ticketing -qmp tcp:0:4444,server,nowait -monitor stdio
Comment 1 Sibiao Luo 2013-07-31 06:30:14 EDT
(gdb) bt full
#0  0x00007f3ff5065a19 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f3ff5067128 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007f3ff505e986 in __assert_fail_base () from /lib64/libc.so.6
No symbol table info available.
#3  0x00007f3ff505ea32 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#4  0x00007f3ff984df4f in bmdma_cmd_writeb (bm=0x7f3ffbacf0d0, val=8) at hw/ide/pci.c:313
        __PRETTY_FUNCTION__ = "bmdma_cmd_writeb"
#5  0x00007f3ff996e7b2 in access_with_adjusted_size (addr=addr@entry=0, value=value@entry=0x7f3fe7ffeb58, size=1, 
    access_size_min=<optimized out>, access_size_max=<optimized out>, access=access@entry=
    0x7f3ff996ed70 <memory_region_write_accessor>, opaque=opaque@entry=0x7f3ffbacf1f8)
    at /usr/src/debug/qemu-1.5.2/memory.c:364
        access_mask = 255
        access_size = 1
        i = <optimized out>
#6  0x00007f3ff996fc87 in memory_region_iorange_write (iorange=<optimized out>, offset=0, width=1, data=8)
    at /usr/src/debug/qemu-1.5.2/memory.c:439
        mrio = <optimized out>
        mr = 0x7f3ffbacf1f8
        __PRETTY_FUNCTION__ = "memory_region_iorange_write"
#7  0x00007f3ff996d322 in kvm_handle_io (count=1, size=1, direction=1, data=<optimized out>, port=49256)
    at /usr/src/debug/qemu-1.5.2/kvm-all.c:1482
        i = 0
        ptr = 0x7f3ff9606000 <Address 0x7f3ff9606000 out of bounds>
#8  kvm_cpu_exec (env=env@entry=0x7f3ffba624d0) at /usr/src/debug/qemu-1.5.2/kvm-all.c:1634
        cpu = 0x7f3ffba623c0
        __func__ = "kvm_cpu_exec"
        run = 0x7f3ff9605000
        ret = <optimized out>
        run_ret = <optimized out>
#9  0x00007f3ff99189b5 in qemu_kvm_cpu_thread_fn (arg=0x7f3ffba624d0) at /usr/src/debug/qemu-1.5.2/cpus.c:759
        cpu = 0x7f3ffba623c0
        __func__ = "qemu_kvm_cpu_thread_fn"
        r = <optimized out>
#10 0x00007f3ff799ec53 in start_thread (arg=0x7f3fe7fff700) at pthread_create.c:308
        __res = <optimized out>
        pd = 0x7f3fe7fff700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139912451979008, 5520618557473219602, 0, 139912451979712, 
                139912451979008, 139912781636816, -5556629762893487086, -5556665239318764526}, mask_was_saved = 0}}, 
          priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#11 0x00007f3ff512513d in clone () from /lib64/libc.so.6
No symbol table info available.
(gdb)
Comment 3 Sibiao Luo 2013-07-31 06:36:19 EDT
Also test rhel6.5 host that did not hit this issue, it can resume VM successfully.
host info:
2.6.32-402.el6.x86_64
qemu-kvm-0.12.1.2-2.381.el6.x86_64

(qemu) info status
VM status: running
(qemu) info status
VM status: running
(qemu) block I/O error in device 'drive-system-disk': Input/output error (5)

(qemu) info status 
VM status: paused (io-error)
(qemu) info status 
VM status: paused (io-error)
(qemu) cont
(qemu) info status 
VM status: running

Best Regards,
sluo
Comment 5 Kevin Wolf 2014-02-25 10:33:28 EST
Is this bug reproducible? My attempts to reproduce it failed.
Comment 6 Sibiao Luo 2014-02-25 21:39:00 EST
(In reply to Kevin Wolf from comment #5)
> Is this bug reproducible? My attempts to reproduce it failed.

Not hit it any more with the same testing as comment #0 on the latest qemu-kvm-rhev-1.5.3-49.el7.x86_64 version which can resume VM successfully without any qemu core dumped.

host info:
# uname -r && rpm -q qemu-kvm-rhev
3.10.0-95.el7.x86_64
qemu-kvm-rhev-1.5.3-49.el7.x86_64

(qemu) info status 
VM status: running

# /bin/systemctl stop  nfs.service

(qemu) block I/O error in device 'drive-system-disk': Input/output error (5)

(qemu) info status 
VM status: paused (io-error)
(qemu) info status 
VM status: paused (io-error)

# /bin/systemctl restart  nfs.service

(qemu) cont
(qemu) 
(qemu) info status 
VM status: running

Best Regards,
sluo
Comment 7 Sibiao Luo 2014-03-19 00:26:15 EDT
Verify this issue on qemu-kvm-1.5.3-53.el7.x86_64 with the same steps as comment #0.

host info:
# uname -r && rpm -q qemu-kvm
3.10.0-98.el7.x86_64
qemu-kvm-1.5.3-53.el7.x86_64
guest info:
# uname -r
3.10.0-98.el7.x86_64

Steps and Results:
(qemu) info status 
VM status: running

# /bin/systemctl stop  nfs.service

(qemu) block I/O error in device 'drive-system-disk': Input/output error (5)

(qemu) info status 
VM status: paused (io-error)
(qemu) info status 
VM status: paused (io-error)

# /bin/systemctl restart  nfs.service

(qemu) cont
(qemu) 
(qemu) info status 
VM status: running

Base on above, this issue has been fixed correctly, move to VERIFIED status.

Best Regards,
sluo
Comment 8 Ludek Smid 2014-06-13 05:44:07 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.