Bug 887844 - AHCI does not support restarting requests (i.e. rerror=stop and werror=stop/enospc) [NEEDINFO]
AHCI does not support restarting requests (i.e. rerror=stop and werror=stop/e...
Status: ASSIGNED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.0
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: John Snow
jingzhao
:
Depends On: 953062
Blocks: 1227278 1305606 1313485
  Show dependency treegraph
 
Reported: 2012-12-17 08:02 EST by Paolo Bonzini
Modified: 2016-08-22 01:17 EDT (History)
10 users (show)

See Also:
Fixed In Version: qemu-kvm-rhev-2.5.0-1.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
jinzhao: needinfo? (jsnow)


Attachments (Terms of Use)

  None (edit)
Description Paolo Bonzini 2012-12-17 08:02:26 EST
Description of problem:
AHCI provides a dummy implementation of restart_cb.  Hence, it will not properly restart after the VM is paused for an I/O error (including ENOSPC).

Version-Release number of selected component (if applicable):
1.3.0
Comment 4 John Snow 2015-04-06 20:04:23 EDT
The patches that enable this have gone upstream, but it's impossible to meaningfully test them without AHCI migration, so this issue is now blocked on #901631.
Comment 7 John Snow 2016-01-13 16:08:27 EST
Included in rebase to qemu 2.5.0.

Instructions for testing:

- Use -werror=stop or -rerror=stop
- Trigger an ENOSPC error either through blkdebug or a legitimate out of space error.
- Observe VM has paused.
- If not using BLKDEBUG method, correct ENOSPC issue
- Resume VM
- Observe that VM is running and transfer(s) have succeeded.

The qemu/tests/ahci-test suite already tests this workflow.
Comment 9 jingzhao 2016-08-17 06:04:01 EDT
Reproduce the bz on host-kernel-3.10.0-327.8.1.el7.x86_64 and qemu-kvm-rhev-2.3.0-31.el7_2.21.x86_64

1. Boot guest and create a I/O error

2. check the info in hmp
(qemu) info status
VM status: running

3. qemu core dump
qemu) qemu-kvm: hw/ide/core.c:672: ide_handle_rw_error: Assertion `s->bus->retry_unit == s->unit' failed.
pc.sh: line 28:  5019 Aborted                 (core dumped) /usr/libexec/qemu-kvm -M pc -cpu SandyBridge -nodefaults -rtc base=utc -m 4G -smp 2,sockets=2,cores=1,threads=1 -enable-kvm -name rhel7.3 -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 -k en-us -nodefaults -serial unix:/tmp/serial0,server,nowait -boot menu=on -bios /usr/share/seabios/bios.bin -chardev file,path=/home/seabios.log,id=seabios -device isa-debugcon,chardev=seabios,iobase=0x402 -qmp tcp:0:6666,server,nowait -device VGA,id=video -vnc :2 -drive file=/home/big.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -device virtio-net-pci,netdev=tap10,mac=9a:6a:6b:6c:6d:6e -netdev tap,id=tap10 -device ahci,id=ahci0 -drive file=/dev/sdb,if=none,id=drive-virtio-disk1,format=qcow2,werror=stop,rerror=stop -device ide-hd,drive=drive-virtio-disk1,id=virtio-disk1,bus=ahci0.0 -monitor stdio

(gdb) bt
#0  0x00007f2678b555f7 in raise () from /lib64/libc.so.6
#1  0x00007f2678b56ce8 in abort () from /lib64/libc.so.6
#2  0x00007f2678b4e566 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007f2678b4e612 in __assert_fail () from /lib64/libc.so.6
#4  0x00007f2680865616 in ide_handle_rw_error (s=0x7f2684b4a0b8, error=28, op=64) at hw/ide/core.c:672
#5  0x00007f26808656ac in ide_flush_cb (opaque=0x7f2684b4a0b8, ret=<optimized out>) at hw/ide/core.c:915
#6  0x00007f26808ff5ce in bdrv_co_em_bh (opaque=0x7f2683ffd510) at block.c:5001
#7  0x00007f26808f8734 in aio_bh_poll (ctx=ctx@entry=0x7f2681d80840) at async.c:85
#8  0x00007f2680907869 in aio_dispatch_clients (ctx=0x7f2681d80840, client_mask=client_mask@entry=-1)
    at aio-posix.c:139
#9  0x00007f2680907c2a in aio_dispatch (ctx=<optimized out>) at aio-posix.c:194
#10 0x00007f26808f85ae in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, 
    user_data=<optimized out>) at async.c:219
#11 0x00007f267eb3d79a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#12 0x00007f26809066e8 in glib_pollfds_poll () at main-loop.c:209
#13 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:254
#14 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:503
#15 0x00007f2680704bfe in main_loop () at vl.c:1818
#16 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4391
(gdb) 

2. Test it on host-kernel-3.10.0-492.el7.x86_64 and qemu-kvm-rhev-2.6.0-20.el7.x86_64 and found the bz also failed

1. Boot guest and create a I/O error

2. qemu core dump
qemu) 887844sh: line 28:  7582 Segmentation fault      (core dumped) /usr/libexec/qemu-kvm -M pc -cpu SandyBridge -nodefaults -rtc base=utc -m 4G -smp 2,sockets=2,cores=1,threads=1 -enable-kvm -name rhel7.3 -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 -k en-us -nodefaults -serial unix:/tmp/serial0,server,nowait -boot menu=on -bios /usr/share/seabios/bios.bin -chardev file,path=/home/seabios.log,id=seabios -device isa-debugcon,chardev=seabios,iobase=0x402 -qmp tcp:0:6666,server,nowait -device VGA,id=video -vnc :2 -drive file=/home/bug/big.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -device virtio-net-pci,netdev=tap10,mac=9a:6a:6b:6c:6d:6e -netdev tap,id=tap10 -device ahci,id=ahci0 -drive file=/dev/sdc,if=none,id=drive-virtio-disk1,format=qcow2,werror=stop,rerror=stop -device ide-hd,drive=drive-virtio-disk1,id=virtio-disk1,bus=ahci0.0 -monitor stdio

(gdb) bt
#0  0x00007f638d863de0 in ?? ()
#1  0x00007f6388db872a in bdrv_aio_cancel_async (acb=0x7f638d862520) at block/io.c:2060
#2  bdrv_aio_cancel (acb=0x7f638d862520) at block/io.c:2041
#3  0x00007f6388dad7c5 in blk_aio_cancel (acb=<optimized out>) at block/block-backend.c:1044
#4  0x00007f6388ccb7bd in ahci_reset_port (port=<optimized out>, s=<optimized out>) at hw/ide/ahci.c:599
#5  0x00007f6388ccc734 in ahci_port_write (val=<optimized out>, offset=<optimized out>, port=0, s=0x7f638c953e90)
    at hw/ide/ahci.c:301
#6  ahci_mem_write (opaque=0x7f638c953e90, addr=<optimized out>, val=768, size=<optimized out>) at hw/ide/ahci.c:435
#7  0x00007f6388b8dfa3 in memory_region_write_accessor (mr=<optimized out>, addr=<optimized out>, 
    value=<optimized out>, size=<optimized out>, shift=<optimized out>, mask=<optimized out>, attrs=...)
    at /usr/src/debug/qemu-2.6.0/memory.c:525
#8  0x00007f6388b8bf09 in access_with_adjusted_size (addr=addr@entry=300, value=value@entry=0x7f6373d36878, 
    size=size@entry=4, access_size_min=<optimized out>, access_size_max=<optimized out>, 
    access=access@entry=0x7f6388b8df60 <memory_region_write_accessor>, mr=mr@entry=0x7f638c953eb8, 
    attrs=attrs@entry=...) at /usr/src/debug/qemu-2.6.0/memory.c:591
#9  0x00007f6388b8f725 in memory_region_dispatch_write (mr=mr@entry=0x7f638c953eb8, addr=addr@entry=300, data=768, 
    size=size@entry=4, attrs=attrs@entry=...) at /usr/src/debug/qemu-2.6.0/memory.c:1273
#10 0x00007f6388b52329 in address_space_write_continue (mr=0x7f638c953eb8, l=4, addr1=300, len=4, 
    buf=0x7f6388956028 <Address 0x7f6388956028 out of bounds>, attrs=..., addr=4273811756, 
    as=0x7f638939bb60 <address_space_memory>) at /usr/src/debug/qemu-2.6.0/exec.c:2593
#11 address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, 
    len=<optimized out>) at /usr/src/debug/qemu-2.6.0/exec.c:2651
#12 0x00007f6388b5289d in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=..., attrs@entry=..., 
    buf=buf@entry=0x7f6388956028 <Address 0x7f6388956028 out of bounds>, len=<optimized out>, 
    is_write=<optimized out>) at /usr/src/debug/qemu-2.6.0/exec.c:2754
#13 0x00007f6388b8b0e0 in kvm_cpu_exec (cpu=cpu@entry=0x7f638c932000) at /usr/src/debug/qemu-2.6.0/kvm-all.c:1950
#14 0x00007f6388b79bd6 in qemu_kvm_cpu_thread_fn (arg=0x7f638c932000) at /usr/src/debug/qemu-2.6.0/cpus.c:1076
#15 0x00007f637e801dc5 in start_thread () from /lib64/libpthread.so.0
#16 0x00007f637cf3c73d in clone () from /lib64/libc.so.6

So I think the bz didn't fixed 

Thanks
Jing Zhao

Note You need to log in before you can comment on or make changes to this bug.