Bug 1184363 - Qemu process fails to start with a multipath device with all paths failed
Summary: Qemu process fails to start with a multipath device with all paths failed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm
Version: 7.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Stefan Hajnoczi
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 1194774
TreeView+ depends on / blocked
 
Reported: 2015-01-21 08:24 UTC by Roman Hodain
Modified: 2015-11-19 04:57 UTC (History)
11 users (show)

Fixed In Version: qemu-kvm-1.5.3-87.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1194774 (view as bug list)
Environment:
Last Closed: 2015-11-19 04:57:47 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2213 normal SHIPPED_LIVE qemu-kvm bug fix and enhancement update 2015-11-19 08:16:10 UTC

Description Roman Hodain 2015-01-21 08:24:20 UTC
Description of problem:
     It is not possible to start a VM which has a failed multipath device connected.


Version-Release number of selected component (if applicable):
     qemu-kvm-1.5.3-60.el7_0.11

How reproducible:
     100%

Steps to Reproduce:
1. Create a VM and a multipath device as a disk:

    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/mapper/3600140519ce508030c64008be5be247d'/>
      <target dev='vdb' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </disk>

2. Mark all the paths of the mulitpath device as failed:
    # multipath -ll
      ...
          3600140519ce508030c64008be5be247d dm-12 LIO-ORG ,iscsi17         
          size=10G features='0' hwhandler='0' wp=rw
          `-+- policy='service-time 0' prio=1 status=active
            `- 7:0:0:16 sds 65:32 failed ready  running

    # multipathd -c fail path sds

    # multipath -ll
      ...
          3600140519ce508030c64008be5be247d dm-12 LIO-ORG ,iscsi17         
          size=10G features='0' hwhandler='0' wp=rw
          `-+- policy='service-time 0' prio=1 status=enabled
            `- 7:0:0:16 sds 65:32 failed ready  running

3. Start the VM

Actual results:
error: Failed to start domain Host_boot_test
error: internal error: early end of file from monitor: possible problem:
Warning: option deprecated, use lost_tick_policy property of kvm-pit instead.
qemu-kvm: block.c:849: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs) != 0' failed.


Expected results:
VM is started

Comment 1 Stefan Hajnoczi 2015-01-21 12:45:29 UTC
(In reply to Roman Hodain from comment #0)
> Description of problem:
>      It is not possible to start a VM which has a failed multipath device
> connected.
...
> Actual results:
> error: Failed to start domain Host_boot_test
> error: internal error: early end of file from monitor: possible problem:
> Warning: option deprecated, use lost_tick_policy property of kvm-pit instead.
> qemu-kvm: block.c:849: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs)
> != 0' failed.

The multipath device fails all I/O requests so QEMU is unable to probe the O_DIRECT memory and request alignment by attempting reads on the device.

Regarding the expected behavior, do you want the VM to start but all I/Os will return errors until the path comes online again?

Comment 2 Roman Hodain 2015-01-23 10:43:20 UTC
(In reply to Stefan Hajnoczi from comment #1)
> (In reply to Roman Hodain from comment #0)
> > Description of problem:
> >      It is not possible to start a VM which has a failed multipath device
> > connected.
> ...
> > Actual results:
> > error: Failed to start domain Host_boot_test
> > error: internal error: early end of file from monitor: possible problem:
> > Warning: option deprecated, use lost_tick_policy property of kvm-pit instead.
> > qemu-kvm: block.c:849: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs)
> > != 0' failed.
> 
> The multipath device fails all I/O requests so QEMU is unable to probe the
> O_DIRECT memory and request alignment by attempting reads on the device.
> 
> Regarding the expected behavior, do you want the VM to start but all I/Os
> will return errors until the path comes online again?

Hi,

yes, the use case here is disaster recovery where the VM has two mirrored LUNs connected and one of them is down and this prevents the VM from starting even the second LUN is available and the VM would be able to start. 

Roman

Comment 3 Stefan Hajnoczi 2015-02-19 14:16:11 UTC
This bug can be reproduced with:

$ echo "0 $((8 * 1024 * 1024 * 1024 / 512)) error" | sudo dmsetup create eiodev
$ qemu-system-x86_64 -enable-kvm \
                     -drive if=virtio,file=/dev/mapper/eiodev,cache=none
qemu-system-x86_64: block.c:860: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs) != 0' failed.

(Use "dmsetup remove eiodev" to clean up after the test.)

I have backported the error message from upstream.  This prevents an assertion failure but does not support starting QEMU with failed disks yet:

qemu-system-x86_64: -drive if=none,file=/dev/mapper/eiodev,cache.direct=on,id=drive0: could not open disk image /dev/mapper/eiodev: Could not find working O_DIRECT alignment. Try cache.direct=off.

Comment 4 Sibiao Luo 2015-02-27 06:34:35 UTC
Reproduce it with qemu-kvm-1.5.3-60.el7.
host info:
# uname -r && rpm -q qemu-kvm-rhev
3.10.0-229.el7.x86_64
qemu-kvm-1.5.3-60.el7

e.g:...-drive file=/dev/mapper/eiodev,if=none,id=drive-system-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device ide-drive,bus=ide.0,unit=0,drive=drive-system-disk,id=system-disk,bootindex=0
qemu-kvm: block.c:849: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs) != 0' failed.
Aborted (core dumped)

########

Tried it on qemu-kvm-rhev-2.1.2-9.el7.x86_64 which did not hit it.
host info:
# uname -r && rpm -q qemu-kvm-rhev
3.10.0-229.el7.x86_64
qemu-kvm-rhev-2.1.2-9.el7.x86_64

e.g:...-drive file=/dev/mapper/eiodev,if=none,id=drive-system-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device ide-drive,bus=ide.0,unit=0,drive=drive-system-disk,id=system-disk,bootindex=0
qemu-kvm: -drive file=/dev/mapper/eiodev,if=none,id=drive-system-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop: could not open disk image /dev/mapper/eiodev: Could not find working O_DIRECT alignment. Try cache.direct=off.

Comment 6 Miroslav Rezanina 2015-03-18 11:24:16 UTC
Fix included in qemu-kvm-1.5.3-87.el7

Comment 7 huiqingding 2015-03-23 09:54:48 UTC
Reproduce this bug using the following version:
qemu-kvm-1.5.3-86.el7.x86_64
kernel-3.10.0-232.el7.x86_64

Test steps:
1. create a multipath device
#  echo "0 $((8 * 1024 * 1024 * 1024 / 512)) error" | sudo dmsetup create eiodev

2. boot a vm, connect the above device:
# /usr/libexec/qemu-kvm -enable-kvm  -drive if=virtio,file=/dev/mapper/eiodev,cache=none

Actual result:
qemu-kvm: block.c:860: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs) != 0' failed.
Aborted (core dumped)

Comment 8 huiqingding 2015-03-30 07:25:21 UTC
Test this bug using the following version:
qemu-kvm-1.5.3-87.el7.x86_64
kernel-3.10.0-232.el7.x86_64

Test steps:
1. create a multipath device
#  echo "0 $((8 * 1024 * 1024 * 1024 / 512)) error" | sudo dmsetup create eiodev

2. boot a vm, connect the above device:
# /usr/libexec/qemu-kvm -enable-kvm  -drive if=virtio,file=/dev/mapper/eiodev,cache=none

Actual result:
qemu-kvm: -drive if=virtio,file=/dev/mapper/eiodev,cache=none: could not open disk image /dev/mapper/eiodev: Could not find working O_DIRECT alignment. Try cache.direct=off.

Based on the above results, this bug has been fixed.

Comment 12 errata-xmlrpc 2015-11-19 04:57:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2213.html


Note You need to log in before you can comment on or make changes to this bug.