RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1194774 - qemu refuses to start if a disk is offline
Summary: qemu refuses to start if a disk is offline
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm
Version: 7.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Stefan Hajnoczi
QA Contact: CongLi
URL:
Whiteboard:
Depends On: 1184363
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-02-20 17:42 UTC by Stefan Hajnoczi
Modified: 2017-12-08 17:03 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1184363
Environment:
Last Closed: 2017-12-08 17:03:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Stefan Hajnoczi 2015-02-20 17:42:05 UTC
The assertion failure mentioned in the bug has been addressed.  QEMU still cannot launch a guest where the disk is offline (returns -EIO).

This clone is for making QEMU capable of starting with an offline disk.


+++ This bug was initially created as a clone of Bug #1184363 +++

Description of problem:
     It is not possible to start a VM which has a failed multipath device connected.


Version-Release number of selected component (if applicable):
     qemu-kvm-1.5.3-60.el7_0.11

How reproducible:
     100%

Steps to Reproduce:
1. Create a VM and a multipath device as a disk:

    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/mapper/3600140519ce508030c64008be5be247d'/>
      <target dev='vdb' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </disk>

2. Mark all the paths of the mulitpath device as failed:
    # multipath -ll
      ...
          3600140519ce508030c64008be5be247d dm-12 LIO-ORG ,iscsi17         
          size=10G features='0' hwhandler='0' wp=rw
          `-+- policy='service-time 0' prio=1 status=active
            `- 7:0:0:16 sds 65:32 failed ready  running

    # multipathd -c fail path sds

    # multipath -ll
      ...
          3600140519ce508030c64008be5be247d dm-12 LIO-ORG ,iscsi17         
          size=10G features='0' hwhandler='0' wp=rw
          `-+- policy='service-time 0' prio=1 status=enabled
            `- 7:0:0:16 sds 65:32 failed ready  running

3. Start the VM

Actual results:
error: Failed to start domain Host_boot_test
error: internal error: early end of file from monitor: possible problem:
Warning: option deprecated, use lost_tick_policy property of kvm-pit instead.
qemu-kvm: block.c:849: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs) != 0' failed.


Expected results:
VM is started

--- Additional comment from Stefan Hajnoczi on 2015-01-21 07:45:29 EST ---

(In reply to Roman Hodain from comment #0)
> Description of problem:
>      It is not possible to start a VM which has a failed multipath device
> connected.
...
> Actual results:
> error: Failed to start domain Host_boot_test
> error: internal error: early end of file from monitor: possible problem:
> Warning: option deprecated, use lost_tick_policy property of kvm-pit instead.
> qemu-kvm: block.c:849: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs)
> != 0' failed.

The multipath device fails all I/O requests so QEMU is unable to probe the O_DIRECT memory and request alignment by attempting reads on the device.

Regarding the expected behavior, do you want the VM to start but all I/Os will return errors until the path comes online again?

--- Additional comment from Roman Hodain on 2015-01-23 05:43:20 EST ---

(In reply to Stefan Hajnoczi from comment #1)
> (In reply to Roman Hodain from comment #0)
> > Description of problem:
> >      It is not possible to start a VM which has a failed multipath device
> > connected.
> ...
> > Actual results:
> > error: Failed to start domain Host_boot_test
> > error: internal error: early end of file from monitor: possible problem:
> > Warning: option deprecated, use lost_tick_policy property of kvm-pit instead.
> > qemu-kvm: block.c:849: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs)
> > != 0' failed.
> 
> The multipath device fails all I/O requests so QEMU is unable to probe the
> O_DIRECT memory and request alignment by attempting reads on the device.
> 
> Regarding the expected behavior, do you want the VM to start but all I/Os
> will return errors until the path comes online again?

Hi,

yes, the use case here is disaster recovery where the VM has two mirrored LUNs connected and one of them is down and this prevents the VM from starting even the second LUN is available and the VM would be able to start. 

Roman

--- Additional comment from Stefan Hajnoczi on 2015-02-19 09:16:11 EST ---

This bug can be reproduced with:

$ echo "0 $((8 * 1024 * 1024 * 1024 / 512)) error" | sudo dmsetup create eiodev
$ qemu-system-x86_64 -enable-kvm \
                     -drive if=virtio,file=/dev/mapper/eiodev,cache=none
qemu-system-x86_64: block.c:860: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs) != 0' failed.

(Use "dmsetup remove eiodev" to clean up after the test.)

I have backported the error message from upstream.  This prevents an assertion failure but does not support starting QEMU with failed disks yet:

qemu-system-x86_64: -drive if=none,file=/dev/mapper/eiodev,cache.direct=on,id=drive0: could not open disk image /dev/mapper/eiodev: Could not find working O_DIRECT alignment. Try cache.direct=off.

Comment 1 Sibiao Luo 2015-02-27 06:35:08 UTC
Reproduce it with qemu-kvm-1.5.3-60.el7.
host info:
# uname -r && rpm -q qemu-kvm-rhev
3.10.0-229.el7.x86_64
qemu-kvm-1.5.3-60.el7

e.g:...-drive file=/dev/mapper/eiodev,if=none,id=drive-system-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device ide-drive,bus=ide.0,unit=0,drive=drive-system-disk,id=system-disk,bootindex=0
qemu-kvm: block.c:849: bdrv_open_common: Assertion `bdrv_opt_mem_align(bs) != 0' failed.
Aborted (core dumped)

########

Tried it on qemu-kvm-rhev-2.1.2-9.el7.x86_64 which did not hit it.
host info:
# uname -r && rpm -q qemu-kvm-rhev
3.10.0-229.el7.x86_64
qemu-kvm-rhev-2.1.2-9.el7.x86_64

e.g:...-drive file=/dev/mapper/eiodev,if=none,id=drive-system-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device ide-drive,bus=ide.0,unit=0,drive=drive-system-disk,id=system-disk,bootindex=0
qemu-kvm: -drive file=/dev/mapper/eiodev,if=none,id=drive-system-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop: could not open disk image /dev/mapper/eiodev: Could not find working O_DIRECT alignment. Try cache.direct=off.

Comment 3 Stefan Hajnoczi 2015-08-04 13:49:16 UTC
Deferring to 7.3.  Need to investigate offline disk behavior so QEMU can emulate that correctly.  In a virtio-scsi scenario this should be quite doable, but there may be no good mapping for the offline state in virtio-blk and other storage controllers.

Comment 5 Stefan Hajnoczi 2016-01-29 14:03:58 UTC
Moving to RHEL 7.4.  This is a rare configuration and I've discussed it with Hannes Reinnecke, trying to change this behavior will be quite involved.

Comment 6 Stefan Hajnoczi 2017-01-16 16:25:12 UTC
No new requests for starting a guest with failed multipath.  Working on higher priority items in RHEL 7.4, deferring this to RHEL 7.5.

Comment 8 Stefan Hajnoczi 2017-11-28 17:16:02 UTC
Deferring to RHEL 7.6.  No activity or requests for this feature upstream.

I may close this BZ next release if there is no demand for it.

Comment 10 Stefan Hajnoczi 2017-12-08 17:03:26 UTC
There has been no activity or demand for this recently.  It is therefore out of scope and unlikely to be addressed.


Note You need to log in before you can comment on or make changes to this bug.