Bug 972675
Summary: | Fail migration when VM get paused due to EIO | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Michal Skrivanek <michal.skrivanek> | |
Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 6.5 | CC: | acathrow, berrange, dallan, dyuan, huding, juzhang, weizhan, ydu, zpeng | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-0.10.2-20.el6 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1045833 (view as bug list) | Environment: | ||
Last Closed: | 2013-11-21 09:02:39 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 886416, 961154, 1045833 |
Description
Michal Skrivanek
2013-06-10 11:51:38 UTC
Sounds like it would be best for QEMU itself to refuse to accept the 'migrate' command if it is paused in EIO, and have it fail an ongoing migration if EIO occurs. Doing it in libvirt is somewhat racy since event notifications are asynchronous. The support for canceling ongoing migration was committed upstream a while ago in commits: commit 5379bb0f33f1529f530a40958a10e8f02eb868bb Author: Peter Krempa <pkrempa> Date: Wed Jun 12 16:11:22 2013 +0200 migration: Don't propagate VIR_MIGRATE_ABORT_ON_ERROR This flag is meant for errors happening on the source of the migration and isn't used on the destination. To allow better migration compatibility, don't propagate it to the destination. commit cf6d56ac433273b7e4e087bb861ebced0680cec3 Author: Peter Krempa <pkrempa> Date: Wed Jun 12 16:11:21 2013 +0200 migration: Make erroring out on I/O error controllable by flag Paolo Bonzini pointed out that it's actually possible to migrate a qemu instance that was paused due to I/O error and it will be able to work on the destination if the storage is accessible. This patch introduces flag VIR_MIGRATE_ABORT_ON_ERROR that cancels the migration in case an I/O error happens while it's being performed and allows migration without this flag. This flag can be possibly used for other error reasons that may be introduced in the future. commit 5f719f217ebf89668ca3c404e4b8288179c26c92 Author: Peter Krempa <pkrempa> Date: Mon Jun 10 16:30:48 2013 +0200 qemu: Forbid migration of machines with I/O errors Such machine can't be successuflly migrated unles the I/O error has recovered and might lead to data corruption. Forbid this kind of migration. commit caa467db626c8691d993e8e15d2cbb0bb043312c Author: Peter Krempa <pkrempa> Date: Mon Jun 10 16:05:45 2013 +0200 qemu: Cancel migration if guest encoutners I/O error while migrating During a live migration the guest may receive a disk access I/O error. In this state the guest is unable to continue running on a remote host after migration as some state may be present in the kernel and not migrated. With this patch, the migration is canceled in such case so it can either continue on the source if the I/O issues are recovered or has to be destroyed anyways. verify with build : libvirt-0.10.2-21.el6.x86_64 step: 1:prepare two machine,source and target 2:create a guest on source with shared nfs test one: do migration after guest EIO: pause guest with I/O error #virsh domstate $guest --reason paused (I/O error) # virsh migrate --live spice qemu+ssh://10.66.106.31/system --verbose --unsafe root.106.31's password: error: cannot open file '/var/lib/libvirt/migrate/xuzhang-Graph.img': Input/output error check guest state # virsh domstate $guest --reason paused (I/O error) check target, no guest exist test two: during migration guest receive EIO: # virsh domstate spice running do migration # virsh migrate --live spice qemu+ssh://10.66.106.31/system --verbose --unsafe root.106.31's password: before migration finished, stop nfs server Migration: [ 96 %]error: Unable to read from monitor: Connection reset by peer the job stoped, check guest state # virsh domstate spice --reason paused (I/O error) check target, no guest exist. worked as expect, move to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1581.html |