Bug 1090079 - vdsm reports guest as paused on any IO error, even if libvirt/qemu policy is set to "report"
Summary: vdsm reports guest as paused on any IO error, even if libvirt/qemu policy is ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.2.0
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 3.3.3
Assignee: Francesco Romani
QA Contact: Pavel Novotny
URL:
Whiteboard: virt
Depends On: 1064630
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-04-22 13:58 UTC by rhev-integ
Modified: 2019-08-15 03:52 UTC (History)
11 users (show)

Fixed In Version: vdsm-4.13.2-0.14.el6ev
Doc Type: Bug Fix
Doc Text:
Previously, VDSM would report that virtual machines experiencing any I/O error were in a paused state. This was caused by the logic used by VDSM to check I/O errors received from libvirt. Now, the logic used to check such errors has been revised so that VDSM detects the nature of the error, allowing I/O errors to be correctly reported and handled.
Clone Of: 1064630
Environment:
Last Closed: 2014-05-27 08:57:34 UTC
oVirt Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0548 0 normal SHIPPED_LIVE vdsm 3.3.3 bug fix update 2014-05-27 12:56:53 UTC
oVirt gerrit 25157 0 None None None Never
oVirt gerrit 26023 0 None None None Never
oVirt gerrit 27016 0 None None None Never

Comment 3 Pavel Novotny 2014-05-05 14:54:40 UTC
Verified in vdsm-4.13.2-0.14.el6ev.x86_64 (is36).

Verification steps:

I used local storages on the host.
One "regular" under `/mnt/localstorage/`.
And a second "flakey" one under `/mnt/errstorage/`, simulating I/O errors using `dmseup` utility:
-~-
# dd if=/dev/zero of=/tmp/virtualblock.img bs=4096 count=1M
1048576+0 records in
1048576+0 records out
4294967296 bytes (4,3 GB) copied, 50,7873 s, 84,6 MB/s
# losetup /dev/loop7 /tmp/virtualblock.img 
# mkfs.ext4 /dev/loop7 
mke2fs 1.41.12 (17-May-2010)
Discarding device blocks: done                            
...
...
### following command creates a flakey device with random I/O errors
# dmsetup create errdev0
0 8388608 flakey /dev/loop7 0 9 1
# mkdir /mnt/errstorage
# chown -R vdsm:kvm /mnt/errstorage
# mount /dev/mapper/errdev0 /mnt/errstorage/
-~-

In RHEVM GUI, add both local storages (/mnt/localstorage as master SD).
Create new VM with two disks - one "healthy" disk on the `localstorage` domain and a second "flakey" disk (1G) on the `errorstorage` domain.

In RHEVM DB, update both disks to propagate errors to guest: psql: UPDATE base_disks SET propagate_errors = 'On';
Restart ovirt-engine service.

In RHEVM GUI, install the guest OS *on the healthy disk* (I used Fedora 19).
In the guest, mount the second flakey disk to `/mnt/errdisk/` and run some I/O operation on it.
I used `dd`: # dd if=/dev/zero of=/mnt/errdisk/test bs=1000 mount=1M
and after few seconds I got a splash of I/O errors "Buffer I/O error on device vdb, logical block ...".

Results:
The qemu process runs with correct parameter 'werror=enospc'.
After the I/O errors, the guest is still running.
Both, QEMU/VDSM and RHEVM, are also reporting the guest as running.

Comment 5 errata-xmlrpc 2014-05-27 08:57:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0548.html


Note You need to log in before you can comment on or make changes to this bug.