Red Hat Bugzilla – Bug 1119784
QMP: extend block events with error information
Last modified: 2015-03-05 02:41:15 EST
This BZ is about the libvirt side work that will be required when the RHEL-only BLOCK_IO_ERROR event extensions are added upstream as described below. +++ This bug was initially created as a clone of Bug #1117445 +++ In RHEL6, we have extended the BLOCK_IO_ERROR event to contain the following fields: o __com.redhat_reason: string representing enum value (ie. "eio", "eperm", "enospc" or "eother") o __com.redhat_debug_info.errno: errno value as an integer o __com.redhat_debug_info.message: error message returned by strerror() Since then we have carried those extensions forward: o RHEL6 (original request): bug 586349 and bug 624607 o RHEL7.0: bug 971938 and bug 895041 o RHEL7.1: bug 1116772 It's time to add this feature upstream, possibly for BLOCK_JOB_ERROR too. I'll add design ideas in the comments. --- Additional comment from RHEL Product and Program Management on 2014-07-08 13:37:54 EDT --- Since this bug report was entered in bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release. --- Additional comment from Luiz Capitulino on 2014-07-08 13:52:53 EDT --- Here are some design considerations when doing this for upstream: o We may want the extension(s) in BLOCK_IO_ERROR and BLOCK_JOB_ERROR events o query-block and query-block-jobs must contain this info too o The errno integer should be dropped o Having the error string from strerror() is probably fine o For the "reason" field, we have two options: a QAPI enum containing the most common errnos; or, instead of having a "reason" at all, we could only distinguish between ENOSPC and all the other errors. Say, having "no-space-error" bool o If doing the QAPI enum containing the most comman errnos, they we should have a catch-all for unknown errnos (eg. "unknown-error-code") Here goes an example (taking the QAPI enum as solution for the "reason" field): { "event": "BLOCK_IO_ERROR", "data": { "device": "ide0-hd1", "operation": "write", "action": "stop", "error-reason": "eio", "error-message": "I/O error" }, "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
Upstream patch proposed. https://www.redhat.com/archives/libvir-list/2014-October/msg00124.html
Test with the following build, this bug is verified. libvirt-1.2.8-7.el7.x86_64 qemu-kvm-rhev-2.1.2-10.el7.x86_64 kernel-3.10.0-205.el7.x86_64 Scenario 1: report reason of BLOCK_IO_ERROR while unplug the device 1. start a guest with usb disk. # virsh dumpxml rhel7|grep disk -A5 <disk type='block' device='disk'> <driver name='qemu' type='raw'/> <source dev='/dev/sdc'/> <backingStore/> <target dev='vdb' bus='virtio'/> </disk> # virsh start rhel7 Domain rhel7 started 2. do the following dd operation inside guest. # dd if=/dev/urandom of=/dev/vdb bs=1M 3. unplug usb stick from host. 4. there is ""reason": "Input/output error"" following BLOCK_IO_ERROR in libvirtd.log: 2014-11-20 12:28:24.613+0000: 2245: debug : qemuMonitorIOProcess:399 : QEMU_MONITOR_IO_PROCESS: mon=0x7fc168001570 buf={"timestamp": {"seconds": 1416486504, "microseconds": 612675}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "report"}}^M {"timestamp": {"seconds": 1416486504, "microseconds": 612796}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "report"}}^M {"timestamp": {"seconds": 1416486504, "microseconds": 612889}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "report"}}^M {"timestamp": {"seconds": 1416486504 len=1023 Scenario 2: report reason of BLOCK_IO_ERROR while no space 1. create one lvm which is 500M. LV Path /dev/vg_flang/lv_flang LV Name lv_flang VG Name vg_flang LV UUID NhJ6o5-B49l-tqjj-3Hsa-XaNj-jA8T-dhijel LV Write Access read/write LV Creation host, time localhost.localdomain, 2014-11-21 15:59:51 +0800 LV Status available # open 0 LV Size 500.00 MiB Current LE 125 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 8192 Block device 253:3 2. new one guest with this lvm, please choose the disk type as qcow2, the disk dumpxml should be like following one: <disk type='block' device='disk'> <driver name='qemu' type='qcow2' cache='none' io='native'/> <source dev='/dev/vg_flang/lv_flang'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </disk> 3. guest will be paused, error ""reason": "No space left on device"" will be get in libvirtd.log as expected. 2014-11-21 08:37:19.263+0000: 7613: debug : qemuMonitorIOProcess:399 : QEMU_MONITOR_IO_PROCESS: mon=0x7f4af00095b0 buf={"timestamp": {"seconds": 1416559039, "microseconds": 262858}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk0", "__com.redhat_debug_info": {"message": "No space left on device", "errno": 28}, "nospace": true, "__com.redhat_reason": "enospc", "reason": "No space left on device", "operation": "write", "action": "stop"}}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0323.html