1119784 – QMP: extend block events with error information

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1119784 - QMP: extend block events with error information

Summary: QMP: extend block events with error information

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Jiri Denemark
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:	1117445
Blocks:
TreeView+	depends on / blocked

Reported:	2014-07-15 13:47 UTC by Luiz Capitulino
Modified:	2015-03-05 07:41 UTC (History)
CC List:	20 users (show)
Fixed In Version:	libvirt-1.2.8-6.el7
Doc Type:	Bug Fix
Doc Text:
Clone Of:	1117445
Environment:
Last Closed:	2015-03-05 07:41:15 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2015:0323	0	normal	SHIPPED_LIVE	Low: libvirt security, bug fix, and enhancement update	2015-03-05 12:10:54 UTC

Description Luiz Capitulino 2014-07-15 13:47:56 UTC

This BZ is about the libvirt side work that will be required when the RHEL-only BLOCK_IO_ERROR event extensions are added upstream as described below.

+++ This bug was initially created as a clone of Bug #1117445 +++

In RHEL6, we have extended the BLOCK_IO_ERROR event to contain the following fields:

o __com.redhat_reason: string representing enum value (ie. "eio", "eperm", "enospc" or "eother")
o __com.redhat_debug_info.errno: errno value as an integer
o __com.redhat_debug_info.message: error message returned by strerror()

Since then we have carried those extensions forward:

o RHEL6 (original request): bug 586349 and bug 624607
o RHEL7.0: bug 971938 and bug 895041
o RHEL7.1: bug 1116772

It's time to add this feature upstream, possibly for BLOCK_JOB_ERROR too. I'll add design ideas in the comments.

--- Additional comment from RHEL Product and Program Management on 2014-07-08 13:37:54 EDT ---

Since this bug report was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from Luiz Capitulino on 2014-07-08 13:52:53 EDT ---

Here are some design considerations when doing this for upstream:

o We may want the extension(s) in BLOCK_IO_ERROR and BLOCK_JOB_ERROR events
o query-block and query-block-jobs must contain this info too
o The errno integer should be dropped
o Having the error string from strerror() is probably fine
o For the "reason" field, we have two options: a QAPI enum containing the most common errnos; or, instead of having a "reason" at all, we could only distinguish between ENOSPC and all the other errors. Say, having "no-space-error" bool
o If doing the QAPI enum containing the most comman errnos, they we should have a catch-all for unknown errnos (eg. "unknown-error-code")

Here goes an example (taking the QAPI enum as solution for the "reason" field):

{ "event": "BLOCK_IO_ERROR",
    "data": { "device": "ide0-hd1",
              "operation": "write",
              "action": "stop",
              "error-reason": "eio",
              "error-message": "I/O error" },
    "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }

Comment 1 Eric Blake 2014-10-03 15:03:28 UTC

Upstream patch proposed.
https://www.redhat.com/archives/libvir-list/2014-October/msg00124.html

Comment 4 Xuesong Zhang 2014-11-21 08:46:06 UTC

Test with the following build, this bug is verified.

libvirt-1.2.8-7.el7.x86_64
qemu-kvm-rhev-2.1.2-10.el7.x86_64
kernel-3.10.0-205.el7.x86_64

Scenario 1: report reason of BLOCK_IO_ERROR while unplug the device
1. start a guest with usb disk.
# virsh dumpxml rhel7|grep disk  -A5
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/sdc'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
    </disk>
# virsh start rhel7
Domain rhel7 started

2. do the following dd operation inside guest.
# dd if=/dev/urandom of=/dev/vdb bs=1M
3. unplug usb stick from host.

4. there is ""reason": "Input/output error"" following BLOCK_IO_ERROR in libvirtd.log:
2014-11-20 12:28:24.613+0000: 2245: debug : qemuMonitorIOProcess:399 : QEMU_MONITOR_IO_PROCESS: mon=0x7fc168001570 buf={"timestamp": {"seconds": 1416486504, "microseconds": 612675}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "report"}}^M
{"timestamp": {"seconds": 1416486504, "microseconds": 612796}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "report"}}^M
{"timestamp": {"seconds": 1416486504, "microseconds": 612889}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "report"}}^M
{"timestamp": {"seconds": 1416486504 len=1023


Scenario 2: report reason of BLOCK_IO_ERROR while no space
1. create one lvm which is 500M.
  LV Path                /dev/vg_flang/lv_flang
  LV Name                lv_flang
  VG Name                vg_flang
  LV UUID                NhJ6o5-B49l-tqjj-3Hsa-XaNj-jA8T-dhijel
  LV Write Access        read/write
  LV Creation host, time localhost.localdomain, 2014-11-21 15:59:51 +0800
  LV Status              available
  # open                 0
  LV Size                500.00 MiB
  Current LE             125
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     8192
  Block device           253:3

2. new one guest with this lvm, please choose the disk type as qcow2, the disk dumpxml should be like following one:

<disk type='block' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='native'/>
      <source dev='/dev/vg_flang/lv_flang'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>

3. guest will be paused, error ""reason": "No space left on device"" will be get in libvirtd.log as expected.
2014-11-21 08:37:19.263+0000: 7613: debug : qemuMonitorIOProcess:399 : QEMU_MONITOR_IO_PROCESS: mon=0x7f4af00095b0 buf={"timestamp": {"seconds": 1416559039, "microseconds": 262858}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk0", "__com.redhat_debug_info": {"message": "No space left on device", "errno": 28}, "nospace": true, "__com.redhat_reason": "enospc", "reason": "No space left on device", "operation": "write", "action": "stop"}}

Comment 6 errata-xmlrpc 2015-03-05 07:41:15 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0323.html

Note You need to log in before you can comment on or make changes to this bug.

armbru
dyuan
eblake
fromani
huding
jdenemar
juzhang
kwolf
michen
mzhan
pbonzini
qzhang
rbalakri
shu
shyu
sluo
virt-bugs
virt-maint
xfu
xuzhang