Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1117445 - QMP: extend block events with error information
QMP: extend block events with error information
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.1
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Luiz Capitulino
Virtualization Bugs
:
Depends On:
Blocks: 1119784
  Show dependency treegraph
 
Reported: 2014-07-08 13:36 EDT by Luiz Capitulino
Modified: 2015-03-05 04:48 EST (History)
17 users (show)

See Also:
Fixed In Version: qemu-kvm-rhev-2.1.2-2.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1119784 (view as bug list)
Environment:
Last Closed: 2015-03-05 04:48:02 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:0624 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2015-03-05 09:37:36 EST

  None (edit)
Description Luiz Capitulino 2014-07-08 13:36:33 EDT
In RHEL6, we have extended the BLOCK_IO_ERROR event to contain the following fields:

o __com.redhat_reason: string representing enum value (ie. "eio", "eperm", "enospc" or "eother")
o __com.redhat_debug_info.errno: errno value as an integer
o __com.redhat_debug_info.message: error message returned by strerror()

Since then we have carried those extensions forward:

o RHEL6 (original request): bug 586349 and bug 624607
o RHEL7.0: bug 971938 and bug 895041
o RHEL7.1: bug 1116772

It's time to add this feature upstream, possibly for BLOCK_JOB_ERROR too. I'll add design ideas in the comments.
Comment 2 Luiz Capitulino 2014-07-08 13:52:53 EDT
Here are some design considerations when doing this for upstream:

o We may want the extension(s) in BLOCK_IO_ERROR and BLOCK_JOB_ERROR events
o query-block and query-block-jobs must contain this info too
o The errno integer should be dropped
o Having the error string from strerror() is probably fine
o For the "reason" field, we have two options: a QAPI enum containing the most common errnos; or, instead of having a "reason" at all, we could only distinguish between ENOSPC and all the other errors. Say, having "no-space-error" bool
o If doing the QAPI enum containing the most comman errnos, they we should have a catch-all for unknown errnos (eg. "unknown-error-code")

Here goes an example (taking the QAPI enum as solution for the "reason" field):

{ "event": "BLOCK_IO_ERROR",
    "data": { "device": "ide0-hd1",
              "operation": "write",
              "action": "stop",
              "error-reason": "eio",
              "error-message": "I/O error" },
    "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
Comment 3 Markus Armbruster 2014-07-16 10:22:05 EDT
Not relevant for RHEL-7, but might be wanted upstream anyway: make the information added to query-block and query-block visible in info block and info block-jobs.
Comment 4 Luiz Capitulino 2014-07-23 15:57:39 EDT
RFC series posted upstream:

http://lists.nongnu.org/archive/html/qemu-devel/2014-07/msg03235.html
Comment 5 Luiz Capitulino 2014-09-11 09:45:48 EDT
Posted v1 some time ago. It has already been applied in the block tree:

http://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg05346.html
Comment 6 Luiz Capitulino 2014-10-09 09:17:03 EDT
This is actually about qemu-kvm-rhev, so fix component.
Comment 10 Miroslav Rezanina 2014-10-10 03:34:11 EDT
Fix included in qemu-kvm-rhev-2.1.2-2.el7
Comment 12 langfang 2014-10-28 04:06:00 EDT
Reproduce this bug as follow version:
Host:
# uname -r
3.10.0-191.el7.x86_64
# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-2.1.0-3.el7.x86_64

Steps:
1.Boot guest with /dev/sdb (a usb storage on host)

2, do the following dd operation inside guest.
# dd if=/dev/urandom of=/dev/vda bs=1M

3, unplug usb storage from host.

Results:
QEMU:
...
block I/O error in device 'drive-virtio-disk1': Input/output error (5)
block I/O error in device 'drive-virtio-disk1': Input/output error (5)
block I/O error in device 'drive-virtio-disk1': Input/output error (5)
...

#telnet $ip 4444
...
{"timestamp": {"seconds": 1414478362, "microseconds": 665870}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "__com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "__com.redhat_reason": "eio", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1414478362, "microseconds": 665908}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "__com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "__com.redhat_reason": "eio", "operation": "write", "action": "stop"}}
....

{"execute":"query-block"}
...
{"io-status": "failed", "device": "drive-virtio-disk1", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 15513354240, "filename": "/dev/sdd", "format": "raw", "actual-size": 0, "dirty-flag": false}, "iops_wr": 0, "ro": false, "backing_file_depth": 0, "drv": "raw", "iops": 0, "bps_wr": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "file": "/dev/sdd", "encryption_key_missing": false}, "type": "unknown"}...


Test on latest version:

Version:
# uname -r
3.10.0-191.el7.x86_86
# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-2.1.2-5.el7.x86_64

Steps as same as reproduce

Resutls:
QEMU:
...
block I/O error in device 'drive-virtio-disk1': Input/output error (5)
block I/O error in device 'drive-virtio-disk1': Input/output error (5)
block I/O error in device 'drive-virtio-disk1': Input/output error (5)
block I/O error in device 'drive-virtio-disk1': Input/output error (5)
block I/O error in device 'drive-virtio-disk1': Input/output error (5)
...
#telnet $IP 4444
....
{"timestamp": {"seconds": 1414479655, "microseconds": 267117}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "__com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "__com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1414479655, "microseconds": 267165}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "__com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "__com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1414479655, "microseconds": 267206}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "__com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "__com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "stop"}}
....

{"execute":"query-block"}
..
 {"io-status": "failed", "device": "drive-virtio-disk1", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 15513354240, "filename": "/dev/sdd", "format": "raw", "actual-size": 0, "dirty-flag": false}, "iops_wr": 0, "ro": false, "backing_file_depth": 0, "drv": "raw", "iops": 0, "bps_wr": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "file": "/dev/sdd", "encryption_key_missing": false}, "type": "unknown"}...


Check the results as comment2
o We may want the extension(s) in BLOCK_IO_ERROR and BLOCK_JOB_ERROR events

o query-block and query-block-jobs must contain this info too--->***contain this info,seem get the same info between unfixed version and fixed version use {"execute":"qeuery-block"}--->not fixed

o The errno integer should be dropped--->the errno integer not droped,will see :
"errno": 5---->not fixed

o Having the error string from strerror() is probably fine--->will see: ..."reason": "Input/output error"...--->fixed

o For the "reason" field, we have two options: a QAPI enum containing the most common errnos; or, instead of having a "reason" at all, we could only distinguish between ENOSPC and all the other errors. Say, having "no-space-error" bool---->will see: ..."nospace": false ....--->fixed



Addtional info:

1) Test try to trigger "no space left " error

Results:
(qemu) info status 
VM status: paused (prelaunch)

#telnet $ip 4444
..
{"timestamp": {"seconds": 1414481471, "microseconds": 854961}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk", "__com.redhat_debug_info": {"message": "No space left on device", "errno": 28}, "nospace": true, "__com.redhat_reason": "enospc", "reason": "No space left on device", "operation": "write", "action": "stop"}}

2)Test try to  tigger block job error

Steps
@@@1.Boot guest with usb device
 ...-drive file=/dev/sdd,if=none,id=drive-virtio-disk1,format=raw,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=on,bus=pci.0,addr=0x6,drive=drive-virtio-disk1,id=virtio-disk1,bootindex=2..

@@@2.
#telnet $ip 4444
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk1","snapshot-file":"/root/sn1","format": "qcow2" } }


@@@3.While block-stream  the usb device, hotunplug the device.
{ "execute": "block-stream", "arguments": { "device": "drive-virtio-disk1"}}


Resutls:
...
{"timestamp": {"seconds": 1414482202, "microseconds": 191604}, "event": "BLOCK_JOB_ERROR", "data": {"device": "drive-virtio-disk1", "operation": "read", "action": "report"}}--->***seem miss some info (EG: reason, debug info)
..
{"timestamp": {"seconds": 1414482202, "microseconds": 191735}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk1", "len": 15513354240, "offset": 134217728, "speed": 0, "type": "stream", "error": "Input/output error"}}

Hi,Luiz

   Please help me to see the above test , seem not fixed according to comment2.thanks

best regards
fang lang
Comment 13 Luiz Capitulino 2014-10-29 12:57:23 EDT
The design changed on upstream. We ended up merging a simpler implementation which adds keys "nospace" and "reason" only to the BLOCK_IO_ERROR event.
Comment 14 langfang 2014-10-29 21:27:47 EDT
(In reply to Luiz Capitulino from comment #13)
> The design changed on upstream. We ended up merging a simpler implementation
> which adds keys "nospace" and "reason" only to the BLOCK_IO_ERROR event.


Results:

1) "BLOCK_IO_ERROR"
..
{"timestamp": {"seconds": 1414479655, "microseconds": 267117}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "__com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "__com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "stop"}}
...

{"timestamp": {"seconds": 1414481471, "microseconds": 854961}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk", "__com.redhat_debug_info": {"message": "No space left on device", "errno": 28}, "nospace": true, "__com.redhat_reason": "enospc", "reason": "No space left on device", "operation": "write", "action": "stop"}}

2) "BLOCK_JOB_ERROR"
...
{"timestamp": {"seconds": 1414482202, "microseconds": 191604}, "event": "BLOCK_JOB_ERROR", "data": {"device": "drive-virtio-disk1", "operation": "read", "action": "report"}}
...

3)"query-block"
....
{"io-status": "failed", "device": "drive-virtio-disk1", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 15513354240, "filename": "/dev/sdd", "format": "raw", "actual-size": 0, "dirty-flag": false}, "iops_wr": 0, "ro": false, "backing_file_depth": 0, "drv": "raw", "iops": 0, "bps_wr": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "file": "/dev/sdd", "encryption_key_missing": false}, "type": "unknown"}--->get the same info between unfixed version and fixed version 
...


Hi,Luiz
   thanks for your review,as you said ,are the expect results for this bug? Is there any plan to  fix others in the feature( EG:"BLOCK_JOB_ERROR" info)? thanks
Comment 15 Luiz Capitulino 2014-10-30 10:19:23 EDT
>    thanks for your review,as you said ,are the expect results for this bug?

Yes.

> Is there any plan to  fix others in the feature( EG:"BLOCK_JOB_ERROR" info)?

Not at this moment.
Comment 16 langfang 2014-10-30 23:30:14 EDT
(In reply to Luiz Capitulino from comment #15)
> >    thanks for your review,as you said ,are the expect results for this bug?
> 
> Yes.
> 
> > Is there any plan to  fix others in the feature( EG:"BLOCK_JOB_ERROR" info)?
> 
> Not at this moment.

As comment15, we can verify this bug. thanks
Comment 19 errata-xmlrpc 2015-03-05 04:48:02 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0624.html

Note You need to log in before you can comment on or make changes to this bug.