Bug 1266430 - regression since 2.2.0: qemu did not raise error number when the underlying storage unexpectedly removed
regression since 2.2.0: qemu did not raise error number when the underlying s...
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.2
Unspecified All
unspecified Severity high
: rc
: ---
Assigned To: Markus Armbruster
Virtualization Bugs
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-25 05:19 EDT by Xiaoqing Wei
Modified: 2016-03-28 05:35 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-10-02 07:30:27 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Xiaoqing Wei 2015-09-25 05:19:27 EDT
Description of problem:

regression since 2.2.0: qemu did not raise error number when the underlying storage unexpectedly removed

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.3.0-24.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. plug in a usb-storage, 1TB disk in my test, host discovered it as /dev/sdb

boot a vm with it as storage back end.

    -drive id=drive_image2,if=none,snapshot=off,aio=native,format=raw,file='/dev/sdb',cache=none,aio=native,werror=stop,rerror=stop \
    -device virtio-blk-pci,drive=drive_image2,id=scsi-usb \
and init your qmp
{ "execute": "qmp_capabilities" }
2. login guest and get it's reflected device path, /dev/vda in this virtio-blk test.
and /dev/sda in scsi/usb test

3. in guest: dd if=/dev/zero of=/dev/vda bs=1M

4. unplug the usb-storage on host

Actual results:
no error raised in monitor, either hmp or qmp
(qemu) info status 
VM status: running
(qemu) info block
drive_image1: /home/w39.qcow2 (qcow2)
    Cache mode:       writeback, direct
    Backing file:     /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64-virtio-scsi.qcow2 (chain depth: 1)

drive_image2: /dev/sdb (raw)
    Cache mode:       writeback, direct
(qemu) info status 
VM status: paused (io-error)
(qemu) info block
drive_image1: /home/w39.qcow2 (qcow2)
    Cache mode:       writeback, direct
    Backing file:     /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64-virtio-scsi.qcow2 (chain depth: 1)

drive_image2: /dev/sdb (raw)
    I/O status:       failed
    Cache mode:       writeback, direct
(qemu) q
[root@dhcp-11-50 staf-kvm-devel]# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-2.3.0-24.el7.x86_64


qmp would get a error, but without a error code(see expected result below)
"errno": 5

{"timestamp": {"seconds": 1443172550, "microseconds": 199442}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image2", "nospace": false, "reason": "Input/output error", "operation": "write", "action": "stop"}}





Expected results:
both hmp and qmp should raise error,
hmp
(qemu) 
(qemu) block I/O error in device 'drive_image2': Input/output error (5)
block I/O error in device 'drive_image2': Input/output error (5)
block I/O error in device 'drive_image2': Input/output error (5)
block I/O error in device 'drive_image2': Input/output error (5)
block I/O error in device 'drive_image2': Input/output error (5)
.........
block I/O error in device 'drive_image2': Input/output error (5)



info status 
VM status: paused (io-error)
(qemu) info block
drive_image1: /home/w39.qcow2 (qcow2)
    Backing file:     /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64-virtio-scsi.qcow2 (chain depth: 1)

drive_image2: /dev/sdb (raw)
    I/O status:       failed
(qemu) info version 
2.1.2 (qemu-kvm-rhev-2.1.2-23.el7)


qmp:
{"timestamp": {"seconds": 1443172147, "microseconds": 731167}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image2", "__com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "__com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "stop"}}



Additional info:
latest working version from brew is
qemu-kvm-rhev-2.1.2-23.el7

so, this is a regression
Comment 3 Markus Armbruster 2015-09-28 09:53:51 EDT
Quick question on the reproducer: does step 3's dd complete before you
unplug (step 4)?

On actual and expected results: please provide output of dd in the
guest, and the relevant tail of the guest's dmesg.
Comment 4 Xiaoqing Wei 2015-09-28 23:04:55 EDT
(In reply to Markus Armbruster from comment #3)
> Quick question on the reproducer: does step 3's dd complete before you
> unplug (step 4)?
> 

no, the test point is to see if the VM stops(and it did), and raise error about the storage


> On actual and expected results: please provide output of dd in the
> guest, and the relevant tail of the guest's dmesg.

no output, the VM has already paused.
Comment 5 Markus Armbruster 2015-09-29 02:03:42 EDT
I'm confused.  Under "Actual results" you wrote the guest isn't paused, but now you tell me it is.  Please advise.
Comment 6 Markus Armbruster 2015-09-29 12:27:22 EDT
My attempts to reproduce this failed.  When I unplug my memory stick, the guest is paused due to I/O error as expected.
Comment 7 Xiaoqing Wei 2015-09-30 01:58:40 EDT
(In reply to Markus Armbruster from comment #5)
> I'm confused.  Under "Actual results" you wrote the guest isn't paused, but
> now you tell me it is.  Please advise.

my fault, I was wanna to tell the guest was running w/p problem before I unplug the usb storage, so I included both running, and pause status , now I seperate them as below
------------------------------------- guest running well, usb storage plugged as /dev/sdb on host

VM status: running
(qemu) info block
drive_image1: /home/w39.qcow2 (qcow2)
    Cache mode:       writeback, direct
    Backing file:     /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64-virtio-scsi.qcow2 (chain depth: 1)

drive_image2: /dev/sdb (raw)
    Cache mode:       writeback, direct



====================================== guest paused, nothing output from qemu monitor


(qemu) info status 
VM status: paused (io-error)
(qemu) info block
drive_image1: /home/w39.qcow2 (qcow2)
    Cache mode:       writeback, direct
    Backing file:     /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64-virtio-scsi.qcow2 (chain depth: 1)

drive_image2: /dev/sdb (raw)
    I/O status:       failed
    Cache mode:       writeback, direct
(qemu) q
Comment 8 Markus Armbruster 2015-09-30 11:38:02 EDT
All right, let's pick apart actual and expected results.

We have:

1. VM status, as reported by "info status"

2. I/O status of block backend drive_image2, as reported by "info block"

3. QMP event BLOCK_IO_ERROR

4. Additional messages to the (HMP) monitor and/or stderr

Actual results, please correct misunderstandings:

1. VM status: paused (io-error)

2. I/O status:       failed

3. QMP event {"timestamp": {"seconds": 1443172550, "microseconds": 199442}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image2", "nospace": false, "reason": "Input/output error", "operation": "write", "action": "stop"}}

4. Nothing

Expected results, please correct misunderstandings:

1. Like actual result

2. Like actual result

3. Actual result plus "__com.redhat_debug_info": {"message": "Input/output error", "errno": 5} in the value of "data"

4. A bunch of "block I/O error in device 'drive_image2': Input/output error (5)"

Is this correct?

Anything missing?
Comment 9 Ademar Reis 2015-10-01 14:24:06 EDT
(In reply to Markus Armbruster from comment #8)
(bunch of <snips>)
> We have:
> 
> 3. QMP event BLOCK_IO_ERROR
> 
> Actual results:
> 
> 3. QMP event {"timestamp": {"seconds": 1443172550, "microseconds": 199442},
> "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image2", "nospace":
> false, "reason": "Input/output error", "operation": "write", "action":
> "stop"}}
> 
> Expected results, please correct misunderstandings:
> 
> 3. Actual result plus "__com.redhat_debug_info": {"message": "Input/output
> error", "errno": 5} in the value of "data"
> 

That's my understading as well. And the lack of the error code in the QMP message is the actual regression that affects libvirt (or its users).
Comment 10 Markus Armbruster 2015-10-02 07:30:27 EDT
Both QMP key "__com.redhat_debug_info" (item 3) and the "block I/O
error in device" messages (item 4) are RHEL-only extensions.

These originally come from RHEL-6 commit 1a2b98f:

    Based on a suggestion from Kevin Wolf, this commit adds a new
    json-object member to the event, containing the errno value and an
    error string from strerror(). Additionally, an error message is
    printed to stderr.
    
    All this new information is meant for human debugging only and should
    not be used by libvirt (as stated in the event's documentation entry).

They extend the prior RHEL extension QMP key "__com.redhat_reason"
from commit a635efd.

We forward-ported them both to RHEL-7 qemu-kvm, in commit 771a3a3 and
commit bfea65d.

Upstream finally added QMP key "reason" in commit 624ff57, v2.2.
qemu-kvm-rhev got it via rebase, obviating our "__com.redhat_reason".
We want to keep "__com.redhat_reason" around for now anyway, to avoid
undue conflicts with libvirt versions that still use it.  We therefore
forward-ported it in commit 44f3e45.

We decided not to forward the other commit, because we feel the
debugging aids it provides are no longer necessary:

* "__com.reason.debug_info" provides two values "message" and "errno".
  The former is identical to "reason".  The latter can be trivially
  derived from "reason".

* Spewing messages to stderr has always been a poor substitute for
  proper logging.  Libvirt's logging should be good enough now.

See the review thread
http://post-office.corp.redhat.com/archives/rhvirt-patches/2015-May/msg00008.html

I'm therefore closing this NOTABUG.  If you think we need to
forward-port these debugging aids, please reopen the bug and explain
why they are still needed.

Note You need to log in before you can comment on or make changes to this bug.