RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1266430 - regression since 2.2.0: qemu did not raise error number when the underlying storage unexpectedly removed
Summary: regression since 2.2.0: qemu did not raise error number when the underlying s...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.2
Hardware: Unspecified
OS: All
unspecified
high
Target Milestone: rc
: ---
Assignee: Markus Armbruster
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-09-25 09:19 UTC by Xiaoqing Wei
Modified: 2016-03-28 09:35 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-02 11:30:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Xiaoqing Wei 2015-09-25 09:19:27 UTC
Description of problem:

regression since 2.2.0: qemu did not raise error number when the underlying storage unexpectedly removed

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.3.0-24.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. plug in a usb-storage, 1TB disk in my test, host discovered it as /dev/sdb

boot a vm with it as storage back end.

    -drive id=drive_image2,if=none,snapshot=off,aio=native,format=raw,file='/dev/sdb',cache=none,aio=native,werror=stop,rerror=stop \
    -device virtio-blk-pci,drive=drive_image2,id=scsi-usb \
and init your qmp
{ "execute": "qmp_capabilities" }
2. login guest and get it's reflected device path, /dev/vda in this virtio-blk test.
and /dev/sda in scsi/usb test

3. in guest: dd if=/dev/zero of=/dev/vda bs=1M

4. unplug the usb-storage on host

Actual results:
no error raised in monitor, either hmp or qmp
(qemu) info status 
VM status: running
(qemu) info block
drive_image1: /home/w39.qcow2 (qcow2)
    Cache mode:       writeback, direct
    Backing file:     /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64-virtio-scsi.qcow2 (chain depth: 1)

drive_image2: /dev/sdb (raw)
    Cache mode:       writeback, direct
(qemu) info status 
VM status: paused (io-error)
(qemu) info block
drive_image1: /home/w39.qcow2 (qcow2)
    Cache mode:       writeback, direct
    Backing file:     /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64-virtio-scsi.qcow2 (chain depth: 1)

drive_image2: /dev/sdb (raw)
    I/O status:       failed
    Cache mode:       writeback, direct
(qemu) q
[root@dhcp-11-50 staf-kvm-devel]# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-2.3.0-24.el7.x86_64


qmp would get a error, but without a error code(see expected result below)
"errno": 5

{"timestamp": {"seconds": 1443172550, "microseconds": 199442}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image2", "nospace": false, "reason": "Input/output error", "operation": "write", "action": "stop"}}





Expected results:
both hmp and qmp should raise error,
hmp
(qemu) 
(qemu) block I/O error in device 'drive_image2': Input/output error (5)
block I/O error in device 'drive_image2': Input/output error (5)
block I/O error in device 'drive_image2': Input/output error (5)
block I/O error in device 'drive_image2': Input/output error (5)
block I/O error in device 'drive_image2': Input/output error (5)
.........
block I/O error in device 'drive_image2': Input/output error (5)



info status 
VM status: paused (io-error)
(qemu) info block
drive_image1: /home/w39.qcow2 (qcow2)
    Backing file:     /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64-virtio-scsi.qcow2 (chain depth: 1)

drive_image2: /dev/sdb (raw)
    I/O status:       failed
(qemu) info version 
2.1.2 (qemu-kvm-rhev-2.1.2-23.el7)


qmp:
{"timestamp": {"seconds": 1443172147, "microseconds": 731167}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image2", "__com.redhat_debug_info": {"message": "Input/output error", "errno": 5}, "nospace": false, "__com.redhat_reason": "eio", "reason": "Input/output error", "operation": "write", "action": "stop"}}



Additional info:
latest working version from brew is
qemu-kvm-rhev-2.1.2-23.el7

so, this is a regression

Comment 3 Markus Armbruster 2015-09-28 13:53:51 UTC
Quick question on the reproducer: does step 3's dd complete before you
unplug (step 4)?

On actual and expected results: please provide output of dd in the
guest, and the relevant tail of the guest's dmesg.

Comment 4 Xiaoqing Wei 2015-09-29 03:04:55 UTC
(In reply to Markus Armbruster from comment #3)
> Quick question on the reproducer: does step 3's dd complete before you
> unplug (step 4)?
> 

no, the test point is to see if the VM stops(and it did), and raise error about the storage


> On actual and expected results: please provide output of dd in the
> guest, and the relevant tail of the guest's dmesg.

no output, the VM has already paused.

Comment 5 Markus Armbruster 2015-09-29 06:03:42 UTC
I'm confused.  Under "Actual results" you wrote the guest isn't paused, but now you tell me it is.  Please advise.

Comment 6 Markus Armbruster 2015-09-29 16:27:22 UTC
My attempts to reproduce this failed.  When I unplug my memory stick, the guest is paused due to I/O error as expected.

Comment 7 Xiaoqing Wei 2015-09-30 05:58:40 UTC
(In reply to Markus Armbruster from comment #5)
> I'm confused.  Under "Actual results" you wrote the guest isn't paused, but
> now you tell me it is.  Please advise.

my fault, I was wanna to tell the guest was running w/p problem before I unplug the usb storage, so I included both running, and pause status , now I seperate them as below
------------------------------------- guest running well, usb storage plugged as /dev/sdb on host

VM status: running
(qemu) info block
drive_image1: /home/w39.qcow2 (qcow2)
    Cache mode:       writeback, direct
    Backing file:     /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64-virtio-scsi.qcow2 (chain depth: 1)

drive_image2: /dev/sdb (raw)
    Cache mode:       writeback, direct



====================================== guest paused, nothing output from qemu monitor


(qemu) info status 
VM status: paused (io-error)
(qemu) info block
drive_image1: /home/w39.qcow2 (qcow2)
    Cache mode:       writeback, direct
    Backing file:     /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64-virtio-scsi.qcow2 (chain depth: 1)

drive_image2: /dev/sdb (raw)
    I/O status:       failed
    Cache mode:       writeback, direct
(qemu) q

Comment 8 Markus Armbruster 2015-09-30 15:38:02 UTC
All right, let's pick apart actual and expected results.

We have:

1. VM status, as reported by "info status"

2. I/O status of block backend drive_image2, as reported by "info block"

3. QMP event BLOCK_IO_ERROR

4. Additional messages to the (HMP) monitor and/or stderr

Actual results, please correct misunderstandings:

1. VM status: paused (io-error)

2. I/O status:       failed

3. QMP event {"timestamp": {"seconds": 1443172550, "microseconds": 199442}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image2", "nospace": false, "reason": "Input/output error", "operation": "write", "action": "stop"}}

4. Nothing

Expected results, please correct misunderstandings:

1. Like actual result

2. Like actual result

3. Actual result plus "__com.redhat_debug_info": {"message": "Input/output error", "errno": 5} in the value of "data"

4. A bunch of "block I/O error in device 'drive_image2': Input/output error (5)"

Is this correct?

Anything missing?

Comment 9 Ademar Reis 2015-10-01 18:24:06 UTC
(In reply to Markus Armbruster from comment #8)
(bunch of <snips>)
> We have:
> 
> 3. QMP event BLOCK_IO_ERROR
> 
> Actual results:
> 
> 3. QMP event {"timestamp": {"seconds": 1443172550, "microseconds": 199442},
> "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image2", "nospace":
> false, "reason": "Input/output error", "operation": "write", "action":
> "stop"}}
> 
> Expected results, please correct misunderstandings:
> 
> 3. Actual result plus "__com.redhat_debug_info": {"message": "Input/output
> error", "errno": 5} in the value of "data"
> 

That's my understading as well. And the lack of the error code in the QMP message is the actual regression that affects libvirt (or its users).

Comment 10 Markus Armbruster 2015-10-02 11:30:27 UTC
Both QMP key "__com.redhat_debug_info" (item 3) and the "block I/O
error in device" messages (item 4) are RHEL-only extensions.

These originally come from RHEL-6 commit 1a2b98f:

    Based on a suggestion from Kevin Wolf, this commit adds a new
    json-object member to the event, containing the errno value and an
    error string from strerror(). Additionally, an error message is
    printed to stderr.
    
    All this new information is meant for human debugging only and should
    not be used by libvirt (as stated in the event's documentation entry).

They extend the prior RHEL extension QMP key "__com.redhat_reason"
from commit a635efd.

We forward-ported them both to RHEL-7 qemu-kvm, in commit 771a3a3 and
commit bfea65d.

Upstream finally added QMP key "reason" in commit 624ff57, v2.2.
qemu-kvm-rhev got it via rebase, obviating our "__com.redhat_reason".
We want to keep "__com.redhat_reason" around for now anyway, to avoid
undue conflicts with libvirt versions that still use it.  We therefore
forward-ported it in commit 44f3e45.

We decided not to forward the other commit, because we feel the
debugging aids it provides are no longer necessary:

* "__com.reason.debug_info" provides two values "message" and "errno".
  The former is identical to "reason".  The latter can be trivially
  derived from "reason".

* Spewing messages to stderr has always been a poor substitute for
  proper logging.  Libvirt's logging should be good enough now.

See the review thread
http://post-office.corp.redhat.com/archives/rhvirt-patches/2015-May/msg00008.html

I'm therefore closing this NOTABUG.  If you think we need to
forward-port these debugging aids, please reopen the bug and explain
why they are still needed.


Note You need to log in before you can comment on or make changes to this bug.