Bug 1004154 - drive_del cause "fdisk -l" fail inside RHEL7 guest
drive_del cause "fdisk -l" fail inside RHEL7 guest
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.0
x86_64 Linux
medium Severity medium
: rc
: ---
Assigned To: Markus Armbruster
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-04 01:56 EDT by Jun Li
Modified: 2014-01-15 09:04 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-13 09:29:45 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jun Li 2013-09-04 01:56:46 EDT
Description of problem:
"fdisk -l" fail inside RHEL7 guest after run "drive_del drive0-1" inside qemu monitor.


Version-Release number of selected component (if applicable):
3.10.0-11.el7.x86_64
qemu-kvm-1.5.2-4.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot guest with a data disk.
# /usr/libexec/qemu-kvm -S -M pc-i440fx-rhel7.0.0 \
-cpu SandyBridge -enable-kvm \
-m 4G -smp 4,sockets=2,cores=2,threads=1 \
-name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa61 \
-rtc base=localtime,clock=host,driftfix=slew \
-drive file=/home/rhel7base_from_qiguo.qcow2_v3,if=none,cache=none,aio=native,format=qcow2,rerror=stop,werror=stop,id=drive0 \
-device virtio-blk-pci,bus=pci.0,addr=0x8,drive=drive0,id=sys-disk,bootindex=0 \
-drive file=/home/sdb-0903.raw,if=none,cache=none,aio=native,format=raw,rerror=stop,werror=stop,id=drive0-1  \
-device virtio-blk-pci,bus=pci.0,drive=drive0-1,id=data-disk-1 \
-device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 \
-netdev tap,id=hostnet0,vhost=off,queues=4,script=/etc/qemu-ifup -device virtio-net-pci,mq=on,vectors=17,netdev=hostnet0,id=virtio-net-pci0,mac=24:be:05:14:0d:82,addr=0x17,bootindex=2 -k en-us -boot menu=on,reboot-timeout=-1,strict=on \
-qmp tcp:0:4445,server,nowait -serial unix:/tmp/ttyS0,server,nowait \
-vnc :3 -spice port=5932,disable-ticketing \
-vga cirrus -monitor stdio -monitor tcp:0:7445,server,nowait \
-monitor unix:/tmp/monitor1,server,nowait
2.hot unplug device inside qemu monitor using "drive_del".
(qemu) drive_del drive0-1
3.check result inside guest using "fdisk -l".

Actual results:
# fdisk -l
fdisk: cannot open /dev/vda: Input/output error


Expected results:
"fdisk -l" can be used as normal.

Additional info:
I also test RHEL6 host, "fdisk -l" inside RHEL6 guest can be used as normal, but qemu monitor will reports I/O error.
Comment 2 Jun Li 2013-09-04 02:50:09 EDT
Whether "fdisk -l" fail in RHEL7 guest is a new issue? 
"drive_del" cause "Input/output error". This error cause "fdisk -l" fail inside RHEL7.
I am not sure if it is a new issue.  Do I need to file a new bug to track it for RHEL7 guest.
Comment 3 Markus Armbruster 2013-09-10 04:59:36 EDT
drive_del rips out the block backend forcibly.  To the guest, this
should look like hardware failure.

Background info: drive_del is useful when you want to revoke the
guest's access to an image, and a hot unplug is impossible (disk
device doesn't support it) or doesn't work (e.g. because the guest
doesn't cooperate).

The observed result (fdisk gets an I/O error and fails) is exactly
what I expect, which suggests NOTABUG.  See also RHEL-6 bug 970159 and
upstream commit 5810174.

Your "Additional Info" for RHEL-6 reports different behavior.  We
changed error handling after drive_del recently (bug 970159, fixed in
qemu-kvm-0.12.1.2-2.387.el6 by backporting said commit 5810174).
Please make sure to test with the current version, and report back.
Comment 4 Jun Li 2013-09-11 05:43:39 EDT
(In reply to Markus Armbruster from comment #3)
> drive_del rips out the block backend forcibly.  To the guest, this
> should look like hardware failure.
> 
> Background info: drive_del is useful when you want to revoke the
> guest's access to an image, and a hot unplug is impossible (disk
> device doesn't support it) or doesn't work (e.g. because the guest
> doesn't cooperate).
> 
> The observed result (fdisk gets an I/O error and fails) is exactly
> what I expect, which suggests NOTABUG.  See also RHEL-6 bug 970159 and
> upstream commit 5810174.
> 
I have read that bug. But I did not delete device(device_del), I only delete drive(drive_del). After delete drive(drive_del), can not run "fdisk -l" command correctly inside RHEL7 guest. At the same time, if use RHEL6 guest, "fdisk -l" can run correctly.

> Your "Additional Info" for RHEL-6 reports different behavior.  We
> changed error handling after drive_del recently (bug 970159, fixed in
> qemu-kvm-0.12.1.2-2.387.el6 by backporting said commit 5810174).
> Please make sure to test with the current version, and report back.

I test RHEL 6.4 guest and RHEL7 guest inside RHEL6.5 host.

qemu-kvm and host kernel version:
qemu-kvm-0.12.1.2-2.399.el6.x86_64
2.6.32-416.el6.x86_64

For RHEL6.4 guest, after delete drive(drive_del), "fdisk -l" could run as normal inside guest.
For RHEL7 guest, after delete drive(drive_del), "fdisk -l" can not run correctly inside guest.
Comment 5 Markus Armbruster 2014-01-13 09:29:45 EST
The guest cannot do *anything* with the block device after its backend
was ripped off with drive_del.  The device is *wrecked*.

The guest will notice the device failed when it next tries to read or
write.  That may not happen on the very first guest operation on the
device.  A guest operation may execute fully within the guest OS's
cache.  Only when the OS has to read/write the actual device will it
realize that you wrecked it.

"fdisk -l" merely lists the partition table.  If the partition table
information is already cached, and the fdisk program makes no attempt
to bypass the cache, then it can succeed even after you broke the
device.

To sum up:

* The reported expected behavior is wrong.

* The reported actual behavior with a RHEL-7 is expected.

* The reported actual behavior with a RHEL-6 guest is inconclusive.

I'm therefore closing this NOTABUG.

If you have questions, don't hesitate to ask.
Comment 6 Jun Li 2014-01-14 00:39:48 EST
(In reply to Markus Armbruster from comment #5)
> The guest cannot do *anything* with the block device after its backend
> was ripped off with drive_del.  The device is *wrecked*.
> 
> The guest will notice the device failed when it next tries to read or
> write.  That may not happen on the very first guest operation on the
> device.  A guest operation may execute fully within the guest OS's
> cache.  Only when the OS has to read/write the actual device will it
> realize that you wrecked it.
> 
> "fdisk -l" merely lists the partition table.  If the partition table
> information is already cached, and the fdisk program makes no attempt
> to bypass the cache, then it can succeed even after you broke the
> device.
As "fdisk -l" merely lists the partition table, every disk has its own partition table. When hot-unplug the second disk drive, the second disk partition table can not read, but the first disk(system disk) partition table is well. So in my opinion,"fdisk -l" can list the first disk(system disk) partition table correctly. 
BTW, Whether if one disk partition table has some problems, it will affect the whole system disk partition table.
When you have time, could you give me some explanation? Thank you very much.

Best Regards,
Jun Li
> 
> To sum up:
> 
> * The reported expected behavior is wrong.
> 
> * The reported actual behavior with a RHEL-7 is expected.
> 
> * The reported actual behavior with a RHEL-6 guest is inconclusive.
> 
> I'm therefore closing this NOTABUG.
> 
> If you have questions, don't hesitate to ask.
Comment 7 Markus Armbruster 2014-01-15 09:04:31 EST
You are right.  I missed one crucial fact: you reported the guest
reports I/O error on /dev/vda after destruction of /dev/vdb by
drive_del.

However, I can't reproduce this.  "fdisk -l /dev/vda" works, "fdisk -l
/dev/vdb" triggers kernel errors on vdb, and "fdisk -l" does both.
Works as expected.  Same behavior with a Fedora 20 guest.

I'm leaving the bug CLOSED/NOTABUG on the hypothesis that you misread
the error messages in your initial testing.

If you can reproduce I/O errors being reported for vda, please reopen
the bug.

Note You need to log in before you can comment on or make changes to this bug.