Bug 499616

Summary: virsh erroneously reported successful disk-detach
Product: Red Hat Enterprise Linux 5 Reporter: Chris Lalancette <clalance>
Component: xenAssignee: Miroslav Rezanina <mrezanin>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 5.4CC: leiwang, llim, minovotn, mrezanin, virt-maint, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: xen-3.0.3-109.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-13 22:16:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 514499    
Attachments:
Description Flags
Backport of upstream c/s 15716 none

Description Chris Lalancette 2009-05-07 13:02:49 UTC
Description of problem:
I'm running RHEL-5.4 preview packages, libvirt version 0.6.3-2.el5.  I've been doing testing of a F-11 domU for the Fedora Test Day.  One of the tests is to do a virsh disk-attach and then a virsh disk-detach to the domU.  I successfully did a virsh disk-attach to the domU, and then created a partition table and an ext4 filesystem inside the guest.  Then I mounted the disk, and wrote some data to it.

Next, I tried "virsh disk-detach" of that same disk.  virsh reported that the disk disconnected successfully, but further probing showed that it actually did *not* disconnect.  Inside dmesg inside the guest, I saw:

vbd vbd-51728: 16 Device in use; refusing to close

Which is right, because it was still mounted inside the guest.  So this bug is basically that virsh reported success of disconnecting the disk device, when in fact the disconnect failed.

We should probably strongly consider fixing this for 5.4.  An unwitting customer could use virsh detach-disk, think that their disk has been disconnected, and then do other manipulations on the disk from the dom0, resulting in corruption.

Comment 1 Chris Lalancette 2009-05-07 13:14:46 UTC
Jirka mentions that he thinks this is probably a bug in the underlying Xen package, so changing component to reflect that.

Chris Lalancette

Comment 5 Jiri Denemark 2009-05-14 14:09:56 UTC
Created attachment 343970 [details]
Backport of upstream c/s 15716

So I backported the device-destroy-timeout patch from upstream and now I see it's not enough. This patch should fix the bug reported by Chris but it would wait until timeout when a device cannot be disconnected (i.e., when it is mounted in a guest), which is pretty stupid, although it's identical to upstream behavior.

And it's even more stupid when one looks into xenstore where the reason why the device cannot be disconnected is written immediately: "Device in use; refusing to close".

To fix this unfortunate waiting for timeout, xend would need to read /local/domain/ID/error/device/DEVCLASS/DEVID/error. And in addition to that, xend would have to reset the error before trying to disconnect a device so that it wouldn't get confused with previous error.

Comment 16 errata-xmlrpc 2011-01-13 22:16:58 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0031.html