Red Hat Bugzilla – Bug 499616
virsh erroneously reported successful disk-detach
Last modified: 2011-01-13 17:16:58 EST
Description of problem:
I'm running RHEL-5.4 preview packages, libvirt version 0.6.3-2.el5. I've been doing testing of a F-11 domU for the Fedora Test Day. One of the tests is to do a virsh disk-attach and then a virsh disk-detach to the domU. I successfully did a virsh disk-attach to the domU, and then created a partition table and an ext4 filesystem inside the guest. Then I mounted the disk, and wrote some data to it.
Next, I tried "virsh disk-detach" of that same disk. virsh reported that the disk disconnected successfully, but further probing showed that it actually did *not* disconnect. Inside dmesg inside the guest, I saw:
vbd vbd-51728: 16 Device in use; refusing to close
Which is right, because it was still mounted inside the guest. So this bug is basically that virsh reported success of disconnecting the disk device, when in fact the disconnect failed.
We should probably strongly consider fixing this for 5.4. An unwitting customer could use virsh detach-disk, think that their disk has been disconnected, and then do other manipulations on the disk from the dom0, resulting in corruption.
Jirka mentions that he thinks this is probably a bug in the underlying Xen package, so changing component to reflect that.
Created attachment 343970 [details]
Backport of upstream c/s 15716
So I backported the device-destroy-timeout patch from upstream and now I see it's not enough. This patch should fix the bug reported by Chris but it would wait until timeout when a device cannot be disconnected (i.e., when it is mounted in a guest), which is pretty stupid, although it's identical to upstream behavior.
And it's even more stupid when one looks into xenstore where the reason why the device cannot be disconnected is written immediately: "Device in use; refusing to close".
To fix this unfortunate waiting for timeout, xend would need to read /local/domain/ID/error/device/DEVCLASS/DEVID/error. And in addition to that, xend would have to reset the error before trying to disconnect a device so that it wouldn't get confused with previous error.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.