217853 – xm block-attach does not allow an xm block-detach when it fails to attach

Bug 217853 - xm block-attach does not allow an xm block-detach when it fails to attach

Summary: xm block-attach does not allow an xm block-detach when it fails to attach

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	xen
Sub Component:
Version:	5.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Glauber Costa
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	217243
Blocks:
TreeView+	depends on / blocked

Reported:	2006-11-30 13:20 UTC by Chris Lalancette
Modified:	2007-11-30 22:07 UTC (History)
CC List:	2 users (show)
Fixed In Version:	5.0.0
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-01-26 20:07:05 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
fix for it (1.16 KB, patch) 2006-12-04 21:31 UTC, Glauber Costa	no flags	Details \| Diff
addition of --force option (6.15 KB, patch) 2006-12-13 22:38 UTC, Glauber Costa	no flags	Details \| Diff
Show Obsolete (1) View All

Description Chris Lalancette 2006-11-30 13:20:35 UTC

+++ This bug was initially created as a clone of Bug #217243 +++

Description of problem:
When running xm block-attach to a PV domU, if the attach fails, the device is
not un-set properly.  In particular, the following sequence causes a problem:

1.  Start up RHEL-5 domU.
2.  In the domU, run "modprobe sd_mod"
3.  On the dom0, run "xm block-attach rhel5-file file://tmp/testblock.img
/dev/sda w"

The block-attach command will seemingly succeed.  However, in the domU, you will
see the following error messages:

Registering block device major 8
register_blkdev: cannot get major 8 for sd
xen_blk: can't get major 8 with name sd
vbd vbd-2160: 19 xlvbd_add at /local/domain/0/backend/vbd/1/2160

Now, trying to run "xm block-detach rhel5-file /dev/sda" on the dom0 will say:

Error: Device /dev/sda not connected
Usage: xm block-detach <Domain> <DevId>

Destroy a domain's virtual block device.

It *thinks* the block device isn't set up, but trying to re-attach with the same
file spits an error (I don't have it right this moment, I'll attach it later).
So there is no (easy) way to detach the broken block device.

I did a little bit of debugging on this.  On the domU side, when the "cannot get
major 8 with name sd" message is printed, it looks like the kernel correctly
writes an error into the xenstore.  The problem is that the scripts on the dom0
side never check for the error in the xenstore, and hence never know that it
wasn't set up properly.  In particular the /etc/xen/scripts/block script doesn't
actually check for any errors.

I think the solution here is probably to properly check for errors in the
block-attach, and then un-setup the loop device and in xenstore (on the dom0)
when it fails.  This will also get rid of the xm block-detach problem.

-- Additional comment from gcosta on 2006-11-29 14:32 EST --
Chris, how does the relevant parts of xenstore-ls looks like ?
I'm also unable to detach them, but due to a different problem:
kernel gives the error message, but hotplug-status in xenstore appears as
connected. (Which partially explains it )

Comment 2 Glauber Costa 2006-12-04 21:31:50 UTC

Created attachment 142776 [details]
fix for it

Briefly, problem is:

When frontend finds any problem, it begins the Closing protocol. It also
happens with migration, and backend has no simply way to differentiate between
then. This is done through an "online = 1" state in xenstore. However, frontend
is not able to set its own state to "online = 0", as it lives in the backend
entries.

Solution is to check if frontend is okay in the transition to Closing to Close,
and properly unregister the device as we would do in case of a
backend-triggered detach.

Comment 5 Glauber Costa 2006-12-13 22:38:09 UTC

Created attachment 143570 [details]
addition of --force option

Upstream solution will most probably go through the addition of a --force
option.
Here's the patch for it. Waiting for upstream status...

Comment 6 Jay Turner 2006-12-14 02:59:37 UTC

QE ack for RHEL5.

Comment 8 Rik van Riel 2006-12-18 20:04:48 UTC

in 3.0.3-18.el5

Comment 9 Jay Turner 2007-01-26 20:07:05 UTC

xen-3.0.3-22.el5 included in 20070125.0.

Note You need to log in before you can comment on or make changes to this bug.