Bug 447315 - parted error: Can't open /dev/xvda while probing disks during installation
Summary: parted error: Can't open /dev/xvda while probing disks during installation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.7
Hardware: All
OS: Linux
low
high
Target Milestone: rc
: ---
Assignee: Don Dutile (Red Hat)
QA Contact: Martin Jenner
URL:
Whiteboard:
: 451390 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-05-19 13:55 UTC by Alexander Todorov
Modified: 2008-07-24 19:29 UTC (History)
6 users (show)

Fixed In Version: RHSA-2008-0665
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-07-24 19:29:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
screen shot of error message (149.37 KB, image/png)
2008-05-19 13:55 UTC, Alexander Todorov
no flags Details
Kudzu reports the xvda and xvdb drives (10.48 KB, image/png)
2008-05-20 07:49 UTC, Martin Sivák
no flags Details
output from dmesg (7.50 KB, text/plain)
2008-05-20 15:00 UTC, Alexander Todorov
no flags Details
Fix for anaconda/kudzu failure (4.57 KB, patch)
2008-06-17 20:37 UTC, Don Dutile (Red Hat)
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2008:0665 0 normal SHIPPED_LIVE Moderate: Updated kernel packages for Red Hat Enterprise Linux 4.7 2008-07-24 16:41:06 UTC

Description Alexander Todorov 2008-05-19 13:55:55 UTC
Description of problem:
Anaconda (stage2) can't open device /dev/xvda (the disk) while probing for
available disk space (this is in the beginning of the install). This is a parted
error. Trying to open the device with parted/fdisk from the console (while in
the install environment) doesn't work also. One of the anaconda folks told me
that this is probably kernel's fault as device perms seemed OK to him. 

Version of component:
tree: RHEL4-U7-re20080515.0
kernel-2.6.9-70.EL
anaconda-10.1.1.89-1

How reproducible:
100% on x86_64

Steps to reproduce:
1. Boot a full virt Xen guest and proceed with install
2. At stage2 (beginning) anaconda tries to probe disks -> BUM

Additional info: 
ia64 - FV guest works fine
x86_64 - FV guest leads to the above error

Comment 1 Alexander Todorov 2008-05-19 13:55:55 UTC
Created attachment 305943 [details]
screen shot of error message

Comment 2 Alexander Todorov 2008-05-19 14:01:02 UTC
Chris,
replying here to your comment at:
https://bugzilla.redhat.com/show_bug.cgi?id=442538#c14

I used the GUI of virt-manager.
1) Created a new guest
2) Selected full virt
3) Selected ISO image (boot.iso)
4) Selected image file and network/mem/cpu settings
5) Proceeded with HTTP install from withing anaconda



Comment 7 Chris Lalancette 2008-05-19 15:56:52 UTC
Well, after a quick look, it doesn't look like FV installs are completely
broken.  If you hit "Cancel" at that screen, you can get by and do the rest of
the install.  Not pretty, but at least we can test.  That being said, it *does*
look like this has to do with the PV-on-HVM drivers.  My uninformed opinion is
that probably /dev/xvda device is being created (since the drivers are
available), but since there is no device backing it, when anaconda goes to probe
/dev/xvda it gets an error.  There are probably 3 possible solutions:

1. Anaconda just ignores this error and goes on without popping up a dialog
2. Anaconda doesn't load the xenblk drivers at all at this stage, unless it
finds a node in xenbus (not sure how exactly that would be done).
3.  Have the xen-blk driver fail to load at modprobe time if there are no
backing devices; I'm not sure if that is viable, though, especially with the
ability to hotplug things later.

Chris Lalancette

Comment 8 Chris Lalancette 2008-05-19 20:54:28 UTC
OK.  This is definitely an anaconda bug.  Here's what's going on:

During findrootparts in anaconda, at some point it calls into openDevices to
find all devices.  The all devices in this case is self.driveList(), which seems
to contain /dev/hda, /dev/xvda, and /dev/xvdb.  I have *no* idea where those
values are coming from.  Anyway, openDevices ends up calling into some parted
code which attempts to open the device file and poke around on the device to do
some detection.  This fails in the case of /dev/xvda (because there is no
backing file), which causes the parted stuff to jump into the exception handler
installed in gui.py

It looks to me that the self.driveList() is probably wrong here; I'm not sure
why it is choosing hda, xvda, and xvdb (what about hdb, xvdc, xvdd, etc?).  But
I'll leave that judgement up to the anaconda people who know much more about it
than I do.

Chris Lalancette

Comment 9 Martin Sivák 2008-05-20 07:49:01 UTC
Created attachment 306086 [details]
Kudzu reports the xvda and xvdb drives

Screenshot of terminal, which demonstrates where we got the drive list from.

This screenshot shows results of two commands used during the probe part. First
- partedUtils.DiskSet() - is the command called from the anaconda. It then
calls isys.hardDriveDict and isys.driveDict, which uses kudzu.probe. Results of
the kudzu probe are on the screenshot too.

Comment 10 Alexander Todorov 2008-05-20 08:13:55 UTC
Don, Chris,
can you advise if this is kudzu or kernel?

Thanks.

Comment 13 Alexander Todorov 2008-05-20 11:14:20 UTC
changing component to kudzu

Comment 14 Bill Nottingham 2008-05-20 14:11:33 UTC
How do you tell that the device actually has a backing file? Can you post the
kernel messages from bootup?

Comment 15 Chris Lalancette 2008-05-20 14:41:43 UTC
OK, after talking this over with several people, here's what we have up until now:

The dom0 explicitly exports the block device as both an IDE disk (hda) and a
PV-on-HVM disk (xvda); that way the guest can choose which way to access the
disk, assuming it has drivers for both.  Additionally Xen FV guests can't boot
to anything except IDE, so you definitely need the disk exported as IDE for at
least boot.

So the question is: why is attempting to access /dev/xvda in anaconda responding
with -ENODEV?  One possible answer is that since the device is already "in-use"
by the IDE driver, the PV-on-HVM drivers are noticing this fact and refusing to
access the same disk.  That's purely speculation, though; someone needs to go
into the PV-on-HVM drivers and find out what is happening for sure.

Chris Lalancette

Comment 16 Alexander Todorov 2008-05-20 15:00:22 UTC
Created attachment 306137 [details]
output from dmesg

is the above information enough? anything else that could help?

thanks!

Comment 17 Bill Nottingham 2008-05-20 15:06:34 UTC
Well, it shows that the xenblk/xennet drivers are remarkably quiet. Doesn't seem
to help with solving this.

Comment 18 Bill Nottingham 2008-05-20 16:10:28 UTC
Assigning to kernel - there's nothing anaconda or kudzu can do here without
additional kernel help (from conversation with Don.) This can be worked around
by booting the installer with ide0=noprobe.

Comment 19 Chris Lalancette 2008-05-22 09:49:33 UTC
Ug.  Always the most obvious problems.  So, if you boot the installer in U7
under Xen FV guest, and then switch to Alt-F2, you can do a little debugging. 
If you do "dmesg | grep -i xen", you'll see a bunch of messages saying:

XENBUS: Device with no driver: device/vbd/768

or similar.  And, in point of fact, it isn't lying; the drivers needed to drive
these Xen devices are not included in the anaconda stage2 image (at least, doing
modprobe xen-vbd didn't seem to do anything).  This explains why kudzu and/or
anaconda gets ENODEV while trying to probe these devices, because there is no
driver loaded.

Digging a little further, what I found is xen-platform-pci (which is the first
stage of the PV-on-HVM drivers, needed to do *anything* at all with them) is
built-in to the kernel, while the netfront and blkfront drivers are modules. 
So, xen-platform-pci does come up during boot and do initialization.  What I
don't quite understand is how/why it creates the /dev/xvd nodes; I would *think*
that would be relegated to the individual device drivers (in this case, xen-vbd)
to actually register/create those nodes, but I'm not entirely sure.  And further
on this, it seems odd that a whole slew of device nodes is created; in
particular, we have /dev/xvda, /dev/xvda1->14, /dev/xvdb, and /dev/xvdb1->14.

I guess at this point we need to answer a few questions, and then try to come up
with some possible solutions.

Questions:

Bill, do you know how /dev device nodes are created during anaconda time?  I
don't see udev running (and I don't remember seeing it during anaconda startup),
so there must be some other mechanism that runs around creating these /dev nodes.

Possible Solutions:
1.  Have anaconda/kudzu explicitly ignore /dev/xvd devices during install time.
 Since we can't reasonably expect users to always remember to add "ide0=noprobe"
while installing, we *have* to do the install via IDE emulation (since it's
already loaded/running by the time anaconda starts, and we can't unload it). 
Just ignoring the /dev/xvd devices is ugly, but it will work as a fallback.

2.  Determine why those /dev/xvd devices are being created in the first place,
and try to remedy that.  This seems like more of the correct solution, but will
depend on how these are being probed/created as to whether we can fix it.

Oh, and one last thing (which will probably need a new bug); *if* anaconda
detects it is running on Xen on 4.7 or later, anaconda should automatically set
up the guest to use the PV-on-HVM drivers (including adding any kernel
command-line configuration, like ide0=noprobe).  This will make the user
experience much better out of the box.

Chris Lalancette

Comment 20 Bill Nottingham 2008-05-22 16:00:49 UTC
CC'ing Jeremy for anaconda + virt, but...

Device nodes are created by anaconda when it thinks it has a device of a certain
type. The xvda device nodes are created because a Xen block device is found in
/sys/bus/xen/devices/xvd-* ; if those sysfs entries aren't there, anaconda won't
create the devices.

The Xen code in 4.7 kudzu is based around using xennet/xenblk as the driver, not
xen-vbd. Are there multiple drivers for the same types of devices?

Comment 21 Jeremy Katz 2008-05-22 17:08:05 UTC
(In reply to comment #19)
> Ug.  Always the most obvious problems.  So, if you boot the installer in U7
> under Xen FV guest, and then switch to Alt-F2, you can do a little debugging. 
> If you do "dmesg | grep -i xen", you'll see a bunch of messages saying:
> 
> XENBUS: Device with no driver: device/vbd/768

These messages are expected.  Basically, it says there isn't a driver for that
device loaded at that time.  When you later load xenblk or whatnot, the device
gets attached to the driver.
 
> And further
> on this, it seems odd that a whole slew of device nodes is created; in
> particular, we have /dev/xvda, /dev/xvda1->14, /dev/xvdb, and /dev/xvdb1->14.

anaconda in RHEL4 (anaconda before F9 in fact) creates devices for all possible
partitions of a disk rather than just having the device nodes created "on demand".

> Bill, do you know how /dev device nodes are created during anaconda time?  I
> don't see udev running (and I don't remember seeing it during anaconda startup),
> so there must be some other mechanism that runs around creating these /dev nodes.

As Bill said, kudzu tells anaconda the disk is there, so anaconda creates device
nodes.
 
> Possible Solutions:
> 1.  Have anaconda/kudzu explicitly ignore /dev/xvd devices during install time.
>  Since we can't reasonably expect users to always remember to add "ide0=noprobe"
> while installing, we *have* to do the install via IDE emulation (since it's
> already loaded/running by the time anaconda starts, and we can't unload it). 
> Just ignoring the /dev/xvd devices is ugly, but it will work as a fallback.

This would break for users who are installing PV RHEL4.

> 2.  Determine why those /dev/xvd devices are being created in the first place,
> and try to remedy that.  This seems like more of the correct solution, but will
> depend on how these are being probed/created as to whether we can fix it.

They're being created because the kernel is saying the devices are present.

Comment 22 Jeremy Katz 2008-06-02 18:27:47 UTC
(In reply to comment #21)
> (In reply to comment #19)
> > Possible Solutions:
> > 1.  Have anaconda/kudzu explicitly ignore /dev/xvd devices during install time.
> >  Since we can't reasonably expect users to always remember to add "ide0=noprobe"
> > while installing, we *have* to do the install via IDE emulation (since it's
> > already loaded/running by the time anaconda starts, and we can't unload it). 
> > Just ignoring the /dev/xvd devices is ugly, but it will work as a fallback.
> 
> This would break for users who are installing PV RHEL4.

Although I guess we could conditionalize this path based on whether you're
running the PV kernel or a "normal" kernel.  But ew.

Comment 23 Bill Burns 2008-06-14 10:30:47 UTC
*** Bug 451390 has been marked as a duplicate of this bug. ***

Comment 25 Don Dutile (Red Hat) 2008-06-17 20:37:55 UTC
Created attachment 309672 [details]
Fix for anaconda/kudzu failure

Also fixes block-detach failure.

Comment 26 Vivek Goyal 2008-06-19 03:39:20 UTC
Committed in 74.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 29 errata-xmlrpc 2008-07-24 19:29:54 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2008-0665.html


Note You need to log in before you can comment on or make changes to this bug.