Description of problem: Anaconda (stage2) can't open device /dev/xvda (the disk) while probing for available disk space (this is in the beginning of the install). This is a parted error. Trying to open the device with parted/fdisk from the console (while in the install environment) doesn't work also. One of the anaconda folks told me that this is probably kernel's fault as device perms seemed OK to him. Version of component: tree: RHEL4-U7-re20080515.0 kernel-2.6.9-70.EL anaconda-10.1.1.89-1 How reproducible: 100% on x86_64 Steps to reproduce: 1. Boot a full virt Xen guest and proceed with install 2. At stage2 (beginning) anaconda tries to probe disks -> BUM Additional info: ia64 - FV guest works fine x86_64 - FV guest leads to the above error
Created attachment 305943 [details] screen shot of error message
Chris, replying here to your comment at: https://bugzilla.redhat.com/show_bug.cgi?id=442538#c14 I used the GUI of virt-manager. 1) Created a new guest 2) Selected full virt 3) Selected ISO image (boot.iso) 4) Selected image file and network/mem/cpu settings 5) Proceeded with HTTP install from withing anaconda
Well, after a quick look, it doesn't look like FV installs are completely broken. If you hit "Cancel" at that screen, you can get by and do the rest of the install. Not pretty, but at least we can test. That being said, it *does* look like this has to do with the PV-on-HVM drivers. My uninformed opinion is that probably /dev/xvda device is being created (since the drivers are available), but since there is no device backing it, when anaconda goes to probe /dev/xvda it gets an error. There are probably 3 possible solutions: 1. Anaconda just ignores this error and goes on without popping up a dialog 2. Anaconda doesn't load the xenblk drivers at all at this stage, unless it finds a node in xenbus (not sure how exactly that would be done). 3. Have the xen-blk driver fail to load at modprobe time if there are no backing devices; I'm not sure if that is viable, though, especially with the ability to hotplug things later. Chris Lalancette
OK. This is definitely an anaconda bug. Here's what's going on: During findrootparts in anaconda, at some point it calls into openDevices to find all devices. The all devices in this case is self.driveList(), which seems to contain /dev/hda, /dev/xvda, and /dev/xvdb. I have *no* idea where those values are coming from. Anyway, openDevices ends up calling into some parted code which attempts to open the device file and poke around on the device to do some detection. This fails in the case of /dev/xvda (because there is no backing file), which causes the parted stuff to jump into the exception handler installed in gui.py It looks to me that the self.driveList() is probably wrong here; I'm not sure why it is choosing hda, xvda, and xvdb (what about hdb, xvdc, xvdd, etc?). But I'll leave that judgement up to the anaconda people who know much more about it than I do. Chris Lalancette
Created attachment 306086 [details] Kudzu reports the xvda and xvdb drives Screenshot of terminal, which demonstrates where we got the drive list from. This screenshot shows results of two commands used during the probe part. First - partedUtils.DiskSet() - is the command called from the anaconda. It then calls isys.hardDriveDict and isys.driveDict, which uses kudzu.probe. Results of the kudzu probe are on the screenshot too.
Don, Chris, can you advise if this is kudzu or kernel? Thanks.
changing component to kudzu
How do you tell that the device actually has a backing file? Can you post the kernel messages from bootup?
OK, after talking this over with several people, here's what we have up until now: The dom0 explicitly exports the block device as both an IDE disk (hda) and a PV-on-HVM disk (xvda); that way the guest can choose which way to access the disk, assuming it has drivers for both. Additionally Xen FV guests can't boot to anything except IDE, so you definitely need the disk exported as IDE for at least boot. So the question is: why is attempting to access /dev/xvda in anaconda responding with -ENODEV? One possible answer is that since the device is already "in-use" by the IDE driver, the PV-on-HVM drivers are noticing this fact and refusing to access the same disk. That's purely speculation, though; someone needs to go into the PV-on-HVM drivers and find out what is happening for sure. Chris Lalancette
Created attachment 306137 [details] output from dmesg is the above information enough? anything else that could help? thanks!
Well, it shows that the xenblk/xennet drivers are remarkably quiet. Doesn't seem to help with solving this.
Assigning to kernel - there's nothing anaconda or kudzu can do here without additional kernel help (from conversation with Don.) This can be worked around by booting the installer with ide0=noprobe.
Ug. Always the most obvious problems. So, if you boot the installer in U7 under Xen FV guest, and then switch to Alt-F2, you can do a little debugging. If you do "dmesg | grep -i xen", you'll see a bunch of messages saying: XENBUS: Device with no driver: device/vbd/768 or similar. And, in point of fact, it isn't lying; the drivers needed to drive these Xen devices are not included in the anaconda stage2 image (at least, doing modprobe xen-vbd didn't seem to do anything). This explains why kudzu and/or anaconda gets ENODEV while trying to probe these devices, because there is no driver loaded. Digging a little further, what I found is xen-platform-pci (which is the first stage of the PV-on-HVM drivers, needed to do *anything* at all with them) is built-in to the kernel, while the netfront and blkfront drivers are modules. So, xen-platform-pci does come up during boot and do initialization. What I don't quite understand is how/why it creates the /dev/xvd nodes; I would *think* that would be relegated to the individual device drivers (in this case, xen-vbd) to actually register/create those nodes, but I'm not entirely sure. And further on this, it seems odd that a whole slew of device nodes is created; in particular, we have /dev/xvda, /dev/xvda1->14, /dev/xvdb, and /dev/xvdb1->14. I guess at this point we need to answer a few questions, and then try to come up with some possible solutions. Questions: Bill, do you know how /dev device nodes are created during anaconda time? I don't see udev running (and I don't remember seeing it during anaconda startup), so there must be some other mechanism that runs around creating these /dev nodes. Possible Solutions: 1. Have anaconda/kudzu explicitly ignore /dev/xvd devices during install time. Since we can't reasonably expect users to always remember to add "ide0=noprobe" while installing, we *have* to do the install via IDE emulation (since it's already loaded/running by the time anaconda starts, and we can't unload it). Just ignoring the /dev/xvd devices is ugly, but it will work as a fallback. 2. Determine why those /dev/xvd devices are being created in the first place, and try to remedy that. This seems like more of the correct solution, but will depend on how these are being probed/created as to whether we can fix it. Oh, and one last thing (which will probably need a new bug); *if* anaconda detects it is running on Xen on 4.7 or later, anaconda should automatically set up the guest to use the PV-on-HVM drivers (including adding any kernel command-line configuration, like ide0=noprobe). This will make the user experience much better out of the box. Chris Lalancette
CC'ing Jeremy for anaconda + virt, but... Device nodes are created by anaconda when it thinks it has a device of a certain type. The xvda device nodes are created because a Xen block device is found in /sys/bus/xen/devices/xvd-* ; if those sysfs entries aren't there, anaconda won't create the devices. The Xen code in 4.7 kudzu is based around using xennet/xenblk as the driver, not xen-vbd. Are there multiple drivers for the same types of devices?
(In reply to comment #19) > Ug. Always the most obvious problems. So, if you boot the installer in U7 > under Xen FV guest, and then switch to Alt-F2, you can do a little debugging. > If you do "dmesg | grep -i xen", you'll see a bunch of messages saying: > > XENBUS: Device with no driver: device/vbd/768 These messages are expected. Basically, it says there isn't a driver for that device loaded at that time. When you later load xenblk or whatnot, the device gets attached to the driver. > And further > on this, it seems odd that a whole slew of device nodes is created; in > particular, we have /dev/xvda, /dev/xvda1->14, /dev/xvdb, and /dev/xvdb1->14. anaconda in RHEL4 (anaconda before F9 in fact) creates devices for all possible partitions of a disk rather than just having the device nodes created "on demand". > Bill, do you know how /dev device nodes are created during anaconda time? I > don't see udev running (and I don't remember seeing it during anaconda startup), > so there must be some other mechanism that runs around creating these /dev nodes. As Bill said, kudzu tells anaconda the disk is there, so anaconda creates device nodes. > Possible Solutions: > 1. Have anaconda/kudzu explicitly ignore /dev/xvd devices during install time. > Since we can't reasonably expect users to always remember to add "ide0=noprobe" > while installing, we *have* to do the install via IDE emulation (since it's > already loaded/running by the time anaconda starts, and we can't unload it). > Just ignoring the /dev/xvd devices is ugly, but it will work as a fallback. This would break for users who are installing PV RHEL4. > 2. Determine why those /dev/xvd devices are being created in the first place, > and try to remedy that. This seems like more of the correct solution, but will > depend on how these are being probed/created as to whether we can fix it. They're being created because the kernel is saying the devices are present.
(In reply to comment #21) > (In reply to comment #19) > > Possible Solutions: > > 1. Have anaconda/kudzu explicitly ignore /dev/xvd devices during install time. > > Since we can't reasonably expect users to always remember to add "ide0=noprobe" > > while installing, we *have* to do the install via IDE emulation (since it's > > already loaded/running by the time anaconda starts, and we can't unload it). > > Just ignoring the /dev/xvd devices is ugly, but it will work as a fallback. > > This would break for users who are installing PV RHEL4. Although I guess we could conditionalize this path based on whether you're running the PV kernel or a "normal" kernel. But ew.
*** Bug 451390 has been marked as a duplicate of this bug. ***
Created attachment 309672 [details] Fix for anaconda/kudzu failure Also fixes block-detach failure.
Committed in 74.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2008-0665.html