Description of problem: virtual disk shows up as both hda (handled by xen-vbd) and sda (handled by libata). Version-Release number of selected component (if applicable): kernel-xen-2.6.18-105.el5 How reproducible: - install rhel5.2 - update to latest nightly (may not be required). - install 105 kernel from dzickus people.r.c page - mkinitrd -v -f --preload xen-vbd /boot/initrd-2.6.18-105.el5.img 2.6.18-105.el5 - edit /boot/grub/menu.lst, add ide=disable to the kernel cmd line. - reboot, watch both drivers find the disk.
Created attachment 474065 [details] untested patch to add ide=disable support in ata_piix Or just hack it with a patch like this (untested).
(In reply to comment #6) > Created attachment 474065 [details] > untested patch to add ide=disable support in ata_piix > > Or just hack it with a patch like this (untested). So adding a second, ide-disable scan/setup in ata_piix will remove it from being connected to sda? The current ide=disable keeps the ide subsys from connecting to the qemu-defined hda device and enabling the pv-hvm block driver to grab it.
Yes. Here's an alternative way to test it, not requiring usage of the PV drivers: boot an old kernel, without PV drivers in the initrd, with "root=/dev/sda3 ide=disable" and it will work, boot a patched kernel with the same command-line and it will panic.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Patch(es) available in kernel-2.6.18-254.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
Reproduced this on kernel -238, in a RHEL5 HVM guest. I first appended the "ide=disable" in grub boot.conf and booted the guest. After successfully boot in I get both 'hda' and 'sda' in the guest: $ ls /dev/*da /dev/hda /dev/sda After upgrading to kernel -261, doing so leads to guest kernel panic, which is predicted by Paolo in comment 8. Then I rebuilt the initrd adding the xen-vbd PV driver: $ mkinitrd -v -f --preload xen-vbd /boot/initrd-2.6.18-261.el5.test.img 2.6.18-261.el5 and test again. This time the guest could boot up fine and there's only the /dev/hda. The /dev/sda is no longer present. As a result, I'm putting this to VERIFIED.
In order to understand the patches for bug 698732 better, I picked up the thread here. I started to experiment with a rhel56-64bit-hvm guest, kernel build -267 (ie. with attachment 474065 [details] applied). The vm config file says disk = [ "file:/var/lib/xen/images/rhel56-64bit-hvm.img,hda,w", ",hdc:cdrom,r" ] See the test results / summary here and the details at the end. - "ide=disable" -- this turns off both libata and the generic IDE driver - if xen-vbd is present in initrd - blkfront takes the disk -- All OK - if xen-vbd is absent from initrd - panic (no driver) - "ide=disable" is absent - both the generic IDE driver and libata are active - "piix.intel_via_libata=1" is absent - the generic IDE driver claims the disk, *with* PIIX_IDE support - since the IDE driver registered the "hda" major devnum, blkfront will fail in xlbd_alloc_major_info() (if the xen-vbd module is present at all in the initrd) - since the IDE driver claimed the disk, the libata driver will skip it - single driver / major devnum (= hda), All OK - "piix.intel_via_libata=1" is present - the generic IDE driver claims the disk, but without PIIX_IDE support - xlbd_alloc_major_info() will fail as above - libata will try to provide PCI support, I believe, but the disk will remain attached as "hda" -- All OK - "piix.intel_via_libata=1" is present, with "ide0=noprobe ide1=noprobe" additionally - see commit 7f93a6cb / bug 230541 for ideX=noprobe, - hits in ide_setup() -- earliest init phase, - the generic IDE driver will completely skip the disk, - xlbd_alloc_major_info() will *succeed* and register blkfront for hda - libata will *also* claim the disk and attach it as sda - two major devnums and two drivers for the same disk (verified with "fdisk -l"), BADNESS Details: The IDE driver is built into the kernel statically. Normally, the initial ramdisk contains the libata driver as a module (lib/ata_piix.ko, otherwise installed as drivers/ata/ata_piix.ko). The initialization sequence is as follows: (1) drivers/ide/ide.c::ide_setup() is called first, very early, basically out of the blue sky. This function sets "disable_ide = 1" if "ide=disable" is passed on the command line. (2) Then we have a loop in do_initcalls(), iterating over... well... initialization functions: init() [kernel/init/main.c] -> do_basic_setup() -> do_initcalls() (2a) At some point the loop tries to initialize the statically built-in PIIX_IDE driver, -> piix_ide_init() [drivers/ide/pci/piix.c] This function checks "intel_via_libata" (on the command line: "piix.intel_via_libata=1"). If set, it skips the PIIX_IDE driver registration, but returns with 0 -- success! This seems to allow the disk to be claimed by the generic (?) IDE driver, but without PIIX_IDE support. If "intel_via_libata" is not set, then piix_ide_init() attempts to register the PIIX_IDE driver: -> ide_pci_register_driver() [include/linux/ide.h] -> __ide_pci_register_driver() [drivers/ide/setup-pci.c] (2b) __ide_pci_register_driver() checks "disable_ide" (optionally set in step (1)). If it's set, not only is the PIIX_IDE (or any other) driver's registration aborted, but also -ENODEV is returned, propagated outwards by piix_ide_init() in step (2a)). This will stop the generic IDE driver to work with the disk. (2c) Later, but still in the same do_initcalls() loop, the generic IDE driver (?) initialization is executed (suppose from here on that "disable_ide" is false): -> ide_init() [drivers/ide/ide.c] The following is logged. The section marked PIIX_IDE is dependent on intel_via_libata being false (see (2a)). [generic] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 [generic] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx [PIIX_IDE] PIIX3: IDE controller at PCI slot 0000:00:01.1 [PIIX_IDE] PIIX3: chipset revision 0 [PIIX_IDE] PIIX3: not 100% native mode: will probe irqs later [PIIX_IDE] PCI: Setting latency timer of device 0000:00:01.1 to 64 [PIIX_IDE] ide0: BM-DMA at 0xc000-0xc007, BIOS settings: hda:pio, hdb:pio [PIIX_IDE] ide1: BM-DMA at 0xc008-0xc00f, BIOS settings: hdc:pio, hdd:pio [generic] Probing IDE interface ide0... [generic] hda: QEMU HARDDISK, ATA DISK drive [generic] Probing IDE interface ide1... [generic] hdc: QEMU CD-ROM, ATAPI CD/DVD-ROM drive [generic] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 [generic] ide1 at 0x170-0x177,0x376 on irq 15 [generic] hda: max request size: 512KiB [generic] hda: 20480000 sectors (10485 MB) w/256KiB Cache, CHS=16383/255/63 [generic] hda: cache flushes supported [generic] hda: hda1 (4) (skipping (3) intentionally) Then at some point the libata driver is loaded from the initrd and initialized: (SCSI subsystem initialized) (libata version 3.00 loaded.) sys_init_module() [kernel/module.c] -> piix_init() [drivers/ata/ata_piix.c] If the IDE driver hasn't claimed the disk(s), then this module will. (3) If we add "--preload xen-vbd" to the mkinitrd command line, then the blkfront driver intervenes as step (3) -- it is initialized after the built-in IDE driver, but before the ata_piix module. Here's the call chain: backend_changed() [drivers/xen/blkfront/blkfront.c] -> connect() -> xlvbd_add() [/drivers/xen/blkfront/vbd.c] -> xlvbd_alloc_gendisk() -> xlbd_get_major_info() When xlbd_get_major_info() is called for the first time wrt. a specific major device number, it tries to register the blkdev itself: -> xlbd_alloc_major_info() -> register_blkdev() The major devnum is dependent on what the sysadmin specified in the vm config file: hda, sda, or xvda.
So the single problematic case is "ide=disable piix.intel_via_libata=1 ide0=noprobe ide1=noprobe" with xen-vbd in the initrd. I believe we can ignore it, can't we?
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1065.html