Created attachment 396262 [details] boot process with error Description of problem: When I have /boot on virtio drive and / on iscsi target, boot crashes with ERROR: Interface setup failed: pupSetupInterface failed: get link - 19: No such device. When i use default network model (not virtio), everything works fine, and system boots. This happened to me on RHEL 5.4 and RHEL 5.5, didn't happen on Fedora 12 x86_64. Version-Release number of selected component (if applicable): RHEL5.5-Server-20100211.0 x86_64 kvm-83-157.el5.x86_64 How reproducible: always Steps to Reproduce: 1. prepare iscsi target 2. start installation using command: /usr/libexec/qemu-kvm -m 1024 -net tap -net nic,model=virtio -drive if=virtio,file=boot.img,boot=on -cdrom path_to_iso -boot d 3. when asked, add prepared iscsi target and use auto layouting 4. complete installation shut down machine 5. start virtual machine using command: /usr/libexec/qemu-kvm -m 1024 -net tap -net nic,model=virtio -drive if=virtio,file=boot.img,boot=on -cdrom path_to_iso -boot c Actual results: System fails to boot (failure screenshot in attachment). Expected results: System should boot successfully. Additional info: System installed into virtual machine was RHEL5.5-Client-20100217.0 x86_64. This shouldn't be problem of the OS installed though, because as I mentioned in description, identical system booted successfully on Fedora 12 x86_64.
Created attachment 396263 [details] another screenshot
isn't it a KVM bug?
Yes, it seems like KVM bug (see component set as KVM).
I just tested it using a RHEL-5.4 guest (I am still downloading a RHEL5.5 ISO) and the guest booted sucessfully and mounted / on iscsi, using the following command-line: /usr/libexec/qemu-kvm -m 1024 -net tap -net nic,model=virtio -drive if=virtio,file=/mnt/common/images/rhel6-iscsi.img,boot=on -cdrom /dev/cdrom -boot dc -vnc :2 On the other hand, I could reproduce a similar boot failure easily by changing the NIC model to rtl8139: /usr/libexec/qemu-kvm -m 1024 -net tap -net nic,model=rtl8139 -drive if=virtio,file=/mnt/common/images/rhel6-iscsi.img,boot=on -cdrom /dev/cdrom -boot dc -vnc :2 This is suspicious: > When i use default network model (not virtio), everything works fine, and system boots. You are not supposed to be able to boot the guest if you change the NIC model, as the NIC driver needs to be inside the initrd image. Are you sure the guest was not installed using the default NIC model instead of virtio? In case you can still reproduce it, can you attach a copy of the guest initrd, so we can check if everything required to boot the guest is present?
I could reproduce the bug installing RHEL5.5-Client-20100322.0-x86_64 as a guest. I will check what can be wrong on the initrd image installed by the RHEL5.5 guest that makes it unable to initialize the network, as the RHEL-5.4 worked as expected.
Found the cause: the initrd generated by RHEL-5.5 doesn't load the virtio_ring and virtio_pci modules before loading virtio_net and initializing the network, hence the virtio_net device is not initialized. This is the diff between the RHEL5.4 and RHEL5.5 initrd /init files: --- i54/initrd/init 2010-09-29 17:03:12.000000000 -0300 +++ i55/initrd/init 2010-09-29 17:24:01.000000000 -0300 @@ -52,14 +52,6 @@ insmod /lib/jbd.ko echo "Loading ext3.ko module" insmod /lib/ext3.ko -echo "Loading virtio.ko module" -insmod /lib/virtio.ko -echo "Loading virtio_ring.ko module" -insmod /lib/virtio_ring.ko -echo "Loading virtio_pci.ko module" -insmod /lib/virtio_pci.ko -echo "Loading virtio_blk.ko module" -insmod /lib/virtio_blk.ko echo "Loading scsi_mod.ko module" insmod /lib/scsi_mod.ko echo "Loading sd_mod.ko module" @@ -74,6 +66,8 @@ insmod /lib/libiscsi_tcp.ko echo "Loading iscsi_tcp.ko module" insmod /lib/iscsi_tcp.ko +echo "Loading virtio.ko module" +insmod /lib/virtio.ko echo "Loading virtio_net.ko module" insmod /lib/virtio_net.ko echo Bringing up eth0 @@ -81,7 +75,13 @@ network --device eth0 --bootproto dhcp rename /var/lib/dhclient/dhclient.leases /var/lib/dhclient/dhclient-eth0.leases echo Attaching to iSCSI storage -/bin/iscsistart -t iqn.2001-04.net.raisama:my-first-iscsi-vol -i iqn.1994-05.com.rhel:01.502165 -g 1 -a 172.31.74.35 +/bin/iscsistart -t iqn.2001-04.net.raisama:my-first-iscsi-vol -i iqn.1994-05.com.rhel:01.a262aa -g 1 -a 172.31.74.35 +echo "Loading virtio_ring.ko module" +insmod /lib/virtio_ring.ko +echo "Loading virtio_pci.ko module" +insmod /lib/virtio_pci.ko +echo "Loading virtio_blk.ko module" +insmod /lib/virtio_blk.ko echo "Loading libata.ko module" insmod /lib/libata.ko echo "Loading ata_piix.ko module" If I add the missing virtio_ring and virtio_pci insmod lines to the RHEL5.5 initrd, iscsi is initialized sucessfully by initrd. I can't see how the same guest could have worked under a Fedora 12 host (as claimed on comment #0), as the guest initrd is not loading the required modules to initialize the virtio network interface.
Being a guest but, this piece of information is important: RHEL iso being used to install the guest: RHEL5.5-Client-20100322.0-x86_64. I don't know if the latest RHEL-5.6 snapshot still has the bug. I will check that soon.
I could reproduce it on the latest RHEL-5.6 snapshot on a RHEL6 host, using the following command: virt-install -l http://download.devel.redhat.com/rel-eng/RHEL5.6-Server-20100928.0/5/x86_64/os/ --name rhel56-iscsi --disk /root/iscsi-boot-56.img,size=1 --prompt --vnclisten=0.0.0.0 --vnc --vncport=5903 --os-type=linux --os-variant=rhel5 --network bridge=eth0,model=virtio -r 1024
Created attachment 450783 [details] tgz dump of boot partition after RHEL-5.6 install Attaching initrd file generated by the RHEL-5.6 guest install.
*** Bug 645831 has been marked as a duplicate of this bug. ***
The module dependencies for virtio_net don't show a dependency on virtio_pci: pjones4:~/Download/tmp$ modprobe -d $PWD --set-version 2.6.18-228.el5.kpq2 --show-depends virtio_net insmod /home/pjones/Download/tmp/lib/modules/2.6.18-228.el5.kpq2/kernel/drivers/virtio/virtio.ko insmod /home/pjones/Download/tmp/lib/modules/2.6.18-228.el5.kpq2/kernel/drivers/net/virtio_net.ko pjones4:~/Download/tmp$ modprobe -d $PWD --set-version 2.6.18-228.el5.kpq2 --show-depends virtio insmod /home/pjones/Download/tmp/lib/modules/2.6.18-228.el5.kpq2/kernel/drivers/virtio/virtio.ko pjones4:~/Download/tmp$ Without a kernel dependency there, there is no strict ordering requirement expressed to mkinitrd, and it has no way of knowing if one module must be loaded before any other. If there's a dependency here, the modules must say so. One potential workaround is to run mkinitrd with "--with virtio_pci".
These modules can be loaded in any order. There's no dependency on virtio-pci because virtio-blk and virtio-net are devices on the virtio bus. virtio-pci is normally loaded by hotplug and that creates a virtio-blk device on the virtio bus.
I agree with Michael. virtio-pci is just like a USB host controller. There is never going to be a dependency on it by actual device drivers. It is up to the mkinitrd tool to deal with this by always include modules like that if they're to support booting of these classes of devices.
Well, then somebody needs to supply me with a programmatic way to determine when it should be loaded. Currently for storage we check the device paths of the slaves/* devices for storage devices we're installing on to. Something similar would work here, but obviously that hack is storage-only.
Can somebody please test with the packages at http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2859104 and see if this hack fixes the problem?
(In reply to comment #23) > Can somebody please test with the packages at > http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2859104 and see if this > hack fixes the problem? Same result, but I found the fix. Ramdisk created by this version of mkinitrd contains all important bits, virtio_pci (and virtio_ring before virtio_pci) has to be loaded right before virtio_net. So the 'insmod' sequence is: echo "Loading virtio.ko module" insmod /lib/virtio.ko echo "Loading virtio_ring.ko module" insmod /lib/virtio_ring.ko echo "Loading virtio_pci.ko module" insmod /lib/virtio_pci.ko echo "Loading virtio_net.ko module" insmod /lib/virtio_net.ko .... (network configuration is starting here)
(In reply to comment #21) > Well, then somebody needs to supply me with a programmatic way to determine > when it should be loaded. Currently for storage we check the device paths of > the slaves/* devices for storage devices we're installing on to. Something > similar would work here, but obviously that hack is storage-only. I went to check the /sys/ structure for virtio-pci and it looks like there is no information there indicating that the devices inside /sys/devices/virtio-pci actually correspond to PCI devices that are handled the virtio-pci module. So, it looks like that all we can do with the current ABI is to check if the device backing the network interface is in /sys/devices/virtio-pci, and manually add it to the list of required modules. Something like the pseudocode below should work: require_device(dev) { driver=readlink("$dev/driver") module=readlink("$driver/module") required_modules = $module; if (dev ~= "devices/virtio-pci/.*") required_modules += "virtio-pci"; return $required_modules; } interface_for_iscsi=eth0 netdev=readlink("/sys/class/net/$interface_for_iscsi/device") modules_for_iscsi += require_device($netdev)
What's "readlink /sys/class/net/$device/device/bus" going to show me here?
In both anaconda and on installed system: ../../../bus/virtio
Alright, that should work then. Let's give this another go with the packages built at: https://brewweb.devel.redhat.com/taskinfo?taskID=2898028
I just tested it, and it didn't work because mkinitrd enters the "if [ -f /sys/class/net/$device/device/modalias ]" branch. The following additional change, however, works: --- /sbin/mkinitrd 2010-11-16 14:00:12.000000000 -0500 +++ /sbin/mkinitrd.hack 2010-11-18 10:48:17.000000000 -0500 @@ -479,12 +479,12 @@ done elif [ "$(basename $(readlink /sys/class/net/$device/device/bus) 2>/dev/null)" = "xen" ]; then findmodule xennet # FIXME: hack for xennet sucking - elif [ "$(basename $(readlink /sys/class/net/$device/device/bus) 2>/dev/null)" = "virtio" ]; then - findmodule virtio_pci # of course virtio sucks the same way xennet - findmodule virtio_net # does... else findmodule $(ethtool -i $device | awk '/^driver:/ { print $2 }') fi + if [ "$(basename $(readlink /sys/class/net/$device/device/bus) 2>/dev/null)" = "virtio" ]; then + findmodule virtio_pci # of course virtio sucks the same way xennet + fi done }
While reproducing bug and testing fix from comment 29, we were experiencing random behaviour. We were sometimes able to boot: RHEL-5-Server-U5 x86_64 (physical machine) RHEL5.6-Server-20101110.0 x86_64 (virtual machine) with root on iscsi where initrd contained correct sequence: grep virtio init | grep -v ^echo insmod /lib/virtio.ko insmod /lib/virtio_ring.ko insmod /lib/virtio_pci.ko insmod /lib/virtio_blk.ko insmod /lib/virtio_net.ko But most of time not: RHEL-5-Server-U5 x86_64 (physical machine) RHEL5.6-Server-20101110.0 x86_64 (virtual machine) with root on iscsi where initrd contained correct sequence: grep virtio init | grep -v ^echo insmod /lib/virtio.ko insmod /lib/virtio_net.ko insmod /lib/virtio_ring.ko insmod /lib/virtio_pci.ko insmod /lib/virtio_blk.ko This behaviour was observed on same physical machines with same (both kickstart and manual) installations.
(In reply to comment #30) Both sequences are supposed to work. If loading virtio_pci after virtio_net doesn't work, it is either a virtio_net or virtio_pci bug. However, if loading virtio_pci first is safer, doing it looks better. Diff replacing the one on comment #29 is below. I didn't get any random behavior by mkinitrd after doing the change below, and virtio_pci seems to be always added before virtio_net. After applying the change below, I am always seeing this on the mkinitrd -v output: Adding module virtio Adding module virtio_ring Adding module virtio_pci Adding module virtio_net --- /sbin/mkinitrd 2010-11-16 14:00:12.000000000 -0500 +++ /sbin/mkinitrd.hack 2010-11-18 12:41:29.000000000 -0500 @@ -471,6 +471,9 @@ continue ;; *) handleddevices="$handleddevices $device" ;; esac + if [ "$(basename $(readlink /sys/class/net/$device/device/bus) 2>/dev/null)" = "virtio" ]; then + findmodule virtio_pci # of course virtio sucks the same way xennet + fi if [ -f /sys/class/net/$device/device/modalias ]; then modalias=$(cat /sys/class/net/$device/device/modalias) moduledep $modalias @@ -479,9 +482,6 @@ done elif [ "$(basename $(readlink /sys/class/net/$device/device/bus) 2>/dev/null)" = "xen" ]; then findmodule xennet # FIXME: hack for xennet sucking - elif [ "$(basename $(readlink /sys/class/net/$device/device/bus) 2>/dev/null)" = "virtio" ]; then - findmodule virtio_pci # of course virtio sucks the same way xennet - findmodule virtio_net # does... else findmodule $(ethtool -i $device | awk '/^driver:/ { print $2 }') fi
Should be fixed in mkinitrd-5.1.19.6-66.el5 then. Here's the brew link: http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2902249 .
Thanks, packages provided from comment 32 seem to fix this issue. I've tried several installations with these packages and all have successfully booted.
mkinitrd-5.1.19.6-66.el5 from comment #32 is included in snapshot #3 (-1124.1). Comment #33 says the bug is fixed. Moving to VERIFIED.
Bug hasn't been properly verified. I'll verify it today.
Ok, performed another several installations of x86_64 RHEL5.6-Server-20101124.1 and didn't hit this bug. Moving to verified.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Previously, virtual machines using iscsi could not boot correctly after installation. With this update booting works correctly.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0110.html
*** Bug 645719 has been marked as a duplicate of this bug. ***
*** Bug 547670 has been marked as a duplicate of this bug. ***