Created attachment 361994 [details] Machine cpuinfo from RHEL5.4 kernel Description of problem: F12 Virt Test Day liveCD fails to boot on Weybridge devel machines. This machine boots rhel5.4 successfully, as well as support device assignment when intel_iommu=on is set. Version-Release number of selected component (if applicable): kernel: 2.6.31-12.fc12.x86_64 (installed from liveCD) liveCD: desktop-x86_64-20090915.15.iso How reproducible: Every boot Steps to Reproduce: 1. Install liveCD into CD of Weybridge machine 2. Reboot existing OS or power cycle machine 3. Actual results: Fails with the following messages on the console screen: No root device found. Boot has failed, sleeping forever. Expected results: System boots to liveCD login screen and able to install F12 onto machine. Additional info: I edited the boot cmdline to remove "quiet", add "debug", and it gave some additional device config output, but ended in the same failure. Also tried with intel_iommu=off and had the same result/problem.
Created attachment 361995 [details] lspci -vvv output from rhel5.4 kernel
Created attachment 361996 [details] dmidecode from rhel5.4 kernel on this machine
Created attachment 361997 [details] dmesg from rhel5.4 system w/intel_iommu=on
Don: can you get full dmesg output over serial for the failing boot? There's not much clues to go on here. You could also try a more recent boot.iso and see if it's still broken: http://download.fedoraproject.org/pub/fedora/linux//development/x86_64/os/images/boot.iso
hmmm... title of bz is wrong. This is about Weybridge machine, not virtlab Nehalem machines. Anyhow, i was able to boot the boot.iso; but during F12 Virt Test day, we were given LiveCD's to test with, not boot.iso's. Now, if liveCD's start with the same boot.iso, then this is an improvement, since I can get to the installation screens to select partitions to load F12 on, etc. (which I will do shortly, once I re-figure out which partition is rhel5 & which is fedora! ;-) ). If I can get some time on the Nehalem virt lab machines (from cdub), i'll try the boot.iso there as well.
okay, that's useful data latest nightly live CD composes gets dumped here: http://alt.fedoraproject.org/pub/alt/nightly-composes/desktop/ maybe you could try out the i386 one there? if that doesn't work, should capture the log and move the bug to livecd-tools
I was able to boot w/the boot.iso listed in c#4. Able to do an upgrade on a Fedora installation that was at f9, then f10, was f11. One 'new' feature: when doing an upgrade to f12, it nuked/removed/deleted all other fedora kernels on the system.... sigh.... Anyhow, after installation, rebooted successfully. Will try kvm guest (with & without (nic) dev assignment) next. Re-assigning to liveCD, since it appears that multiple installations fail with similar error, that appears to be similar to other dracut errors. note: The Weybridge under test has 10 partitions, multiple LVM's, some partitions not in LVMs, some partitions labeled, but not all of them .... so quite the 'variety' of partitions & their uses.
Not sure that moving this to LiveCD will help. If anything, this appears to be either a kernel or dracut problem.
Moving to dracut for now.
is this still an issue?
I don't know. Haven't tried an updated liveCD of F12 on a Nehalem; I've seen lots of chatter about anaconda trying to open all filesystems on an install (vs update), and how that logic is not so well liked (for other reasons like encrypted filesystems). I recommend a test on a Nehalem (or Tylersburg) system (virtlab16->virtlab17) with latest F12 liveCD to see if it's still busted. - Don
(In reply to comment #11) > I recommend a test on a Nehalem (or Tylersburg) system > (virtlab16->virtlab17) with latest F12 liveCD to see if > it's still busted. Please let us know when you re-test
I tested desktop-x86_64-20091109.15.iso late yesterday and it still fails with 'no root device found' on virtlab17 (Tylersburg machines; the bz for the Tylersburg machine is BZ 527529). I did not get a chance to test on Weybridge machine; I'll test that machine on Thursday. On the Tylersburg machines, it no longer crashes in the graphics driver, so it is better than the Virt Test day version. It now fails more like the Weybridge machine did, but with some new wrinkles, so I'm adding my test results here, since they may be relevant to the F12 LiveCD problem. I got the following (when removing "quiet", adding "debug" to kernel cmdline): dracut: Starting plymouth daemon : Starting plymouth daemon : rd-NO-MD: removing MD RAID activation : rd-NO-MDIMSM: no MD RAID for imsm/isw raids : scanning sda2 for LVM volume groups : Reading all physcial volumes. This may take a while : Found volume group "VolGroup00" using metadata type lvm2 : 2 logical volume(s) in volume group "VolGroup00" now active Note that the order of this output wrt various driver init completions varies from boot run to boot run. Also note, I'm being a bit lazy, and all above lines are prefixed by "dracut:". Another oddity -- the last 4 lines of the above dracut output repeat, but the second block is often separated from the above block by other driver init completions. It is after this second block of dracut messages that the famous "No root device found" msg comes out followed by "Boot has failed, sleeping forever" I also tried booting w/rhgb removed and no difference. I also tried with irqpoll, and that ordered the driver output a bit differently, but same end result. I tried with intel_iommu=off and the system never gets to the No root device found msg; it always hangs after various messages related to the mpt2sas (LSILOGIC SCSI) driver. I tried w/iommu=pt and got the No root found failure. I tried w/iommu=soft and hung after mpt2sas's msg of 'version 01.100.04.00 loaded' So, on the Tylersburg systems, it now appears to be an mpt2sas driver problem. One more test I think I'll try on those machines is with mem=2G to see if it gets the mpt2sas driver to configure succesfully. In summary: -- Tylersburg machine better, but still failing -- need to re-test Weybridge tomorrow & report results in this bz.
Test results on Weybridge machine: Essentially, it is the same as for the virtlab17 machine always end up fith 'No root device found' I tried (all with quiet & rhgb removed, debug set on kernel cmdline): (a) intel_iommu=off (b) iommu=pt (c) iommu=soft (d) mem=2G (e) (a) & (d) together (f) removed all the no-raid & no-luks cmdline switches As the Tylersburg machines, dracut saw the Volgroup's on 2 different disks on that system. Graphics worked as well when rhgb left on; just died with "unable to remove a fb that we didn't own" after the 'No root device found' btw -- it was interesting to see that although the message "Boot failed, sleeping forever", if I toggled my console-KVM, I'd see USB unplug, plug-in messages come out.... so it could be woken up! ;-) So, although the Tylersburg systems may be seeing an mpt2sas issue, there are no such things on a Weybridge, and I'm getting similar 'No root device found' problems. Given that a generic boot.iso works, it still points to further dracut problems. Still don't understand why VTd systems fail with liveCD, given that it fails even when it is forced off (intel_iommu=off or iommu=soft).
try adding "rd_info rd_initdebug" and removing "rhgb quiet" to/from the kernel command line. btw, what is your kernel command line?
(In reply to comment #15) > try adding "rd_info rd_initdebug" and removing "rhgb quiet" to/from the kernel > command line. of course it should be "rdinfo rdinitdebug" ... -ENOCOFFEE > > btw, what is your kernel command line?
Attached is the boot up log with the following kernel command line: Command line: initrd=initrd0.img root=live:CDLABEL=desktop-x86_64-20091109.15 rootfstype=auto ro liveimg debug console=ttyS0,115200 console=tty0 rdinfo rdinitdebug rd_NO_LUKS rd_NO_MD noiswmd BOOT_IMAGE=vmlinuz0 I removed the "quiet rhgb" and added "debug console=.... rdinitdebug" so I could get the output on a serial console & saved as attachment. The rest of the command line comes from the LiveCD image. Note, I added 'debug' in order to get dracut output in boot-up log; w/o it, no dracut output would occur, just the last two log sections after the No root device found. Let me know if I can provide more (testing) info. - Don
Created attachment 369260 [details] Boot log of failed F12 LiveCD of 20091109.15 x86_64 on Weybridge machine
seems like it cannot find a device/partition with a filesystem LABEL "desktop-x86_64-20091109.15"
how is desktop-x86_64-20090915.15.iso presented to the system?
What do you mean '..presented ot the system?' ??? If I put the CD into my w/s, it says its label is: "desktop-x86_64-20091109.15" and it's filesystem is iso9660. Note, that in the boot log, it states "CDLABEL=desktop-x86_64-20091109.15" and not "LABEL=desktop-x86_64-20091109.15" are they equivalent ???
How is the iso image bound in the system? real CDROM? virtual CDROM?
I think it's pretty clear - he's inserting a physical CDROM into his machine and booting from it
scsi 1:0:0:0: CD-ROM PLEXTOR DVDR PX-755A 1.04 PQ: 0 ANSI: 5 sda8sr0: scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray Uniform CD-ROM driver Revision: 3.20 sr 1:0:0:0: Attached scsi CD-ROM sr0 ok, then please run with the cdrom inserted: $ blkid /dev/cdrom and/or $ /lib/udev/vol_id /dev/cdrom to check the filesystem label of the cdrom.
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle. Changing version to '12'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
On my w/s, which uses a DVD reader.... # blkid /dev/dvd /dev/dvd: LABEL="desktop-x86_64-20091109.15" TYPE="iso9660" # /lib/udev/vol_id /dev/dvd ID_FS_USAGE=filesystem ID_FS_TYPE=iso9660 ID_FS_VERSION= ID_FS_UUID= ID_FS_LABEL=desktop-x86_64-20091109.15 ID_FS_LABEL_SAFE=desktop-x86_64-20091109.15 In case there was something odd w/my Weybridge's DVD reader, I ran the same cmd's on the CD and it generated the same output as well.
ok, looks good, now boot with the live CD, and add to the kernel command line: "rdinfo rdinitdebug rdshell" and you will get a shell in case of a failed boot.. run blkid on the cdrom again # blkid /dev/cdrom run # ls /dev/disk/by-label
and please provide the output of dmesg
(In reply to comment #28) > and please provide the output of dmesg scratch that
# blkid /dev/scd0 (/dev/cdrom doesn't exist) LABEL = "desktop-x86_64-20091109.15" TYPE="iso9660" # ls /dev/disk/by-label SWAP-sda3 \x2f \x2fboot \x2fboot_el5_32 \x2fboot_el5_64 \x2fboot_f9_32 \x2fguest_images \x2froot_el5_32 \x2froot_el5_64 \x2froot_f9_32
/dev/cdrom does not exist? very strange... something is wrong with udev and your CDROM. Workaround: specify on the kernel command line root=/dev/scd0
(In reply to comment #31) > /dev/cdrom does not exist? very strange... something is wrong with udev and > your CDROM. > > Workaround: specify on the kernel command line " root=/dev/scd0 liveimg"
also, I think, specifying both "liveimg" _and_ "root=live:" might create a problem. either: "liveimg root=CDLABEL=desktop-x86_64-20091109.15" or "root=live:CDLABEL=desktop-x86_64-20091109.15"
(In reply to comment #30) > # blkid /dev/scd0 (/dev/cdrom doesn't exist) > LABEL = "desktop-x86_64-20091109.15" TYPE="iso9660" > > # ls /dev/disk/by-label > SWAP-sda3 \x2f \x2fboot \x2fboot_el5_32 \x2fboot_el5_64 \x2fboot_f9_32 > \x2fguest_images \x2froot_el5_32 \x2froot_el5_64 \x2froot_f9_32 it would be very interesting to see the output of a boot with "rdudevdebug" added to the kernel command line.
None of the recommendations in c#32 or c#33 helped. In all cases, it ended with root device not found. note: when trying " root=/dev/scd0 liveimg" it did generate a new/extra output: failed: you must specify filesystem type then the all-too-common Can't mount root filesystem When adding 'rdudevdebug' to the kernel cmd line, it generated so much output that it overran my multi-megabyte screen buffer, and it took over 3 mins to get it to the boot failure. Do you want me to reconfigure my console screen so I can catch the full rdudevdebug output, and post it here?
Does it work, if you boot with "rdshell" and mount the cdrom by hand? "failed: you must specify filesystem type" looks like s.th. is wrong...
If I remove rhgb & quiet; add debug & rdshell, it fails to boot & drops into rdshell. at rdshell, I can mount the cdrom using mount /dev/scd0 / and it comes back: mount block device /dev/sr0 is write-protected, mounting read-only ISO 9660 Extensions: Microsoft Joliet Level 3 ISO 9660 Extensions: RRIP_1991A note, an ls -l of /dev/scd0 shows /dev/scd0 -> sr0
ok, before we dig deeper, did you try the final images also?
I burned desktop_x86_64-20091122.16.iso onto a CD and tried it. Same results: no root device found. No /dev/cdrom device file; only /dev/sr0 Able to mount /dev/sr0 as /
ok, then let's try to debug, why udev does not like it. add "rdshell", boot, be dropped to a shell, run: # ls /etc/udev/rules.d /lib/udev/rules.d # udevadm info --query=all --name=/dev/sr0 you can also attach photos of the screen, instead of retyping it
dracut-004-4.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/dracut-004-4.fc12
dracut-004-4.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update dracut'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2010-1088
dracut-004-4.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report.