Grub2 cannot boot a kernel running on a qemu environment. Even a 'ls' on grub2's command line is enough to get it stuck in an infinite loop. Following the output of 'ls' on grub2's command line with debug enabled. The last lines repeat indefinitely. GNU GRUB version 2.00 Minimal BASH-like line editing is supported. For the first word, TAB lists possible command completions. Anywhere else TAB lists possible device or file completions. ESC at any time exits. grub> set pager=1 grub> set debug=all script/script.c:65: free 0x7ffd4660 script/script.c:65: free 0x7ffd4710 script/script.c:65: free 0x7ffd4730 script/script.c:65: free 0x7ffd40d0 script/script.c:65: free 0x7ffd4100 script/script.c:65: free 0x7ffd4120 script/script.c:65: free 0x7ffd4150 script/script.c:65: free 0x7ffd4450 script/script.c:65: free 0x7ffd4570 script/script.c:65: free 0x7ffd4680 script/script.c:65: free 0x7ffd46b0 script/script.c:65: free 0x7ffd46d0 grub> ls script/lexer.c:318: token 288 text [ls] script/script.c:50: malloc 0x7ffd4260 script/script.c:50: malloc 0x7ffd4240 script/script.c:163: arglist script/script.c:50: malloc 0x7ffd4150 script/lexer.c:318: token 259 text [ ] script/script.c:50: malloc 0x7ffd4120 script/script.c:50: malloc 0x7ffd4100 script/script.c:198: cmdline script/script.c:50: malloc 0x7ffd7850 script/lexer.c:318: token 0 text [] script/script.c:50: malloc 0x7ffd40d0 script/script.c:50: malloc 0x7ffd7830 script/script.c:294: append command script/script.c:50: malloc 0x7ffd68f0 kern/disk.c:230: Opening `ieee1275/disk,msdos2'... disk/ieee1275/ofdisk.c:330: Opening `disk'. partmap/msdos.c:181: partition 0: flag 0x80, type 0x41, start 0x800, len 0x2000 partmap/msdos.c:181: partition 1: flag 0x0, type 0x83, start 0x2800, len 0xfa000 kern/fs.c:55: Detecting ext2... kern/disk.c:326: Closing `ieee1275/disk'. kern/dl.c:602: module at 0x7ffde7b0, size 0x1638 kern/dl.c:626: relocating to 0x7ffd46e0 kern/dl.c:590: flushing 0x1695 bytes at 0x7ffdd100 kern/dl.c:649: module name: ls kern/dl.c:650: init function: 0x7ffdd750 kern/ieee1275/openfw.c:155: devalias name = scsi kern/ieee1275/openfw.c:155: devalias name = cdrom disk/ieee1275/ofdisk.c:126: disk name = cdrom, path = /vdevice/v-scsi@1002/cdrom@2,0 disk/ieee1275/ofdisk.c:90: devpath = cdrom, canonical = /vdevice/v-scsi@1002/cdrom@2,0 kern/ieee1275/openfw.c:155: devalias name = disk disk/ieee1275/ofdisk.c:126: disk name = disk, path = /vdevice/v-scsi@1002/disk@0,0 disk/ieee1275/ofdisk.c:90: devpath = disk, canonical = /vdevice/v-scsi@1002/disk@0,0 kern/ieee1275/openfw.c:155: devalias name = net kern/ieee1275/openfw.c:155: devalias name = hvterm kern/ieee1275/openfw.c:155: devalias name = name disk/ieee1275/ofdisk.c:126: disk name = /vdevice/v-scsi@1002/disk@d8, path = /vdevice/v-scsi@1002/disk@d8 disk/ieee1275/ofdisk.c:90: devpath = /vdevice/v-scsi@1002/disk@d8, canonical = /vdevice/v-scsi@1002/disk@d8 disk/ieee1275/ofdisk.c:126: disk name = /vdevice/v-scsi@1002/disk@88, path = /vdevice/v-scsi@1002/disk@88 disk/ieee1275/ofdisk.c:90: devpath = /vdevice/v-scsi@1002/disk@88, canonical = /vdevice/v-scsi@1002/disk@88 disk/ieee1275/ofdisk.c:126: disk name = /vdevice/v-scsi@1002/disk@d8, path = /vdevice/v-scsi@1002/disk@d8 disk/ieee1275/ofdisk.c:126: disk name = /vdevice/v-scsi@1002/disk@88, path = /vdevice/v-scsi@1002/disk@88 disk/ieee1275/ofdisk.c:126: disk name = /vdevice/v-scsi@1002/disk@d8, path = /vdevice/v-scsi@1002/disk@d8 disk/ieee1275/ofdisk.c:126: disk name = /vdevice/v-scsi@1002/disk@88, path = /vdevice/v-scsi@1002/disk@88 disk/ieee1275/ofdisk.c:126: disk name = /vdevice/v-scsi@1002/disk@d8, path = /vdevice/v-scsi@1002/disk@d8 disk/ieee1275/ofdisk.c:126: disk name = /vdevice/v-scsi@1002/disk@88, path = /vdevice/v-scsi@1002/disk@88 disk/ieee1275/ofdisk.c:126: disk name = /vdevice/v-scsi@1002/disk@d8, path = /vdevice/v-scsi@1002/disk@d8 disk/ieee1275/ofdisk.c:126: disk name = /vdevice/v-scsi@1002/disk@88, path = /vdevice/v-scsi@1002/disk@88 (continues indefinitely) Steps to reproduce: 1) Get latest upstream qemu code: git clone git://git.qemu.org/qemu.git 2) Configure it: ./configure --target-list=ppc64-softmmu 3) Build it: make 4) Create an image: ./qemu-img create -f raw fedora.img 10G 5) Run qemu: ./ppc64-softmmu/qemu-system-ppc64 -M pseries -m 1024 -nographic -cdrom Fedora-17-ppc64-DVD.iso fedora.img 6) Follow the installation instructions 7) Reboot the VM and try to boot the installed kernel Expected Results: Grub2 should boot the installed kernel Actual Results: Grub2 gets stuck trying to boot the installed kernel Additional info: The same results happen both with grub2-2.0-0.36.beta6.fc17 and grub2-2.00-1.fc18. As for the cause of the problem itself, I am not sure, but from a quick look it seems like the unit address (the @xxxx part of the device path) is wrong. My understanding is that grub2 is messing around with unit addresses and tries to "know" how to build them up from scratch. The reality is that unit addresses are pretty device specific and thus such an algorithm can only be very fragile. In this case, it tries to build a unit address that might work with IBM open firmware vscsi driver under pHyp but doesn't with SLOF (the former uses a custom format, the latter uses the old OFW standard for SCSI addresses of @id,lun). I think grub2 should be less pro-active at messing around with these, ie, only do that when installed if it knows for sure that it will have to access files outside of the device it was loaded from. In 99% of the cases, it will not have to do that and can just use the path it was loaded from as a device-path with a working unit address. This is how yaboot does it as well and it works reliably.
------- Comment From bherren.com 2012-07-18 22:51 EDT------- So I'm going to use this bug as a place for a more complete discussion on the problem of having the appropriate unit addresses (the last @xxx part) in OFW path. Can somebody make sure we have the right grub2 people CCed on the redhat side ? So the unit address is more or less adapter specific. Each adapter firmware has its own way of encoding it unfortunately. We can start building specific knowledge about each adapter type in our grub2 configuration script (fortunately we have a limited number of supported adapters with OF firmwares on them), but it might be better to seek a solution involving the kernel drivers knowing about the methods used by the firmware for the specific adapters it drives. I'm adding Brian on CC who might help discuss that from a SCSI driver perspective. Ideally we'd want to add a "devspec" attribute to the sysfs nodes of the disks. So I've collected some info about a couple of common adapters. First VSCSI: The unit address for vscsi is the SRP "LUN" value (which is not the same as the SCSI LUN). The formula to calculate it can be found in the linux driver: static inline u16 lun_from_dev(struct scsi_device *dev) { return (0x2 << 14) | (dev->id << 8) | (dev->channel << 5) | dev->lun; } Currently, SLOF in qemu doesn't use the above formula however, but I will change it ASAP so grub2 doesnt have to deal with two different methods for vscsi. Then we have IPR/Obsidian. This itself falls into two categories, the newer "SIS64" variants, which you can recognize via the presence of an "ibm,sis64" property in the adapter device node, and the older "SIS32" variants which don't have this property. For SIS32, the unit address is a "resource address" of the form (bus << 16) | (id << 8) | lun However, bus can be 0xff when using HW RAID For SIS64, the unit address is the SAS WWN of the disk (though it might get appended a ",lun" when applicable, we need to double check that) So here we'll have to detect the adapter type, we'll also need to be careful that below the adpater PCI device in the device-tree can be a "functional" sub node to differenciate SAS from SATA which some of those support (for the optical drive). I don't know what the address encoding scheme is for SATA btw, I'll try to find it out later. Due to the above complexity, it's clearly a piece of logic that is best located in the IPR driver itself, which could either expose an ioctl to retrieve a disk unit address or better, would create sysfs devspec attributes in the disk sysfs directories, but that won't happen immediately. I'm still trying to get more info about other supported adapters such as our fiber channel ones. Additionally, there are some methods that our adapter firmwares provide that can be called within the OFW environment to retrieve lists of attached devices. Those are used by the SMS menu system, and would be handy for grub2 to be able to use as well in some cases, to display fallback menus of devices maybe, that sort of thing... I'm in the process of obtaining the documentation for these and will update this BZ when I have it.
------- Comment From bjking1.com 2012-07-18 23:04 EDT------- Is there any reason that grub2 can't use the ofpathname script that is included in powerpc-utils to translate from a logical device name (i.e. /dev/sda) to an OF path name? There are far too many different pieces of code trying to do this translation, which makes it nearly impossible to keep them all up to date when the OF binding changes. As for doing something in the ipr driver itself, the ipr driver does expose some attributes in sysfs that the ofpathname script uses in order to be able to build the OF path, but they still require some intelligence to know how to use them. However, I'm not convinced that adding a devspec to the ipr driver for each device is the right answer, since there are plenty of other I/O adapters that need special treatment as well - LSI SAS, QLogic FC, Emulex FC, VSCSI, VFC, and more...
------- Comment From pfsmorigo.com 2012-07-19 00:31 EDT------- I tested ofpathname here and it shows the id in the pHyp's OFW format: [root@localhost target0:0:0]# ofpathname /dev/sda1 /vdevice/v-scsi@1002/disk@8000000000000000 I discover that the boot-device was blank after the instalation: 0 > printenv ---environment variable--------current value-------------default value------ use-axon-ddr? true true real-mode? true true direct-serial? false false use-nvramrc? false false selftest-#megs 0 0 security-password security-mode 0 0 security-#badlogins 0 0 screen-#rows 200 200 screen-#columns 200 200 output-device oem-logo? false false oem-logo oem-banner? false false oem-banner nvramrc input-device fcode-debug? true true diag-switch? false false diag-file diag-device boot-command boot boot boot-file boot-device auto-boot? true true The device tree and the alias: 0 > ls 3e597a18 : /vdevice 3e597c98 : |-- vty@1000 3e597e70 : |-- l-lan@1001 3e5981f8 : +-- v-scsi@1002 3e5b7368 : |-- disk@0,0 3e5b7a58 : +-- cdrom@2,0 ok 0 > devalias scsi : /vdevice/v-scsi@1002 cdrom : /vdevice/v-scsi@1002/cdrom@2,0 disk : /vdevice/v-scsi@1002/disk@0,0 net : /vdevice/l-lan@1001 hvterm : /vdevice/vty@1000 ok I set the boot-device and the system booted: setenv boot-device disk
------- Comment From bherren.com 2012-07-19 01:10 EDT------- So to begin with, I wasn't even aware we had an ofpathname script in powerpc-utils :-) As for the problem with qemu vs. OFW using a different format for vscsi, as I said in my previous post, I will fix that in qemu/SLOF, hopefully later today. ------- Comment From bherren.com 2012-07-19 01:13 EDT------- BTW. We should also make yaboot use ofpathname instead of its own built-in ofpath.... might cause some "interesting" dependencies but probably the way to go, a single script to rule them all and in the darkness of forth bind them !
------- Comment From bherren.com 2012-07-19 06:58 EDT------- I've now fixed qemu to behave like vscsi. I've also added a vscsi-report-luns method (which unlike OFW one supports multiple SCSI IDs which from what I can tell grub will parse properly). This makes grub2 works in qemu for me. I've pushed the fixes to github and will submit a qemu patch to get a new build of SLOF upstream.
So can we close this out, then?