Description of problem: I've installed F16 on two HD configuration. ----------- /dev/sda and /dev/sdb the same: Device Boot Start End Blocks Id System /dev/sda1 * 2048 411647 204800 fd Linux raid autodetect /dev/sda2 411648 17188863 8388608 fd Linux raid autodetect /dev/sda3 17188864 59131903 20971520 fd Linux raid autodetect /dev/sda4 59131904 1465149167 703008632 5 Extended /dev/sda5 59133952 101076991 20971520 fd Linux raid autodetect /dev/sda6 101079040 143022079 20971520 fd Linux raid autodetect /dev/sda7 143024128 184967167 20971520 fd Linux raid autodetect /dev/sda8 184969216 226912255 20971520 fd Linux raid autodetect /dev/sda9 226914304 1065775103 419430400 fd Linux raid autodetect /dev/sda10 1065777152 1465149167 199686008 fd Linux raid autodetect ----------- mdadm --detail /dev/md0 Version : 1.0 Creation Time : Tue Nov 1 09:41:40 2011 Raid Level : raid1 Array Size : 204788 (200.02 MiB 209.70 MB) Used Dev Size : 204788 (200.02 MiB 209.70 MB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Wed Nov 2 12:35:04 2011 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Name : XXX:0 (local to host XXX) UUID : 6c948a3a:856c099c:ce87bf53:864eb3f9 Events : 25 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 ----------- df -h /dev/md0 /dev/md0 194M 84M 100M 46% /boot ----------- After installation without error grub cannot boot the raid1 /boot partition, giving a "No such device:" id number and then dumping out to a rescue prompt. Workaround solution is: boot from DVD in rescue mode (Troubleshooting). chroot to /mnt/sysimage edit /boot/grub2/grub.cfg and add line insmod raid insmod mdraid09 insmod mdraid1x I've added three lines but this one is maybe sufficient: insmod mdraid09 than run grub2-install /dev/sda grub2-install /dev/sdb and this is maybe necessary too: dracut Than system boots correctly Version-Release number of selected component (if applicable): How reproducible: always Steps to Reproduce: 1. 2. 3. Actual results: grub cannot boot the raid1 /boot partition Expected results: grub can boot the raid1 /boot partition Additional info: It seems like blocker bug
Please attach /var/log/anaconda/* from the installed system (or /tmp/program.log, /tmp/syslog, /tmp/anaconda.log, and /tmp/storage.log from the installation environment at the end of installation) to this bug report.
Created attachment 531446 [details] anaconda.program.log
Created attachment 531447 [details] anaconda.ifcfg.log
Created attachment 531448 [details] anaconda.log
Created attachment 531449 [details] anaconda.storage.log
Created attachment 531450 [details] anaconda.syslog
Created attachment 531451 [details] anaconda.xlog
Created attachment 531452 [details] anaconda.yum.log
*** Bug 751481 has been marked as a duplicate of this bug. ***
*** Bug 751587 has been marked as a duplicate of this bug. ***
-- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
I can confirm workaround: you need add only this line (for raid1 with metadata 0.90 or 1.0) to /boot/grub2/grub.cfg: insmod mdraid09 I don't know if this is necessary: grub2-install /dev/sda grub2-install /dev/sdb but it works only when you run it from install DVD in rescue mode. When I did it in f16 I got unbootable system again. Something is really broken there.
Please correct it on https://fedoraproject.org/wiki/Common_F16_bugs#boot-on-softraid Thanks
Upgraded from FC15 Workaround doesn't work for me: [root@srv1 ~]# rpm -q grub2 grub2-1.99-12.fc16.x86_64 [root@srv1 ~]# cat /proc/mdstat |grep md0 md0 : active raid1 sda1[0] sdb1[1] [root@srv1 ~]# mount|grep md0 /dev/md0 on /boot type ext3 (rw,relatime,errors=continue,user_xattr,acl,barrier=0,data=ordered) [root@srv1 ~]# grub2-install /dev/md0 /sbin/grub2-setup: warn: Attempting to install GRUB to a partitionless disk or to a partition. This is a BAD idea.. /sbin/grub2-setup: warn: Embedding is not possible. GRUB can only be installed in this setup by using blocklists. However, blocklists are UNRELIABLE and their use is discouraged.. /sbin/grub2-setup: error: will not proceed with blocklists. [root@srv1 ~]# grub2-install /dev/md0 --force /sbin/grub2-setup: warn: Attempting to install GRUB to a partitionless disk or to a partition. This is a BAD idea.. /sbin/grub2-setup: warn: Embedding is not possible. GRUB can only be installed in this setup by using blocklists. However, blocklists are UNRELIABLE and their use is discouraged.. Installation finished. No error reported. [--force didn't help, i still see old grub screen with old config] Found this page: http://dione.no-ip.org/AlexisWiki/DebianSqueezeRaid1AndGrub2 [root@srv1 ~]# grub2-install /dev/sda /sbin/grub2-setup: warn: Your core.img is unusually large. It won't fit in the embedding area.. /sbin/grub2-setup: warn: Embedding is not possible. GRUB can only be installed in this setup by using blocklists. However, blocklists are UNRELIABLE and their use is discouraged.. /sbin/grub2-setup: error: will not proceed with blocklists. [root@srv1 ~]# grub2-install /dev/sdb /sbin/grub2-setup: warn: Your core.img is unusually large. It won't fit in the embedding area.. /sbin/grub2-setup: error: embedding is not possible, but this is required for cross-disk install. [this didn't help also, but i didn't see same warning for sdb]
George: There shouldn't be any need to run grub2-install if the boot loader already has been installed correctly and you just edited grub.cfg. If the boot loader haven't been installed correctly then you have a different problem than the one discussed here. Your problems installing the boot loader might be more like Bug 737508 .
yup, agree with mads: that sounds like 737508. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
(In reply to comment #13) > Please correct it on > https://fedoraproject.org/wiki/Common_F16_bugs#boot-on-softraid right now that link instructs to edit a file that is marked 'DO NOT EDIT'. Would it not be better to add the line GRUB_PRELOAD_MODULES="raid mdraid09 mdraid1x" to /etc/default/grub? FWIW, it worked on the box I am experimenting this bug on. If yes, I'll be happy to adjust the wiki, but as I have little grub2 experience I'd like a confirm or deny first.
(In reply to comment #17) > Would it not be better to add the line > GRUB_PRELOAD_MODULES="raid mdraid09 mdraid1x" > to /etc/default/grub? Yes, that is in some ways a better way to do it - but remember to run grub2-mkconfig -o /boot/grub2/grub.cfg . The necessary drivers should however have been detected automatically by grub2-install and built into core.img. Do something like grub2-probe --target=device /boot/grub2 grub2-probe --target=abstraction --device /dev/XXX look reasonable? Running something like bash -x grub2-mkinstall /dev/sda might also help figuring out what is going on.
(In reply to comment #18) > (In reply to comment #17) > > Would it not be better to add the line > > GRUB_PRELOAD_MODULES="raid mdraid09 mdraid1x" > > to /etc/default/grub? > > Yes, that is in some ways a better way to do it - but remember to run > grub2-mkconfig -o /boot/grub2/grub.cfg . Hmm, for me the edit of /etc/default/grub followed by grub2-install --no-floppy /dev/sdX with X for each of my disks, was enough. # grep raid /boot/grub2/grub.cfg comes up empty yet the system boot. Now I'm puzzled. > > The necessary drivers should however have been detected automatically by > grub2-install and built into core.img. OK, so if I remove the PRELOAD from /etc/default/grub and do a grub2-install to say sdc (simply so that I can easily test booting from there), my system should still come up, right? > > Do something like > grub2-probe --target=device /boot/grub2 > grub2-probe --target=abstraction --device /dev/XXX > look reasonable? > > Running something like > bash -x grub2-mkinstall /dev/sda > might also help figuring out what is going on. I presume you meant grub2-install Mind you the -x output is not too pleasant, but I do see the raid modules being copied in.
grub2-install will never read /etc/default/grub or write /boot/grub2/grub.cfg, but grub2-install can include modules in core.img which otherwise would have to be loaded explicitly with insmod in grub.cfg. It seems like the core issue here shows the same pattern as Bug 748121: The initial grub2 commands run from anaconda didn't do the right thing, but running exactly the same commands after a reboot works just fine.
(In reply to comment #18) > (In reply to comment #17) > The necessary drivers should however have been detected automatically by > grub2-install and built into core.img. > > Do something like > grub2-probe --target=device /boot/grub2 > grub2-probe --target=abstraction --device /dev/XXX > look reasonable? My results looks confusing It returns nothing for /boot on Fedora 16 (/dev/md0, metadata 1.0): # grub2-probe --target=device /boot/grub2 /dev/md0 # grub2-probe --target=abstraction --device /dev/md0 # mdadm --detail /dev/md0 /dev/md0: Version : 1.0 Raid Level : raid1 But it works for / for fedora 16 (/dev/md2, metadata 1.2): # grub2-probe --target=abstraction --device /dev/md2 raid mdraid1x # mdadm --detail /dev/md2 /dev/md2: Version : 1.2 Raid Level : raid1 And it works! for / for fedora 15 (/dev/md3, metadata 1.0): # grub2-probe --target=abstraction --device /dev/md3 raid mdraid1x # mdadm --detail /dev/md3 /dev/md3: Version : 1.0 Raid Level : raid1 I don't see difference between /dev/md0 and /dev/md3 (the same metadata but different result). How it it possible?
(In reply to comment #18) > Running something like > bash -x grub2-mkinstall /dev/sda > might also help figuring out what is going on. You wanted to write bash -x /sbin/grub2-install /dev/sda right? I have tried it and because of problem in detection on /dev/md0, then /sbin/grub2-install create core.img with: /usr/bin/grub2-mkimage -c /boot/grub2/load.cfg -d /usr/lib/grub2/i386-pc -O i386-pc --output=/boot/grub2/core.img --prefix=/grub2 biosdisk ext2 search_fs_uuid instead of /usr/bin/grub2-mkimage -c /boot/grub2/load.cfg -d /usr/lib/grub2/i386-pc -O i386-pc --output=/boot/grub2/core.img --prefix=/grub2 biosdisk ext2 mdraid09 search_fs_uuid or /usr/bin/grub2-mkimage -c /boot/grub2/load.cfg -d /usr/lib/grub2/i386-pc -O i386-pc --output=/boot/grub2/core.img --prefix=/grub2 biosdisk ext2 raid mdraid1x search_fs_uuid which both should make working configuration
(In reply to comment #12) > I can confirm workaround: > > you need add only this line (for raid1 with metadata 0.90 or 1.0) to > /boot/grub2/grub.cfg: > > insmod mdraid09 > > I don't know if this is necessary: > grub2-install /dev/sda > grub2-install /dev/sdb > > but it works only when you run it from install DVD in rescue mode. When I did > it in f16 I got unbootable system again. Something is really broken there. I have sw raid1 mirrored / and /boot partitions here. Equal grub2 boot problems after F15 to F16 upgrade. My workaround is to reboot into rescue mode with DVD or USB stick, goto 'chroot /mnt/sysimage' and add 'insmod raid' under that kernel menuentry in /etc/grub2.cfg. I mean somehing like: menuentry 'Fedora Linux, with Linux 3.1.0-7.fc16.x86_64' --class fedora --class gnu-linux --class gnu --class os { load_video set gfxpayload=keep insmod gzio insmod ext2 # The following fixes sw raid boot problem: insmod raid Save, exit and reboot. It works.
Hi, just a quick update: I wrote a QA testcase for RAID 1 /boot with the Anaconda team and AdamW's blessing: https://fedoraproject.org/wiki/QA:Testcase_Partitioning_On_Software_RAID Adam is currently proposing to revise the /boot-on-raid release criterion for Fedora 17 beta and we are hoping to have it fixed then. Sorry that you ran into this, folks.
I think there's a second bug here. Manually adding the "insmod"s just before "set timeout" in grub.cfg didn't fix it for me. But rerunning grub2-mkconfig after installation, produced a slightly different grub.cfg that worked. Comparing the before and after, I see these: @@ -9,7 +9,7 @@ if [ -s $prefix/grubenv ]; then load_env fi -set default="${saved_entry}" +set default="0" if [ "${prev_saved_entry}" ]; then set saved_entry="${prev_saved_entry}" save_env saved_entry But most importantly: menuentry 'Fedora Linux, with Linux 3.1.0-7.fc16.x86_64' --class fedora --class gnu-linux --class gnu --class os { load_video set gfxpayload=keep insmod gzio + insmod raid + insmod mdraid1x + insmod part_msdos + insmod part_msdos insmod ext2 - set root='(md124)' + set root='(mduuid/7ae940a5bd5a0d0f666adf78e0c5b498)' search --no-floppy --fs-uuid --set=root adfb46ed-a89e-4d95-bf3e-7b7da7fc82fe echo 'Loading Linux 3.1.0-7.fc16.x86_64 ...' The grub.cfg generated from anaconda had the "set root=(md124)", referencing the mdraid mount that Anaconda started. Which is not of much use during a regular system boot. Rerunning grub2-mkconfig replaced it with an apparent reference to the raid1 UUID volume of /boot.
(In reply to comment #26) > The grub.cfg generated from anaconda had the "set root=(md124)", referencing > the mdraid mount that Anaconda started. I think that is just another example of how the grub2 detection of what is going on at the storage level for some reason doesn't work correctly while and immediately after anaconda has been running. That doesn't look like a grub2 bug to me. The bug must be that something leaves the system in an inconsistent state - and that confuses grub2.
I had this problem too: My RAID1 setup: sda1,sdb1: bios-boot sda2,sdb2: root (md1) sda3,sdb3: swap (md0) After installation grub failed to boot. I booted from fedora-rescue and added all missing modules to /etc/default/grub like this: GRUB_PRELOAD_MODULES="part_gpt raid mdraid1x mdraid09" Running grub2-mkconfig gave me an functioning grub.cfg at the next reboot - all fine now... BUT: running the same grub2-mkconfig command from within the bootet system messes things up again. Some insmod lines are missing in the output file again, but in a stragne manner: while the 00_header output is ok an contains all modules the 10_linux output is missing these modules and grub stops working. 1. So there is a difference running grub2-mkconfig from the rescue and the bootet CD! 2. Why does grub.cfg needs all modules twice? in 00_header output and in all 10_linux boot entries?
arturj, please attach * grub.cfg generated on fedora-rescue * grub.cfg generated on booted system Did you have the same /etc/default/grub in both cases? Or how did it look when? Did you /boot/grub2/device.map look the same in both cases? Or will grub2-mkdevicemap overwrite it with something else?
Based on upgrades of multiple machines that have /boot on mdraid, I recommend that http://fedoraproject.org/wiki/Common_F16_bugs#Cannot_boot_with_.2Fboot_partition_on_a_software_RAID_array should be changed to recommend a different fix of rebooting the installation image in rescue mode, letting it find the just-upgraded system, dropping into a rescue shell, doing a chroot /mnt/sysimage, then running grub2-mkconfig, following by grub2-install. This has always worked. Manually adding the insmods, as Common_F16_Bugs now recommends, does not always work, because "set root=" is also broken. See comment 26. I just updated another machine -- and the same exact thing. Initial Anaconda installer produced a broken grub.cfg. But, just by rebooting the /same/ Anaconda installer, but this time in rescue mode, letting it find the updated system, now results in grub2-mkconfig generating a working configuration. There's something different between the way mdraid partition come up during an update path, versus the way they come up when going in rescue mode mode. In the former, grub2-mkconfig can't make heads or tails, doesn't know which mdraid modules to set up, and blurts out an "set root=(mdXXX)" in each stanza, which is utterly useless. In the latter, grub2-mkconfig produces a well-rounded list of modules that are needed to find everything at boot-time, and puts the correct UUID into "set root=".
(In reply to comment #30) > rescue shell, doing a chroot /mnt/sysimage, then running grub2-mkconfig, > following by grub2-install. Sam, out of curiousity: Have you seen any indication that you really have to run grub2-install again? > "set root=" is also broken. See comment 26. The 'set root' line shouldn't be relevant at all. It just hardcodes the name the device had when grub2-mkconfig was run and uses that as a fall back. The following 'search ... --set=root' will always overwrite it if it finds the specified UUID. I think it is very unlikely that the fall back every will do the right thing if the UUID search fails.
Perhaps someone should update http://fedoraproject.org/wiki/QA:Testcase_Partitioning_On_Software_RAID to reflect /boot on raid so that this bug won't be missed if it returns?
It's my understanding that grub fails to boot because it's not loading the raid modules, which were not included in the early stages of grub's boot, and rerunning grub2-install, after regenerating grub.cfg, ends up adding the necessary modules and installing them in the boot area. So, I think that grub2-install is necessary. Furthermore, in the specific case of RAID-1, Anaconda only installs grub on sda, so grub still needs to be installed on sdb, if the goal here is to have two drives and either one of them can be used as a boot drive.
(In reply to comment #33) > It's my understanding that grub fails to boot because it's not loading the raid > modules, which were not included in the early stages of grub's boot, and > rerunning grub2-install, after regenerating grub.cfg, ends up adding the > necessary modules and installing them in the boot area. So, I think that > grub2-install is necessary. grub2-install creates core.img and includes some modules in it and installs it as the boot loader. It _could_ build some of these modules into core.img so they don't have to be loaded from grub.cfg, just like in comment 23. It should however not do that if the modules just could be loaded dynamically (which they can), so in theory and as far as I can see that shouldn't be a reason to run grub2-install again. But I agree that if the original grub2-mkconfig failed then it is very likely that grub2-install also got it wrong, so running it again is a good idea in this workaround. It should however be slightly better to run grub2-install first. grub2-mkconfig looks at some files installed by grub2-install, but grub2-install doesn't look at anything from grub2-mkconfig. I think I have seen indications that anaconda _will_ run grub2-install on all raid 1 disks.
(In reply to comment #29) > arturj, please attach > * grub.cfg generated on fedora-rescue > * grub.cfg generated on booted system > > Did you have the same /etc/default/grub in both cases? Or how did it look when? > > Did you /boot/grub2/device.map look the same in both cases? Or will > grub2-mkdevicemap overwrite it with something else? This is it: the device.map generated by anaconda during original install is different from the one created afterwards. Issueing an grub2-mkdevicemap solves all problems, don't even need those insmods (raid mkraid09 mkraid1x) anymore as grub2-mkconfig now insertes them automatically. So this are the steps boot rescue and issue these commands: grub2-mkdevicemap grub2-mkconfig -o /path/to/grub.cfg grub2-install /path/to/devices NO modifications to any config files needed! On a root-server I additionally had to rebuild the initramfs by issueing "dracut --force /path/to/initramfs version" for the boot sequence not to hang.
(In reply to comment #35) > This is it: the device.map generated by anaconda during original install is > different from the one created afterwards. Issueing an grub2-mkdevicemap solves > all problems, don't even need those insmods (raid mkraid09 mkraid1x) anymore as > grub2-mkconfig now insertes them automatically. So this are the steps > > boot rescue and ... As described here there is no evidence that the difference is caused by grub2-mkdevicemap - it could also be caused by rebooting or booting differently. Can anybody confirm that running these commands in a terminal immediately after anaconda has terminated and before rebooting also solves the problem?
(In reply to comment #36) > Can anybody confirm that running these commands in a terminal immediately after > anaconda has terminated and before rebooting also solves the problem? No it does not. That's exactly what I did when I upgraded my last machine. After all the packages were updated, and Anaconda prompted for a reboot, before the reboot I flipped to a shell, and chroot-ed to /mnt/sysimage. Running grub2-mkconfig produced the same, identical, non-functional grub.cfg that was just created by Anaconda. Only after rebooting back into Anaconda, and this time going into the rescue option, letting it mount the just-updated Fedora, only then chroot-ing into /mnt/sysimage resulted in grub2-mkconfig generating a working grub.cfg
(In reply to comment #37) > Running > grub2-mkconfig produced the same, identical, non-functional grub.cfg that was > just created by Anaconda. Did you run grub2-mkdevicemap first?
No.
(In reply to comment #37) > (In reply to comment #36) >> Only after rebooting back into Anaconda, and this time going into the rescue > option, letting it mount the just-updated Fedora, only then chroot-ing into > /mnt/sysimage resulted in grub2-mkconfig generating a working grub.cfg I had similar experience. After doing a yum upgrade of the DVD install, my RAID1 build would not boot. Ran a chroot /mnt/sysimage and then re-ran the commands: grub2-mkdevicemap grub2-mkconfig -o /boot/grub2/grub.cfg grub2-install /dev/sda grub2-install /dev/sdb The system will now boot again. However, it is a concern as to what might happen the next time there are major updates!
I think problem still persist in grub2-probe, which is (I suppose) used by grub2-mkdevicemap Now I have in /boot/grub2/device.map which is made (I suppose) in rescue mode (hd0) /dev/sdb (hd1) /dev/sda (md0) /dev/md0 grub2-mkdevicemap on running system (F16) give only two lines: (hd0) /dev/disk/by-id/ata-WDC_WD7500AADS-00M2B0_WD-WCAV59044433 (hd1) /dev/disk/by-id/ata-ST3750640NS_5QD29MEH So re-run in F16 grub2-mkdevicemap couldn't help It look grub2-probe and grub2-mkdevicemap give different results in rescue mode versus F16 system
(In reply to comment #40) > I had similar experience. After doing a yum upgrade of the DVD install, my > RAID1 build would not boot. That sounds like a different problem. You should file a new bug and attach grub.cfg.
sam: the common bugs page is a wiki; if you're confident you can improve on an entry in it and your information is not incorrect (or at least no _less_ correct than what was there before :>), do go ahead and change it. thanks! -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
*** Bug 769070 has been marked as a duplicate of this bug. ***
Today, I freshly installed Fedora 16 (32 bit) from DVD with RAID-1 for /boot and / (as I always do). No GPT but traditional MBR. I followed the instructions in https://fedoraproject.org/wiki/Common_F16_bugs#boot-on-softraid before rebooting the installer. Didn't work. :-( People should be warned that those instructions are NOT correct! It only worked to boot from DVD into rescue mode and (as grub.cfg was already changed before) run "chroot /mnt/sysimage" and "grub2-install /dev/sda" and "grub2-install /dev/sdb". However, the DVD rescue mode always assumes the BIOS clock to be UTC which results in wrong timestamps, fsck warnings and automatic superblock (last mount time) fixes. But that's basically cosmetic. Unfortunately, DVD rescue mode also messes up SELinux and causes a full relabel on next system boot (which runs almost an hour). DVD rescue mode is no fun. System now boots from RAID-1 /boot. But if I run "grub2-install /dev/sda" and "grub2-install /dev/sda" on the running system, it makes the RAID-1 /boot unusable again. (Can only be fixed in DVD rescue mode.) At no point I did run grub2-mkconfig or other grub2* commands, so my /boot/grub2/grub.cfg is the default from the installation (of course, with the three additional "insmod" statements according to Common_F16_bugs). It looks reasonable (eg, the UUIDs are correct, also "set root='(mdX)'" is correct). As stated by others, there are some differences between "grub-install" during installation or on a running system versus "grub-install" during DVD rescue mode. Installation/running system: grub-install builds a /boot/grub2/core.img that is about 4 kB smaller than in DVD rescue mode. It also creates a previously non-existing /boot/grub2/load.cfg file (but removing it again doesn't change anything). DVD rescue mode: grub-install builds a /boot/grub2/core.img that is about 4 kB larger. It does not create a /boot/grub2/load.cfg file (doesn't seem to be needed). Everything else in /boot is exactly the same. Of course, the resulting MBR is different (that's because core.img is different, right?) The current situation is a little bit unsatisfying because the next time anybody runs grub2-install for some reason, it can only be fixed in DVD rescue mode. I've read through all the comments but I'm not quite sure if it's possible at all to fix GRUB2/MBR from a running F16 system or during installation (as Common_F16_bugs wrongly suggests). Although that bug shouldn't happen at all, it would be a nice first step if it could be fixed from within F16 itself *without* temporarily booting from DVD and into rescue mode (including timestamp and SELinux troubles). I'm confused why it doesn't work during Fedora installation (right before the reboot) because to my understanding the installation process comes pretty close to the rescue mode. What's so different between these two?
From within a running F16 system, I managed to make "grub2-install" build a correct core.img + MBR if I added '--modules="part_msdos"' to the command line. The grub2-install in DVD rescue mode adds this automatically. I did a fresh installation, and in addition to the instructions given in Common_F16_bugs I also put "insmod part_msdos" into grub.cfg, and grub2-installed was called with the arguments '--no-floppy' and '--modules="part_msdos raid mdraid09 mdraid1x"'. (After the Anaconda installation, but before the reboot.) This time, GRUB2 didn't put me into rescue mode. Nice. I could see some graphical progress screen (where that Fedora logo fills up), but then Dracut dropped me into a debug shell. On that screen, the last thing F16 did was loading the initial ramdisk. But then it failed: Booting 'Fedora Linux, with Linux 3.1.0-7.fc16.i686.PAE' Loading Linux 3.1.0-7.fc16.i686.PAE ... Loading initial ramdisk ... dracut Warning: No root device "block:/dev/disk/by-uuid/<uuid of md root partition>" found Dropping to debug shell. Well, that debug shell didn't help me much. "ls -l /dev/disk/by-uuid" only shows me two entries, "md5" (/boot) and "md10" (swap), but nothing else. No root device (md6, /) or any other (md7, md8, md9). The UUID for the root device that cannot be found is correct. Don't know why it can detect "/boot" and "swap" (both RAID-1 MD), but not "/" (also RAID-1 MD). I tried to fix the mess up in DVD rescue mode, but running grub2-install no longer gives me a bootable system at all if I run it with the arguments mentioned above. Then F16 goes into GRUB rescue mode without even mentioning any UUID ("error: no such disk.") Tried again in DVD rescue mode and just did "grub2-install /dev/sda" and "grub2-install /dev/sdb" without any fancy additional parameters. This time, GRUB2 happily worked, but now I get dropped into Dracut debug shell again. Currently, I don't get a working system at all. I then tried to build a new initramfs with "dracut --force /boot/initramfs-`uname -r`.PAE.img `uname -r`.PAE", but that didn't change anything. Still getting dropped into Dracut debug shell. Solved (?) one problem with GRUB2, got a new one with Dracut. I don't understand why DVD rescue mode doesn't fix the system any longer. Yes, sure, there must be something different now but I've really no idea what it is.
I cannot explain all the problems I run into during the last two weeks, but currently the following procedure works and during several test installations I couldn't find a way to make it fail (well that doesn't mean very much but the old procedure didn't work at all for me, so this is a big step forward). Installation of Fedora 16 i686 (32 bit) from DVD with MSDOS disk label format (MBR). The DVD is booted with the additional kernel option "nogpt". /boot is ext4 filesystem on RAID-1. When the first stage of the installation is finished and Anaconda asks to press the "Reboot" button, switch to text console and edit "/mnt/sysimage/boot/grub2/grub.cfg". Add these lines to each kernel section (regular + recovery): insmod part_msdos insmod raid insmod mdraid09 insmod mdraid1x Without "part_msdos" it won't work at all (might be different on GPT installations). If there are no old MD devices, "mdraid09" is not necessary (but it won't hurt to add it). Now run these commands: chroot /mnt/sysimage grub2-install --modules="part_msdos raid mdraid09 mdraid1x" /dev/sda grub2-install --modules="part_msdos raid mdraid09 mdraid1x" /dev/sdb Note the additional --modules statement because some of the modules would be missing otherwise. Funny thing is that in DVD rescue mode the module "part_msdos" is added automatically. But for some reason it is not added during installation and also not added on a running Fedora system. You'll notice that difference if you run "bash -x .../grub2-install ..." I don't know if that enhanced workaround will work for everyone here, but it won't make things worse - only better. Now I no longer need DVD rescue mode (breaks SELinux) and I can also run grub2-install from a running Fedora 16 system.
*** Bug 753328 has been marked as a duplicate of this bug. ***
I'm also struggling with this on a server that has two disks used to form a RAID1 /boot (md0) and a RAID1 / (md1). No GPT. Somehow had this working before after install but now am trying to upgrade the disks one at a time. Having found this bug, I suspect with the various approaches outlined above I can "solve" this issue next time I can bring the server down to retry this. In the mean time, I think I have found something of significance: Looking over the output of "bash -x grub2-install" I concur with a comment above that suggests the core of this problem is probably: # /sbin/grub2-probe --device-map=/boot/grub2/device.map --target=abstraction --device /dev/md0 That command produces no output, but you would expect it to list some RAID modules to include in the MBR. Lets try it again on another RAID1 on the same machine (I have no interest in booting from this one, but just for fun): # /sbin/grub2-probe --device-map=/boot/grub2/device.map --target=abstraction --device /dev/md1 raid mdraid1x That one works. My /boot/grub2/device.map has contents: # this device map was generated by anaconda (hd0) /dev/sda (hd1) /dev/sdb (md0) /dev/md0 If I remove the md0 line from device.map and retry the grub-probe command: # /sbin/grub2-probe --device-map=/boot/grub2/device.map --target=abstraction --device /dev/md0 raid mdraid1x That now produces the expected output. So, the presence of md0 in device.map seems to influence the decisions made by grub2-probe when deciding which modules are needed. In this case it needs to be excluded from device.map before it makes the right decision. Above comments suggest that device.map is unused in grub2, but I think I have illustrated thats not true in this case. If I run grub2-mkdevicemap, my device.map now has contents: (hd0) /dev/disk/by-id/ata-WDC_WD5000AAKS-00UU3A0_WD-WCAYU7177856 (hd1) /dev/disk/by-id/ata-ST1000DM003-9YN162_Z1D0JBHS i.e. grub2-mkdevicemap also doesn't consider the md* devices for inclusion in device.map. This also explains why some other people above have claimed that grub2-mkdevicemap then grub2-install solves the issue. Having removed md0 from device.map, "bash -x grub2-install" now looks more promising - it is clearly including the RAID modules in the mbr. And now, running grub2-mkconfig produces something more promising: the new config loads the RAID modules (the old one didn't mention them), and it also changed the root= assignment - set root='(md0)' + set root='(mduuid/c1df93687c31ff9dfd673a6e4d40a94f)' The "search" line (which overrides the above root= assignment, if it finds something) stays the same: search --no-floppy --fs-uuid --set=root 66b89385-951d-46f5-b6a0-f4aec3083815 Just for reference, to correlate those UUIDs with the "blkid" output: /dev/sda1: UUID="c1df9368-7c31-ff9d-fd67-3a6e4d40a94f" UUID_SUB="3ee1be50-a0ff-d016-0e0d-e10f6aff6b6e" LABEL="fzt-server:0" TYPE="linux_raid_member" /dev/sdb1: UUID="c1df9368-7c31-ff9d-fd67-3a6e4d40a94f" UUID_SUB="4730af0e-85c7-0111-2d79-8486e7c11790" LABEL="fzt-server:0" TYPE="linux_raid_member" /dev/md0: UUID="66b89385-951d-46f5-b6a0-f4aec3083815" TYPE="ext4" (sda1 and sdb1 are the two partitions included in the md0 RAID1) So I think I've written a good MBR now, but I can't test it just yet (server is in use). I'll assume its fixed though - the above looks promising. I'll post another comment after trying it. To move forward on this bug I think we need to have a clear understanding of how exactly grub uses device.map. Here's some guesswork (totally unverified): device.map is only supposed to reference physical disks that really exist (e.g. sda), not for virtual things such as RAIDs. When grub2 sees a disk listed in device.map maybe it assumes that its really a fixed disk (therefore it wouldn't even bother trying to see if its a RAID). If this guesswork is true it would mean that anaconda is at fault for sticking md0 in device.map. Can anyone verify this understanding of device.map?
It worked. I spent 5 minutes exploring the above assumption. I didn't manage to completely verify it, but I have at least identified the change in behaviour at the code level. In util/probe.c:probe_raid_level(): When md0 is listed in device.map, disk->dev->id is GRUB_DISK_DEVICE_BIOSDISK_ID hence we don't even bother probing for raid level. When md0 is not listed in device.map, disk->dev->id is GRUB_DISK_DEVICE_RAID_ID. This is with grub2-1.99-13.fc16.2.x86_64. In my eyes the next steps are to understand this code better, and/or ask for clarification of the role of device.map on the grub mailing list.
device.map should not be needed with the grub2 that is used in f17 ... but it will still use device.map if it is present and the file will thus do more harm than good if it is wrong. If the probing used in grub2-mkdevicemap got it wrong then the new auto detection in the tools might get it wrong too. I haven't seen this bug reported for f17, so I guess it is fine, but it might deserve some extra testing.
The device.map that comes installed on a system contains as the first line: # this device map was generated by anaconda So I don't think the file is being generated by grub2-mkdevicemap (if it were, I don't think we'd be having this issue).
I kickstarted FC16 on a brand-new (never used) system with md mirrored /boot and md mirrored/LVM root, and ended up with this at reboot time: --- GRUB loading. Welcome to GRUB! error: no such device: 1897bd7b-fa22-4475-ba21-12dd246f64d5. Entering rescue mode... grub rescue> ---- So, rescue mode... --- sh-4.2# df /boot Filesystem 1K-blocks Used Available Use% Mounted on /dev/md127 253859 53058 187695 23% /boot sh-4.2# blkid /dev/md127 /dev/md127: UUID="1897bd7b-fa22-4475-ba21-12dd246f64d5" TYPE="ext3" sh-4.2# cat /boot/grub2/device.map # this device map was generated by anaconda (hd0) /dev/sda (hd1) /dev/sdb (md0) /dev/md0 After lots of futzing around in rescue mode and reading the above, the magic I needed to make it boot was this: # grub2-mkdevicemap # cat device.map (hd0) /dev/disk/by-id/ata-HITACHI_H7210CA30SUN1.0T_1131AVZPBL_JPW9K0J81VZPBL (hd1) /dev/disk/by-id/ata-HITACHI_H7210CA30SUN1.0T_1131AVZPDL_JPW9K0J81VZPDL # grub2-install --modules="search_fs_uuid" /dev/sda # grub2-install --modules="search_fs_uuid" /dev/sdb vi /etc/default/grub # add this: GRUB_PRELOAD_MODULES="part_gpt raid mdraid1x" # grub2-mkconfig -o /boot/grub2/grub.cfg ---- and now it boots. I'm not sure the device.map changes or the GRUB_PRELOAD_MODULES were required; I think the thing that was majorly missing was the search_fs_uuid module, hence the complaint. I was initally hitting the well known segfault when running the grub2-install on /dev/md0 but don't get that when running on the underlying devices, above. (in FC15 and prior the 'grub-install /dev/md0' worked fine, so it seemed) Below is the relevant bit of the kickstart.cfg that lead me to this: -- bootloader --location=mbr zerombr clearpart --all --initlabel part biosboot --fstype=biosboot --size=1 --ondisk=sda part biosboot --fstype=biosboot --size=1 --ondisk=sdb part raid.01 --size=512 --ondisk=sda part raid.02 --size=512 --ondisk=sdb part raid.03 --size=940000 --ondisk=sda part raid.04 --size=940000 --ondisk=sdb raid /boot --fstype ext3 --level=RAID1 --device=md0 raid.01 raid.02 raid pv.01 --fstype ext3 --level=RAID1 --device=md1 raid.03 raid.04 volgroup VolGroup00 pv.01 logvol swap --fstype swap --name=LogVol00 --vgname=VolGroup00 --size=16384 logvol /lvmbackup --fstype ext3 --name=backup --vgname=VolGroup00 --size=16384 logvol / --fstype ext4 --name=LogVol01 --vgname=VolGroup00 --size=1 --grow --
(In reply to comment #53) > I kickstarted FC16 on a brand-new (never used) system with Was that by using an original f16 install media? There has been several reports here that something weird was going on with some install media on some machines. It is too late to get that fixed now. What matters now is: Do you see the same problem with f17? In that case some effort should be put into understanding and fixing the issue there.
No, it was over ethernet from a server with a copy of F16 "Everything".
(In reply to comment #54) > (In reply to comment #53) > > I kickstarted FC16 on a brand-new (never used) system with > > Was that by using an original f16 install media? > > There has been several reports here that something weird was going on with some > install media on some machines. It is too late to get that fixed now. > > What matters now is: Do you see the same problem with f17? In that case some > effort should be put into understanding and fixing the issue there. I've been doing tests with the fc17 beta over the last week (migrating our deploy/build structure to the new release, previously was fc14) and have very much been seeing the issues mentioned in this bug. After many attempts at fixing it, including the references to the fc16 bugs on the wiki. The solution looks to be to remove the md0 line in /boot/grub2/device.map, and then regenerating the config with grub2-mkconfig. (following comment 49)It correctly probes and adds the correct modules for raid access, and the subsequent grub2-installs work without a problem. There seems to be no grub2-mkdevicemap in fc17 so I omitted that step. Its been mentioned that device.map isn't required in fc17, but its still created, and it looks like the values of that file are stopping a raid1 /boot from installing cleanly. I'm now trying to put these commands into the kickstart %post to make things easier, but I'm sure I'll still hit a error message in the install.
(In reply to comment #56) Did anaconda create new boot loader configuration when you upgraded? Including a new incorrect device.map? Anyway: It seems like anaconda still has an issue: it should of course stop creating invalid device.map entries, but it should also be taught to only create device.map when it is needed and with the needed entries - which apparently would be some setups with complex stacking of the /boot device.
After a clean install of F17-beta, there's a /boot/grub2/device.map containing /dev/md0 and there are no insmod lines for the raid in /boot/grub2/grub.cfg. The / filesystem is on raid1 and there is no separate /boot partition.
I think I've just linked this bug with bug 788830 and bug 809111. If I remove the md0 line from device.map then grub2-probe doesn't segfault.
(In reply to comment #57) > (In reply to comment #56) > > Did anaconda create new boot loader configuration when you upgraded? Including > a new incorrect device.map? > > Anyway: It seems like anaconda still has an issue: it should of course stop > creating invalid device.map entries, but it should also be taught to only > create device.map when it is needed and with the needed entries - which > apparently would be some setups with complex stacking of the /boot device. Sorry, forgot to specify, its a clean wipe/install using kickstart, the incorrect device.map is created as part of it. My workaround seems TO give me a bootable system, in the %post section of the kickstart config I have the following: ------------------- # delete md0 reference from device.map sed -i "/md0/d" /boot/grub2/device.map # reconfigure grub with working device.map grub2-mkconfig > /boot/grub2/grub.cfg # install on both drives grub2-install /dev/sda grub2-install /dev/sdb ------------------- Didn't need to use --modules on the grub2-install commands, as the grub2-probe seems to correctly find what it needs.
Another good report of apparently the same problem is Bug 755647 "grub2-mkdevicemap creates bad /boot/grub2/device.map on RAID configuration"
Fedora-17 (beta) has the same issue.
F17 TC4 installs fine for me with /boot on raid. It's actually / on the raid, so if it makes any difference I could try with a separate /boot raid partition as well.
F17 TC4 does NOT work for me when placing /boot on an md raid 1 (/dev/md0) when using UEFI. This is with /boot/efi on it's own partition (the first partition on the first disk) as recommended by anaconda. Also, it appears to use grub-efi (rather than grub2-efi). This might be intentional however, it seemed a bit strange.
It is intentional. We don't consider grub2-efi ready for prime time. I'd say file a separate bug for that one. thanks! -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
(In reply to comment #65) > It is intentional. We don't consider grub2-efi ready for prime time. > > I'd say file a separate bug for that one. thanks! Here it is: https://bugzilla.redhat.com/show_bug.cgi?id=816742 It also relates to the first part of my post, however: "F17 TC4 does NOT work for me when placing /boot on an md raid 1 (/dev/md0) when using UEFI. This is with /boot/efi on it's own partition (the first partition on the first disk) as recommended by anaconda."
F17 does NOT work for me when placing / on an md raid 1 (/dev/md0) when using standard mbr.
(In reply to comment #67) > F17 does NOT work for me There is no F17 yet. Which F17 pre-release version with which grub2 version doesn't work for you - and what problem do you see? Please try with the latest TC with grub2 beta5 and see if you still see the problem (assuming you have been testing some older version).
> Please try with the latest TC with grub2 beta5 and see if you still see the > problem (assuming you have been testing some older version). Do not work... Ok I found how to repair it in case if root and boot on the same md0 volume. Maybe it will be useful for other configurations, but I did not check it. ## boot into rescue mode ## Save root volume info into file, as exampe, /a ## root is on md0 device, but it looks as /dev/md127 blkid | grep md127 > /a # # Edit grub configuration # --- vi /mnt/sysimage/boot/grub2/grub.cfg insmod gzip # ADD + insmod part_msdos + insmod mdraid09 + insmod mdraid1x + insmod ext2 + insmod linux + set root='(md/0)' # ==> We need for volume UUID, read it from file /a # ==> `<esc> :r /a` # ==> root UUID is 387b9a7e-837d-47e7-9af3-ce81202dee1f # ==> Edit string ( delete extra info) and add result to 'search' string + search --no-floppy --fs-uuid --set=root 387b9a7e-837d-47e7-9af3-ce81202dee1f # ==> Repair kernel parameter root= # ==> In my case UUID=387b9a7e-837d-47e7-9af3-ce81202dee1f # ==> Add selinux=0 and text mode runlevel 3, imho, it is useful # ==> Also, I have deleted from kernel parameters rhgb quiet rd.dm=0 # ==> Result string linux /boot/vmlinuz-3.3.4-5.fc17.i686.PAE root=UUID=6bcf2c1e-f57c-47b9-bf55-6e41ba2e9170 ro LANG=ru_RU.UTF-8 SYSFONT=False rd.lvm=0 rd.md.uuid=b708bc45:19ffb3fb:fb99b39d:f9f21a95 rd.luks=0 rd.md.uuid=c1756b66:b047221e:220d58ed:455a1c98 KEYTABLE=us-acentos selinux=0 # ==> It is all, save it # Chroot environment is device-less # Bind it from rescue CD mount --bind /dev /mnt/sysimage/dev # Chroot chroot /mnt/sysimage # ==> It is strange, but patameter targrt is required # ==> Install grub to both devices grun2-install --target=i386-pc /dev/vda grun2-install --target=i386-pc /dev/vdb # exit from chroot exit # Unmount /mnt/sysimage/dev umount /mnt/sysimage/device exit reboot Now repaired system have to boot as expected.
(In reply to comment #69) Please state explicitly which grub2 version you are using. How does your /boot/grub2/device.map look like? Did you run grub2-mkconfig after upgrading grub2? Please try to insert a 'set -x' line at the beginning of /etc/grub.d/00_header and run "grub2-mkconfig -o /boot/grub2/grub.cfg 2> log" and attach the log file and the generated grub.cfg. Please also show the full blkid output.
(In reply to comment #70) > (In reply to comment #69) > > Please state explicitly which grub2 version you are using. > Hm ... good question. I have installed f17b with included 'updates' repository yesterday. It do not help - system do not boot. Is it important which exactly version is in updates? > How does your /boot/grub2/device.map look like? I do not know. Is it important to install yet another virtual machine to check? > Did you run grub2-mkconfig after upgrading grub2? No. As soon as I have fixed issue, I did not rebuild config. Yes, in future in can bring issue... > Please try to insert a 'set -x' line at the beginning of /etc/grub.d/00_header > and run "grub2-mkconfig -o /boot/grub2/grub.cfg 2> log" and attach the log file > and the generated grub.cfg. Please also show the full blkid output. Sorry Mads, I have reproduced, that anaconda/grub from f17b+updates at 19.05.2012 are not able to provide possibilities to install root on raid1. I have provided how to avoid the problem. I have solved problem for myself and for anybody who is interesting in it for F17b. I hope, F17 will be able to do it...
The 'updates' repository has nothing in it, for pre-releases. So if you install from a Beta DVD and enable the 'updates' repository, what you'll get is Beta. You need to enable the 'Fedora 17' repository (the main one), and optionally, updates-testing (though this should be fixed by what's in stable). Or just use RC2 - http://dl.fedoraproject.org/pub/alt/stage/17.RC2/ .
> > Or just use RC2 - http://dl.fedoraproject.org/pub/alt/stage/17.RC2/ > I found F17RC4. I can confirm, f17rc4 can be installed on RAID. [root@f7rc4 ~]# mount | grep /dev/md /dev/md1 on / type ext4 (rw,relatime,seclabel,user_xattr,barrier=1,data=ordered) /dev/md3 on /var type ext4 (rw,relatime,seclabel,user_xattr,barrier=1,stripe=256,data=ordered) /dev/md2 on /tmp type ext4 (rw,relatime,seclabel,user_xattr,barrier=1,stripe=256,data=ordered) PS: Issue is fixed...
Just to put a nail in it, can you post the contents of /boot/grub2/device.map from such an install? Thanks.
(In reply to comment #74) > Just to put a nail in it, can you post the contents of > /boot/grub2/device.map from such an install? Thanks. [root@localhost grub2]# cat device.map # this device map was generated by anaconda (hd0) /dev/vda (hd1) /dev/vdb
Hmm, and vda and/or vdb are part of the RAID? Assuming yes, that's not quite what I was expecting. But maybe the f17 fix was in another part of the system. Can anyone clarify what exactly was changed in F17 that is expected to fix this bug?
(In reply to comment #76) There is no wrong entires for /dev/md* - that's progress! ;-)
Ah yes, I was getting confused. Please ignore my comment above. Indeed, having vda/vdb in there is expected (I guess), the important thing is that md* has gone. I'm convinced of the fix. Thanks Mads and those who fixed and tested this :)
(In reply to comment #78) > Indeed, having vda/vdb in there is expected (I guess) I doubt they are necessary in this simple case, and I think it would be better if we only created a device.map when it _really_ was necessary. FWIW.
I have upgraded from f14 via f16 (32bit) to f17(64bit) and the system won't boot. dracut warning: no root device “block: devmapper vg_i7-lv_root” found dracut warning: crypto LUKS UUID lc8f42f4-454c-45f5-ac78-318f2c9a6fa1 not found dracut warning: lvm vg_i7l_root not found Dropping to debug shell. Sh: can't access tty; job control turned off The installation process reported success but after trying all the suggestions in this thread over the last 2 days I got nowhere. Is the Grub for f17 different -Should I be looking at a different bug Grub menu reports the kernels as f16 but I know it is trying to boot f17 because it shows the same horiontal flash and cannot open font file True cannot open font file True That I got when installed it dual boot to the laptop. The first hard drive(s) in bios are marvell hardware raid and encrypted. The second (single) hard drive is partitioned roughly 50/50 ntfs/ext4 for windows7 and backup. The installation process saw all these partitions, asked for the encryption keys and installed to the raid. Grub sees windows7 and the progress bubble almost completes before dropping to the dracut warnings but does not ask for the encryption keys. Help please!
(In reply to comment #73) > > > > Or just use RC2 - http://dl.fedoraproject.org/pub/alt/stage/17.RC2/ > > > > I found F17RC4. > I can confirm, f17rc4 can be installed on RAID. > > [root@f7rc4 ~]# mount | grep /dev/md > /dev/md1 on / type ext4 > (rw,relatime,seclabel,user_xattr,barrier=1,data=ordered) > /dev/md3 on /var type ext4 > (rw,relatime,seclabel,user_xattr,barrier=1,stripe=256,data=ordered) > /dev/md2 on /tmp type ext4 > (rw,relatime,seclabel,user_xattr,barrier=1,stripe=256,data=ordered) > > > PS: Issue is fixed... I'm closing this out, since it appears to have been resolved.