Bug 743273
Summary: | grub2 fails to install on IMSM raid device | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jes Sorensen <Jes.Sorensen> | ||||||
Component: | grub2 | Assignee: | Peter Jones <pjones> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 16 | CC: | agajan, awilliam, dennis, dledford, jensting, mads, mattyclarkson, pjones, vserbine, xaphir | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2013-02-13 21:09:52 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Jes Sorensen
2011-10-04 13:03:05 UTC
Forgot to include the grub version number: grub2-1.99-6.fc16.x86_64 I did complete a successful install from F16 Beta DVD media to an Intel BIOS RAID-1 array while testing https://bugzilla.redhat.com/show_bug.cgi?id=742226 - the RAID-1 array was the only install target, so it got the bootloader on there somehow, though I'm not sure precisely what command the installer ran. It would help if you could test this from the installer rather than manually. Trust me, I have *tried* doing this from the installer, about 20 times, but it isn't possible due to BZ#742888 I did manage somehow to get the installer to run it at some point as an upgrade I think, but even the grub2 install failed. Jes What if you add md126 to /boot/grub2/device.map (assuming it's not already there) and then try again? jes: so, you can't reproduce 742888 any more, and the other bug you hit there (731356) has a known workaround (delete the LANG= parameter), so can you try again with a clean install and report the result? or specify the device to install to in the format used in device.map (on my system it seems to use /dev/disk/by-id names)? Discussed at 2011-10-07 blocker review meeting. Agreed that it's unclear what's wrong here (if anything) and we need more information to evaluate the blocker status of this bug. Jes, please test more and provide more data (at least grub config), and pjones when you can, let us know what's going wrong. thanks! *** Bug 744054 has been marked as a duplicate of this bug. *** I hit a similar error message when testing on my laptop so it does seem like there's a real issue here, though it would be good to know jes can still reproduce and test, as I can't (I had to have my laptop working so I converted it to soft RAID with a separate /boot partition). jes, can you confirm you're still able to test this? Discussed at 2011-10-14 blocker review meeting. Agreed to punt on this again as we really need pjones to take a look at what's going wrong here. *** Bug 746460 has been marked as a duplicate of this bug. *** Jes: 746460 was strictly software raid; there was no bios raid involved. Jes: I was able to fix the problem by downloading a system rescue cd (http://www.sysresccd.org) and doing a chroot on the raid array. Then I had to run "rmmod floppy" to get past an fd0 error that grub2-install will generate, which will happen if the bios floppy controller is enabled where there is no floppy. Disabling floppy controller in the bios fixed that. (grub2-install will fail to recover from the fd0 error if it occurs.) Then you run grub2-install --recheck /dev/sda from the chroot terminal. After that, the array should boot. The original uuid parameters in /boot/grub2/grub.cfg and in /etc were all there as the F16 installer left them. I'm pretty sure that didn't help my laptop case, but again, I can't test that one any more. :/ Sorry for the late reply, yes I can still reproduce this problem. Unlike xaphir's case, I don't have a fake floppy controller in this system, so that didn't make a difference. However adding it to /boot/grub2/device-map makes the problem go away. It looks like grub2 has issues handling missing devices. I will upload my grub.cfg and my device-map files in a moment. Note these are the updated files, and also note in this test case I was trying to put grub onto a different partition than the one specified in grub.cfg for testing purposes. Cheers, Jes Created attachment 528736 [details]
device.map file, with the md device added
Created attachment 528737 [details]
grub.cfg
Note the grub.cfg specifies a different install device than the one causing
the failure. I was unable to install grub onto the raid device from anaconda
so I installed on a different partition and reproduced the error manually
on the command line instead.
neither pjones nor I can reproduce with a simple supported test case: 1) install f15 to Intel BIOS RAID 2) upgrade from an F16 DVD for both of us this works: the upgraded system has a working (bootable) grub2 config. The case where I hit a similar error, I did a yum upgrade. It's possible there's still a bug lurking here but we may need more detail to figure out exactly what it is. As things stand I don't think we can say for sure there's a blocker here. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Ok, I ran some more testing - here is the data. My system is setup as follows: 4 x 500GB SATA drives /dev/sd[ab] are assembled as a raid1 (/dev/md126) /dev/sd[cd] are left as standalone drives Fedora-16-Beta-TC1 iso put onto a USB stick using livecd-iso-to-disk During install, I create two regular partitions on /dev/md126 for boot and / Anaconda is not allowing me to install anaconda onto /dev/md126, but only offers me to put it onto the /boot partition (/dev/md126p1). Everything installs fine, no errors. Post boot I use the BIOS boot menu to ask for boot from the raid device, rather than one of the standalone disks. At this point it just hangs - I get a flashing cursor and nothing..... :( Jes "Anaconda is not allowing me to install anaconda onto /dev/md126, but only offers me to put it onto the /boot partition (/dev/md126p1)." This is a separate bug - https://bugzilla.redhat.com/show_bug.cgi?id=744088 "Post boot I use the BIOS boot menu to ask for boot from the raid device, rather than one of the standalone disks. At this point it just hangs - I get a flashing cursor and nothing..... :(" Well, yes. There's no bootloader on the MBR of the RAID device. Of course it ain't going to work. 744088 should be 'fixed' in Final TC2 by popping up another disk selection screen earlier in installation that lets you pick the bootloader target disk. Can you please try with Final TC2 and let us know how it goes? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Discussed at 2011-10-24 QA meeting, functioning as a blocker review meeting. This is reading more and more like a niche case and/or pilot error, but we're punting on it again just until this afternoon, when pjones hopes to have more data. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Jes, can you please provide the feedback requested asap? we're short on time for RC... I am happy to provide feedback, but you haven't told me what you want me to provide. I am pretty sure this is not pilot error, but it might be masked by 744088 at this point. I can't cannot confirm that before I can get access to an iso test image. See comment #20. As I understand your description of your previous test, you installed the bootloader to the first partition on the disk, not to the MBR, so the disk was left with no MBR bootloader; naturally you can't boot from it. What I'd like you to do is do a test which doesn't hit that bug so we can tell whether anything is actually broken. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers I already said that 744088 is fixed in TC2, so just test with TC2. http://dl.fedoraproject.org/pub/alt/stage/16.TC2/ It works if I install to a regular disk partition, but I haven't been able to install to the MBR as explained above. This is the first I have heard of TC2, so I will pull that now and report back as soon as I get it down. Ok I retested using TC2. The results are mixed, but less bad than before: 1) If I boot the installer and let Anaconda do auto-partitioning onto the raid device, grub installs and boots ok. 2) If I do custom partitioning, it fails with grub saying the size of the ELF header of stage1 is wrong, or similar. The custom partitioning including creating the magic boot partition that isn't referenced anywhere, but adamw told me about on irc (I never needed this when I installed anything else on this system, so at least having a slightly more informative help/error message would be kinda good). I didn't save the logs from the failed boot, but I can try to reproduce it later and save them. This might be good enough for release, even if it isn't ideal. Jes I tried running a few more tests on this, and I cannot reproduce the issue with grub2 getting into a weird state post install, at least for now. It may have to do with using pre-created partitions at the initial install. However I wiped the partition table since then, so reproducing the exact same scenario will be hard. It seems to work for most cases with the latest fixes in place, so I recommend we don't keep this is a block for F16. Worth noting that TC2 pretty much hangs solid every time, upon reboot once the install has finished. Jes Reporter has requested this be un-proposed, so un-proposing. I am using IMSM on the boot drive. My typical partitioning is as follows: partition boot partition boot2 logical volume lv0/root logical volume lv0/root2 logical volume lv0/home logical volume lv0/opt Then, I alternate boot and root partitions with each new installation. So I might use boot and root for F15 and LVs boot2 and root2 for F16. This way, I don't have to restore my home directory from backup. Should I expect problems when if I install F16 and keep this custom partitioning? Will it work if I install grub2 to the boot partition instead of the MBR? By the way, I don't really understand statement 2) in Comment #27. from what we know right now I'd expect it to most likely work, but we really don't have a huge amount of data. the only problem you might hit would be https://bugzilla.redhat.com/show_bug.cgi?id=737508 ; you might want to check the alignment of the first boot partition, I guess. But even if you hit that it should be workaround-able. But really, we only have maybe 5-6 different reports from IMSM so far. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Was this really fixed for f16? Or is it still a problem in f17? I just tried a clean install of F17 using the Anaconda GUI install and it fails to install the bootloader on my Intel RAID Mirror 1. What information can I provide to help fix this bug? I used custom partitioning and have Windows 7 already installed. I tried installing GRUB2 via the LiveCD: [root@localhost liveuser]# mdadm --misc --detail /dev/md126 /dev/md126: Container : /dev/md127, member 0 Raid Level : raid1 Array Size : 488383488 (465.76 GiB 500.10 GB) Used Dev Size : 488383620 (465.76 GiB 500.10 GB) Raid Devices : 2 Total Devices : 2 State : clean, resyncing Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Resync Status : 22% complete UUID : a55f0c47:51511d71:fbb9100b:b6f83315 Number Major Minor RaidDevice State 1 8 0 0 active sync /dev/sda 0 8 16 1 active sync /dev/sdb [root@localhost liveuser]# grub2-install /dev/md126 /usr/share/grub/grub-mkconfig_lib: line 53: 1790 Segmentation fault (core dumped) "${grub_probe}" -t fs "$path" > /dev/null 2>&1 Path `/boot/grub2' is not readable by GRUB on boot. Installation is impossible. Aborting. That looks like a different problem than what is tracked here. It is probably a bug that has been fixed in http://koji.fedoraproject.org/koji/buildinfo?buildID=322368 - please give that a try and file a new issue if you see the same problem with that version. Thanks, Mads. I updated - currently have grub2-2.0.0.beta4 and I get the following error: [root@localhost /]# grub2-install /dev/md126 /usr/sbin/grub2-bios-setup: warning: disk isn't LDM. /usr/sbin/grub2-bios-setup: warning: Embedding is not possible. GRUB can only be installed in this setup by using blocklists. However, blocklists are UNRELIABLE and their use is discouraged.. /usr/sbin/grub2-bios-setup: error: will not proceed with blocklists. I used the folowing guide www.webtechquery.com/index.php/2010/04/install-grub2-from-live-cd//usr to try an install grub2 Will beta6 be in one of the nine fedora repos and fix the above problem? Or is this a user (my) error? Thanks, Matt (In reply to comment #38) Please clarify: Is that with beta4 or beta6? You say Intel raid - it looks more like software raid to me. I am not uptodate with raid, but AFAIK you shouldn't install to the raid device but to each of the disks. A raid that spans whole disks might however not leave any room for installing a boot loader and is thus not a good idea for a bootable disks. Please research this elsewhere - I am probably wrong. Yes, this looks like a user error - or at least an error unrelated to the issue reported here. That was with beta4. I updated grub2 using rawhide and tried beta5 as well - same error. It's Intel RAID - maybe it is getting picked up wrongly. P55 chipset. Thanks for the info and feedback, it's been helpful. I'll see what I can do from here on out. Mads: Intel firmware RAID uses mdraid. (In reply to comment #38) > I updated - currently have grub2-2.0.0.beta4 and I get the following error: > > [root@localhost /]# grub2-install /dev/md126 > /usr/sbin/grub2-bios-setup: warning: disk isn't LDM. This is bad. It's basically an assert failure. For some reason GRUB thinks that you use LDM and then it sees that it's not the case. Can you try > grub2-install --debug /dev/md126 *** Bug 832872 has been marked as a duplicate of this bug. *** This message is a reminder that Fedora 16 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 16. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '16'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 16's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 16 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |