Bug 533621
Summary: | Can't Boot After F12 b2 DVD Upgrade on sytem with RAID1 /boot | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Justin Newman <eqisow> | ||||||
Component: | anaconda | Assignee: | Radek Vykydal <rvykydal> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 12 | CC: | awilliam, ddumas, eqisow, jlaska, lili, vanmeeuwen+fedora | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | anaconda-13.9-1 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2010-02-23 19:51:35 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Justin Newman
2009-11-07 22:37:08 UTC
there is no 'beta 2', can you be more specific what you tested with? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Sorry, my mistake. I don't know how I got the two stuck in my head; was just the beta. http://mirrors.kernel.org/fedora/releases/test/12-Beta/Fedora/x86_64/iso/Fedora-12-Beta-x86_64-netinst.iso Liam, can you please check whether you can reproduce this, during your installer testing? Thanks. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers looks like this may be the same as 533533. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers more questions: is this hardware, BIOS or software RAID? what type exactly? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Justin, could you describe how you remapped the devices and where grub was installed in F11 - boot partition or mbr (F11 anaconda would install it in mbr for both boot partition and mbr choices, but you may have fixed it manually)? These log files from the upgraded machine can be helpful: /boot/grub/grub.conf /boot/grub/device.map /root/anaconda-ks.cfg /var/log/program.log /var/log/storage.log /var/log/anaconda.log /var/log/anaconda.syslog /etc/sysconfig/grub Also some more detailed description of your configuration could be. Isn't it that of bug #533545? Created attachment 368266 [details] requested logs and configs To answer the first question, this is software raid with mdadm. Sure, to fix it afterwards I boot into a live disk, assembled the /boot raid and mounted it under /boot in the live system (so I didn't have to bother assembling my root fs). Then 'grub-install --recheck /dev/sdc'. Without the recheck parameter I received the error, "does not have any corresponding BIOS drive." I thought that was interesting, because in the case of bug #533545 the --recheck parameter was not necessary to fix it. (Short answer, it was installed on the MBR of sdc) Yes, the system configuration is the same as bug #533545. Requested files are attached. Thanks for the logs, I can read this from them: F11 was installed with this driveorder specified: sdc,sdd,sda,sdb,sde,sdf, grub was installed into mbr of sdc, which was mapped to (hd0). Anaconda generated /boot/grub/device.map containing this mapping. Present /boot/grub/device.map is different: (fd0) /dev/fd0 (hd0) /dev/sda (hd1) /dev/sdb (hd2) /dev/sdc (hd3) /dev/sdd It maps /dev/sdc to (hd2), and was probably created as a result of "grub-install --recheck /dev/sdc". Need for using --recheck option suggests that former device.map was not valid. The log from upgrade says: and grub update failed using mapping /dev/sdc to (hd2): GNU GRUB version 0.97 (640K lower / 3072K upper memory) [ Minimal BASH-like line editing is supported. For the first word, TAB lists possible command completions. Anywhere else TAB lists the possible completions of a device/filename.] grub> root (hd2,0) Error 21: Selected disk does not exist grub> install --stage2=/boot/grub/stage2 /grub/stage1 d (hd2) /grub/stage2 p (hd2,0)/grub/grub.conf Error 12: Invalid device requested grub> Anaconda read from /etc/sysconfig/grub that grub (stage1) was installed on /dev/sdc which, according to driveorder detected by anaconda during upgrade is mapped to (hd2). It seems like the grub shell from log above was using device.map generated when installing F11 (with driveorder sdc,sda,sdb,sde,sdf) where (hd2) would correspond to sdb. Anaconda should have updated original device.map if it didn't correspond to detected order, but I can't say if it really happened correctly. Could you collect /boot/grub/device.map.backup and /boot/grub/device.map.rpmsave from your system if they are present? They should give me the information. Created attachment 368398 [details]
rpmsave and backup devicemap
Now I see - the problem is that original F11 install device.map (/boot/grub/device.map.rpmsave) is: # this device map was generated by anaconda (hd0) /dev/sdc (hd1) /dev/sdd Anaconda puts only devices used during grub install into it. During upgrade, only grub values already present in file are updated, so I imagine the contents of updated file would be something like this (/boot/grub/device.map.backup contains garbage, or is binary?): # file updated by anaconda # this device map was generated by anaconda (hd0) /dev/sda (hd1) /dev/sdb /dev/sdc is missing - I'll come with a patch fixing update of device.map that should solve the issue. Radek, is this an issue that will hit many people, or depends on having a complex disk layout? Is it an F12 regression or would it have been the same in F11? Just trying to assess its blocker-ness. thanks! -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers So, Hans de Goede states that this will not hit many people; possibly several who have /boot on software RAID, but that is not a huge amount, and not all of them. He also states it's probably not a regression since F11. Given the above, I think we can not consider it a blocker for F12 release. Will check with jkeating/notting/jlaska etc what they think. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers (In reply to comment #12) > Given the above, > I think we can not consider it a blocker for F12 release. Will check with > jkeating/notting/jlaska etc what they think. Nice job getting to the bottom of this. I don't have any objections, let's toss it on Common_F12_Bugs for the users that will hit it. for that I'd need an explanation I can understand =) do you get the explanation, jlaska? If not, could a kind anaconda person provide an explanation ratcheted down to a level a dumb bug monkey can understand? thanks :) -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers (In reply to comment #14) > for that I'd need an explanation I can understand =) do you get the > explanation, jlaska? If not, could a kind anaconda person provide an > explanation ratcheted down to a level a dumb bug monkey can understand? thanks > :) Same boat for me. I've got a queue of entries to write, so I'll be happy to take this one. Radek ... can you help outline the problem so that we can document this issue including details on: * how someone can determine if they're affected by this * and how to workaround the issue (In reply to comment #15) > > Radek ... can you help outline the problem so that we can document this issue > including details on: As Hans said, this bug is not a regression, but it is independent of having /boot on mdraid, the cause of the bug is change of driveorder between install and upgrade (for specific configurations). * Simple reproducer: 1) Install F11 on machine with 2 drives, putting /boot partition on second disk (BIOS drive order) and bootloader in MBR of the second disk too. This can be achieved for example by specifiing something like --driveorder=sdb,sda in ks. 2) Upgrade to F12. * The cause and explanation: Incomplete device.map file is the cause. The file is generated by anaconda during install and contains only info for drives used for grub - that is for drives containing /boot partition, drives where grub stage1 will go to, and drives containing chainloaded bootloaders - so in case of the reproducer, it contains only "(hd0) /dev/sdb". When upgrading grub, if the detected driveorder is different (we can't take into account driveorder information from installation), the records are updated, so (hd0) becomes /dev/sda in reproducer. Anaconda wants to upgrade grub which is in /dev/sdb (this info is stored in /etc/sysconfig/grub), but there is no record for /dev/sdb in devices.map and so it fails. * Fix: Generate more complete device.map during upgrade. > * how someone can determine if they're affected by this Upgraded system won't boot (probably just cursor on screen, or message from description), and when investigating it from rescue mode, /mnt/sysimage/boot/grub/device.map file will not contain <DEVICE> which is in file /mnt/sysimage/etc/sysconfig/grub on line boot=<DEVICE>. > * and how to workaround the issue chroot /mnt/sysimage grub-install --recheck <DEVICE> Also, the symptom of this bug is that just "grub-install <DEVICE>" without --recheck option would give "<DEVICE> does not have any corresponding BIOS drive." error message. This should be fixed in version 13.9 of anaconda. |