Bug 533545 - Fedora 11 preupgrade to F12/rawhide destroys grub on raid (warning about grub on RAID not displayed)
Summary: Fedora 11 preupgrade to F12/rawhide destroys grub on raid (warning about grub...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: preupgrade
Version: 11
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Seth Vidal
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: https://fedoraproject.org/wiki/Common...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-11-07 06:45 UTC by Justin Newman
Modified: 2014-01-21 23:12 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2010-06-28 15:24:45 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Justin Newman 2009-11-07 06:45:40 UTC
Description of problem: I ran Fedora 11's preupgrade to rawhide/Fedora 12. After finishing it prompted me to reboot, at which point I rebooted and got nothing. No grub errors at all, just a blinking cursor as if the disk were blank. I tried booting from all four drives to no avail. After booting into the Fedora 11 live disk I was able to repair grub, however my raid1 on /boot would not assemble (bad superblock on /dev/sdc1) so I had to assemble it degraded.

After repairing grub my system was able to boot, but all three of my raid arrays were missing a partition (and all from different disks). The degraded arrays are currently in the process of rebuilding.

I now have a /boot/upgrade folder, but the system never rebooted into Anaconda and my menu.list remains unchanged.

My partition layout is a bit complicated, which I believe may have caused the issue, and is as follows:

/dev/sda
- /dev/sda1 - swap
- /dev/sda2 - md2

/dev/sdb
- /dev/sdb1 - swap
- /dev/sdb2 - md2

/dev/sdc
- /dev/sdc1 - md0
- /dev/sdc2 - md2
- /dev/sdc3 - md1
- /dev/sdc5 - swap

/dev/sdd
- /dev/sdd1 - md0
- /dev/sdd2 - md2
- /dev/sdd3 - md1
- /dev/sdd5 - swap

/dev/md0, raid1 --> /boot
/dev/md1, raid1 --> LVM1 --> /, /home
/dev/md2, raid5 --> LVM2 --> /mnt/storage

GRUB was installed on /dev/sdc and /dev/sdd, the drives with the /boot array, which were the first two in my boot order. After the reboot the following were removed from their array:

/dev/md0 --> /dev/sdd1
/dev/md1 --> /dev/sdc3
/dev/md2 --> /dev/sda2

As you can see, all of the missing partitions are from different disks. Also, all of the drives test OK. Also interesting is that /dev/sdd1 is the *not* the one that supposedly had a bad superblock in the F11 live disk.

Version-Release number of selected component (if applicable): preupgrade-1.1.2-1.fc11.noarch


How reproducible: Umm.... shrug? I suppose the best way to try would be to duplicate my partition layout and attempt to run preupgrade. I haven't been brave enough to attempt the upgrade again.


Actual results: Nearly destroys both GRUB and my raid arrays


Expected results: Upgrades to F12?

Comment 1 Justin Newman 2009-11-07 15:52:56 UTC
After looking at the following bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=444497
https://bugzilla.redhat.com/show_bug.cgi?id=500004
https://bugzilla.redhat.com/show_bug.cgi?id=504826

I've began to understand what happened better. From the comments I gather that preupgrade on raid *should* work now, though perhaps not since two of the bugs are still open/assigned.

Either way, if I understand correctly stage1 (initrd.img) should have booted no matter what my raid situation was like and the erasing of the MBR is something new, so I'm leaving this report open at the moment.

It may also be worth noting that my LVM volumes (root, /home, and /mnt/storage) are also LUKS encrypted volumes.

Comment 2 Adam Williamson 2009-11-08 07:26:13 UTC
Liam, can you see if you can reproduce this during your installer testing? I tried to reproduce in a VM but it hung while installing F11...

CC'ing jlaska for info.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 3 Adam Williamson 2009-11-09 16:50:59 UTC
Will Woods states that preupgrade expressly doesn't support the /boot on RAID case. In fact, it's supposed to detect if you have this set up and refuse to run, so if anything, that's the real bug here. That code would be in F11, so adjusting this bug to be against 11 and dropping from F12 Blocker list, as the fix doesn't actually go into F12 and can go as an F11 update.

Is this a BIOS RAID set, using dmraid rather than mdraid?

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 4 Justin Newman 2009-11-09 18:12:09 UTC
No, this is mdraid. It is the F11 version of preupgrade, of course, but since that's what you use to upgrade to F12 it made sense to me for it to be a F12 blocker. It's obviously going to affect the F12 release.

Comment 5 seth vidal 2009-11-09 18:25:53 UTC
what version of preupgrade is it specifically?

rpm -q preupgrade

Comment 6 Adam Williamson 2009-11-09 18:27:52 UTC
justin: but it would get fixed in f11, not f12, so we don't have to block the f12 compose and related processes for it. Will has tested in his test VM and the warning came up correctly, so either you're using an old preupgrade, or there's something specific about your case which caused the test to fail.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 7 Justin Newman 2009-11-09 18:41:41 UTC
Of course it's not F12 code and needs to be fixed in F11. It shouldn't interrupt the compose, etc for F12, but it *should* be fixed before GA.

Also, to be clear, I *did* get a warning, but the warning was just that it would pull the needed package after reboot and that a network connection would be required. It didn't refuse to do anything and in fact indicated that it would work fine.

The initial report states version preupgrade-1.1.2-1.fc11.noarch, which is what 'rpm -q preupgrade' returns. Also, it seems to be the newest version on both Koji and the mirrors.

I'm just concerned about the F12 release being as smooth as possible as I've already gotten my system upgraded with the DVD installer. :)

Comment 8 James Laska 2009-11-11 20:03:12 UTC
Tagging for addition to the Common_F12_Bugs wiki page

Comment 9 Will Woods 2009-11-11 20:36:52 UTC
Interestingly, I can prevent this from happening by running:
  sudo grub-install /dev/md1
(where md1 is my /boot partition).

Since we turned off printing "GRUB" at grub startup in either F10 or F11, I realize this might be the same root cause as bug 450143 (which is sort of a trainwreck now, but you get the idea). 

Here's the difference. Before rewriting the boot sector, running 'file' on the disk device yields:

x86 boot sector; GRand Unified Bootloader, stage1 version 0x3, boot drive 0x80, 1st sector stage2 0x5dc41, GRUB version 0.94; ...

Afterward it says:

x86 boot sector; GRand Unified Bootloader, stage1 version 0x3, stage2 address 0x2000, stage2 segment 0x200, GRUB version 0.94; ...

Notice that the method used to find stage2 seems to have changed. The latter will happily boot after running preupgrade; the former will get stuck in GRUB.

Comment 10 Adam Williamson 2009-11-17 07:17:34 UTC
it doesn't seem right to add this to common_bugs if the fixed preupgrade will be in F11 and F10 tomorrow. is that going to happen, will?

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 11 James Laska 2009-11-17 13:23:33 UTC
(In reply to comment #10)
> it doesn't seem right to add this to common_bugs if the fixed preupgrade will
> be in F11 and F10 tomorrow. is that going to happen, will?

I don't see harm in having it there since people will likely not read the instructions for preupgrade and will perform the update without the latest version.

Comment 12 Will Woods 2009-11-17 14:48:45 UTC
1) the fixed preupgrade is still in updates-testing, not updates,
2) even if it was a stable update, it might be worth leaving a note that says "update preupgrade before you upgrade, for pete's sake, and
3) the actual "writing to grub.conf kills grub" bug is extremely hard to reproduce or trace and therefore remains undiagnosed and unfixed. The traceback when /boot is on RAID is fixed, though.

Comment 13 Bug Zapper 2010-04-28 11:09:03 UTC
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 14 Bug Zapper 2010-06-28 15:24:45 UTC
Fedora 11 changed to end-of-life (EOL) status on 2010-06-25. Fedora 11 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.