Bug 494995 - Aborting upgrade of F10 to F11-beta leaves system unbootable
Aborting upgrade of F10 to F11-beta leaves system unbootable
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: anaconda (Show other bugs)
rawhide
i686 Linux
low Severity high
: ---
: ---
Assigned To: Anaconda Maintenance Team
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-04-08 20:05 EDT by Dick Franks
Modified: 2009-05-01 17:06 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-01 17:06:32 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Dick Franks 2009-04-08 20:05:54 EDT
Description of problem:

Upgrade of D610 running kernel 2.6.27.21-170.2.56.fc10.i686 using Fedora-11-Beta-i386 DVD ISO misidentifies system architecture as i586.

Loader has already been destroyed before user is invited to abort upgrade process.


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Boot F11-Beta-i386 DVD
2. Follow upgrade existing system route
3. After error message:
"The arch of the release of Fedora you are upgrading to appears to be i586 which does not match your previously installed arch of i686. This is likely to not succeed. Are you sure you wish to continue the upgrade process?"
Answer "No" to abort process.

4. Remove DVD
5. Reboot system


Expected results:

Original F10 system should remain undamaged until user commits to install.


Actual results:

F10 loader destroyed.
System attempts to start but never reaches a login prompt.
Get [FAIL] for large number of init scripts.


Additional info:

Recover using Fedora-10-i386 DVD, selecting "Upgrade existing system" and "Install new loader". Recover /boot/grub/grub.conf from grub.conf.rpmsave to run latest kernel.
Comment 1 Chris Lumens 2009-04-09 11:15:36 EDT
Can you try this with rawhide and see if you're still experiencing a problem?  A quick test here did not reproduce your issue, and we do not do any bootloader writing until after you would have gotten the upgrade arch message.

Incidentally, the warning you're getting there is essentially harmelss for you and can safely be answered "yes" to.
Comment 2 Jerry Amundson 2009-04-29 13:02:16 EDT
(In reply to comment #0)
> Steps to Reproduce:
> 1. Boot F11-Beta-i386 DVD
> 2. Follow upgrade existing system route

Did you, by chance, ext4migrate, and switch / to ext4?
Comment 3 Jerry Amundson 2009-04-29 17:09:57 EDT
The problem (for me) is that by this time /etc/fstab has already been modified with "ext4" for the selected partitions, but with the abort no migration is done to the actual partition.

I suggest this as a blocker.
Comment 4 Chris Lumens 2009-04-29 17:17:00 EDT
Considering you have to use a special hidden parameter to get support for migrate, therefore making it not available by default, I don't see how that could be a blocker for this release.  If we still have weird behavior on whatever release where we enable fs migration by default, then yes this could be a blocker candidate there.
Comment 5 Jerry Amundson 2009-04-29 17:29:19 EDT
(In reply to comment #4)
> Considering you have to use a special hidden parameter to get support for
> migrate, therefore making it not available by default, I don't see how that
> could be a blocker for this release.  If we still have weird behavior on
> whatever release where we enable fs migration by default, then yes this could
> be a blocker candidate there.  

Heh, you are right, of course. My brain forgot that important little word I had typed. :)

FWIW, I recovered by booting Repair, "vi /mnt/sysimage/etc/fstab" and changing ext4 to ext3.
Comment 6 Dick Franks 2009-05-01 10:08:31 EDT
NOTE:My original complaint is about changes being made to system BEFORE I gave permission.

Response to #1 request for info:

Problem still exists in F11-Preview-i386 DVD ISO. Quickest way to reproduce is:

1: Boot F10-i686-Live on Dell Latitude D610 with new disk.

2: Click "Install to Hard Drive" on desktop.

3: Take all default options (except for UK keyboard).

4: Reboot and check system runs ok.

5: Boot F11-Preview-i386 DVD.

6: Select "Upgrade existing system".

7: Select "Update Boot Loader"

8: At error message (arch mismatch), answer "No" to abort upgrade.

In my view, no change should have been made to user's disk at this point. This is clearly not the case. 

9: Reboot (F10) system.

System appears to boot then hangs. Sometimes a flood of messages appear, too fast to read. 
.
.
init: prefdm main process (2054) terminated with status 1
init: prefdm main process ended, respawning
init: tty4 respawning too fast, stopped
init: tty5 respawning too fast, stopped
init: tty2 respawning too fast, stopped
init: tty3 respawning too fast, stopped
init: tty6 respawning too fast, stopped
init: prefdm main process (2074) terminated with status 1
init: prefdm main process ended, respawning
init: tty4 respawning too fast, stopped
init: tty5 respawning too fast, stopped
init: tty2 respawning too fast, stopped
init: tty3 respawning too fast, stopped
init: tty6 respawning too fast, stopped
init: prefdm main process (2096) terminated with status 1
init: prefdm main process ended, respawning
init: tty4 respawning too fast, stopped
init: tty5 respawning too fast, stopped
init: tty2 respawning too fast, stopped
init: tty3 respawning too fast, stopped
init: tty6 respawning too fast, stopped
init: prefdm main process (2096) terminated with status 1
init: prefdm main process ended, respawning
init: prefdm respawning too fast, stopped


Getting into this mess without having made any non-default choice is unacceptable. For humble end-user this is a catastophe!
Comment 7 Chris Lumens 2009-05-01 10:26:55 EDT
Okay, I see what the problem is now.  There are two (!) checks for whether the repo you're upgrading with is the right arch, and I had only noticed the first one.  The second one is the one you are running into, and that happens after the bootloader screen has been presented.  Still, I don't see how it's possible for the bootloader config to be written until much later in the process.  What are the differences between your original grub.conf and the new one?
Comment 8 Dick Franks 2009-05-01 10:57:28 EDT
(In reply to comment #7)
> Okay, I see what the problem is now.  There are two (!) checks for whether the
> repo you're upgrading with is the right arch, and I had only noticed the first
> one.  The second one is the one you are running into, and that happens after
> the bootloader screen has been presented.  Still, I don't see how it's possible
> for the bootloader config to be written until much later in the process.  What
> are the differences between your original grub.conf and the new one?  

Focus here is not on architecture, this was addressed in Bug#494990

Suggest you read (and possibly do) steps listed in #6
Comment 9 Dick Franks 2009-05-01 10:58:09 EDT
(In reply to comment #3)
> The problem (for me) is that by this time /etc/fstab has already been modified
> with "ext4" for the selected partitions, but with the abort no migration is
> done to the actual partition.
> I suggest this as a blocker.  

It is!

Messing with /etc/fstab before user commits to upgrade is wrong, with or
without magic hidden parameters!

Robust solution would be for Anaconda to mount existing partitions read-only
whilst gathering system information.

After user commits, remount system partitions read-write to perform upgrade.

If things go pear-shaped after that, user has to accept the consequences.
Comment 10 Will Woods 2009-05-01 14:06:03 EDT
Reproduced here, as per comment #6. As far as I can tell /etc/fstab is unmodified. It has the same contents and timestamp as it did before I started the F11 install.

Reading the boot failures more carefully I'm nearly certain this is an SELinux problem - most of the failures are due to denials while trying to read libraries etc. 

Booting F10 after creating '/.autorelabel' fixes the problem. Booting with 'autorelabel' should do the same thing.

So, something in the upgrade process before that (second) arch check is causing some SELinux labels to be modified. If we can't easily avoid that, creating '/.autorelabel' immediately after the r/w mount - and removing it at the end of the install - is a possible workaround.
Comment 11 Chris Lumens 2009-05-01 17:06:32 EDT
Yup, contexts were being set incorrectly on mount.  I have committed a patch for this and it will be fixed in the next build of anaconda.  Thanks for the bug report and help debugging.

Note You need to log in before you can comment on or make changes to this bug.