Bug 1096358

Summary: System fails to boot with "unaligned pointer" error after live install
Product: [Fedora] Fedora Reporter: Gene Czarcinski <gczarcinski>
Component: grub2Assignee: Peter Jones <pjones>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rawhideCC: anaconda-maint-list, awilliam, bcl, dennis, g.kaviyarasu, jonathan, lkundrak, mads, pjones, robatino, vanmeeuwen+fedora
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-05-22 14:51:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1043119    
Attachments:
Description Flags
screenshot of virtual system failing to boot
none
pkglist in lxde-livecd
none
kickstart file used to build the livecd iso
none
program.log from a live install with grubby debugging enabled
none
strace output from grub2-install on an affected system
none
strace output from grub2-install on a non-affected system none

Description Gene Czarcinski 2014-05-09 18:06:28 UTC
Created attachment 894140 [details]
screenshot of virtual system failing to boot

Created a new livecd-lxde based on the "fresh as of 20140509 rawhide" [the ks and pkglist are attached]

booted up on qemu-kvm virtual system and ran /usr/bin/liveinst

selected custom partitioning, deleted everything and did auto btrfs config

installed and then rebooted ... oops

screenshot attached

Comment 1 Gene Czarcinski 2014-05-09 18:07:37 UTC
Created attachment 894143 [details]
pkglist in lxde-livecd

Comment 2 Gene Czarcinski 2014-05-09 18:10:01 UTC
Created attachment 894145 [details]
kickstart file used to build the livecd iso

I will keep this virtual system around a while in case you need something off it.

I believe you should be able to re-create the problem as it happens everytime for me.

Comment 3 Adam Williamson 2014-05-13 00:22:55 UTC
satellit (Thomas Gilliard) has reported seeing this too, on VMs and bare metal: https://fedoraproject.org/wiki/Test_Results:Fedora_21_Rawhide_2014_05_Install#Live_image . I confirm seeing it in a Rawhide-on-Rawhide VM (using a live image composed with today's Rawhide).

Nominating as an Alpha blocker, criterion https://fedoraproject.org/wiki/Fedora_21_Alpha_Release_Criteria#Expected_installed_system_boot_behavior - "A system installed with a release-blocking desktop must boot to a log in screen where it is possible to log in to a working desktop using a user account created during installation or a 'first boot' utility." It doesn't seem to do so, at least when installing from a live image.

Comment 4 Gene Czarcinski 2014-05-13 02:17:34 UTC
1.  After the install, running grub2-install and grub2-mkconfig results inm a bootable system ... but should not need to do that.

2. Before being "fixed", the /boot/grub2/grug.cfg file is strange ... the rescue entry is first and the regular kernel is second ... never seen that before.

Comment 5 Gene Czarcinski 2014-05-13 02:22:18 UTC
If the grubby package is one where my patch is included and DEBUG is enabled (or even if my pach is not included but DEBUG is enabled), we should be able to see if grubby is involved since the messages it spues are captured in anaconda logs.

Comment 6 Adam Williamson 2014-05-13 02:42:49 UTC
I have an image with a debug-enabled grubby here, I'll try and attach the relevant anaconda logs later tonight.

Comment 7 Adam Williamson 2014-05-13 06:40:18 UTC
Created attachment 895010 [details]
program.log from a live install with grubby debugging enabled

Here's program.log from an install affected by the bug, with grubby debugging enabled. This is with a grubby build with Gene's patch for https://bugzilla.redhat.com/show_bug.cgi?id=1094489 applied, but I don't believe that patch is relevant here, reports indicate the bug's affected stock images too - just attaching this for the debug output.

Comment 8 Adam Williamson 2014-05-20 21:21:33 UTC
pjones has been working on this, but it doesn't look like an easy fix at all. We're currently having trouble identifying the problem.

just for the record, here's some of the data we've found on affected / not affected configurations:

1) No non-live images seem to be affected by this.
2) Builds from both 'fedora-livecd-desktop.ks' (the old "Desktop" kickstart) and 'fedora-live-workstation.ks' (the new "Workstation" kickstart) seem to be affected.
3) SoaS images do *not* appear to be affected: grub2 runs successfully after an install from Fedora-Live-SoaS-x86_64-rawhide-20140520.iso , but not from Fedora-Live-Workstation-x86_64-rawhide-20140520.iso .
4) Not currently sure about KDE, Xfce etc.
5) the issue showed up some time between 2014-05-05 and 2014-05-07, according to satellit's testing.

Comment 9 Adam Williamson 2014-05-20 21:27:38 UTC
on 2014-05-05, there was a new kernel build: kernel-3.15.0-0.rc3.git5.3.fc21 . On 2014-05-06, there was a new anaconda build: anaconda-21.35-1.fc21 . Possibly significant changelog entry: "install: Move Payload postInstall() after bootloader (walters)". There was another new kernel build: kernel-3.15.0-0.rc4.git0.1.fc21 . On 2014-05-07, there was another new kernel build: kernel-3.15.0-0.rc4.git1.1.fc21 . there was also a new gcc - gcc-4.9.0-3.fc21 - but I don't believe the new kernel would've been built with the new gcc.

Comment 10 Adam Williamson 2014-05-21 02:11:32 UTC
also affects Fedora-Live-KDE-x86_64-rawhide-20140520.iso .

Comment 11 Adam Williamson 2014-05-21 06:15:55 UTC
Created attachment 897840 [details]
strace output from grub2-install on an affected system

As requested by pjones, this is the output of:

strace -s64 -v -f grub2-install /dev/sda > grub2-install.strace 2>&1

from rescue mode on an affected system (installed from 05-20 KDE live), after chrooting and deleting /boot/grub2/i386-pc and /boot/grub2/grubenv . the bug is still present after doing this. Will attach similar output from a non-affected install (SoaS) for comparison.

Comment 12 Adam Williamson 2014-05-21 06:43:06 UTC
Created attachment 897842 [details]
strace output from grub2-install on a non-affected system

Here's the same output from a non-affected case (an install of the 05-20 SoaS image).

Comment 13 Adam Williamson 2014-05-22 00:16:11 UTC
http://koji.fedoraproject.org/koji/buildinfo?buildID=518131 seems to resolve this, for me. Built a Workstation-ish live image to test, boot of installed system succeeeds. Also tested an SoaS-ish live image still works.