Bug 470905

Summary: anaconda installs the wrong kernel for i686 xen guests
Product: [Fedora] Fedora Reporter: John Poelstra <poelstra>
Component: anacondaAssignee: Anaconda Maintenance Team <anaconda-maint-list>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: anaconda-maint-list, berrange, bill-bugzilla.redhat.com, clalance, dcantrell, markmc, mishu, stickster, tcallawa, wwoods, xen-maint
Target Milestone: ---Flags: stickster: fedora_requires_release_note?
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-03-24 14:18:21 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 438944, 480593    
Description Flags
virt-manager.log none

Description John Poelstra 2008-11-10 15:10:48 EST
Description of problem:
After successfully installing today's rawhide (Fedora 10) on a Xen guest and rebooting, the guest will not start and virt-manager returns a traceback.

Version-Release number of selected component (if applicable):
$ rpm -qa | egrep 'xen|virt' | sort

$ uname -a
Linux screamer 2.6.18-122.el5xen #1 SMP Mon Nov 3 18:49:46 EST 2008 i686 i686 i386 GNU/Linux

Additional info:
Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/engine.py", line 514, in run_domain
  File "/usr/share/virt-manager/virtManager/domain.py", line 379, in startup
  File "/usr/lib/python2.4/site-packages/libvirt.py", line 228, in create
    if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self)
libvirtError: virDomainCreate() failed POST operation failed: (xend.err "Error creating domain: (2, 'Invalid kernel', 'elf_xen_note_check: ERROR: Will only load images built for the generic loader or Linux images')")
Comment 1 John Poelstra 2008-11-10 17:55:12 EST
FWIW... a guest install of the latest RHEL5.3 beta installs and boots fine
Comment 2 Daniel Berrange 2008-11-11 05:16:14 EST
Please provide the log files for

Comment 3 John Poelstra 2008-11-11 12:41:24 EST
Created attachment 323205 [details]
Comment 4 John Poelstra 2008-11-11 12:41:52 EST
Created attachment 323206 [details]
Comment 5 John Poelstra 2008-11-11 12:42:20 EST
Created attachment 323207 [details]
Comment 6 John Poelstra 2008-11-11 12:42:48 EST
Logs added.  Curious.... could this be a Fedora bug?
Comment 7 Chris Lalancette 2008-11-17 10:04:31 EST
Well, what it seems like is that the wrong kernel was installed for some reason.  I've definitely installed F-10 guests on RHEL-5.3 before, so something weird is going on.  This has the hallmarks of installing the non-PAE kernel inside the guest, which doesn't, I believe, have pv_ops turned on.  John, can you mount the guest disk loopback and find out which kernel anaconda installed?

Chris Lalancette
Comment 8 Mark McLoughlin 2008-11-17 10:10:44 EST
Yeah, that's the error you get if you try and boot a non-PAE i386 fedora kernel in a xen guest. See also bug #471268
Comment 9 John Poelstra 2008-11-18 12:58:35 EST
Here is what grub.conf looks like for the guest: 

title Red Hat Enterprise Linux Client (2.6.18-122.el5xen)
        root (hd0,0)
        kernel /vmlinuz-2.6.18-122.el5xen ro root=/dev/VolGroup00/LogVol00 rhgb quiet
        initrd /initrd-2.6.18-122.el5xen.img
Comment 10 John Poelstra 2008-11-18 13:01:05 EST
disregard comment #9 information is from the wrong image
Comment 11 John Poelstra 2008-11-18 13:07:55 EST
Here is grub.conf from F10 guest that was installed on RHEL5.3 dom0

title Fedora (
        root (hd0,0)
        kernel /vmlinuz- ro root=UUID=3dfba3e5-65bc-4466-9499-f3bac2782f86 rhgb quiet
        initrd /initrd-
Comment 12 Chris Lalancette 2008-11-18 13:35:48 EST
OK, yeah.  That does confirm that anaconda chose the wrong kernel.  Odd, because I had done an F-10 install a couple of months ago and it chose the right one.  This is *probably* an anaconda bug against F-10.

Chris Lalancette
Comment 13 Chris Lalancette 2008-11-19 05:08:57 EST
OK.  I'm not exactly sure how the anaconda people would want to fix this, but the problem is in yuminstall.py, selectBestKernel(), here:

        # FIXME: this is a bit of a hack.  we shouldn't hard-code and
        # instead check by provides.  but alas.
        for k in ("kernel", "kernel-smp", "kernel-PAE"):
            if len(self.ayum.tsInfo.matchNaevr(name=k)) > 0:
                self.selectModulePackages(anaconda, k)
                foundkernel = True

        if not foundkernel and (isys.smpAvailable() or isys.htavailable()):
                ksmp = getBestKernelByArch("kernel-smp", self.ayum)
            except PackageSackError:
                ksmp = None
                log.debug("no kernel-smp package")

            if ksmp and ksmp.returnSimple("arch") == kpkg.returnSimple("arch"):
                foundkernel = True
                log.info("selected kernel-smp package for kernel")
                self.selectModulePackages(anaconda, ksmp.name)

                if len(self.ayum.tsInfo.matchNaevr(name="gcc")) > 0:
                    log.debug("selecting kernel-smp-devel")
                    self.selectPackage("kernel-smp-devel.%s" % (kpkg.arch,))

        if not foundkernel and isys.isPaeAvailable():
                kpae = getBestKernelByArch("kernel-PAE", self.ayum)
            except PackageSackError:
                kpae = None
                log.debug("no kernel-PAE package")

            if kpae and kpae.returnSimple("arch") == kpkg.returnSimple("arch"):
                foundkernel = True
                log.info("select kernel-PAE package for kernel")
                self.selectModulePackages(anaconda, kpae.name)

                if len(self.ayum.tsInfo.matchNaevr(name="gcc")) > 0:
                    log.debug("selecting kernel-PAE-devel")
                    self.selectPackage("kernel-PAE-devel.%s" % (kpkg.arch,))

        if not foundkernel:
            log.info("selected kernel package for kernel")
            self.selectModulePackages(anaconda, kpkg.name)

            if len(self.ayum.tsInfo.matchNaevr(name="gcc")) > 0:
                log.debug("selecting kernel-devel")
                self.selectPackage("kernel-devel.%s" % (kpkg.arch,))

Basically, we get out of that first loop without setting foundkernel to True (I'm not entirely sure what that first loop does).  Then we do the isys.smpAvailable() check, which is true, but that fails because there is no kernel-smp packages available.  Next, we do the isys.isPaeAvailable check, but this doesn't fire because we are using < 4G of memory here.  Finally, we fall through to the default, which is just to select "kernel".  This is confirmed by the anaconda logs:

08:58:03 INFO    : moving (1) to step postselection
08:58:03 DEBUG   : no kernel-smp package
08:58:03 INFO    : selected kernel package for kernel

So, one way to fix this (suggested by Mark McLoughlin) would be to fix this by changing:

        if not foundkernel and isys.isPaeAvailable():

to something like:

        if not foundkernel and (isys.isPaeAvailable() or running_PAE_kernel):

Where running_PAE_kernel would be set to True in the case that the anaconda installer is running on a PAE kernel.  We could determine running_PAE_kernel by doing "os.uname()[2]", and looking for the substring PAE.  I'm sure there are other solutions, but nothing is springing to mind at the moment.

Chris Lalancette
Comment 14 Jesse Keating 2008-11-19 11:09:21 EST
What happens if you manually add say kernel-PAE to a kickstart file?  I suspect you'll get the right kernel (or both) and this would be an OKish workaround for F10.
Comment 15 Mark McLoughlin 2008-11-19 11:26:15 EST
(In reply to comment #14)
> What happens if you manually add say kernel-PAE to a kickstart file?  I suspect
> you'll get the right kernel (or both) and this would be an OKish workaround for
> F10.

We haven't tried it yet, but yeah that would probably be our workaround if it doesn't get fixed for F10.
Comment 16 Paul W. Frields 2008-11-19 12:10:12 EST
Should we add a release note for this for the 0-day update?
Comment 17 John Poelstra 2008-11-19 12:29:02 EST
I haven't done a kickstart install in a long time and I think it is an unreasonably high requirement for a "workaround".  This is also a regression from Fedora 9.
Comment 18 Jesse Keating 2008-11-19 12:42:41 EST
Given that you can't run Fedora as a Xen host, which means you need to have $SOMETHING_ELSE installed to run into this, I think it's unreasonable to consider this a blocker bug.  For now, using a kickstart script is likely a workable workaround.  In the near future we could have this fixed with an updates.img

The class of people that are going to have access to a Xen capable setup, but not know how to do a basic kickstart install (especially provided the Fedora and RHEL documentation on doing them) is going to be quite small in my opinion.  I do believe we should have some release notes for this issue, be them notes about using kickstart, using an updates.img, or both.
Comment 19 Will Woods 2008-11-24 17:34:17 EST
Release note added:


As I understand it, this only affects 32-bit Xen hosts, which are far less common than 64-bit hosts. Still, it would probably be good if we had an updates.img for this problem soon.
Comment 20 Bug Zapper 2008-11-26 00:09:08 EST
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
Comment 21 Daniel Berrange 2009-03-24 14:18:21 EDT
Installed a new guest  on RHEL-5 Xen host with:

# virt-install --paravirt --location http://download.fedora.redhat.com/pub/fedora/linux/development/i386/os/  --name f11i686xen --file /var/lib/xen/images/f11i686.img --file-size 5 --vnc --ram 900  --noautoconsole

And post-install the guest has

# uname -r

So anaconda is installing the correct kernel now.
Comment 22 Bill McGonigle 2009-05-26 06:29:55 EDT
If the kernel choices showed up in the Base package selection list the user could just fix this as he goes.  I recently installed a virtual machine with 2GB using HVM and the standard anaconda install method, so I got the non-PAE kernel.  The workaround was to boot the machine again with HVM and then yum install the PAE kernel, adjust the yum timeout and default kernel, and then finally boot the machine PVM.  One side effect of moving Xen to PAE is that one kernel can't boot HVM and PVM anymore, so perhaps if anaconda knew something about the HVM VM's fingerprint it could just install both, anticipating a common usage scenario.