Description of problem: After successfully installing today's rawhide (Fedora 10) on a Xen guest and rebooting, the guest will not start and virt-manager returns a traceback. Version-Release number of selected component (if applicable): $ rpm -qa | egrep 'xen|virt' | sort kernel-xen-2.6.18-121.el5 kernel-xen-2.6.18-122.el5 libvirt-0.3.3-14.el5 libvirt-python-0.3.3-14.el5 python-virtinst-0.300.2-11.el5 virt-manager-0.5.3-10.el5 virt-viewer-0.0.2-2.el5 xen-3.0.3-73.el5 xen-libs-3.0.3-73.el5 $ uname -a Linux screamer 2.6.18-122.el5xen #1 SMP Mon Nov 3 18:49:46 EST 2008 i686 i686 i386 GNU/Linux Additional info: Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/engine.py", line 514, in run_domain vm.startup() File "/usr/share/virt-manager/virtManager/domain.py", line 379, in startup self.vm.create() File "/usr/lib/python2.4/site-packages/libvirt.py", line 228, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirtError: virDomainCreate() failed POST operation failed: (xend.err "Error creating domain: (2, 'Invalid kernel', 'elf_xen_note_check: ERROR: Will only load images built for the generic loader or Linux images')")
FWIW... a guest install of the latest RHEL5.3 beta installs and boots fine
Please provide the log files for /root/.virt-manager/virt-manager.log /var/log/xen/xend.log /var/log/xen/domain-builder-ng.log
Created attachment 323205 [details] domain-builder-ng.log
Created attachment 323206 [details] xend.log
Created attachment 323207 [details] virt-manager.log
Logs added. Curious.... could this be a Fedora bug?
Well, what it seems like is that the wrong kernel was installed for some reason. I've definitely installed F-10 guests on RHEL-5.3 before, so something weird is going on. This has the hallmarks of installing the non-PAE kernel inside the guest, which doesn't, I believe, have pv_ops turned on. John, can you mount the guest disk loopback and find out which kernel anaconda installed? Chris Lalancette
Yeah, that's the error you get if you try and boot a non-PAE i386 fedora kernel in a xen guest. See also bug #471268
Here is what grub.conf looks like for the guest: default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title Red Hat Enterprise Linux Client (2.6.18-122.el5xen) root (hd0,0) kernel /vmlinuz-2.6.18-122.el5xen ro root=/dev/VolGroup00/LogVol00 rhgb quiet initrd /initrd-2.6.18-122.el5xen.img
disregard comment #9 information is from the wrong image
Here is grub.conf from F10 guest that was installed on RHEL5.3 dom0 default=0 timeout=0 chaintimeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title Fedora (2.6.27.4-79.fc10.i686) root (hd0,0) kernel /vmlinuz-2.6.27.4-79.fc10.i686 ro root=UUID=3dfba3e5-65bc-4466-9499-f3bac2782f86 rhgb quiet initrd /initrd-2.6.27.4-79.fc10.i686.img
OK, yeah. That does confirm that anaconda chose the wrong kernel. Odd, because I had done an F-10 install a couple of months ago and it chose the right one. This is *probably* an anaconda bug against F-10. Chris Lalancette
OK. I'm not exactly sure how the anaconda people would want to fix this, but the problem is in yuminstall.py, selectBestKernel(), here: # FIXME: this is a bit of a hack. we shouldn't hard-code and # instead check by provides. but alas. for k in ("kernel", "kernel-smp", "kernel-PAE"): if len(self.ayum.tsInfo.matchNaevr(name=k)) > 0: self.selectModulePackages(anaconda, k) foundkernel = True if not foundkernel and (isys.smpAvailable() or isys.htavailable()): try: ksmp = getBestKernelByArch("kernel-smp", self.ayum) except PackageSackError: ksmp = None log.debug("no kernel-smp package") if ksmp and ksmp.returnSimple("arch") == kpkg.returnSimple("arch"): foundkernel = True log.info("selected kernel-smp package for kernel") self.ayum.install(po=ksmp) self.selectModulePackages(anaconda, ksmp.name) if len(self.ayum.tsInfo.matchNaevr(name="gcc")) > 0: log.debug("selecting kernel-smp-devel") self.selectPackage("kernel-smp-devel.%s" % (kpkg.arch,)) if not foundkernel and isys.isPaeAvailable(): try: kpae = getBestKernelByArch("kernel-PAE", self.ayum) except PackageSackError: kpae = None log.debug("no kernel-PAE package") if kpae and kpae.returnSimple("arch") == kpkg.returnSimple("arch"): foundkernel = True log.info("select kernel-PAE package for kernel") self.ayum.install(po=kpae) self.selectModulePackages(anaconda, kpae.name) if len(self.ayum.tsInfo.matchNaevr(name="gcc")) > 0: log.debug("selecting kernel-PAE-devel") self.selectPackage("kernel-PAE-devel.%s" % (kpkg.arch,)) if not foundkernel: log.info("selected kernel package for kernel") self.ayum.install(po=kpkg) self.selectModulePackages(anaconda, kpkg.name) if len(self.ayum.tsInfo.matchNaevr(name="gcc")) > 0: log.debug("selecting kernel-devel") self.selectPackage("kernel-devel.%s" % (kpkg.arch,)) Basically, we get out of that first loop without setting foundkernel to True (I'm not entirely sure what that first loop does). Then we do the isys.smpAvailable() check, which is true, but that fails because there is no kernel-smp packages available. Next, we do the isys.isPaeAvailable check, but this doesn't fire because we are using < 4G of memory here. Finally, we fall through to the default, which is just to select "kernel". This is confirmed by the anaconda logs: 08:58:03 INFO : moving (1) to step postselection 08:58:03 DEBUG : no kernel-smp package 08:58:03 INFO : selected kernel package for kernel So, one way to fix this (suggested by Mark McLoughlin) would be to fix this by changing: if not foundkernel and isys.isPaeAvailable(): to something like: if not foundkernel and (isys.isPaeAvailable() or running_PAE_kernel): Where running_PAE_kernel would be set to True in the case that the anaconda installer is running on a PAE kernel. We could determine running_PAE_kernel by doing "os.uname()[2]", and looking for the substring PAE. I'm sure there are other solutions, but nothing is springing to mind at the moment. Chris Lalancette
What happens if you manually add say kernel-PAE to a kickstart file? I suspect you'll get the right kernel (or both) and this would be an OKish workaround for F10.
(In reply to comment #14) > What happens if you manually add say kernel-PAE to a kickstart file? I suspect > you'll get the right kernel (or both) and this would be an OKish workaround for > F10. We haven't tried it yet, but yeah that would probably be our workaround if it doesn't get fixed for F10.
Should we add a release note for this for the 0-day update?
I haven't done a kickstart install in a long time and I think it is an unreasonably high requirement for a "workaround". This is also a regression from Fedora 9.
Given that you can't run Fedora as a Xen host, which means you need to have $SOMETHING_ELSE installed to run into this, I think it's unreasonable to consider this a blocker bug. For now, using a kickstart script is likely a workable workaround. In the near future we could have this fixed with an updates.img The class of people that are going to have access to a Xen capable setup, but not know how to do a basic kickstart install (especially provided the Fedora and RHEL documentation on doing them) is going to be quite small in my opinion. I do believe we should have some release notes for this issue, be them notes about using kickstart, using an updates.img, or both.
Release note added: https://fedoraproject.org/wiki/Common_F10_bugs#Fedora_10_i686_Xen_guest_won.27t_boot As I understand it, this only affects 32-bit Xen hosts, which are far less common than 64-bit hosts. Still, it would probably be good if we had an updates.img for this problem soon.
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle. Changing version to '10'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Installed a new guest on RHEL-5 Xen host with: # virt-install --paravirt --location http://download.fedora.redhat.com/pub/fedora/linux/development/i386/os/ --name f11i686xen --file /var/lib/xen/images/f11i686.img --file-size 5 --vnc --ram 900 --noautoconsole And post-install the guest has # uname -r 2.6.29-0.258.rc8.git2.fc11.i686.PAE So anaconda is installing the correct kernel now.
If the kernel choices showed up in the Base package selection list the user could just fix this as he goes. I recently installed a virtual machine with 2GB using HVM and the standard anaconda install method, so I got the non-PAE kernel. The workaround was to boot the machine again with HVM and then yum install the PAE kernel, adjust the yum timeout and default kernel, and then finally boot the machine PVM. One side effect of moving Xen to PAE is that one kernel can't boot HVM and PVM anymore, so perhaps if anaconda knew something about the HVM VM's fingerprint it could just install both, anticipating a common usage scenario.