Description of problem: After upgrade to latest 5.3 kernel-xen, the machine doesn't boot and hangs immediately after selecting the kernel from grub prompt. No messages are printed (without the rhgb quiet options), just black screen and hang. Version-Release number of selected component (if applicable): Reproduced with kernel-xen-2.6.18-120.el5 and -118. I might have time to bisect the build that caused this tomorrow. How reproducible: always Steps to Reproduce: 1. install a recent 5.3 kernel 2. boot Actual results: hang Expected results: clean boot Additional info: Regular non-xen kernel works fine on the same machine (-120). kernel-xen that was shipped in 5.2 (-92) also works OK.
What kind of system is this happening on? Architecture, how many CPUs, how much memory, etc? Can you provide the grub file? Thanks.
Ahh, system type in the subject line...still please provide the other info.
Created attachment 321258 [details] grub config file
Created attachment 321259 [details] dmidecode output The system is x86 (32 bit), 1 CPU, 2GB of memory. Attaching output of dmidecode.
Created attachment 321260 [details] /proc/cpuinfo output
Any chance you can capture serial console? (add "com1=115200,8n1 console=com1" to the hv line in grub and "console=ttyS0,115200,n8" to the kernel line. You have not put on an x86_64 kernel or something by mistake have you?
(In reply to comment #6) > Any chance you can capture serial console? (add "com1=115200,8n1 console=com1" > to the hv line in grub and "console=ttyS0,115200,n8" to the kernel line. OK, I'll try. > You have not put on an x86_64 kernel or something by mistake have you? No, everything is i686: # rpm -q kernel-xen --queryformat '%{name}-%{version}-%{release}.%{arch}\n' kernel-xen-2.6.18-118.el5.i686 kernel-xen-2.6.18-120.el5.i686 kernel-xen-2.6.18-92.el5.i686
I mentioned to Jakub in IRC that we haven't done a whole lot of mucking with early initialization code between 5.2 and 5.3, so this is a little surprising. However, looking briefly through the kernel changelog, the two biggest possibilities seem to be: 1) EPT/2MB stuff 2) GDT expansion stuff I've asked Jakub to try -111 (which is right before the EPT stuff went in) to see if that did it. I also asked him to try -106, which is right before the GDT changes went in. If neither of those work, then we'll have to do a full bisection. I'll leave this in needinfo until jakub gets a chance to do the test. Chris Lalancette
So I ended up doing almost full bisection and oddly enough, the breakage appears to be between -115 and -116. IOW, kernel-xen-2.6.18-115.el5 boots, kernel-xen-2.6.18-116.el5 does not boot.
Ug. OK, well, that really only leaves a single hypervisor patch, which is one of mine. So it must be that patch, although I have a hard time seeing how it could cause a problem that early in boot. I'll have to get on there at some point (or find another similar machine) and try some things. Chris Lalancette
Arg! I think I see the problem in the patch. I'll spin a test kernel with a fix for you to try. Chris Lalancette
This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP.
Created attachment 321397 [details] Patch that will probably fix this issue OK, I missed something when I did the backport of the CR4 TSC hiding patches. One of the things the CR4 TSC patches do is to add a read of the EFER MSR in the boot path for the boot processor and all other processors. Unfortunately, older processors (like the P-4) do not have the EFER MSR, so they are basically generating a fault very early on in boot. Upstream c/s 16378 addresses this by checking for the existence of the EFER before accessing it. The attached patch is a backport of this, and should solve the problem. I'm building a test kernel now with this patch; I'll give download details once it is done building.
OK. I've built a kernel with the patch. It's available here: http://people.redhat.com/clalance/bz468083/ Please download it and give it a try, and report back the results. I need to have testing results by early next week to make sure we can get this in as soon as possible. Thanks! Chris Lalancette
(In reply to comment #15) > OK. I've built a kernel with the patch. It's available here: > > http://people.redhat.com/clalance/bz468083/ > > Please download it and give it a try, and report back the results. I need to > have testing results by early next week to make sure we can get this in as soon > as possible. > Seems like it does the trick - boots and runs just fine! Thanks, Chris!
OK, thanks for the bug report and the testing. I'll get this queued up for 5.3 Chris Lalancette
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: The Xen kernel will not boot on some older i686 systems that lack the EFER MSR. This issue will be fixed in a beta snapshot.
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -The Xen kernel will not boot on some older i686 systems that lack the EFER MSR. This issue will be fixed in a beta snapshot.+In this beta release, the virtualized kernel may not boot on some older x86 systems that lack the EFER MSR. Refer to Red Hat Bugzilla #468083 for more on this issue.
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -In this beta release, the virtualized kernel may not boot on some older x86 systems that lack the EFER MSR. Refer to Red Hat Bugzilla #468083 for more on this issue.+In this beta release, the virtualized kernel may not boot on some older x86 systems that lack the EFER MSR. Refer to Red Hat Bugzilla #468083 for more information on this issue.
*** Bug 469237 has been marked as a duplicate of this bug. ***
in kernel-2.6.18-122.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
*** Bug 470535 has been marked as a duplicate of this bug. ***
This bug also applies to 2.6.18-92.1.17.el5xen which is being pushed as a current RH5.2 update. Please take note that this is hitting production systems as well and not just lab systems using the beta.
(In reply to comment #29) > This bug also applies to 2.6.18-92.1.17.el5xen which is being pushed as a > current RH5.2 update. > Please take note that this is hitting production systems as well and not just > lab systems using the beta. This is why I filed bug 470535 entered here (RHEL 5.2): https://bugzilla.redhat.com/show_bug.cgi?id=470535 Can we get a confirmation on when we can expect a fixed xen kernel and that the problem is confirmed by Red Hat ?
------- Comment From santwana.samantray.com 2008-11-16 12:57 EDT------- Hi, I was able to boot successfully into a 32-bit machine installed with a Xen Kernel,in RHEL5.3-Snap2. [root@x360a ~]# uname -a Linux x360a.in.ibm.com 2.6.18-122.el5xen #1 SMP Mon Nov 3 18:49:46 EST 2008 i686 i686 i386 GNU/Linux Thanks, Santwana This event sent from IssueTracker by jkachuck issue 234219
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0225.html