Bug 468083 - kernel-xen doesn't boot on Dell Optiplex GX280
Summary: kernel-xen doesn't boot on Dell Optiplex GX280
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.3
Hardware: All
OS: Linux
urgent
high
Target Milestone: rc
: ---
Assignee: Chris Lalancette
QA Contact: Martin Jenner
URL:
Whiteboard:
: 469237 470535 (view as bug list)
Depends On:
Blocks: RHEL5u3_relnotes 470040
TreeView+ depends on / blocked
 
Reported: 2008-10-22 17:40 UTC by Jakub Hrozek
Modified: 2018-10-20 02:15 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
In this beta release, the virtualized kernel may not boot on some older x86 systems that lack the EFER MSR. Refer to Red Hat Bugzilla #468083 for more information on this issue.
Clone Of:
Environment:
Last Closed: 2009-01-20 19:35:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
grub config file (1.17 KB, text/plain)
2008-10-23 09:05 UTC, Jakub Hrozek
no flags Details
dmidecode output (17.54 KB, text/plain)
2008-10-23 09:07 UTC, Jakub Hrozek
no flags Details
/proc/cpuinfo output (1.04 KB, text/plain)
2008-10-23 09:09 UTC, Jakub Hrozek
no flags Details
Patch that will probably fix this issue (3.23 KB, patch)
2008-10-24 11:42 UTC, Chris Lalancette
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
CentOS 3230 0 None None None Never
Red Hat Product Errata RHSA-2009:0225 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.3 kernel security and bug fix update 2009-01-20 16:06:24 UTC

Description Jakub Hrozek 2008-10-22 17:40:26 UTC
Description of problem:
After upgrade to latest 5.3 kernel-xen, the machine doesn't boot and hangs immediately after selecting the kernel from grub prompt. No messages are printed (without the rhgb quiet options), just black screen and hang.

Version-Release number of selected component (if applicable):
Reproduced with kernel-xen-2.6.18-120.el5 and -118. I might have time to bisect the build that caused this tomorrow.

How reproducible:
always

Steps to Reproduce:
1. install a recent 5.3 kernel
2. boot

  
Actual results:
hang 

Expected results:
clean boot

Additional info:
Regular non-xen kernel works fine on the same machine (-120). kernel-xen that was shipped in 5.2 (-92) also works OK.

Comment 1 Bill Burns 2008-10-23 00:50:29 UTC
What kind of system is this happening on? Architecture, how many CPUs, how much memory, etc? Can you provide the grub file? Thanks.

Comment 2 Bill Burns 2008-10-23 00:52:07 UTC
Ahh, system type in the subject line...still please provide the other info.

Comment 3 Jakub Hrozek 2008-10-23 09:05:16 UTC
Created attachment 321258 [details]
grub config file

Comment 4 Jakub Hrozek 2008-10-23 09:07:40 UTC
Created attachment 321259 [details]
dmidecode output

The system is x86 (32 bit), 1 CPU, 2GB of memory. Attaching output of dmidecode.

Comment 5 Jakub Hrozek 2008-10-23 09:09:34 UTC
Created attachment 321260 [details]
/proc/cpuinfo output

Comment 6 Bill Burns 2008-10-23 11:46:53 UTC
Any chance you can capture serial console? (add "com1=115200,8n1 console=com1"
to the hv line in grub and  "console=ttyS0,115200,n8" to the kernel line.
You have not put on an x86_64 kernel or something by mistake have you?

Comment 7 Jakub Hrozek 2008-10-23 14:37:07 UTC
(In reply to comment #6)
> Any chance you can capture serial console? (add "com1=115200,8n1 console=com1"
> to the hv line in grub and  "console=ttyS0,115200,n8" to the kernel line.

OK, I'll try. 

> You have not put on an x86_64 kernel or something by mistake have you?

No, everything is i686:
# rpm -q kernel-xen --queryformat '%{name}-%{version}-%{release}.%{arch}\n'
kernel-xen-2.6.18-118.el5.i686
kernel-xen-2.6.18-120.el5.i686
kernel-xen-2.6.18-92.el5.i686

Comment 8 Chris Lalancette 2008-10-23 15:05:08 UTC
I mentioned to Jakub in IRC that we haven't done a whole lot of mucking with early initialization code between 5.2 and 5.3, so this is a little surprising.  However, looking briefly through the kernel changelog, the two biggest possibilities seem to be:

1)  EPT/2MB stuff
2)  GDT expansion stuff

I've asked Jakub to try -111 (which is right before the EPT stuff went in) to see if that did it.  I also asked him to try -106, which is right before the GDT changes went in.  If neither of those work, then we'll have to do a full bisection.  I'll leave this in needinfo until jakub gets a chance to do the test.

Chris Lalancette

Comment 9 Jakub Hrozek 2008-10-23 16:32:17 UTC
So I ended up doing almost full bisection and oddly enough, the breakage appears to be between -115 and -116. IOW, kernel-xen-2.6.18-115.el5 boots, kernel-xen-2.6.18-116.el5 does not boot.

Comment 10 Chris Lalancette 2008-10-23 20:19:20 UTC
Ug.  OK, well, that really only leaves a single hypervisor patch, which is one of mine.  So it must be that patch, although I have a hard time seeing how it could cause a problem that early in boot.  I'll have to get on there at some point (or find another similar machine) and try some things.

Chris Lalancette

Comment 11 Chris Lalancette 2008-10-24 09:38:45 UTC
Arg!  I think I see the problem in the patch.  I'll spin a test kernel with a fix for you to try.

Chris Lalancette

Comment 12 RHEL Program Management 2008-10-24 10:19:13 UTC
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.

Comment 14 Chris Lalancette 2008-10-24 11:42:03 UTC
Created attachment 321397 [details]
Patch that will probably fix this issue

OK, I missed something when I did the backport of the CR4 TSC hiding patches.  One of the things the CR4 TSC patches do is to add a read of the EFER MSR in the boot path for the boot processor and all other processors.  Unfortunately, older processors (like the P-4) do not have the EFER MSR, so they are basically generating a fault very early on in boot.  Upstream c/s 16378 addresses this by checking for the existence of the EFER before accessing it.  The attached patch is a backport of this, and should solve the problem.  I'm building a test kernel now with this patch; I'll give download details once it is done building.

Comment 15 Chris Lalancette 2008-10-24 14:08:18 UTC
OK.  I've built a kernel with the patch.  It's available here:

http://people.redhat.com/clalance/bz468083/

Please download it and give it a try, and report back the results.  I need to have testing results by early next week to make sure we can get this in as soon as possible.

Thanks!
Chris Lalancette

Comment 16 Jakub Hrozek 2008-10-27 10:36:07 UTC
(In reply to comment #15)
> OK.  I've built a kernel with the patch.  It's available here:
> 
> http://people.redhat.com/clalance/bz468083/
> 
> Please download it and give it a try, and report back the results.  I need to
> have testing results by early next week to make sure we can get this in as soon
> as possible.
>

Seems like it does the trick - boots and runs just fine! Thanks, Chris!

Comment 17 Chris Lalancette 2008-10-27 10:39:21 UTC
OK, thanks for the bug report and the testing.  I'll get this queued up for 5.3

Chris Lalancette

Comment 18 Bill Burns 2008-10-27 13:50:24 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
The Xen kernel will not boot on some older i686 systems that lack the EFER MSR. This issue will be fixed in a beta snapshot.

Comment 21 Ryan Lerch 2008-10-28 05:10:52 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-The Xen kernel will not boot on some older i686 systems that lack the EFER MSR. This issue will be fixed in a beta snapshot.+In this beta release, the virtualized kernel may not boot on some older x86 systems that lack the EFER MSR. Refer to Red Hat Bugzilla #468083 for more on this issue.

Comment 22 Ryan Lerch 2008-10-28 05:11:49 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-In this beta release, the virtualized kernel may not boot on some older x86 systems that lack the EFER MSR. Refer to Red Hat Bugzilla #468083 for more on this issue.+In this beta release, the virtualized kernel may not boot on some older x86 systems that lack the EFER MSR. Refer to Red Hat Bugzilla #468083 for more information on this issue.

Comment 23 Chris Lalancette 2008-10-31 13:32:47 UTC
*** Bug 469237 has been marked as a duplicate of this bug. ***

Comment 24 Don Zickus 2008-11-04 16:51:13 UTC
in kernel-2.6.18-122.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 28 Bill Burns 2008-11-07 16:59:13 UTC
*** Bug 470535 has been marked as a duplicate of this bug. ***

Comment 29 Andreas Thienemann 2008-11-09 23:57:04 UTC
This bug also applies to 2.6.18-92.1.17.el5xen which is being pushed as a current RH5.2 update.
Please take note that this is hitting production systems as well and not just lab systems using the beta.

Comment 30 Alexander Lindqvist 2008-11-12 08:00:36 UTC
(In reply to comment #29)
> This bug also applies to 2.6.18-92.1.17.el5xen which is being pushed as a
> current RH5.2 update.
> Please take note that this is hitting production systems as well and not just
> lab systems using the beta.

This is why I filed bug  470535 entered here (RHEL 5.2):

https://bugzilla.redhat.com/show_bug.cgi?id=470535

Can we get a confirmation on when we can expect a fixed xen kernel and that the problem is confirmed by Red Hat ?

Comment 32 Issue Tracker 2008-11-17 15:01:34 UTC
------- Comment From santwana.samantray.com 2008-11-16 12:57
EDT-------
Hi,

I was able to boot successfully into a 32-bit machine installed with a Xen
Kernel,in RHEL5.3-Snap2.
[root@x360a ~]# uname -a
Linux x360a.in.ibm.com 2.6.18-122.el5xen #1 SMP Mon Nov 3 18:49:46 EST
2008 i686 i686 i386 GNU/Linux

Thanks,
Santwana


This event sent from IssueTracker by jkachuck 
 issue 234219

Comment 36 errata-xmlrpc 2009-01-20 19:35:20 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html


Note You need to log in before you can comment on or make changes to this bug.