Bug 459107 - [RHEL5.3]: Hang when booting an i386 domU on an i386 HV
Summary: [RHEL5.3]: Hang when booting an i386 domU on an i386 HV
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.3
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Markus Armbruster
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-08-14 14:21 UTC by Chris Lalancette
Modified: 2009-01-20 20:07 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-01-20 20:07:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:0225 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.3 kernel security and bug fix update 2009-01-20 16:06:24 UTC

Description Chris Lalancette 2008-08-14 14:21:41 UTC
Description of problem:
I was trying to boot a 2.6.18-104 i386 kernel under and i386 HV, and it kept hanging during boot.  I bisected it, and found out that it started happening in 2.6.18-98.  Bisecting further, I found that this patch is the culprit:

[xen] PVFB probe & suspend fixes

Reverting that patch causes it to boot properly again.  Interestingly enough, if I let the guest hang around long enough, it will eventually spew out some softlockup warnings, and even longer, will finally boot.  The stack traces from the softlockups look like:

BUG: soft lockup - CPU#0 stuck for 10s! [swapper:0]

Pid: 0, comm:              swapper
EIP: 0061:[<c055301b>] CPU: 0
EIP is at input_handler+0x39/0x13a
 EFLAGS: 00000212    Not tainted  (2.6.18.4 #6)
EAX: 003c4c47 EBX: c0edaba8 ECX: 00000033 EDX: 000007d0
ESI: ed7a2560 EDI: 0c033257 EBP: c0eda000 DS: 007b ES: 007b
CR0: 8005003b CR2: f5305000 CR3: 00f2c000 CR4: 00000660
 [<c041b570>] __wake_up+0x2a/0x3d
 [<c05497c4>] unmask_evtchn+0x26/0xba
 [<c044736f>] handle_IRQ_event+0x27/0x51
 [<c044741d>] __do_IRQ+0x84/0xd6
 [<c0406e6e>] do_IRQ+0x93/0xae
 [<c0549e27>] evtchn_do_upcall+0x64/0x9b
 [<c04055d9>] hypervisor_callback+0x3d/0x48
 [<c0408632>] raw_safe_halt+0x8c/0xaf
 [<c040321a>] xen_idle+0x22/0x2e
 [<c0403339>] cpu_idle+0x91/0xab
 [<c06ec9f5>] start_kernel+0x37a/0x381
 =======================

Comment 1 RHEL Program Management 2008-08-14 14:23:03 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 2 RHEL Program Management 2008-08-14 14:36:27 UTC
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.

Comment 3 Chris Lalancette 2008-08-14 14:54:12 UTC
+       info->page = (void *)__get_free_page(GFP_KERNEL || __GFP_ZERO);

Oops.  That probably wants to be (GFP_KERNEL | GFP_ZERO)

And that is the bug; using | instead of || makes it boot just fine on i386.  We
also need to make sure this is not broken upstream.

Chris Lalancette

Comment 5 Markus Armbruster 2008-08-15 18:42:08 UTC
Upstream is fine, both pvops and XS's 2.6.18.

Comment 7 Don Zickus 2008-09-03 03:41:43 UTC
in kernel-2.6.18-107.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 10 Ryan Lerch 2008-11-07 00:17:19 UTC
This bug has been marked for inclusion in the Red Hat Enterprise Linux 5.3
Release Notes.

To aid in the development of relevant and accurate release notes, please fill
out the "Release Notes" field above with the following 4 pieces of information:


Cause:   What actions or circumstances cause this bug to present.

Consequence:  What happens when the bug presents.

Fix:   What was done to fix the bug.

Result:  What now happens when the actions or circumstances above occur. (NB:
this is not the same as 'the bug doesn't present anymore')

Comment 11 Chris Lalancette 2008-11-07 08:08:47 UTC
This one shouldn't have a release note; it was introduced, and fixed, during 5.3 beta testing.

Chris Lalancette

Comment 14 errata-xmlrpc 2009-01-20 20:07:28 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html


Note You need to log in before you can comment on or make changes to this bug.