Bug 445450 - [RHEL4]: Kernel crash when booting -68.33 or later under KVM on AMD systems
Summary: [RHEL4]: Kernel crash when booting -68.33 or later under KVM on AMD systems
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.7
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Chris Lalancette
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-05-06 21:05 UTC by Chris Lalancette
Modified: 2010-07-19 13:49 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-08-18 21:21:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
test patch (604 bytes, patch)
2008-05-12 15:43 UTC, Aristeu Rozanski
no flags Details | Diff
updated test patch (604 bytes, patch)
2008-05-13 15:56 UTC, Aristeu Rozanski
no flags Details | Diff

Description Chris Lalancette 2008-05-06 21:05:14 UTC
Description of problem:
I've been doing some testing of RHEL-4 guests under KVM (Fedora 8 host).  Prior
to 2.6.9-68.33, RHEL-4 guests would boot just fine.  However, on 2.6.9-68.33 or
later, I crash when starting the guest up; the stack trace I can actually
capture is really bogus.  I'm pretty sure this has to do with the recent NMI
watchdog fixes in 68.33; if I pass "nmi_watchdog=0" (NMI watchdog disabled) or
"nmi_watchdog=1" (IO-APIC watchdog), then the guest boots fine.  It's only the
default of nmi_watchdog=2 (performance counters) that causes the problem.

I believe the problem is related to KVM's emulation of performance counters.  At
this point, KVM basically doesn't emulate performance counters, so if you wrmsr
to a perf counter, KVM injects a GPF.  In previous RHEL-4 kernels, this was
actually OK, because they did a "checking_wrmsrl()" with a fixup section before
trying to write.  I think the solution here is probably to go back to a
"checking_wrmsrl" type solution, to make sure that we don't crash if either the
platform doesn't support the counters, or if we are running under KVM.

Comment 1 Aristeu Rozanski 2008-05-12 15:43:37 UTC
Created attachment 305138 [details]
test patch

Comment 2 Aristeu Rozanski 2008-05-12 15:44:49 UTC
Comment on attachment 305138 [details]
test patch

not upstream

Comment 3 Aristeu Rozanski 2008-05-13 15:56:17 UTC
Created attachment 305246 [details]
updated test patch

Comment 4 Chris Lalancette 2008-05-14 09:28:26 UTC
Yep, that second patch seems to do it for me under KVM.  The RHEL-4 kernel boots
up fine with it, although the NMI watchdog itself does not work (which is not
surprising, since KVM doesn't properly emulate the performance counters at the
moment).

Chris Lalancette

Comment 6 Chris Lalancette 2008-08-18 21:21:00 UTC
OK.  This should now be fixed in upstream KVM, so we don't actually need a patch for RHEL-4.  If this turns out not to be true in the future, we can re-open this, but for now I'm going to close as WONTFIX.

Chris Lalancette

Comment 7 Chris Lalancette 2010-07-19 13:49:58 UTC
Clearing out old flags for reporting purposes.

Chris Lalancette


Note You need to log in before you can comment on or make changes to this bug.