Bug 445450 - [RHEL4]: Kernel crash when booting -68.33 or later under KVM on AMD systems
[RHEL4]: Kernel crash when booting -68.33 or later under KVM on AMD systems
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.7
All Linux
high Severity high
: rc
: ---
Assigned To: Chris Lalancette
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-06 17:05 EDT by Chris Lalancette
Modified: 2010-07-19 09:49 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-08-18 17:21:00 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
test patch (604 bytes, patch)
2008-05-12 11:43 EDT, Aristeu Rozanski
no flags Details | Diff
updated test patch (604 bytes, patch)
2008-05-13 11:56 EDT, Aristeu Rozanski
no flags Details | Diff

  None (edit)
Description Chris Lalancette 2008-05-06 17:05:14 EDT
Description of problem:
I've been doing some testing of RHEL-4 guests under KVM (Fedora 8 host).  Prior
to 2.6.9-68.33, RHEL-4 guests would boot just fine.  However, on 2.6.9-68.33 or
later, I crash when starting the guest up; the stack trace I can actually
capture is really bogus.  I'm pretty sure this has to do with the recent NMI
watchdog fixes in 68.33; if I pass "nmi_watchdog=0" (NMI watchdog disabled) or
"nmi_watchdog=1" (IO-APIC watchdog), then the guest boots fine.  It's only the
default of nmi_watchdog=2 (performance counters) that causes the problem.

I believe the problem is related to KVM's emulation of performance counters.  At
this point, KVM basically doesn't emulate performance counters, so if you wrmsr
to a perf counter, KVM injects a GPF.  In previous RHEL-4 kernels, this was
actually OK, because they did a "checking_wrmsrl()" with a fixup section before
trying to write.  I think the solution here is probably to go back to a
"checking_wrmsrl" type solution, to make sure that we don't crash if either the
platform doesn't support the counters, or if we are running under KVM.
Comment 1 Aristeu Rozanski 2008-05-12 11:43:37 EDT
Created attachment 305138 [details]
test patch
Comment 2 Aristeu Rozanski 2008-05-12 11:44:49 EDT
Comment on attachment 305138 [details]
test patch

not upstream
Comment 3 Aristeu Rozanski 2008-05-13 11:56:17 EDT
Created attachment 305246 [details]
updated test patch
Comment 4 Chris Lalancette 2008-05-14 05:28:26 EDT
Yep, that second patch seems to do it for me under KVM.  The RHEL-4 kernel boots
up fine with it, although the NMI watchdog itself does not work (which is not
surprising, since KVM doesn't properly emulate the performance counters at the
moment).

Chris Lalancette
Comment 6 Chris Lalancette 2008-08-18 17:21:00 EDT
OK.  This should now be fixed in upstream KVM, so we don't actually need a patch for RHEL-4.  If this turns out not to be true in the future, we can re-open this, but for now I'm going to close as WONTFIX.

Chris Lalancette
Comment 7 Chris Lalancette 2010-07-19 09:49:58 EDT
Clearing out old flags for reporting purposes.

Chris Lalancette

Note You need to log in before you can comment on or make changes to this bug.