Bug 445450

Summary: [RHEL4]: Kernel crash when booting -68.33 or later under KVM on AMD systems
Product: Red Hat Enterprise Linux 4 Reporter: Chris Lalancette <clalance>
Component: kernelAssignee: Chris Lalancette <clalance>
Status: CLOSED WONTFIX QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 4.7CC: vgoyal
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-08-18 21:21:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
test patch
none
updated test patch none

Description Chris Lalancette 2008-05-06 21:05:14 UTC
Description of problem:
I've been doing some testing of RHEL-4 guests under KVM (Fedora 8 host).  Prior
to 2.6.9-68.33, RHEL-4 guests would boot just fine.  However, on 2.6.9-68.33 or
later, I crash when starting the guest up; the stack trace I can actually
capture is really bogus.  I'm pretty sure this has to do with the recent NMI
watchdog fixes in 68.33; if I pass "nmi_watchdog=0" (NMI watchdog disabled) or
"nmi_watchdog=1" (IO-APIC watchdog), then the guest boots fine.  It's only the
default of nmi_watchdog=2 (performance counters) that causes the problem.

I believe the problem is related to KVM's emulation of performance counters.  At
this point, KVM basically doesn't emulate performance counters, so if you wrmsr
to a perf counter, KVM injects a GPF.  In previous RHEL-4 kernels, this was
actually OK, because they did a "checking_wrmsrl()" with a fixup section before
trying to write.  I think the solution here is probably to go back to a
"checking_wrmsrl" type solution, to make sure that we don't crash if either the
platform doesn't support the counters, or if we are running under KVM.

Comment 1 Aristeu Rozanski 2008-05-12 15:43:37 UTC
Created attachment 305138 [details]
test patch

Comment 2 Aristeu Rozanski 2008-05-12 15:44:49 UTC
Comment on attachment 305138 [details]
test patch

not upstream

Comment 3 Aristeu Rozanski 2008-05-13 15:56:17 UTC
Created attachment 305246 [details]
updated test patch

Comment 4 Chris Lalancette 2008-05-14 09:28:26 UTC
Yep, that second patch seems to do it for me under KVM.  The RHEL-4 kernel boots
up fine with it, although the NMI watchdog itself does not work (which is not
surprising, since KVM doesn't properly emulate the performance counters at the
moment).

Chris Lalancette

Comment 6 Chris Lalancette 2008-08-18 21:21:00 UTC
OK.  This should now be fixed in upstream KVM, so we don't actually need a patch for RHEL-4.  If this turns out not to be true in the future, we can re-open this, but for now I'm going to close as WONTFIX.

Chris Lalancette

Comment 7 Chris Lalancette 2010-07-19 13:49:58 UTC
Clearing out old flags for reporting purposes.

Chris Lalancette