Bug 447636

Summary: Cannot instal RHEL4 x86_64 guest on AMD
Product: [Fedora] Fedora Reporter: Daniel Berrange <berrange>
Component: kvmAssignee: Glauber Costa <gcosta>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 9CC: berrange, clalance, djuran, dsmith, katzj, mniranja
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-10-20 07:17:57 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Description Flags
upstream msr write fix none

Description Daniel Berrange 2008-05-20 17:54:07 EDT
Description of problem:
Cannot install RHEL4 x86_64 guest on AMD, it fails at boot with

CPU: QEMU Virtual CPU version 0.9.1 stepping 03
general protection fault: 0000 [1] 
CPU 0 
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.9-70.EL
RIP: 0010:[<ffffffff8011a423>] <ffffffff8011a423>{setup_k7_watchdog+26}
RSP: 0000:000001001fa13eb8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000c0010000 RCX: 00000000c0010004
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000003e8
RBP: 0000000000000014 R08: 000000011fb139c0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 00000000c0010004
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff80572700(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000101000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo 000001001fa12000, task 000001001fa11110)
Stack: 0000000000000000 00000000000003e8 0000000000000000 ffffffff8011a83c 
       0000000000000000 ffffffff8011ea5c 00000000c0010000 ffffffff805804e0 
       0000000000000000 0000000000000800 
Call Trace:<ffffffff8011a83c>{lapic_watchdog_init+27}
       <ffffffff8010c4fe>{init+214} <ffffffff8010c428>{init+0} 
       <ffffffff80111657>{child_rip+8} <ffffffff8010c428>{init+0} 

Code: 0f 30 b8 76 00 13 00 89 d9 0f 30 48 c7 c6 18 11 38 80 89 fa 
RIP <ffffffff8011a423>{setup_k7_watchdog+26} RSP <000001001fa13eb8>
 <0>Kernel panic - not syncing: Oops

Version-Release number of selected component (if applicable):
# rpm -q kvm
[root@dhcp-100-19-138 ~]# uname -a
Linux dhcp-100-19-138.bos.redhat.com 2.6.25-14.fc9.x86_64 #1 SMP Thu May 1
06:06:21 EDT 2008 x86_64 x86_64 x86_64 GNU/Linu

How reproducible:

Steps to Reproduce:
1.  Boot RHEL-4, updte 7, x86_64  installer under KVM
Actual results:

Expected results:

Additional info:
Comment 1 Chris Wright 2008-05-20 19:02:41 EDT
Created attachment 306190 [details]
upstream msr write fix

This was fixed in upstream commit 854d17ee91e87903dc42e8b4506ffd9d023ed47a.
Launching scratch build to verify fix:
Comment 2 Chris Lalancette 2008-05-21 03:04:26 EDT
What Chris says is true, however, I believe the problem is deeper.  I believe
the above fix works for AMD, but you will still have problems on Intel, in a
similar piece of code.  This started happening on RHEL-4 U7 because of the
backported NMI code, basically so it could share the perfctr registers with
oprofile.  Before U7, what would happen is that RHEL-4 would attempt to access
the EVNTSEL register; since KVM didn't support it, it would inject a GPF, which
RHEL-4 would then catch, and then give up on NMI watchdog support and go on with
life.  In U7, however, that attempt to access the EVNTSEL register no longer
traps the GPF, so when you actually hit the GPF, it crashes the guest.  My
suggestion is two-fold:

1.  Fix RHEL-4 U7 (and probably upstream) to re-add the "check MSR with GPF
fixup", so that if you boot on a version of KVM (or bare-metal, or any other
virtualization solution for that matter) that doesn't support these MSR's, the
guest just won't enable NMI watchdog and won't crash.

2.  Fix KVM on the Intel side so that you properly emulate-n-drop writes to
these MSR's, so we won't have this problem going forward.

Chris Lalancette
Comment 3 Glauber Costa 2008-05-21 09:42:36 EDT
I checked "The Book" (tm), for some light, and it says, under the rdpmc
instruction that a gpf is generated if the instruction attempts to access an
invalid perfctr index. So, although nothing at all is said about perfctrs being
optional, how many of them there are, seem to be variant across architectures.

Given this, it seems the correct thing to me to check for a possible GPF when
reading it.
Comment 4 David Juran 2008-10-20 07:17:57 EDT
RHEL4 kernel-2.6.9-78.0.5 now works fine with kvm-65-9.fc9