Bug 639208 - qemu-kvm crashes early in the BIOS, apparently in one of the extension ROMs
Summary: qemu-kvm crashes early in the BIOS, apparently in one of the extension ROMs
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 14
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 659196 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-10-01 05:36 UTC by H. Peter Anvin
Modified: 2013-01-09 11:40 UTC (History)
19 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2010-12-11 01:52:56 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
vmxcap (6.62 KB, application/octet-stream)
2010-10-03 09:49 UTC, Avi Kivity
no flags Details
vmxcap output (3.43 KB, text/plain)
2010-10-13 05:23 UTC, H. Peter Anvin
no flags Details

Description H. Peter Anvin 2010-10-01 05:36:23 UTC
Description of problem:
Trying to run qemu-kvm on a Nehalem box, KVM crashes early in the BIOS; it appears based on the value of CS to be inside the video BIOS:

QEMU 0.12.5 monitor - type 'help' for more information
(qemu) KVM internal error. Suberror: 2
extra data[0]: 80000010
extra data[1]: 80000b0d
rax 0000000000000e0a rbx 0000000000000007 rcx 0000000000000000 rdx 000000000000ffff
rsi 00000000000002ce rdi 0000000000000000 rsp 0000000000006e70 rbp 0000000000000000
r8  0000000000000000 r9  0000000000000000 r10 0000000000000000 r11 0000000000000000
r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000
rip 00000000000004a7 rflags 00010002
cs ca00 (000ca000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ds ca00 (000ca000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
es f000 (000f0000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ss 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
fs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
gs ffff (000ffff0/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
tr 0000 (feffd000/00002088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0)                                                  
gdt f7240/37                                                                                                          
idt 0/3ff                                                                                                             
cr0 10 cr2 0 cr3 0 cr4 0 cr8 0 efer 0 

(qemu) info registers
EAX=00000e0a EBX=00000007 ECX=00000000 EDX=0000ffff
ESI=000002ce EDI=00000000 EBP=00000000 ESP=00006e70
EIP=000004a7 EFL=00010002 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
ES =f000 000f0000 0000ffff 0000f300
CS =ca00 000ca000 0000ffff 0000f300
SS =0000 00000000 0000ffff 0000f300
DS =ca00 000ca000 0000ffff 0000f300
FS =0000 00000000 0000ffff 0000f300
GS =ffff 000ffff0 0000ffff 0000f300
LDT=0000 00000000 0000ffff 00008200
TR =0000 feffd000 00002088 00008b00
GDT=     000f7240 00000037
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
DR6=ffff0ff0 DR7=00000400
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00000000
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000

The exact same VM with the exact same installed components runs fine on a Penryn box.

Version-Release number of selected component (if applicable):

qemu-system-x86-0.12.5-1.fc13.x86_64
seabios-bin-0.6.0-1.fc13.noarch
vgabios-0.6b-3.fc12.noarch
kernel-2.6.34.7-56.fc13.x86_64

How reproducible:
100%

Steps to Reproduce:
#!/bin/bash
here=$(dirname "$0")
image=$here/$(basename "$0" .run).img
macaddr=ca:a6:29:4a:5d:06
uuid=624a7e60-31f9-4caa-8e0d-6439af089f2d

netif=$(sudo /usr/bin/tunctl -b -u $(whoami))
qemu-kvm \
    -enable-kvm \
    -smp 4 -m 2048 \
    -drive file="$image",if=virtio,boot=on \
    -vga std \
    -net nic,model=virtio,macaddr=$macaddr \
    -net tap,ifname=$netif,script=/home/hpa/qemu/net/qemu-ifup,downscript=no \
    -rtc base=utc -uuid $uuid \
    -usb -usbdevice tablet -soundhw ac97 \
    -balloon virtio \
    -monitor stdio "$@"
sudo /usr/bin/tunctl -d $netif

The VM image used is a Centos 5.5 image, but the VM never gets anywhere close to getting into the operating system.
  
Actual results:


Expected results:


Additional info:

Comment 1 H. Peter Anvin 2010-10-01 05:38:43 UTC
CPU information (times 8):

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Genuine Intel(R) CPU           @ 0000 @ 2.93GHz
stepping        : 2
cpu MHz         : 1596.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips        : 5866.18
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

Comment 2 H. Peter Anvin 2010-10-01 05:51:30 UTC
Correction: The CS looks to be right *after* the video BIOS, which presumably means in one of the option ROMS -- i.e. extboot or gPXE.

Comment 3 H. Peter Anvin 2010-10-01 21:14:23 UTC
Setting severity to high since this is a significant loss of functionality.

Comment 4 H. Peter Anvin 2010-10-02 23:04:34 UTC
Update on this: the failure appears to happen inside the gPXE OROM, which isn't actually being used in this scenario, but it doesn't seem possible to shut off.

gpxe-roms-qemu-1.0.1-1.fc13.noarch

Comment 5 H. Peter Anvin 2010-10-02 23:11:31 UTC
The failure happens specifically after returning from int 0x10, which implies an interaction with the video BIOS:

0000049A  EB0F              jmp short 0x4ab
0000049C  BB0700            mov bx,0x7
0000049F  B40E              mov ah,0xe
000004A1  3C0A              cmp al,0xa
000004A3  7504              jnz 0x4a9
000004A5  CD10              int 0x10
000004A7  B00D              mov al,0xd        <--- crash here
000004A9  CD10              int 0x10
000004AB  5D                pop bp
000004AC  5B                pop bx
000004AD  58                pop ax
000004AE  C3                ret

Comment 6 Avi Kivity 2010-10-03 08:08:07 UTC
While the processor tries to inject software interrupt 0x10 (extra data[0]: 80000010) it encountered a #GP (extra data[1]: 80000b0d).

However extra data[0] doesn't look right: it should have bits 8:10 == 4 for a software interrupt.

Can you trace vmx_queue_exception() to see which path it takes and what value it writes into VM_ENTRY_INTR_INFO_FIELD?  Looks like the software interrupt was converted into a hardware interrupt somehow.


static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr,
                                bool has_error_code, u32 error_code,
                                bool reinject)
{
        struct vcpu_vmx *vmx = to_vmx(vcpu);
        u32 intr_info = nr | INTR_INFO_VALID_MASK;

        if (has_error_code) {
                vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code);
                intr_info |= INTR_INFO_DELIVER_CODE_MASK;
        }

        if (vmx->rmode.vm86_active) {


<--- expected path

                vmx->rmode.irq.pending = true;
                vmx->rmode.irq.vector = nr;
                vmx->rmode.irq.rip = kvm_rip_read(vcpu);
                if (kvm_exception_is_soft(nr))
                        vmx->rmode.irq.rip +=
                                vmx->vcpu.arch.event_exit_inst_len;
                intr_info |= INTR_TYPE_SOFT_INTR;
                vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
                vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, 1);
                kvm_rip_write(vcpu, vmx->rmode.irq.rip - 1);
                return;
        }

Comment 7 Avi Kivity 2010-10-03 09:49:09 UTC
Created attachment 451258 [details]
vmxcap

Unable to reproduce with same components on a nehalem-ex here.

Please provide the output of vmxcap (attached).

Comment 8 H. Peter Anvin 2010-10-13 05:23:03 UTC
Created attachment 453083 [details]
vmxcap output

Sorry for the late reply... here is the output.

Comment 9 Avi Kivity 2010-10-13 06:35:36 UTC
Ok, no unrestricted guest support.

Please provide the traces requested in #6.

Comment 10 H. Peter Anvin 2010-12-09 21:33:56 UTC
*** Bug 659196 has been marked as a duplicate of this bug. ***

Comment 11 H. Peter Anvin 2010-12-09 21:34:30 UTC
Bump to Fedora 14

Comment 12 H. Peter Anvin 2010-12-11 01:52:56 UTC
Problem root-caused to a defective CPU.  I incorrectly thought this was a qualed system, it was in fact not.


Note You need to log in before you can comment on or make changes to this bug.