Bug 1088784

Summary: qemu ' KVM internal error. Suberror: 1' when query cpu frequently during pxe boot in Intel "Q95xx" host
Product: Red Hat Enterprise Linux 7 Reporter: Qian Guo <qiguo>
Component: kernelAssignee: Paolo Bonzini <pbonzini>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: urgent    
Version: 7.0CC: alex.williamson, bdas, hhuang, juzhang, knoel, lersek, michen, mtosatti, pbonzini, qiguo
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-3.10.0-143.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1097363 (view as bug list) Environment:
Last Closed: 2015-03-05 11:55:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1116936    
Bug Blocks: 1078775, 1097363, 1113511    
Attachments:
Description Flags
dmidecode of my host with Q9500
none
/proc/cpuinfo of host with q9500 none

Description Qian Guo 2014-04-17 07:44:39 UTC
Created attachment 887066 [details]
dmidecode of my host with Q9500

Description of problem:
When query cpu frequently during guest pxe boots, qemu crashed, and just occurs on host with cpu (witch I used and hit this bug ).
'Intel(R) Core(TM)2 Quad CPU    Q9500  @ 2.83GHz' 
'Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz'

Version-Release number of selected component (if applicable):
ipxe-roms-qemu-20130517-5.gitc4bce43.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1.Boot guest with network:
# /usr/libexec/qemu-kvm -cpu Penryn -m 4G -smp 4,sockets=1,cores=4,threads=1 -M pc -enable-kvm  -device piix3-usb-uhci,id=usb -name rhel7 -nodefaults -nodefconfig  -device virtio-balloon-pci,id=balloon0  -vnc :10 -vga std -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0   -monitor stdio     -drive file=test,if=none,media=disk,format=raw,rerror=stop,werror=stop,aio=native,id=scsi-disk0 -device virtio-scsi-pci,id=bus2 -device scsi-hd,bus=bus2.0,drive=scsi-disk0,id=disk0 -netdev tap,id=netdev0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=netdev0,id=vn1,mac=52:54:a0:0b:00:01 -boot menu=on -monitor unix:/tmp/m1,server,nowait -S

2.In another host session, query cpu frequently:
# while true; do echo "info cpus" |nc -U /tmp/m1 ; done

3.Start guest to boot

Actual results:
qemu print following infos:

(qemu) KVM internal error. Suberror: 1
emulation failure
EAX=00000011 EBX=e5f8dfff ECX=00000030 EDX=00002ca8
ESI=40176888 EDI=00000000 EBP=00009cf2 ESP=00002ca8
EIP=00000213 EFL=00000006 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =9cf2 0009cf20 ffffffff 00809300 DPL=0 DS16 [-WA]
CS =9c7b 0009c7b0 ffffffff 00809b00 DPL=0 CS16 [-RA]
SS =9cf2 0009cf20 ffffffff 00809300 DPL=0 DS16 [-WA]
DS =9cf2 0009cf20 ffffffff 00809300 DPL=0 DS16 [-WA]
FS =9cf2 0009cf20 ffffffff 00809300 DPL=0 DS16 [-WA]
GS =9cf2 0009cf20 ffffffff 00809300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     0009cf30 00000037
IDT=     00000000 0000ffff
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=66 0f 01 16 10 00 66 0f 01 1e 48 00 0f 20 c0 0c 01 0f 22 c0 <66> ea a4 00 00 00 08 00 0f 20 c0 24 fe 0f 22 c0 ff 2e 4e 00 2e a1 be 06 8e d8 8e c0 8e e0


repeatedly print same failure

Expected results:
qemu-kvm works well, 

Additional info:
1. If at this time, quit the query cpus loop, and under hmp, do system-reset, guest can reboot successfully, and under hmp, check guest status, it is running.

2.I test this case for some hosts, only the hosts with  cpu 'Intel(R) Core(TM)2 Quad CPU    Q9500  @ 2.83GHz' 'Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz
' hit this issue, the flollowings are the host infos with 'Intel(R) Core(TM)2 Quad CPU    Q9500  @ 2.83GHz'.

# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 23
Model name:            Intel(R) Core(TM)2 Quad CPU    Q9500  @ 2.83GHz
Stepping:              10
CPU MHz:               2833.000
BogoMIPS:              5653.07
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              3072K
NUMA node0 CPU(s):     0-3


and I will attach the dmidecode and /proc/cpuinfo  of the host in this bug.

3.The other hosts I tests that can not hit, are with following cpus:
3.1.Model name:            Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
3.2.Model name:            Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz


4.This bug is not related with qemu or kernel,

I test qemu-kvm-1.5.3-60.el7.x86_64 & qemu-kvm-1.5.3-50.el7.x86_64 & qemu-kvm-1.5.3-49.el7.x86_64 , and kernel-3.10.0-95.el7.x86_64, kernel-3.10.0-121.el7.x86_64 
with all above builds, can not reproduce this bug when with ipxe-roms-qemu-20130517-4.gitc4bce43.el7.noarch installed.

So according to above, this bug is a regression bug of ipxe, hit with ipxe-roms-qemu-20130517-5.gitc4bce43.el7.noarch and can not hit with ipxe-roms-qemu-20130517-4.gitc4bce43.el7.noarch.

HIGHLIGHT: this bug only can reproduce on hosts with cpu intel q9500/q9550 serials.

Comment 1 Qian Guo 2014-04-17 07:45:26 UTC
Created attachment 887067 [details]
/proc/cpuinfo of host with q9500

Comment 4 juzhang 2014-04-17 08:24:56 UTC
> 
> So according to above, this bug is a regression bug of ipxe, hit with
> ipxe-roms-qemu-20130517-5.gitc4bce43.el7.noarch and can not hit with
> ipxe-roms-qemu-20130517-4.gitc4bce43.el7.noarch.

According to this comment, add regression keyword.

> 
> HIGHLIGHT: this bug only can reproduce on hosts with cpu intel q9500/q9550
> serials.

Please notes, QE tested several intel host. and this issue only happens on q9500/q9550 so far.

Set priority as urgent since this is a regression issue. Set the severity as medium since the issue only happens q9500/q9550 so far.

Comment 16 Paolo Bonzini 2014-05-12 15:07:53 UTC
100% reproducible indeed even with the Fedora iPXE.  The end of the trace is as follows:

kvm_emulate_insn:     9c7a0:20e: 0f 20 c0
kvm_entry:            vcpu 0
kvm_emulate_insn:     9c7a0:211: 0c 01
kvm_entry:            vcpu 0
kvm_emulate_insn:     9c7a0:213: 0f 22 c0
kvm_userspace_exit:   reason KVM_EXIT_INTR (10)
kvm_entry:            vcpu 0
kvm_emulate_insn:     9c7a0:216: 0f 22 c0
kvm_emulate_insn:     9c7a0:216: 0f 22 c0 FAIL

From a first look, the KVM_EXIT_INTR causes the VM to re-enter with the wrong instruction pointer.

Comment 17 Paolo Bonzini 2014-05-13 11:52:10 UTC
The repeated dump at offset 0x216 is a bug in the kvm plugin of trace-cmd.  Disabling it (trace-cmd report -N) shows that even the first byte of the instruction fails to be fetched:

 kvm_emulate_insn:     9c7a0:216: (prot16) failed

The reason is that "info cpus" causes the KVM_SET_SREGS ioctl to be triggered at exactly the wrong time, when CR0.PE = 0 but the real mode segment is still in CS.  KVM_SET_SREGS ioctl resets the cached CPL value (which is 0), and the next call to vmx_get_cpl thinks that the CPL is 2 in my case or 3 in RHEL (that's bits 0-1 of CS).  Thus the bug is sensitive to the code size.  If it happens that CS's bits 0-1 are 0, the bug doesn't show up.

Fixing it is not exactly trivial, but not too hard either.  We need to hijack the cs.padding field of kvm_segment to host the CPL, and QEMU needs to get and set the CPL too (which it stores in bits 0-1 of hflags).  The padding is currently ignored, so we also need a new VM capability that can be enabled with KVM_ENABLE_CAP.

In addition, vmx_set_cr0 must force CPL=0 always when CR0.PE=0, not just if VM86 mode is in use.

Comment 18 Paolo Bonzini 2014-05-13 11:52:52 UTC
The last sentence should have been "In addition, vmx_set_cr0 must force CPL=0 always when CR0.PE becomes 1, not just if VM86 mode is in use".

Comment 19 Paolo Bonzini 2014-05-14 14:40:25 UTC
Simpler patch at http://article.gmane.org/gmane.comp.emulators.kvm.devel/121884/raw

Comment 21 Jarod Wilson 2014-08-07 20:54:38 UTC
Patch(es) available on kernel-3.10.0-143.el7

Comment 24 Qian Guo 2014-10-30 06:24:32 UTC
Reproduced this bug by kernel-3.10.0-140.el7.x86_64

Steps
1.Boot guest in a q9500 host
/usr/libexec/qemu-kvm -cpu Penryn -m 4G -smp 4,sockets=1,cores=4,threads=1 -M pc -enable-kvm  -device piix3-usb-uhci,id=usb -name rhel7 -nodefaults -nodefconfig  -device virtio-balloon-pci,id=balloon0  -vnc :10 -vga std -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0   -monitor stdio     -drive file=test,if=none,media=disk,format=raw,rerror=stop,werror=stop,aio=native,id=scsi-disk0 -device virtio-scsi-pci,id=bus2 -device scsi-hd,bus=bus2.0,drive=scsi-disk0,id=disk0 -netdev tap,id=netdev0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=netdev0,id=vn1,mac=52:54:a0:0b:00:01 -boot menu=on -monitor unix:/tmp/m1,server,nowait -S

2.Repeatedly info cpus
while true; do echo "info cpus" |nc -U /tmp/m1 ; done

3.Continue guest 
(qemu) c

Result: qemu crashed:
KVM internal error. Suberror: 1
emulation failure
EAX=00000011 EBX=00010063 ECX=00000030 EDX=00002ca8
ESI=401a7f78 EDI=b10a0000 EBP=00009cf2 ESP=00002ca8
EIP=00000213 EFL=00000006 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =9cf2 0009cf20 ffffffff 00809300 DPL=0 DS16 [-WA]
CS =9c7b 0009c7b0 ffffffff 00809b00 DPL=0 CS16 [-RA]
SS =9cf2 0009cf20 ffffffff 00809300 DPL=0 DS16 [-WA]
DS =9cf2 0009cf20 ffffffff 00809300 DPL=0 DS16 [-WA]
FS =9cf2 0009cf20 ffffffff 00809300 DPL=0 DS16 [-WA]
GS =9cf2 0009cf20 ffffffff 00809300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     0009cf30 00000037
IDT=     00000000 0000ffff
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=66 0f 01 16 10 00 66 0f 01 1e 48 00 0f 20 c0 0c 01 0f 22 c0 <66> ea a4 00 00 00 08 00 0f 20 c0 24 fe 0f 22 c0 ff 2e 4e 00 2e a1 be 06 8e d8 8e c0 8e e0
.....


So this bug is reproduced.

Verify this bug with kernel-3.10.0-196.el7.x86_64

Steps as above

Result, qemu works well and guest can access bios and boot deivce.

So this bug is fixed

Comment 26 errata-xmlrpc 2015-03-05 11:55:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0290.html