Bug 1117542

Summary: Support for movntdq
Product: Red Hat Enterprise Linux 7 Reporter: Alex Williamson <alex.williamson>
Component: kernelAssignee: Paolo Bonzini <pbonzini>
kernel sub component: KVM QA Contact: Virtualization Bugs <virt-bugs>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: alex.williamson, chayang, huding, juzhang, knoel, michen, pbonzini, qzhang, rbalakri, virt-maint, xfu, xwei
Version: 7.0   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-3.10.0-205.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 12:28:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alex Williamson 2014-07-08 22:55:24 UTC
Description of problem:
qemu-kvm exits with:

KVM internal error. Suberror: 1
emulation failure
RAX=0000000000000008 RBX=ffffd000222d4000 RCX=ffffd00022354000 RDX=fffff000d31ea000
RSI=ffffe000716bd5e0 RDI=ffffe0007169a000 RBP=ffffd000221052c0 RSP=ffffd00022105208
R8 =0000000000080000 R9 =0000000000004000 R10=fffff800564433fc R11=ffffd000222d4000
R12=0000000000000000 R13=ffffe000716c2690 R14=0000000000000000 R15=ffffe00071e2f010
RIP=fffff800566099ac RFL=00010286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
FS =0053 000000004fa70000 00003c00 0040f300 DPL=3 DS   [-WA]
GS =002b fffff8038e971000 ffffffff 00c0f300 DPL=3 DS   [-WA]
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 fffff8038fe74080 00000067 00008b00 DPL=0 TSS64-busy
GDT=     fffff8038fe73000 0000007f
IDT=     fffff8038fe73080 00000fff
CR0=80050033 CR2=fffff800569ab724 CR3=0000000102573000 CR4=001506f8
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
Code=02 00 00 b8 08 00 00 00 f3 0f 6f 44 0a f0 f3 0f 6f 4c 0a e0 <66> 0f e7 41 f0 66 0f e7 49 e0 48 83 e9 40 f3 0f 6f 44 0a 10 f3 0f 6f 0c 0a 66 0f e7 41 10

Paolo decodes this to:

$ as -o a.out
        .section .text
        .byte 0x66, 0x0f, 0xe7, 0x41, 0xf0
        .byte 0x66, 0x0f, 0xe7, 0x49, 0xe0
$ objdump -d a.out
    0:  66 0f e7 41 f0          movntdq %xmm0,-0x10(%rcx)
    5:  66 0f e7 49 e0          movntdq %xmm1,-0x20(%rcx)

Version-Release number of selected component (if applicable):
Found upstream, relevant to RHEL as well.

How reproducible:
100%

Steps to Reproduce:
1. Probably best if Paolo can come up with a test case, my test is to attempt to boot a Windows 8.1 guest with an assigned GeForce card using OVMF firmware (not supported by RHEL)
2.
3.

Actual results:
KVM go boom

Expected results:
Windows go boom

Additional info:

Comment 2 Paolo Bonzini 2014-08-29 12:59:51 UTC
http://article.gmane.org/gmane.linux.kernel/1745010

Comment 3 Jarod Wilson 2014-11-17 14:14:34 UTC
Patch(es) available on kernel-3.10.0-205.el7

Comment 6 Xiaoqing Wei 2014-12-02 03:34:55 UTC
Hi Alex,

I am trying to verify this bz,

would like to know what exact model of card & host you're using.
or any kind of VGA passthru will do ?
then I can seek for a host with hw cfg needed. thank you.

Best Regards,
Xiaoqing Wei.

Comment 7 Alex Williamson 2014-12-02 05:34:56 UTC
(In reply to Xiaoqing Wei from comment #6)
> Hi Alex,
> 
> I am trying to verify this bz,
> 
> would like to know what exact model of card & host you're using.
> or any kind of VGA passthru will do ?
> then I can seek for a host with hw cfg needed. thank you.

Likely an Nvidia GT635 on an Intel host, but note that the original problem was not found on a RHEL kernel.  GeForce assignment is not supported on RHEL at all.  VFIO VGA support is explicitly disabled and no attempt has been made to make OVMF-based GPU assignment work on RHEL.  RHEL7 only supports Nvidia Quadro/GRID/Tesla assignment.  I'd suggest a synthetic test that is not dependent on device assignment to validate this bug.

Comment 8 Xiaoqing Wei 2014-12-02 08:47:31 UTC
(In reply to Alex Williamson from comment #7)
> (In reply to Xiaoqing Wei from comment #6)
> > Hi Alex,
> > 
> > I am trying to verify this bz,
> > 
> > would like to know what exact model of card & host you're using.
> > or any kind of VGA passthru will do ?
> > then I can seek for a host with hw cfg needed. thank you.
> 
> Likely an Nvidia GT635 on an Intel host, but note that the original problem
> was not found on a RHEL kernel.  GeForce assignment is not supported on RHEL
> at all.  VFIO VGA support is explicitly disabled and no attempt has been
> made to make OVMF-based GPU assignment work on RHEL.  RHEL7 only supports
> Nvidia Quadro/GRID/Tesla assignment.  I'd suggest a synthetic test that is
> not dependent on device assignment to validate this bug.


Hi Paolo,

Alex suggested dont test with device assignment,
do you have any suggestion to trigger it w/o device assignment ?

best if the senario is supported in RHEL/RHEV.

QE is doing verify things, need to trigger on old version and make sure same step wont trigger on fixed version.


Best Regards,
Xiaoqing Wei.

Comment 9 Xiaoqing Wei 2014-12-02 08:53:21 UTC
or write a program that utilize the "movntdq" instruction ? guessing that might work....


https://bugzilla.redhat.com/buglist.cgi?quicksearch=suberror&list_id=3053572

I've digged out the bz database, several similiar bzs available, and (most of) they have a very common steps that is do 'system_reset' in qemu monitor.
so I arranged hundreds of rounds of tests on different hosts, with Win8.1 x86_64.
on both amd and intel machines, but unluckily. none of them get me a reproducer.

:-(

Comment 10 Paolo Bonzini 2014-12-02 10:09:48 UTC
Right now there is no such testcase in kvm-unit-tests.  Because it uses SSE, it is a bit harder to write one.  I can work on it for 7.2, but for now I suggest sanity checking only.

Comment 12 juzhang 2014-12-08 02:19:27 UTC
According to comment10 and comment11, set this issue as verified.

Comment 13 Xiaoqing Wei 2014-12-08 03:07:45 UTC
updates to C#11,

tests on amd and intel machines with kernel 211

1600+ rounds of test done(each round system_reset vm 20 times.)

no reproduce of this bug.

we're safe in the sanity test.

Comment 15 errata-xmlrpc 2015-03-05 12:28:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0290.html