Bug 1566258

Summary:	Running a VM causes thousands of vmwrite error: reg 401e value 2021 (err 12)
Product:	[Fedora] Fedora	Reporter:	pzeppegno
Component:	kernel	Assignee:	Kernel Maintainer List <kernel-maint>
Status:	CLOSED ERRATA	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	28	CC:	airlied, bskeggs, dowdle, ewk, fche, fedora-kernel-kvm, hdegoede, ichavero, itamar, jarodwilson, jcline, jglisse, john.j5live, jonathan, josef, kernel-maint, labbott, linville, mchehab, mjg59, nberrehouc, pbonzini, redhat, sean.j.christopherson, steved
Target Milestone:	---	Keywords:	Reopened
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	kernel-4.16.8-300.fc28	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-09-27 14:15:28 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description pzeppegno 2018-04-11 21:38:54 UTC

Description of problem:


Version-Release number of selected component (if applicable):
4.16.0-300.fc28.x86_64

How reproducible:
always

Steps to Reproduce:
1. Start a virtual machine under virt-manager (windows 7 or XP)
2. See it running slowly
3. Check dmesg and see zillions of warnings

Actual results:
VM runs slow
zillions of errors in dmesg

[35238.492760] vmwrite error: reg 401e value 2021 (err 12)
[35238.492765] CPU: 0 PID: 21396 Comm: CPU 0/KVM Tainted: P        W  OE    4.16.1-300.fc28.x86_64 #1
[35238.492766] Hardware name: System manufacturer System Product Name/P5B-Deluxe, BIOS 1238    09/30/2008
[35238.492767] Call Trace:
[35238.492774]  dump_stack+0x5c/0x85
[35238.492780]  vmx_set_cr4+0x1dc/0x200 [kvm_intel]
[35238.492803]  kvm_set_cr4+0x18b/0x2e0 [kvm]
[35238.492823]  rsm_enter_protected_mode+0x7f/0x110 [kvm]
[35238.492844]  rsm_load_state_64+0x6cb/0x8b0 [kvm]
[35238.492863]  ? emulator_task_switch+0x290/0x4a0 [kvm]
[35238.492883]  em_rsm+0x258/0x2a0 [kvm]
[35238.492903]  x86_emulate_insn+0x15a/0x10a0 [kvm]
[35238.492908]  ? vmx_get_cs_db_l_bits+0x17/0x30 [kvm_intel]
[35238.492927]  x86_emulate_instruction+0x14d/0x670 [kvm]
[35238.492932]  handle_exception+0x112/0x330 [kvm_intel]
[35238.492937]  vmx_handle_exit+0xab/0xe10 [kvm_intel]
[35238.492941]  ? vmx_vcpu_run+0x7b7/0x990 [kvm_intel]
[35238.492961]  ? kvm_arch_vcpu_ioctl_run+0x504/0x1720 [kvm]
[35238.492964]  ? __schedule+0x23f/0x850
[35238.492967]  ? __update_load_avg_se.isra.33+0x1c5/0x1d0
[35238.492984]  ? kvm_vcpu_ioctl+0x2b8/0x5e0 [kvm]
[35238.492986]  ? set_next_entity+0x96/0x1c0
[35238.492988]  ? pick_next_task_fair+0x2f7/0x5a0
[35238.492991]  ? do_vfs_ioctl+0xa4/0x610
[35238.492993]  ? SyS_ioctl+0x74/0x80
[35238.492995]  ? do_syscall_64+0x74/0x180
[35238.492997]  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2

same errors also with nouveau. nvidia taint is not the cause

Expected results:
Runs fast with no warnings

Additional info:
Old processor, same one hit previously by kernel regression

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz

Comment 1 Sean Christopherson 2018-04-17 15:10:49 UTC

v4.16 is doing an unconditional RMW on SECONDARY_VM_EXEC_CONTROL in the vmx_set_cr4 path.  The SECONDARY_VM_EXEC_CONTROL VMCS field is not guaranteed to exist on older processors, e.g. most Core 2 SKUs, causing VMWRITE to fail.  KVM's VMREAD function doesn't warn on failure and so only the VMWRITE failure is logged.  Because the VMREAD fails in the RMW sequence, the value reported in the VMWRITE error path is garbage, i.e. whatever happened to be in the GPR prior to VMREAD.  In this case it's more than likely the value that is actually destined to be written to CR4.

This error is benign in the sense that it won't actually affect KVM functionality, though it undoubtedly slows down the system due to flooding the kernel log.

Comment 2 pzeppegno 2018-04-18 17:47:41 UTC

I confirm that since 4.16 I cannot use VMs with kvm anymore, again.

I have to switch back to using VirtualBox, that while nice enough, is not my preferred solution.

If removing the RMW for that set of old processors won't affect functionality, then I would ask to at least make it conditional or roll the change back as it has created a regression.

Comment 3 Gerhard Wiesinger 2018-04-24 16:27:17 UTC

Also happens with kernel-4.16.3-200.fc27. In my setup kernel stalls then completely (e.g. after bootup 3 pings, afterwards no ping replay from the hosts). Around 10 VMs are started.

See https://bodhi.fedoraproject.org/updates/FEDORA-2018-b6be90818d for details.

Possible fix: https://patchwork.kernel.org/patch/10353377/

Testbuild at: https://koji.fedoraproject.org/koji/taskinfo?taskID=26538797

Comment 4 pzeppegno 2018-04-25 00:17:55 UTC

I have installed and tested on my PC the testbuild kernel from:

https://koji.fedoraproject.org/koji/taskinfo?taskID=26538797

It works fine in my PC. No issues to report. Good job everyone!

Comment 5 Nicolas Berrehouc 2018-04-28 08:12:33 UTC

Same problem and same fix with the testbuild kernel https://koji.fedoraproject.org/koji/taskinfo?taskID=26538797

Works fine again. Good job!

Comment 6 pzeppegno 2018-05-04 17:04:58 UTC

I have seen various comments about the correctness of the patch, just wondering if there's a consensus about it and therefore when the patch would be applied upstream and made available in an official fedora build. Right now I have to choose between using the fixed but stale koji build and a broken fresh release, that is not the ideal situation.

Thanks

Comment 7 Laura Abbott 2018-05-04 17:22:30 UTC

There was a second version posted on Monday https://patchwork.kernel.org/patch/10372151/ and tagged for stable. I haven't seen it applied to the kvm tree yet but I would expect it to happen sometime this coming week. If you see the maintainer ack it or mention it is applied feel free to let us know so we can pick it up.

Comment 8 Jeremy Cline 2018-05-07 13:16:22 UTC

*** Bug 1575625 has been marked as a duplicate of this bug. ***

Comment 9 Scott Dowdle 2018-05-10 18:11:04 UTC

I don't know if this is related or not... but I haven't found a better bug match yet...

I have 16GB of RAM and 4GB of swap.  I start a KVM VM that is assigned 2GB of RAM.  I have plenty of free RAM available (although a significant amount of it might be used as buffer/cache) but the system goes into a swap storm.  It is like getting a request to 2GB of RAM for the VM somehow made the system overreact and decide to start swapping all my desktop applications to swap.  swap goes from 0 to 2GB+ in a couple of minutes, the hard drive light is constantly on... and switching between applications is impossible... unless I'm extra patient and wait several minutes.  Looking at free -m I have 10GB+ of RAM available but it is in a heavy swap situation.

Just rebooted tot he 4.16.7-300 kernel and am hoping not to run into that again but I started having it on 4.15.x and all of the 4.16.x releases.

Comment 10 Fedora Update System 2018-05-11 14:05:59 UTC

kernel-4.16.8-300.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-ac3b4c7605

Comment 11 Jeremy Cline 2018-05-11 16:19:17 UTC

*** Bug 1577283 has been marked as a duplicate of this bug. ***

Comment 12 Fedora Update System 2018-05-12 19:35:43 UTC

kernel-4.16.8-300.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-ac3b4c7605

Comment 13 Fedora Update System 2018-05-14 17:55:44 UTC

kernel-4.16.8-300.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Comment 14 Frank Ch. Eigler 2018-05-21 21:19:53 UTC

Please issue an fc27 update, since it too is affected.

Comment 15 Gerhard Wiesinger 2018-09-24 18:43:55 UTC

Kernel 4.17.19-200.fc28.x86_64 is ok
but Kernel 4.18.9-200.fc28.x86_64 is again NOT ok.

Maybe the patch is missing.
Is the upstream kernel integration maybe missing, too?

Thnx

Comment 16 Nicolas Berrehouc 2018-09-25 17:27:42 UTC

Indeed problem is still present with kernel 4.18.9-200.fc28.x86_64.

Errors in dmesg:

[207786.536493] vmwrite error: reg 401e value 0 (err 12)
[207786.546141] CPU: 6 PID: 7719 Comm: CPU 0/KVM Not tainted 4.18.9-200.fc28.x86_64 #1
[207786.556149] Hardware name: Dell Inc. PowerEdge 2900/0TM757, BIOS 2.7.0 10/30/2010
[207786.566105] Call Trace:
[207786.575939]  dump_stack+0x5c/0x80
[207786.585904]  vmx_set_virtual_apic_mode+0x1c5/0x270 [kvm_intel]
[207786.595926]  ? kvm_make_all_cpus_request+0x4d/0x70 [kvm]
[207786.605794]  kvm_lapic_set_base+0x7b/0x190 [kvm]
[207786.615583]  kvm_set_apic_base+0xac/0xd0 [kvm]
[207786.625218]  kvm_set_msr_common+0x829/0xc30 [kvm]
[207786.635258]  ? vmx_vcpu_run+0x404/0xb80 [kvm_intel]
[207786.644796]  ? vmx_vcpu_run+0x410/0xb80 [kvm_intel]
[207786.654695]  ? vmx_vcpu_run+0x404/0xb80 [kvm_intel]
[207786.664250]  ? vmx_vcpu_run+0x410/0xb80 [kvm_intel]
[207786.673532]  ? vmx_vcpu_run+0x404/0xb80 [kvm_intel]
[207786.683044]  handle_wrmsr+0x55/0x140 [kvm_intel]
[207786.693016]  kvm_arch_vcpu_ioctl_run+0x56e/0x1a10 [kvm]
[207786.702666]  kvm_vcpu_ioctl+0x2b8/0x5e0 [kvm]
[207786.712577]  do_vfs_ioctl+0xa4/0x620
[207786.722416]  ? apic_timer_interrupt+0xa/0x20
[207786.732097]  ksys_ioctl+0x60/0x90
[207786.742037]  __x64_sys_ioctl+0x16/0x20
[207786.751679]  do_syscall_64+0x5b/0x160
[207786.761662]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[207786.771391] RIP: 0033:0x7ff61abf4c57
[207786.781515] Code: 00 00 90 48 8b 05 49 82 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 19 82 2c 00 f7 d8 64 89 01 48 
[207786.802966] RSP: 002b:00007ff60ba37648 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[207786.813995] RAX: ffffffffffffffda RBX: 00007ff624ac7002 RCX: 00007ff61abf4c57
[207786.824446] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000017
[207786.834947] RBP: 0000000000000001 R08: 000055b035f73730 R09: 000000000000ffff
[207786.844994] R10: 000055b035f5d490 R11: 0000000000000246 R12: 000055b035f59760
[207786.855174] R13: 0000000000000000 R14: 00007ff624ac6000 R15: 000055b0386308b0

Comment 17 pzeppegno 2018-09-26 21:47:59 UTC

Confirmed there's something fishy again with kernel-4.18.9-200.fc28.x86_64

[  +0.163969] vmwrite error: reg 401e value 0 (err 12)
[  +0.000008] CPU: 0 PID: 8329 Comm: CPU 1/KVM Tainted: P           OE     4.18.9-200.fc28.x86_64 #1
[  +0.000001] Hardware name: System manufacturer System Product Name/P5B-Deluxe, BIOS 1238    09/30/2008
[  +0.000001] Call Trace:
[  +0.000011]  dump_stack+0x5c/0x80
[  +0.000010]  vmx_set_virtual_apic_mode+0x1c5/0x270 [kvm_intel]
[  +0.000039]  ? kvm_make_all_cpus_request+0x4d/0x70 [kvm]
[  +0.000019]  kvm_lapic_set_base+0x7b/0x190 [kvm]
[  +0.000019]  kvm_set_apic_base+0xac/0xd0 [kvm]
[  +0.000018]  kvm_set_msr_common+0x829/0xc30 [kvm]
[  +0.000005]  ? vmx_vcpu_run+0x404/0xb80 [kvm_intel]
[  +0.000005]  ? vmx_vcpu_run+0x410/0xb80 [kvm_intel]
[  +0.000004]  ? vmx_vcpu_run+0x404/0xb80 [kvm_intel]
[  +0.000004]  ? vmx_vcpu_run+0x410/0xb80 [kvm_intel]
[  +0.000004]  ? vmx_vcpu_run+0x404/0xb80 [kvm_intel]
[  +0.000004]  handle_wrmsr+0x55/0x140 [kvm_intel]
[  +0.000020]  kvm_arch_vcpu_ioctl_run+0x56e/0x1a10 [kvm]
[  +0.000016]  kvm_vcpu_ioctl+0x2b8/0x5e0 [kvm]
[  +0.000004]  ? do_signal+0x1a8/0x610
[  +0.000003]  ? __fpu__restore_sig+0x75/0x440
[  +0.000002]  do_vfs_ioctl+0xa4/0x620
[  +0.000003]  ksys_ioctl+0x60/0x90
[  +0.000001]  __x64_sys_ioctl+0x16/0x20
[  +0.000003]  do_syscall_64+0x5b/0x160
[  +0.000003]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  +0.000003] RIP: 0033:0x7f1fcf961c57
[  +0.000000] Code: 00 00 90 48 8b 05 49 82 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 19 82 2c 00 f7 d8 64 89 01 48 
[  +0.000035] RSP: 002b:00007f1fbb7fd648 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  +0.000002] RAX: ffffffffffffffda RBX: 0000562f67310920 RCX: 00007f1fcf961c57
[  +0.000001] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000017
[  +0.000001] RBP: 0000562f6568aa60 R08: 0000562f650ac730 R09: 0000562f656bb640
[  +0.000002] R10: 0000000000000000 R11: 0000000000000246 R12: 0000562f65092760
[  +0.000001] R13: 00007ffe00301fbf R14: 00007f1fd96a6000 R15: 0000562f67310920

However the behavior seems a bit different than before:

My windows 7 VM goes into suspended state after a while and I cannot resume it, while my fedora 28 VM throws some of these errors but not the zillions we had previously.

This is clearly a bummer, can someone take a look at it?

Comment 18 Jeremy Cline 2018-09-27 14:15:28 UTC

Let's track this on the new bug (https://bugzilla.redhat.com/show_bug.cgi?id=1632426) - it's not a regression of the fix for this bug as far as I can tell, just a new unconditional write to a field that presumably doesn't exist.