Bug 1572446 - KVM: entry failed, hardware error 0xffffffff when booting rhel 7.5 guest in valgrind environment
Summary: KVM: entry failed, hardware error 0xffffffff when booting rhel 7.5 guest in v...
Keywords:
Status: CLOSED DUPLICATE of bug 1572447
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.6
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Virtualization Maintenance
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-27 03:18 UTC by Guo, Zhiyi
Modified: 2018-04-27 05:22 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-27 05:22:33 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Guo, Zhiyi 2018-04-27 03:18:29 UTC
Description of problem:
KVM: entry failed, hardware error 0xffffffff when booting rhel 7.5 guest in valgrind environment

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.10.0-21.el7.x86_64
3.10.0-862.el7.x86_64
valgrind-3.13.0-10.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot guest with cli:
valgrind /usr/libexec/qemu-kvm -name nice -m 4G \
        -cpu Opteron_G5,enforce \
        -smp 4,cores=2 \
        -monitor stdio \
        -qmp unix:/tmp/qmp,server,nowait \
        -device cirrus-vga,vgamem_mb=4 \
        -serial unix:/tmp/console,server,nowait \
        -uuid 115e11b2-a869-41b5-91cd-6a32a907be7e \
        -drive file=rhel75.qcow2,if=none,id=drive-scsi-disk0,format=qcow2,cache=none,werror=stop,rerror=stop -device ide-hd,drive=drive-scsi-disk0,id=scsi-disk0 \
        -vnc :0 \
        -cdrom RHEL-7.5-20180322.0-Server-x86_64-dvd1.iso \

2.
3.

Actual results:
KVM: entry failed:
KVM: entry failed, hardware error 0xffffffff
RAX=0000000000000005 RBX=0000000000000000 RCX=0000000000000003 RDX=ffff8a65ffd80000
RSI=0000000000000048 RDI=0000000000000003 RBP=ffff8a64f547f960 RSP=ffff8a64f547f958
R8 =ffff8a65fff3a280 R9 =0000000000000100 R10=0000000000000100 R11=000142fd00c0aba0
R12=0000000000000282 R13=0000000000000107 R14=0000000000000000 R15=ffff8a65f96c4000
RIP=ffffffffae267ae7 RFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 ffffffff 00800000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00800000
FS =0000 00007f76b0c408c0 ffffffff 00800000
GS =0000 ffff8a65ffc00000 ffffffff 00800000
LDT=0000 0000000000000000 0000ffff 00000000
TR =0040 ffff8a65ffc04000 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff8a65ffc0c000 0000007f
IDT=     ffffffffff528000 00000fff
CR0=8005003b CR2=00005617250df000 CR3=0000000035dba000 CR4=000406f0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
Code=fd a0 53 f3 ae 48 89 e5 53 31 db 0f b7 0c 10 b8 05 00 00 00 <0f> 01 c1 5b 5d c3 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec


Expected results:
Guest can boot without problem

Additional info:
Until now, I can only reproduce this issue on some amd host like:
Model name:            AMD Opteron(tm) X3421 APU
Model name:            AMD Ryzen 5 PRO 1500 Quad-Core Processor

And I cannot reproduce this issue on AMD EPYC host

No such issue happen without valgrind environment.

Dmesg log:
[216539.375264] ------------[ cut here ]------------
[216539.375298] WARNING: CPU: 0 PID: 1498 at arch/x86/kvm/emulate.c:5653 x86_emulate_insn+0x38b/0xd00 [kvm]
[216539.375300] Modules linked in: vfat fat amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg joydev pcspkr k10temp fam15h_power i2c_piix4 shpchp pinctrl_amd video i2c_designware_platform i2c_designware_core acpi_cpufreq ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic amdkfd amd_iommu_v2 amdgpu i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci ttm libahci drm libata crct10dif_pclmul crct10dif_common tg3 crc32c_intel ptp pps_core i2c_hid i2c_core dm_mirror dm_region_hash dm_log dm_mod
[216539.375339] CPU: 0 PID: 1498 Comm: memcheck-amd64- Kdump: loaded Not tainted 3.10.0-862.el7.x86_64 #1
[216539.375340] Hardware name: HPE ProLiant MicroServer Gen10/ProLiant MicroServer Gen10, BIOS 5.12 03/02/2018
[216539.375342] Call Trace:
[216539.375350]  [<ffffffffb050d768>] dump_stack+0x19/0x1b
[216539.375355]  [<ffffffffafe916d8>] __warn+0xd8/0x100
[216539.375357]  [<ffffffffafe9181d>] warn_slowpath_null+0x1d/0x20
[216539.375370]  [<ffffffffc081854b>] x86_emulate_insn+0x38b/0xd00 [kvm]
[216539.375381]  [<ffffffffc07fa15d>] x86_emulate_instruction+0x1cd/0x700 [kvm]
[216539.375386]  [<ffffffffc0951e6e>] ud_interception+0x1e/0x40 [kvm_amd]
[216539.375389]  [<ffffffffc0957a54>] handle_exit+0x224/0xab0 [kvm_amd]
[216539.375400]  [<ffffffffc07f0ccc>] ? kvm_set_cr8+0x1c/0x20 [kvm]
[216539.375403]  [<ffffffffc0952f3a>] ? svm_vcpu_run+0x37a/0x610 [kvm_amd]
[216539.375413]  [<ffffffffc07f671d>] vcpu_enter_guest+0x64d/0x12c0 [kvm]
[216539.375425]  [<ffffffffc07fde58>] kvm_arch_vcpu_ioctl_run+0x358/0x480 [kvm]
[216539.375434]  [<ffffffffc07e3441>] kvm_vcpu_ioctl+0x2b1/0x650 [kvm]
[216539.375439]  [<ffffffffb002fb90>] do_vfs_ioctl+0x350/0x560
[216539.375442]  [<ffffffffafea5b0b>] ? recalc_sigpending+0x1b/0x70
[216539.375446]  [<ffffffffb00d82bf>] ? file_has_perm+0x9f/0xb0
[216539.375448]  [<ffffffffb002fe41>] SyS_ioctl+0xa1/0xc0
[216539.375452]  [<ffffffffb051f7d5>] system_call_fastpath+0x1c/0x21
[216539.375454] ---[ end trace 7758fa556d251181 ]---
[216539.375456] ------------[ cut here ]------------
[216539.375466] WARNING: CPU: 0 PID: 1498 at arch/x86/kvm/x86.c:364 exception_type+0x49/0x50 [kvm]
[216539.375467] Modules linked in: vfat fat amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg joydev pcspkr k10temp fam15h_power i2c_piix4 shpchp pinctrl_amd video i2c_designware_platform i2c_designware_core acpi_cpufreq ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic amdkfd amd_iommu_v2 amdgpu i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci ttm libahci drm libata crct10dif_pclmul crct10dif_common tg3 crc32c_intel ptp pps_core i2c_hid i2c_core dm_mirror dm_region_hash dm_log dm_mod
[216539.375491] CPU: 0 PID: 1498 Comm: memcheck-amd64- Kdump: loaded Tainted: G        W      ------------   3.10.0-862.el7.x86_64 #1
[216539.375493] Hardware name: HPE ProLiant MicroServer Gen10/ProLiant MicroServer Gen10, BIOS 5.12 03/02/2018
[216539.375494] Call Trace:
[216539.375496]  [<ffffffffb050d768>] dump_stack+0x19/0x1b
[216539.375499]  [<ffffffffafe916d8>] __warn+0xd8/0x100
[216539.375501]  [<ffffffffafe9181d>] warn_slowpath_null+0x1d/0x20
[216539.375511]  [<ffffffffc07f01e9>] exception_type+0x49/0x50 [kvm]
[216539.375521]  [<ffffffffc07fa36b>] x86_emulate_instruction+0x3db/0x700 [kvm]
[216539.375525]  [<ffffffffc0951e6e>] ud_interception+0x1e/0x40 [kvm_amd]
[216539.375528]  [<ffffffffc0957a54>] handle_exit+0x224/0xab0 [kvm_amd]
[216539.375538]  [<ffffffffc07f0ccc>] ? kvm_set_cr8+0x1c/0x20 [kvm]
[216539.375540]  [<ffffffffc0952f3a>] ? svm_vcpu_run+0x37a/0x610 [kvm_amd]
[216539.375551]  [<ffffffffc07f671d>] vcpu_enter_guest+0x64d/0x12c0 [kvm]
[216539.375562]  [<ffffffffc07fde58>] kvm_arch_vcpu_ioctl_run+0x358/0x480 [kvm]
[216539.375581]  [<ffffffffc07e3441>] kvm_vcpu_ioctl+0x2b1/0x650 [kvm]
[216539.375584]  [<ffffffffb002fb90>] do_vfs_ioctl+0x350/0x560
[216539.375586]  [<ffffffffafea5b0b>] ? recalc_sigpending+0x1b/0x70
[216539.375588]  [<ffffffffb00d82bf>] ? file_has_perm+0x9f/0xb0
[216539.375591]  [<ffffffffb002fe41>] SyS_ioctl+0xa1/0xc0
[216539.375594]  [<ffffffffb051f7d5>] system_call_fastpath+0x1c/0x21
[216539.375596] ---[ end trace 7758fa556d251182 ]---
[216539.375612] ------------[ cut here ]------------
[216539.375623] WARNING: CPU: 0 PID: 1498 at arch/x86/kvm/x86.c:364 exception_type+0x49/0x50 [kvm]
[216539.375624] Modules linked in: vfat fat amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg joydev pcspkr k10temp fam15h_power i2c_piix4 shpchp pinctrl_amd video i2c_designware_platform i2c_designware_core acpi_cpufreq ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic amdkfd amd_iommu_v2 amdgpu i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci ttm libahci drm libata crct10dif_pclmul crct10dif_common tg3 crc32c_intel ptp pps_core i2c_hid i2c_core dm_mirror dm_region_hash dm_log dm_mod
[216539.375647] CPU: 0 PID: 1498 Comm: memcheck-amd64- Kdump: loaded Tainted: G        W      ------------   3.10.0-862.el7.x86_64 #1
[216539.375649] Hardware name: HPE ProLiant MicroServer Gen10/ProLiant MicroServer Gen10, BIOS 5.12 03/02/2018
[216539.375650] Call Trace:
[216539.375652]  [<ffffffffb050d768>] dump_stack+0x19/0x1b
[216539.375654]  [<ffffffffafe916d8>] __warn+0xd8/0x100
[216539.375657]  [<ffffffffafe9181d>] warn_slowpath_null+0x1d/0x20
[216539.375666]  [<ffffffffc07f01e9>] exception_type+0x49/0x50 [kvm]
[216539.375677]  [<ffffffffc07f6cd8>] vcpu_enter_guest+0xc08/0x12c0 [kvm]
[216539.375688]  [<ffffffffc07fde58>] kvm_arch_vcpu_ioctl_run+0x358/0x480 [kvm]
[216539.375697]  [<ffffffffc07e3441>] kvm_vcpu_ioctl+0x2b1/0x650 [kvm]
[216539.375700]  [<ffffffffb002fb90>] do_vfs_ioctl+0x350/0x560
[216539.375702]  [<ffffffffafea5b0b>] ? recalc_sigpending+0x1b/0x70
[216539.375703]  [<ffffffffb00d82bf>] ? file_has_perm+0x9f/0xb0
[216539.375706]  [<ffffffffb002fe41>] SyS_ioctl+0xa1/0xc0
[216539.375708]  [<ffffffffb051f7d5>] system_call_fastpath+0x1c/0x21
[216539.375710] ---[ end trace 7758fa556d251183 ]---
[216539.375713] SVM: KVM: FAILED VMRUN WITH VMCB:
[216539.375737] SVM: VMCB Control Area:
[216539.375749] SVM: cr_read:            0010
[216539.375763] SVM: cr_write:           0010
[216539.375776] SVM: dr_read:            00ff
[216539.375790] SVM: dr_write:           00ff
[216539.375803] SVM: exceptions:         00060042
[216539.375817] SVM: intercepts:         00002e7fbdc48037
[216539.375834] SVM: pause filter count: 3000
[216539.375847] SVM: iopm_base_pa:       00000003e4a44000
[216539.375876] SVM: msrpm_base_pa:      00000003d8768000
[216539.375893] SVM: tsc_offset:         fffe63858e4b02f2
[216539.375909] SVM: asid:               58
[216539.375922] SVM: tlb_ctl:            0
[216539.375935] SVM: int_ctl:            010f0100
[216539.375949] SVM: int_vector:         00000000
[216539.375963] SVM: int_state:          00000000
[216539.375977] SVM: exit_code:          ffffffff
[216539.375992] SVM: exit_info1:         0000000000000000
[216539.376008] SVM: exit_info2:         0000000000000000
[216539.376025] SVM: exit_int_info:      00000000
[216539.376042] SVM: exit_int_info_err:  00000000
[216539.376057] SVM: nested_ctl:         1
[216539.376071] SVM: nested_cr3:         00000000d78f4000
[216539.376088] SVM: avic_vapic_bar:     0000000000000000
[216539.376106] SVM: event_inj:          800003ff
[216539.376122] SVM: event_inj_err:      00000000
[216539.376138] SVM: virt_ext:           0
[216539.376153] SVM: next_rip:           0000000000000000
[216539.376171] SVM: avic_backing_page:  0000000000000000
[216539.376188] SVM: avic_logical_id:    0000000000000000
[216539.376204] SVM: avic_physical_id:   0000000000000000
[216539.376220] SVM: VMCB State Save Area:
[216539.376233] SVM: es:   s: 0000 a: 0000 l: ffffffff b: 0000000000000000
[216539.376253] SVM: cs:   s: 0010 a: 029b l: ffffffff b: 0000000000000000
[216539.376273] SVM: ss:   s: 0018 a: 0c93 l: ffffffff b: 0000000000000000
[216539.376294] SVM: ds:   s: 0000 a: 0000 l: ffffffff b: 0000000000000000
[216539.376314] SVM: fs:   s: 0000 a: 0000 l: ffffffff b: 00007f76b0c408c0
[216539.376334] SVM: gs:   s: 0000 a: 0000 l: ffffffff b: ffff8a65ffc00000
[216539.376355] SVM: gdtr: s: 0000 a: 0000 l: 0000007f b: ffff8a65ffc0c000
[216539.376376] SVM: ldtr: s: 0000 a: 0000 l: 0000ffff b: 0000000000000000
[216539.376397] SVM: idtr: s: 0000 a: 0000 l: 00000fff b: ffffffffff528000
[216539.376417] SVM: tr:   s: 0040 a: 008b l: 00002087 b: ffff8a65ffc04000
[216539.376437] SVM: cpl:            0                efer:         0000000000001d01
[216539.376816] SVM: cr0:            000000008005003b cr2:          00005617250df000
[216539.377231] SVM: cr3:            0000000035dba000 cr4:          00000000000406f0
[216539.377621] SVM: dr6:            00000000ffff0ff0 dr7:          0000000000000400
[216539.378033] SVM: rip:            ffffffffae267ae7 rflags:       0000000000000046
[216539.378430] SVM: rsp:            ffff8a64f547f958 rax:          0000000000000005
[216539.378821] SVM: star:           0023001000000000 lstar:        ffffffffae91f670
[216539.379235] SVM: cstar:          ffffffffae9237d0 sfmask:       0000000000043700
[216539.379694] SVM: kernel_gs_base: 0000000000000000 sysenter_cs:  0000000000000010
[216539.380126] SVM: sysenter_esp:   0000000000000000 sysenter_eip: 00000000ae923450
[216539.380511] SVM: gpat:           0007050600070106 dbgctl:       0000000000000000
[216539.380901] SVM: br_from:        0000000000000000 br_to:        0000000000000000
[216539.381287] SVM: excp_from:      0000000000000000 excp_to:      0000000000000000

Comment 1 Guo, Zhiyi 2018-04-27 05:22:33 UTC
Seems bugzilla has some problem when I report the bug.. mark this one as a duplicate of 1572447

*** This bug has been marked as a duplicate of bug 1572447 ***


Note You need to log in before you can comment on or make changes to this bug.