Bug 1760607
Summary: | Corrupted EAX values due to missing brackets at CPUID[0x800000008] code | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Eduardo Habkost <ehabkost> |
Component: | qemu-kvm | Assignee: | Eduardo Habkost <ehabkost> |
Status: | CLOSED ERRATA | QA Contact: | Yumei Huang <yuhuang> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.0 | CC: | andbartl, chayang, jinzhao, juzhang, knoel, virt-maint, yuhuang |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-1.5.3-171.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-03-31 20:02:00 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Eduardo Habkost
2019-10-10 22:12:41 UTC
Hi Eduardo, Is there a reproducer for the issue? If no, QE will do patch review to verify. (In reply to Yumei Huang from comment #3) > Hi Eduardo, > > Is there a reproducer for the issue? If no, QE will do patch review to > verify. It is possible to reproduce the bug only if the host xlevel is less than 0x80000008. In this case, the physical address size shown in the guest /proc/cpuinfo may not match the host /proc/cpuinfo. If the physical address size in the guest already matches the host, this means the bug is not reproducible on that host. (In reply to Eduardo Habkost from comment #4) > (In reply to Yumei Huang from comment #3) > > Hi Eduardo, > > > > Is there a reproducer for the issue? If no, QE will do patch review to > > verify. > > It is possible to reproduce the bug only if the host xlevel is less than > 0x80000008. Does that mean the initial EAX value should be always less than 0x80000008 in cpuid output? Do you happen to know what kind of host meet the requirement ? In this case, the physical address size shown in the guest > /proc/cpuinfo may not match the host /proc/cpuinfo. > > If the physical address size in the guest already matches the host, this > means the bug is not reproducible on that host. (In reply to Yumei Huang from comment #5) > (In reply to Eduardo Habkost from comment #4) > > (In reply to Yumei Huang from comment #3) > > > Hi Eduardo, > > > > > > Is there a reproducer for the issue? If no, QE will do patch review to > > > verify. > > > > It is possible to reproduce the bug only if the host xlevel is less than > > 0x80000008. > > Does that mean the initial EAX value should be always less than 0x80000008 > in cpuid output? Do you happen to know what kind of host meet the > requirement ? I thought xlevel would appear on /proc/cpuinfo, but it doesn't. xlevel can be checked manually using `cpuid -1 -r -l 0x80000000` on the command line. However, I couldn't find any host with xlevel less than 0x80000008, so maybe this is only reproducible in theory but not in practice. It should be possible to create a virtual machine with xlevel=0x80000007 using nested virt, run QEMU inside the VM, and check the physical address size inside the L2 guest. But I don't think this is worth the effort for a trivial fix. Checking if host physical address size is still being exposed correctly to the guest under normal circumstances is enough to validate this BZ. (In reply to Eduardo Habkost from comment #6) > I thought xlevel would appear on /proc/cpuinfo, but it doesn't. xlevel can > be checked manually using `cpuid -1 -r -l 0x80000000` on the command line. > > However, I couldn't find any host with xlevel less than 0x80000008, so maybe > this is only reproducible in theory but not in practice. > > It should be possible to create a virtual machine with xlevel=0x80000007 > using nested virt, run QEMU inside the VM, and check the physical address > size inside the L2 guest. But I don't think this is worth the effort for a > trivial fix. Checking if host physical address size is still being exposed > correctly to the guest under normal circumstances is enough to validate this > BZ. I tested on both EPYC host and Cascadelaker Server host, compared physical address size between guest and host, and got different results. Would you please help check if it is expected, thanks! qemu-kvm-1.5.3-171.el7 guest kernel: 3.10.0-1105.el7.x86_64 host kernel: 1) On EPYC host, guest physical address size is different from host. QEMU cli: -cpu EPYC Host address size: # cat /proc/cpuinfo | grep address address sizes : 43 bits physical, 48 bits virtual Guest address size: # cat /proc/cpuinfo | grep address address sizes : 48 bits physical, 48 bits virtual 2) On Cascadelake host, guest physical address size is same to host. QEMU cli: -cpu Cascadelake-Server Host address size: # cat /proc/cpuinfo | grep address address sizes : 46 bits physical, 48 bits virtual Guest address size: # cat /proc/cpuinfo | grep address address sizes : 46 bits physical, 48 bits virtual Missed host kernel in comment 7. Host kernel: 3.10.0-1107.el7.x86_64 (In reply to Yumei Huang from comment #7) > 1) On EPYC host, guest physical address size is different from host. > > QEMU cli: -cpu EPYC > > Host address size: > # cat /proc/cpuinfo | grep address > address sizes : 43 bits physical, 48 bits virtual > > Guest address size: > # cat /proc/cpuinfo | grep address > address sizes : 48 bits physical, 48 bits virtual This is not expected. Please send more detailed host info, including hostname in our lab and full 'x86info -r' output. (In reply to Eduardo Habkost from comment #9) > (In reply to Yumei Huang from comment #7) > > 1) On EPYC host, guest physical address size is different from host. > > > > QEMU cli: -cpu EPYC > > > > Host address size: > > # cat /proc/cpuinfo | grep address > > address sizes : 43 bits physical, 48 bits virtual > > > > Guest address size: > > # cat /proc/cpuinfo | grep address > > address sizes : 48 bits physical, 48 bits virtual > > This is not expected. Please send more detailed host info, including > hostname in our lab and full 'x86info -r' output. EPYC host info: hp-dl385g10-02.lab.eng.pek2.redhat.com # x86info -r x86info v1.30. Dave Jones 2001-2011 Feedback to <davej>. Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Unknown CPU family: 0x17 Found 32 identical CPUs Extended Family: 8 Extended Model: 0 Family: 15 Model: 1 Stepping: 2 CPU Model (x86info's best guess): Processor name string (BIOS programmed): AMD EPYC 7251 8-Core Processor Monitor/Mwait: min/max line size 64/64, ecx bit 0 support, enumeration extension SVM: revision 1, 32768 ASIDs Address Size: 48 bits virtual, 48 bits physical The physical package has 8 of 2 possible cores implemented. eax in: 0x00000000, eax = 0000000d ebx = 68747541 ecx = 444d4163 edx = 69746e65 eax in: 0x00000001, eax = 00800f12 ebx = 00100800 ecx = 7ed8320b edx = 178bfbff eax in: 0x00000002, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x00000003, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x00000004, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x00000005, eax = 00000040 ebx = 00000040 ecx = 00000003 edx = 00000011 eax in: 0x00000006, eax = 00000004 ebx = 00000000 ecx = 00000001 edx = 00000000 eax in: 0x00000007, eax = 00000000 ebx = 209c01a9 ecx = 00000000 edx = 00000000 eax in: 0x00000008, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x00000009, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x0000000a, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x0000000b, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x0000000c, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x0000000d, eax = 00000007 ebx = 00000340 ecx = 00000340 edx = 00000000 eax in: 0x80000000, eax = 8000001f ebx = 68747541 ecx = 444d4163 edx = 69746e65 eax in: 0x80000001, eax = 00800f12 ebx = 40000000 ecx = 35c233ff edx = 2fd3fbff eax in: 0x80000002, eax = 20444d41 ebx = 43595045 ecx = 35323720 edx = 2d382031 eax in: 0x80000003, eax = 65726f43 ebx = 6f725020 ecx = 73736563 edx = 2020726f eax in: 0x80000004, eax = 20202020 ebx = 20202020 ecx = 20202020 edx = 00202020 eax in: 0x80000005, eax = ff40ff40 ebx = ff40ff40 ecx = 20080140 edx = 40040140 eax in: 0x80000006, eax = 36006400 ebx = 56006400 ecx = 02006140 edx = 0100c140 eax in: 0x80000007, eax = 00000000 ebx = 0000001b ecx = 00000000 edx = 00006799 eax in: 0x80000008, eax = 00003030 ebx = 00001007 ecx = 0000600f edx = 00000000 eax in: 0x80000009, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000000a, eax = 00000001 ebx = 00008000 ecx = 00000000 edx = 0001bcff eax in: 0x8000000b, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000000c, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000000d, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000000e, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000000f, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000010, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000011, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000012, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000013, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000014, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000015, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000016, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000017, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000018, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000019, eax = f040f040 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000001a, eax = 00000003 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000001b, eax = 000003ff ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000001c, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000001d, eax = 00004121 ebx = 01c0003f ecx = 0000003f edx = 00000000 eax in: 0x8000001e, eax = 00000000 ebx = 00000100 ecx = 00000300 edx = 00000000 eax in: 0x8000001f, eax = 0000000f ebx = 0000016f ecx = 0000000f edx = 00000008 running at an estimated 2.10GHz (In reply to Yumei Huang from comment #10) > eax in: 0x8000001f, eax = 0000000f ebx = 0000016f ecx = 0000000f edx = > 00000008 The 5 bit reduction in /proc/cpuinfo is caused by CPUID[0x8000001f].EBX[11:6]. See the X86_FEATURE_SME checks at arch/x86/kernel/cpu/amd.c:early_init_amd(). CPUID[0x8000001f].EBX[11:6] is documented as affecting only host physical addresses, not guest physical addresses. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:1116 |