Bug 1633150
Summary: | Cross migration from RHEL7.5 to RHEL7.6 fails with cpu flag stibp | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Fangge Jin <fjin> | |
Component: | qemu-kvm-rhev | Assignee: | Eduardo Habkost <ehabkost> | |
Status: | CLOSED ERRATA | QA Contact: | jingzhao <jinzhao> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 7.6 | CC: | chayang, dgilbert, ehabkost, hhuang, jdenemar, jherrman, jinzhao, jiyan, juzhang, kchamart, mrezanin, mtessun, salmy, toneata, virt-maint, xuzhang, zhguo | |
Target Milestone: | rc | Keywords: | Regression, ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-rhev-2.12.0-19.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: |
Previously, migrating virtual machines (VMs) from a Red Hat Enterprise Linux 7.5 host with a single-thread indirect branch predictors (STIBP) flag set in some cases failed. This update ensures that the flag is consistently added to VMs with an AMD64 or Intel 64 virtual CPU (vCPU), which prevents the described problem from occurring.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1638077 1639446 (view as bug list) | Environment: | ||
Last Closed: | 2019-08-22 09:19:58 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1638077, 1651787 |
Description
Fangge Jin
2018-09-26 09:50:37 UTC
Is the feature actually enabled for the running guest or does QEMU complain that it can't be enabled? You should be able to check this with libvirt by starting a domain with host-model and running "virsh dumpxml $DOMAIN". The XML should contain a CPU model and features which were actually enabled by QEMU. (In reply to Jiri Denemark from comment #3) > Is the feature actually enabled for the running guest or does QEMU complain > that it can't be enabled? > > You should be able to check this with libvirt by starting a domain with > host-model and running "virsh dumpxml $DOMAIN". The XML should contain a CPU > model and features which were actually enabled by QEMU. After guest starts: # virsh dumpxml rhel7-min <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>IvyBridge-IBRS</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='pcid'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='arat'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='stibp'/> <feature policy='require' name='ssbd'/> <feature policy='require' name='xsaveopt'/> OK, so it confirms the feature was actually enabled by QEMU. And libvirt is trying to make sure the feature does not disappear once the domain is migrated. Verified it with qemu-kvm-rhev-2.12.0-20.el7.x86_64 Detailed info: 1. Boot guest with "cpu IvyBridge,ss=on,pcid=on,hypervisor=on,arat=on,tsc_adjust=on,stibp=on,ssbd=on,xsaveopt=on" 2. Migrate from RHEL.7.5 to RHEL.7.6 3. Migrate successfully Changed to verified according to above test result Hi Eduardo, I encountered similar problems in 7.6.z also, could you please have a look at it? thank you! Description: Fail to compute cpu baseline through "virsh capabilities" in RHEL-7.6.z because of "intel_pt" while it works well through "virsh domcapabilities" How reducible: 100% Version: RHEL-7.7 host # cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.7 Beta (Maipo) # rpm -qa libvirt qemu-kvm-rhev kernel libvirt-4.5.0-22.el7.x86_64 kernel-3.10.0-1053.el7.x86_64 qemu-kvm-rhev-2.12.0-32.el7.x86_64 RHEL-7.6 host # cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.6 (Maipo) # rpm -qa libvirt qemu-kvm-rhev kernel qemu-kvm-rhev-2.12.0-18.el7_6.7.x86_64 kernel-3.10.0-957.21.2.el7.x86_64 libvirt-4.5.0-10.el7_6.10.x86_64 Steps: 1. In rhel-7.7: # virsh capabilities > cap1.xml # virsh domcapabilities > dom1.xml 2. in rhel-7.6 # virsh domcapabilities > dom2.xml # virsh capabilities > cap2.xm 3. In rhel-7.7 compute cpu baseline: # cat cap1.xml cap2.xml >> capall.xml # cat dom1.xml dom2.xml >> domall.xml # virsh hypervisor-cpu-baseline capall.xml <cpu mode='custom' match='exact'> <model fallback='forbid'>Skylake-Server-IBRS</model> <vendor>Intel</vendor> <feature policy='require' name='ds'/> <feature policy='require' name='acpi'/> <feature policy='require' name='ss'/> <feature policy='require' name='ht'/> <feature policy='require' name='tm'/> <feature policy='require' name='pbe'/> <feature policy='require' name='dtes64'/> <feature policy='require' name='monitor'/> <feature policy='require' name='ds_cpl'/> <feature policy='require' name='vmx'/> <feature policy='require' name='smx'/> <feature policy='require' name='est'/> <feature policy='require' name='tm2'/> <feature policy='require' name='xtpr'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='dca'/> <feature policy='require' name='osxsave'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='clflushopt'/> <feature policy='require' name='pku'/> <feature policy='require' name='ospke'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='ssbd'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='invtsc'/> </cpu> # virsh hypervisor-cpu-baseline domall.xml <cpu mode='custom' match='exact'> <model fallback='forbid'>Skylake-Server-IBRS</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='clflushopt'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='ssbd'/> <feature policy='require' name='invtsc'/> </cpu> 4. In rhel-7.6 compute cpu baseline: # cat cap1.xml cap2.xml >> capall.xml # cat dom1.xml dom2.xml >> domall.xml # virsh hypervisor-cpu-baseline capall.xml error: internal error: Unknown CPU feature intel-pt # virsh hypervisor-cpu-baseline domall.xml <cpu mode='custom' match='exact'> <model fallback='forbid'>Skylake-Server-IBRS</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='clflushopt'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='ssbd'/> <feature policy='require' name='invtsc'/> </cpu> Actual result: As step-4 shows Expected result: Since "virsh hypervisor-cpu-baseline" can accept the output of "virsh capabilities" and "virsh domcapabilities", then the result should keep same. Additional info: RHEL-7.6 # lscpu |grep intel Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_ppin intel_pt ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear spec_ctrl intel_stibp flush_l1d RHEL-7.7 # lscpu |grep intel Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_ppin intel_pt ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear spec_ctrl intel_stibp flush_l1d (In reply to jiyan from comment #14) Hi, there are several issues in your test case. The expectation is incorrect. The hypervisor-cpu-baseline command accepts any host CPU XML (from capabilities or domain capabilities), but that doesn't mean the result should be the same. Because the host CPU models are different. And the documentation says the domain capabilities is the best source for the best result. If you're calling baseline on CPU models gathered from hosts which do not contain the same version of packages, you logically need to run the baseline API on the newest host since the older host(s) may not know some new features reported only by new versions. These features will not be included in the result because they are not supported on all hosts. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:2553 |