Bug 2184282
Summary: | [AMD] CPU flags don't match between QEMU cmdline and Libvirt xml(dumpxml) on some AMD hosts | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | liunana <nanliu> |
Component: | qemu-kvm | Assignee: | Bandan Das <bdas> |
qemu-kvm sub component: | CPU Models | QA Contact: | liunana <nanliu> |
Status: | CLOSED MIGRATED | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | bdas, chayang, coli, jdenemar, jinzhao, juzhang, nilal, virt-maint, yuhuang, zixchen |
Version: | 9.3 | Keywords: | MigratedToJIRA, Triaged |
Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-09-27 17:13:03 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
liunana
2023-04-04 06:15:48 UTC
I think the first part is a bug in that the value in XML should be reflected in the Qemu command line. But since all of these are virt related features(?), they are all meaningless since svm=off. Jiri, any thoughts ? Coming to: (qemu) qemu-kvm: warning: This feature depends on other features that were not requested: CPUID.8000000AH:EDX.lbrv [bit 1] qemu-kvm: warning: This feature depends on other features that were not requested: CPUID.8000000AH:EDX.tsc-scale [bit 4] qemu-kvm: warning: This feature depends on other features that were not requested: CPUID.8000000AH:EDX.vmcb-clean [bit 5] qemu-kvm: warning: This feature depends on other features that were not requested: CPUID.8000000AH:EDX.pause-filter [bit 10] qemu-kvm: warning: This feature depends on other features that were not requested: CPUID.8000000AH:EDX.pfthreshold [bit 12] qemu-kvm: warning: This feature depends on other features that were not requested: CPUID.8000000AH:EDX.v-vmsave-vmload [bit 15] qemu-kvm: warning: This feature depends on other features that were not requested: CPUID.8000000AH:EDX.vgif [bit 16] I just picked up one random bit from the list above which is vGIF. This feature allows STGI and CLGI in guest mode and so even if it's available, it's not meaningful unless svm is enabled. I think that's what the warning message says above. I would think the only harm this bug does is print unexpected information. Well, the problem here is that <cpu mode='host-model'/> does not really work as expected. We probe QEMU what features would be enabled with -cpu host and transform that into a CPU model and a list of enabled/disabled features so that the domain can safely be migrated as long as the destination host is able to provide the same vCPU. The probing is done using CPU model expansion on "host" model, which reports all these features including "svm" as enabled. But, we know (or rather think) the EPYC-Milan CPU model already includes svm so we do not ask for it explicitly on the command line. Even looking at QEMU definition of the CPU model confirms this. Unfortunately when running QEMU with -cpu EPYC-Milan, "svm" is not actually enabled because of some runtime magic that disables it despite it being specified in the CPU model definition. And because we explicitly ask for some additional features which are not part of EPYC-Milan and depend on "svm", QEMU logically complains about them and refuses to enable them. This is all correctly reflected in the domain XML once QEMU is running. So this part works OK. If "svm" is explicitly requested on the command line, everything works as expected and QEMU does not complain about anything. But this is not what happens with host-model. So the question is why "svm" gets disabled when it's not explicitly requested even though EPYC-Milan is defined as .name = "EPYC-Milan", .level = 0xd, .vendor = CPUID_VENDOR_AMD, .family = 25, .model = 1, .stepping = 1, ... .features[FEAT_8000_0001_ECX] = CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH | CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM | CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM | /* <-- SVM enabled here */ CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE, ... Theoretically we could be able to fix this on libvirt side, but our options are quite limited: - Ideally we would ask QEMU what features would be enabled for EPYC-Milan in runtime (because just reporting the static definition is not enough in this case), but we would need to do so at the time we probe QEMU capabilities, i.e., when QEMU is started with "-machine none". Knowing the exact definition, we could explicitly request all feature enabled for -cpu host but not included in the model definition. AFAIK this is not supported by QEMU - We could explicitly ask for all enabled features, even those included in a given CPU model. This would result in a terribly long command line. - We could selectively request some features explicitly even though they are included in a given CPU model. But the question is which features would need such treatment. Just "svm" or more features? And do we need to do so for all CPU models or just some of them? I don't see a clear winner here. Except for the first option, which is not doable without non-trivial (I expect) work in QEMU. Hi Bandan, Could you please help to check Comment 4? Do we have a solution for this bug? Thanks. Best regards Nana (In reply to liunana from comment #5) > Hi Bandan, > > > Could you please help to check Comment 4? > Do we have a solution for this bug? Thanks. > > > Best regards > Nana Nana, unfortunately, I have to keep this in the backlog for now because I am not sure what the right approach for upstream is. I will keep track of it and update if something changes. Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |