RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2174605 - [EDK2] disable dynamic mmio window
Summary: [EDK2] disable dynamic mmio window
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: edk2
Version: 9.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Gerd Hoffmann
QA Contact: Xueqiang Wei
URL:
Whiteboard:
Depends On:
Blocks: 2176920
TreeView+ depends on / blocked
 
Reported: 2023-03-02 00:03 UTC by Nitesh Narayan Lal
Modified: 2023-05-09 08:02 UTC (History)
16 users (show)

Fixed In Version: edk2-20221207gitfff6d81270b5-8.el9_2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2174749 2176920 (view as bug list)
Environment:
Last Closed: 2023-05-09 07:25:12 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/centos-stream/src edk2 merge_requests 29 0 None opened OvmfPkg: disable dynamic mmio window (rhel only) 2023-03-02 11:41:56 UTC
Red Hat Issue Tracker RHELPLAN-150361 0 None None None 2023-03-02 00:05:24 UTC
Red Hat Product Errata RHSA-2023:2165 0 None None None 2023-05-09 07:25:43 UTC

Description Nitesh Narayan Lal 2023-03-02 00:03:55 UTC
This bug was initially created as a copy of Bug #2171860

I am copying this bug because: 



Description of problem:
vm migration failed with "failed to set MSR 0x202 to 0x380000000000"

Version-Release number of selected component (if applicable):
source and target host:
libvirt-9.0.0-6.el9.x86_64
qemu-kvm-7.2.0-9.el9.x86_64
kernel-5.14.0-268.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare source host with cpu model as "Cascadelake-Server-noTSX", target host with cpu model as "Skylake-Client-noTSX-IBRS";
mount nfs on both source and target host to target directory as /var/lib/libvirt/migrate/
On source host run:  
# virsh domcapabilities  > /var/lib/libvirt/migrate/cpu.xml
On target host run:
virsh domcapabilities  >> /var/lib/libvirt/migrate/cpu.xml

On the source host, generate the baseline cpu by:
# virsh hypervisor-cpu-baseline /var/lib/libvirt/migrate/cpu.xml --migratable
<cpu mode='custom' match='exact'>
  <model fallback='forbid'>Skylake-Client-IBRS</model>
  <vendor>Intel</vendor>
  <feature policy='require' name='ss'/>
  <feature policy='require' name='vmx'/>
  <feature policy='require' name='pdcm'/>
  <feature policy='require' name='hypervisor'/>
  <feature policy='require' name='tsc_adjust'/>
  <feature policy='require' name='clflushopt'/>
  <feature policy='require' name='umip'/>
  <feature policy='require' name='md-clear'/>
  <feature policy='require' name='stibp'/>
  <feature policy='require' name='arch-capabilities'/>
  <feature policy='require' name='ssbd'/>
  <feature policy='require' name='xsaves'/>
  <feature policy='require' name='pdpe1gb'/>
  <feature policy='require' name='ibpb'/>
  <feature policy='require' name='ibrs'/>
  <feature policy='require' name='amd-stibp'/>
  <feature policy='require' name='amd-ssbd'/>
  <feature policy='require' name='skip-l1dfl-vmentry'/>
  <feature policy='require' name='pschange-mc-no'/>
  <feature policy='disable' name='hle'/>
  <feature policy='disable' name='rtm'/>
</cpu>

2. start a vm on source host with the cpu configuration above, and try to migrate the vm to target host:
# virsh migrate rhel --live --verbose qemu+ssh://{$target_host}/system --p2p --persistent --undefinesource
Migration: [100 %]error: operation failed: job 'migration out' unexpectedly failed

check the libvirtd log on target host:
2023-02-18 10:15:47.792+0000: 7216: error : qemuProcessReportLogError:1971 : internal error: qemu unexpectedly closed the monitor: 2023-02-18T10:15:47.735537Z qemu-kvm: warning: TSC frequency mismatch between VM (2194843 kHz) and host (2903990 kHz), and TSC scaling unavailable
2023-02-18T10:15:47.735651Z qemu-kvm: error: failed to set MSR 0x202 to 0x380000000000
qemu-kvm: ../target/i386/kvm/kvm.c:3177: int kvm_buf_set_msrs(X86CPU *): Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

Actual results:
VM migration failed with baseline cpu

Expected results:
VM migration should succeed

Additional info:

Comment 1 Gerd Hoffmann 2023-03-02 11:22:51 UTC
Problem: recent OVMF start using the full physical address space which is available.
See https://bugzilla.redhat.com//show_bug.cgi?id=2084533
and https://issues.redhat.com/browse/RHEL-60

libvirt host capabilities (for live migration compatibility)
do not include the physical address space size though, so this
causes problems in heterogeneous clusters.

PLAN: disable for 9.2, enable again for 9.3 (and eventually 9.2.z),
after libvirt has been fixed.

Comment 2 Gerd Hoffmann 2023-03-02 11:27:50 UTC
test build: https://kojihub.stream.rdu2.redhat.com/koji/taskinfo?taskID=2040397

Comment 6 Li Xiaohui 2023-03-03 10:58:44 UTC
Hi Yalan, can you help try again with the scratch build in Comment 5?

Comment 7 Li Xiaohui 2023-03-06 03:36:22 UTC
Hi Xueqiang,

Reproduce below bug on qemu-kvm-7.2.0-10.el9.x86_64 and edk2-ovmf-20221207gitfff6d81270b5-7.el9.noarch when migrate VM from Xeon(R) Silver 4110 to Xeon(R) CPU E3-1240 v5, dst qemu core dump when migration finishes:
(qemu) 2023-03-06T03:21:58.968405Z qemu-kvm: warning: TSC frequency mismatch between VM (2095072 kHz) and host (3503988 kHz), and TSC scaling unavailable
2023-03-06T03:21:58.968512Z qemu-kvm: error: failed to set MSR 0x202 to 0xe000000000
qemu-kvm: ../target/i386/kvm/kvm.c:3177: int kvm_buf_set_msrs(X86CPU *): Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

Bug 2171860 - migration: larger->E3: vm failed with "failed to set MSR 0x202 to 0x380000000000"


Only upgrade edk2 to edk2-ovmf-20221207gitfff6d81270b5-7.el9.bz2174605.20230302.1201.noarch, other environment and qemu cmds keep the same with above, then migration succeeds

Notes: CPU commands -> -cpu Skylake-Client-v4 



So the scratch build should fix Bug 2171860

Comment 12 Xueqiang Wei 2023-03-10 09:01:15 UTC
(In reply to Li Xiaohui from comment #7)
> Hi Xueqiang,
> 
> Reproduce below bug on qemu-kvm-7.2.0-10.el9.x86_64 and
> edk2-ovmf-20221207gitfff6d81270b5-7.el9.noarch when migrate VM from Xeon(R)
> Silver 4110 to Xeon(R) CPU E3-1240 v5, dst qemu core dump when migration
> finishes:
> (qemu) 2023-03-06T03:21:58.968405Z qemu-kvm: warning: TSC frequency mismatch
> between VM (2095072 kHz) and host (3503988 kHz), and TSC scaling unavailable
> 2023-03-06T03:21:58.968512Z qemu-kvm: error: failed to set MSR 0x202 to
> 0xe000000000
> qemu-kvm: ../target/i386/kvm/kvm.c:3177: int kvm_buf_set_msrs(X86CPU *):
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> 
> Bug 2171860 - migration: larger->E3: vm failed with "failed to set MSR 0x202
> to 0x380000000000"
> 
> 
> Only upgrade edk2 to
> edk2-ovmf-20221207gitfff6d81270b5-7.el9.bz2174605.20230302.1201.noarch,
> other environment and qemu cmds keep the same with above, then migration
> succeeds
> 
> Notes: CPU commands -> -cpu Skylake-Client-v4 
> 
> 
> 
> So the scratch build should fix Bug 2171860

Thank you Xiaohui. Could you please double check with the final build edk2-20221207gitfff6d81270b5-8.el9_2? I will do the regression test. Many thanks.

Comment 13 Li Xiaohui 2023-03-10 11:27:27 UTC
Retest this bug according to Comment 7 on edk2-20221207gitfff6d81270b5-8.el9_2, it has the fix: migration succeeds, qemu on the src and dst host work well.


Only get some tsc unstable dmesg info in guest after migration:

[   75.412616] clocksource: timekeeping watchdog on CPU3: Marking clocksource 'tsc' as unstable because the skew is too large:
[   75.413853] clocksource:                       'kvm-clock' wd_nsec: 504016065 wd_now: 15f0d2b67c wd_last: 15d2c809bb mask: ffffffffffffffff
[   75.415213] clocksource:                       'tsc' cs_nsec: 842961982 cs_now: 2e8f76fb7e cs_last: 2e2632f39c mask: ffffffffffffffff
[   75.416490] clocksource:                       'kvm-clock' (not 'tsc') is current clocksource.
[   75.417405] tsc: Marking TSC unstable due to clocksource watchdog

Comment 14 Xueqiang Wei 2023-03-10 17:20:57 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 17 yalzhang@redhat.com 2023-03-10 21:00:57 UTC
The migration succeed with edk2-ovmf-20221207gitfff6d81270b5-8.el9_2.noarch. The issue is fixed.

Comment 18 Xueqiang Wei 2023-03-12 09:10:00 UTC
Did regression test, no new bug was found.

Versions:
kernel-5.14.0-284.el9.x86_64
qemu-kvm-7.2.0-11.el9_2
edk2-ovmf-20221207gitfff6d81270b5-8.el9_2.noarch


1. Tested qemu gating test, the results were passed.
Job link: http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/?jobid=7613910

2. Tested edk2 test loop, the results were passed.
Job link: http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/?jobid=7615585

Comment 19 Xueqiang Wei 2023-03-12 09:16:30 UTC
Thank you Xiaohui and Yalan, many thanks. According to Comment 13, Comment 17 and Comment 18, set status to VERIFIED. Thanks all.

Comment 23 errata-xmlrpc 2023-05-09 07:25:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: edk2 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2165


Note You need to log in before you can comment on or make changes to this bug.