Bug 2219309

Summary: [Hyper-V][RHEL9.3] Nested Hyper-V on KVM: L1 Windows VM fails to boot up if host uses intel alderlake CPU
Product: Red Hat Enterprise Linux 9 Reporter: xuli <xuli>
Component: qemu-kvmAssignee: Vitaly Kuznetsov <vkuznets>
qemu-kvm sub component: CPU Models QA Contact: xuli <xuli>
Status: CLOSED MIGRATED Docs Contact:
Severity: unspecified    
Priority: unspecified CC: andavis, bdas, cavery, litian, mlevitsk, nilal, virt-maint, vkuznets, xuli, xxiong, yacao, yuxisun
Version: 9.3Keywords: MigratedToJIRA
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-22 13:42:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description xuli 2023-07-03 08:18:21 UTC
Description of problem:

Windows 11 L1 guest cannot boot up after enabling Hyper-V role in alderlake host.  Have tried both copying existing Windows 11 images and newly installed image.  Also copying one 2022-bios.qcow2 to the host, which could boot up in the Intel(R) Xeon(R) CPU E5-2650 host, Hyper-V 2022 L1 VM cannot boot up in the alderlake host. 

Refer to upstream bug 
https://bugzilla.kernel.org/show_bug.cgi?id=217307 -  windows guest entering boot loop when nested virtualization enabled and hyperv installed


Version-Release number of selected component (if applicable):

L0:  Intel(R) Corporation, 12th Gen Intel(R) Core(TM) i7-12700E

    RHEL9.3 with kernel 5.14.0-331.el9.x86_64
    qemu-kvm-8.0.0-5.el9.x86_64
    edk2-ovmf-20230301gitf80f052277c8-5.el9.noarch

L1: Test on Win11 firmware with enabled Hyper-V role

How reproducible:
100%

Steps to Reproduce:

1. Start L0 with enabled nested

# cat /sys/module/kvm_intel/parameters/nested
Y

2. Start L1 Windows 11 VM on UEFI firmware with enabled Hyper-V role using qemu-kvm, e.g. Windows 11 VM:

mkdir /tmp/mytpm
/usr/bin/swtpm_setup --tpm2 --tpmstate /tmp/mytpm --create-ek-cert --create-platform-cert --overwrite --lock-nvram
/usr/bin/swtpm socket --daemon --ctrl type=unixio,path=/tmp/guest-swtpm.sock,mode=0600 --tpmstate dir=/tmp/mytpm,mode=0600 --tpm2



/usr/libexec/qemu-kvm -name w11-uefi -m 8G -smp 8 \
-rtc base=localtime,driftfix=none  -boot order=cd,menu=on -monitor stdio -M q35,smm=on,accel=kvm -vga std -vnc :8 \
-global driver=cfi.pflash01,property=secure,value=on \
-drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,readonly=on,unit=0 \
-drive file=/home/test/OVMF_VARSw11.fd,if=pflash,format=raw,unit=1 \
-netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0,vhost=on,queues=4 \
-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x3 \
-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x3.0x1  \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:52:11:36:1f:61,bus=pci.2,mq=on,vectors=10 \
-blockdev \
driver=file,cache.direct=off,cache.no-flush=on,filename=/home/images/w11-uefi.qcow2,node-name=system_file \
-blockdev driver=qcow2,node-name=drive_system_disk,file=system_file -object iothread,id=thread0 \
-device virtio-blk-pci,iothread=thread0,drive=drive_system_disk,id=system_disk,bootindex=0,bus=pci.1 \
-usb -device usb-tablet \
-cpu \
host,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vpindex,hv_runtime,hv_crash,hv_time,hv_synic,hv_stimer,hv_tlbflush,hv_ipi,hv_reset,hv_frequencies,hv_reenlightenment,hv_stimer_direct,hv_evmcs,hv_emsr_bitmap \
-enable-kvm \
-tpmdev emulator,id=tpm-tpm0,chardev=chrtpm \
	-chardev socket,id=chrtpm,path=/tmp/guest-swtpm.sock \
	-device tpm-crb,tpmdev=tpm-tpm0,id=tpm0 \


Actual results:
Windows 11 L1 guest cannot boot up after enabling Hyper-V role in alderlake host. 

Expected results:
All L1 Windows could start well in alderlake host.

Note: 
1. Windows 11 L1 guest can boot up after disabling Hyper-V role in alderlake host.
2. Hyper-V 2022 L1 guest also cannot boot up after enabling Hyper-V role in alderlake host.
3. Refer to https://bugzilla.kernel.org/show_bug.cgi?id=217307#c11, almost likely anyone with 12'th gen or 13'th gen intel host should be able to repro this.

root@intel-alderlake-s-iotg-02 ~]# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  20
  On-line CPU(s) list:   0-19
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Intel(R) Corporation
  Model name:            12th Gen Intel(R) Core(TM) i7-12700E
    BIOS Model name:     12th Gen Intel(R) Core(TM) i7-12700E
    CPU family:          6
    Model:               151
    Thread(s) per core:  2
    Core(s) per socket:  12
    Socket(s):           1
    Stepping:            2
    CPU max MHz:         4800.0000
    CPU min MHz:         800.0000
    BogoMIPS:            4224.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1
                         gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dt
                         es64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16
                         c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l2 cdp_l2 ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpi
                         d ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdt_a rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xget
                         bv1 xsaves split_lock_detect avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi umip pku ospke waitp
                         kg gfni vaes vpclmulqdq tme rdpid movdiri movdir64b fsrm md_clear serialize

Comment 1 RHEL Program Management 2023-09-22 13:39:52 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 2 RHEL Program Management 2023-09-22 13:42:20 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.