2074149 – RHV VM with Q35 UEFI and 64 CPUs is running but without boot screen, console and network.

Bug 2074149 - RHV VM with Q35 UEFI and 64 CPUs is running but without boot screen, console and network.

Summary: RHV VM with Q35 UEFI and 64 CPUs is running but without boot screen, console ...

Keywords:
Status:	CLOSED DUPLICATE of bug 2075486
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-engine
Sub Component:
Version:	4.5.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Nobody
QA Contact:	meital avital
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-04-11 15:50 UTC by Nisim Simsolo
Modified:	2022-04-22 12:21 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-04-18 15:40:58 UTC
oVirt Team:	---
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHV-45807	0	None	None	None	2022-04-18 15:50:29 UTC

Description Nisim Simsolo 2022-04-11 15:50:16 UTC

Description of problem:
- When running RHV VM with Q35/UEFI and more than 48 CPUs (RHV host is with 64 CPUs), there is no VM console (black screen and no TianoCore screen when booting) and there's no IP address for that VM. 
- When running the same VM, this time reducing CPUs to 48, the VM is running properly with console. network address etc.
- running VM with Q35/BIOS with 64 CPUs is also running properly with console and network address.
- this issue reproduced also on a different setup with different host, but there the issue encountered with more than 28 CPUs (RHV host with 48 CPUs, ).

Version-Release number of selected component (if applicable):
ovirt-engine-4.5.0-0.237.el8ev
vdsm-4.50.0.10-1.el8ev.x86_64
qemu-kvm-6.2.0-11.module+el8.6.0+14707+5aa4b42d.x86_64
libvirt-daemon-8.0.0-5.module+el8.6.0+14480+c0a3aa0f.x86_64

How reproducible:
100%

Steps to Reproduce:
1. using host with 64 CPUs (2 virtual sockets, 16 cores per socket and 2 threads per core), try to run VM with 64 CPUs and Q35 chipset with UEFI.
2. 
3.

Actual results:
VM is running, but no boot screen (TianoCore), console is black and no IP addresses.

Expected results:
VM should boot with console, network etc.

Additional info:
vdsm.log attached (VM with 48 CPUs run at 2022-04-11 10:51:42,151-0400, VM with 64 CPUs run at 2022-04-11 10:58:01,881-0400)
engine.log, libvirt/qemu.log and VMs domain XML attached.
VM names: rhel8_VM_48_CPUs and rhel8_VM_64_CPUs

Comment 7 liunana 2022-04-12 08:37:20 UTC

I can reproduce this bug with edk2 with 'maxcpus=512'.

And I also get the firmware log while booting the guest, seems it is stuck at edk2 something finally:
.......
CPU[02F]  APIC ID=002F  SMBASE=7FC11000  SaveState=7FC20C00  Size=00000400
ASSERT /builddir/build/BUILD/edk2-bb1bba3d77/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c(920): Stacks != ((void *) 0)


Will upload full firmware log to attachments "firmware.log" later.



Test Env:
    kernel-4.18.0-372.7.1.el8.x86_64
    edk2-ovmf-20220126gitbb1bba3d77-2.el8.noarch
    qemu-kvm-6.2.0-11.module+el8.6.0+14707+5aa4b42d.x86_64
    CPU(s):              48
Guest: 4.18.0-372.7.1.el8.x86_64


QEMU cmdline:
/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -blockdev node-name=file_ovmf_code,driver=file,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd,auto-read-only=on,discard=unmap \
    -blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \
    -blockdev node-name=file_ovmf_vars,driver=file,filename=/home/rhel860.qcow2_VARS.fd,auto-read-only=on,discard=unmap \
    -blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \
    -machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars,kernel_irqchip=split \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 8G \
    -object memory-backend-ram,size=8G,id=mem-machine_mem \
    -device intel-iommu,intremap=on,caching-mode=on,eim=on \
    -smp 48,maxcpus=512,cores=16,threads=2,dies=1,sockets=16 \
    -cpu Icelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,rdpid=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,mpx=off,intel-pt=off \
    -device pvpanic,ioport=0x505,id=idmPEwZc \
    -chardev socket,server=on,id=chardev_serial0,wait=off,path=/tmp/serial0 \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/RHEL-8.6-x86_64-latest-ovmf.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:73:c4:14:c7:1a,id=id1He0A4,netdev=id86E9yE,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=id86E9yE,vhost=on \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=d,strict=off  \
    -enable-kvm \
    -monitor stdio \
    -chardev file,id=firmware,path=/tmp/edk2.log \
    -device isa-debugcon,iobase=0x402,chardev=firmware



Hi Cong,

This seems related to edk2 issue, could you please help to check this?
Thanks a lot!



Best regards
Liu Nana

Comment 17 Milan Zamazal 2022-04-14 09:53:49 UTC

Thank you for the explanations. Enabling SMM and setting TSEG size to 48 MB fixes the problem observed in the RHV environment.

As I understand https://bugzilla.redhat.com/show_bug.cgi?id=1469338#c30, there is no single good solution (although I'd think some mechanism allowing at least to boot into the standard OVMF or to show a clear error or warning in case it might not possible would be much better than a black screen without any obvious hint). As "trial-and-error changing the value until the VM boots successfully" is not acceptable in RHV, we will have to rely on some guesswork and set the value big enough pro-actively. As far as I understand it, it just takes from the RAM available to the guest OS, which shouldn't be a real problem on hosts running large (in any sense) VMs.

But I wonder about the following sentence from the comment above: "However, just because your TSEG is large, it does not guarantee that you can boot with a huge VCPU count -- there are many other limiting factors."

Laszlo, are there any further possible surprises we should be aware of, assuming we can boot (some) large VMs already?

Comment 18 Laszlo Ersek 2022-04-14 14:49:55 UTC

I don't remember the "many" other limiting factos I may have had in mind at that time; I do remember *one*:

RHEL8: https://bugzilla.redhat.com/show_bug.cgi?id=1982176
RHEL9: https://bugzilla.redhat.com/show_bug.cgi?id=1983086

Comment 19 Milan Zamazal 2022-04-14 15:34:07 UTC

Thank you for clarification. Both 1024 vCPUs and 8 TB of RAM are out of the limits permitted/supported by RHV so we are currently safe regarding that issue.

Comment 20 John Ferlan 2022-04-18 14:13:29 UTC

Klaus - seems to more of a firmware type issue for your team to handle resolution.

Comment 21 Klaus Heinrich Kiwi 2022-04-18 14:41:57 UTC

Thanks John.

I do need some clarification from the reported though: Is there still an issue (i.e., product non-conformity) that we need to address here?

SMBios 3.0 and support for higher number of  CPUs is being tracked as a RFE for RHEL 9 through Bug 1983086, so I need to know if there's anything "else" needed, especially since this bug was tagged against RHEL8, but we're only planning on supporting it on RHEL9.1 onwards (see Bug 1982176)

Comment 22 Laszlo Ersek 2022-04-18 15:26:39 UTC

IMO, for the stated guest characteristics, SMBIOS 3.0 is not needed just yet; it's enough to raise the TSEG size in RHV (in the domain XML and/or the QEMU cmdline that ovirt-engine generates) to, say, 48 MiB.

Comment 23 Laszlo Ersek 2022-04-18 15:29:40 UTC

(NB I'm unsure if I should have selected "RHEV-M / ovirt-engine" *or* "ovirt-engine / BLL Virt" -- it's unclear to me what product & component track upstream vs. downstream ovirt-engine development. Please feel free to reclassify; I just wanted to express that "Component: qemu-kvm" was not appropriate. Thanks.)

Comment 24 Nisim Simsolo 2022-04-18 15:32:41 UTC

(In reply to Klaus Heinrich Kiwi from comment #21)
> Thanks John.
> 
> I do need some clarification from the reported though: Is there still an
> issue (i.e., product non-conformity) that we need to address here?
 No, Laszlo suggestion from https://bugzilla.redhat.com/show_bug.cgi?id=2074149#c15 is solving the issue for RHV.
I also filed a bug on RHV: https://bugzilla.redhat.com/show_bug.cgi?id=2075486

Comment 25 Laszlo Ersek 2022-04-18 15:40:58 UTC

Ah OK, then we have bug 2075486 tracking this issue for the proper components; we don't need this open any longer.

*** This bug has been marked as a duplicate of bug 2075486 ***

Comment 26 Klaus Heinrich Kiwi 2022-04-18 15:47:17 UTC

Thanks everyone! Seems like we're done here then!

Comment 27 Milan Zamazal 2022-04-21 09:41:56 UTC

Laszlo, two more question about the recommended TSEG size of 48 MB:

- According to https://bugzilla.redhat.com/show_bug.cgi?id=1469338#c7, 8 MB should be added for each 1 TB of address space. It indicates that 48 MB may not be enough for VMs with ~4 TB of maximum RAM and many vCPUs (which is within the limits supported by RHV). Should the TSEG size be increased beyond 48 MB for such VMs?

- If NVDIMM is present, its size should be considered too, right?

Comment 28 Laszlo Ersek 2022-04-22 06:34:58 UTC

Hi Milan,

(In reply to Milan Zamazal from comment #27)
> Laszlo, two more question about the recommended TSEG size of 48 MB:
> 
> - According to https://bugzilla.redhat.com/show_bug.cgi?id=1469338#c7, 8 MB
> should be added for each 1 TB of address space. It indicates that 48 MB may
> not be enough for VMs with ~4 TB of maximum RAM and many vCPUs (which is
> within the limits supported by RHV). Should the TSEG size be increased
> beyond 48 MB for such VMs?

I can't say for sure without testing such a beast. But anyway, for a 4TB guest, nobody's going to care about spending an extra 16MB on TSEG, so use 64MB rather than 48MB I guess.

> - If NVDIMM is present, its size should be considered too, right?

Sorry, I'm not familiar with NVDIMM. I'm not sure what QEMU device model that means and how it affects the guest address space.

If you mean memory hotplug, then yes; the memory hotplug area shifts the 64-bit PCI MMIO aperture up, so it does expand the guest address space. It means that the page tables built for SMM will have to cover a larger address range, and so a bit more SMRAM (TSEG) may be necessary.

Comment 29 Milan Zamazal 2022-04-22 12:21:07 UTC

Thank you, Laszlo, for clarification. We will then stick to the rule of 8 MB TSEG size per 1 TB of the address space (+ some extra in case of many vCPUs). And since NVDIMM size must be included when specifying the maximum memory of a VM then I think it's part of the address space and we will count it too.

Note You need to log in before you can comment on or make changes to this bug.