Created attachment 1721842 [details]
QEMU command line
Description of problem:
When a VM is started with q35 chipset, UEFI BIOS and maxmem at least 16 GB on a certain hardware, it starts but it doesn't boot.
Version-Release number of selected component (if applicable):
It was observed and is always reproducible only on two machine with Intel(R) Xeon(R) CPU E3-1230 V2 and 8 GB RAM. It couldn't be reproduced elsewhere.
Steps to Reproduce:
1. Start a VM with an installed guest OS from RHV, see the attached qemu-kvm command line.
2. Connect to the VM using SPICE -- the VM is stuck with a black screen and doesn't boot. Pinging the VM also doesn't work.
The VM gets stuck immediately after starting, it doesn't reach even bootloader screen and qemu-kvm process consumes 100% CPU.
The VM starts normally.
When maxmem is reduced to e.g. 12 GB, the VM starts normally on the same machine. With a non-UEFI BIOS and 16 GB maxmem, it starts and reaches BIOS. When the same VM is started the same way on a different kind of host with the same amount of RAM, it starts normally.
Assigned to Amnon for initial triage per bz process and age of bug created or assigned to virt-maint without triage.
Looks to be some sort of specific machine type and memory size type issue. Not clear what model(s) would allow boot to continue or whether a specific change to some parameter(s) would make a difference.
This is weirdly similar to the issues we had when trying to go beyond ~700 VCPUs when testing the BZs related to bug 1788991. I will investigate.
Eduardo, did you find out something?
Issue seems unrelated to the ones on bug 1788991. The CPUs where this bug can be reproduced have a small physical address size (36 bits), and I believe that's the root cause.
Laszlo, any suggestion on where to look? Do you think OVMF might be using more than 36 bits of physical address space somehow?
(1) General correction for the QEMU command line (not related to this particular symptom, but required for actually securing Secure Boot) -- the following option *must* be appended:
-global driver=cfi.pflash01,property=secure,value=on \
(2) Regarding the specific symptom, please capture the OVMF debug log, and attach it to this BZ. QEMU options for that:
-chardev file,id=debugfile,path=ovmf.log \
-device isa-debugcon,iobase=0x402,chardev=debugfile \
Once you have the OVMF debug log attached, please set needinfo on me again. Thanks.