Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2114637

Summary:	[QEMU7.1 - rc0] Install VM failed when use firmware ovmf
Product:	Red Hat Enterprise Linux 9	Reporter:	Yanan Fu <yfu>
Component:	qemu-kvm	Assignee:	Gerd Hoffmann <kraxel>
qemu-kvm sub component:	General	QA Contact:	Yanan Fu <yfu>
Status:	CLOSED CURRENTRELEASE	Docs Contact:
Severity:	high
Priority:	high	CC:	berrange, coli, hongzliu, jinzhao, juzhang, kkiwi, kraxel, lersek, lijin, lkotek, mdeng, meili, mrezanin, osteffen, virt-maint, xuwei, yfu
Version:	9.1	Keywords:	Triaged, Upstream
Target Milestone:	rc	Flags:	pm-rhel: mirror+
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	qemu-7.1	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-09-03 13:35:45 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Yanan Fu 2022-08-03 02:22:02 UTC

Description of problem:
As subject, when test with the QEMU 7.1 rc0 build:
http://batcave.lab.eng.brq.redhat.com/repos/preview/RHEL-9/RHEL-9.2.0/qemu-kvm/220727_qemu-kvm-7.1.0rc0/x86_64/


Install VM will stuck if use firmware ovmf

Version-Release number of selected component (if applicable):
qemu:  qemu-kvm-7.1.0-9.rc0.el9.mrezanin202207280252.x86_64
edk2:  edk2-ovmf-20220526git16779ede2d36-2.el9.noarch
Host compose: RHEL-9.1.0-20220801.1
Guest iso: RHEL-9.1.0-20220802.1-x86_64-dvd1.iso


How reproducible:
100%

Steps to Reproduce:
1. Launch qemu process to install VM 
2. From the screen, it stuck at: EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path



Actual results:
Install VM failed

Expected results:
Install VM success


Additional info:
1. This is ovmf only, test with seabios success.
2. QEMU cli
  /usr/libexec/qemu-kvm \
-S  \
-name 'avocado-vt-vm1'  \
-sandbox off  \
-blockdev node-name=file_ovmf_code,driver=file,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \
-blockdev node-name=file_ovmf_vars,driver=file,filename=/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel910-64-virtio-scsi_qcow2_filesystem_VARS.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \
-machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 14336 \
-object memory-backend-ram,size=14336M,id=mem-machine_mem  \
-smp 24,maxcpus=24,cores=12,threads=1,dies=1,sockets=2  \
-cpu host \
-chardev socket,path=/var/tmp/avocado_mn6_hwsz/serial-serial0-20220802-001950-4Z7bf4lu,wait=off,server=on,id=chardev_serial0 \
-device isa-serial,id=serial0,chardev=chardev_serial0 \
-chardev socket,id=seabioslog_id_20220802-001950-4Z7bf4lu,path=/var/tmp/avocado_mn6_hwsz/seabios-20220802-001950-4Z7bf4lu,server=on,wait=off \
-device isa-debugcon,chardev=seabioslog_id_20220802-001950-4Z7bf4lu,iobase=0x402 \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel910-64-virtio-scsi.qcow2,cache.direct=off,cache.no-flush=on \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=off,cache.no-flush=on,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
-kernel '/home/kvm_autotest_root/images/rhel910-64/vmlinuz'  \
-append 'inst.sshd inst.repo=cdrom inst.ks=cdrom:/ks.cfg net.ifnames=0 console=ttyS0,115200'  \
-initrd '/home/kvm_autotest_root/images/rhel910-64/initrd.img'  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=on,order=dc \
-enable-kvm \
-monitor stdio

Comment 3 Daniel Berrangé 2022-08-03 10:54:58 UTC

This sounds like it the same issue reported by Intel with the recent RNG seed patches to PC machine types:

https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg00148.html

Comment 4 Laszlo Ersek 2022-08-03 13:58:55 UTC

The crash is triggered somewhere in (or after) the kernel's EFI stub.

You can see this from the serial log in comment 2:

> EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path
> [... register dump ...]

At that time, the kernel's EFI stub has taken control, but we're still in firmware-land because the CPU exception handler is still from OVMF.


Furthermore, the serial log contains

> Find image based on IP(0x726F7C35) (No PDB)  (ImageBase=0000000071C35000, EntryPoint=00000000726F72D5)

Normally, in such situations it is useful to collect, from the *firmware log* from comment 1, lines matching the pattern

> Loading driver at 0x0007E39C000 EntryPoint=0x0007E39D0FA NullMemoryTestDxe.efi

These lines tell us what DXE driver was loaded at what address, with what entry point address.

Then these lines can be correlated with the ImageBase / EntryPoint information from the crash message. If you find a match, that gives you the crashing DXE driver.

Yet further, reusing the original edk2 Build directory, you can disassemble ("objdump -S") the culprit driver binary (using the above example: that would be "NullMemoryTestDxe.debug"). This objdump command provides a disassembly with the C-language source code intermixed, with instruction offsets relative to the image load base. In the present case, RIP (from the register dump) is 00000000726F7C35, therefore the relative offset would be

  RIP - ImageBase = 00000000726F7C35 - 0000000071C35000 = AC2C35

Looking up this relative insn offset in the disassembly usually provides a pretty good hint about the problem.

Now, consider the following snippet from the firmware log:

> [Security] 3rd party image[0] can be loaded after EndOfDxe: VenMedia(1428F772-B64A-441E-B8C3-9EBDD7F893C7)/kernel.
> [...]
> Loading driver at 0x00071C35000 EntryPoint=0x000726F72D5 
> [...]
> QemuLoadKernelImage: command line: "inst.sshd inst.repo=cdrom inst.ks=cdrom:/ks.cfg net.ifnames=0 console=ttyS0,115200 initrd=initrd"

This shows that the kernel image (basically a UEFI application, due to the presence of the UEFI stub) is loaded at 00071C35000.

Therefore the crash happens genuinely in (or after) the UEFI stub of the kernel, at offset 0xAC2C35.

(1) I suggest adding

  earlyprintk=ttyS0,115200n8 ignore_loglevel

to the kernel command line.

(2) My reading of QEMU commit 67f7e426e538 ("hw/i386: pass RNG seed via setup_data entry", 2022-07-22) is that it enables setup_data block chaining -- in fact, *any* setup_data blocks -- for the first time ever for OVMF guests. In OVMF, the whole setup area is handled by commit 0758a8e979be ("OvmfPkg/QemuKernelLoaderFsDxe: add support for the kernel setup block", 2020-03-05). It's possible that OVMF does not concatenate the setup area with the kernel block correctly, and the QEMU commit in question (introducing the first setup_data linkage) is what is triggering the problem now, for the first time ever.

FWIW, from the firmware log, it seems that the setup block size is 31 sectors (of 512 byte each) -- 15872 bytes.

OVMF places the setup area first, and the kernel second. The QEMU commit in question seems to append a setup_data block to the end of the kernel area though?...

Comment 5 Oliver Steffen 2022-08-05 13:49:26 UTC

(In reply to Laszlo Ersek from comment #4)
> The crash is triggered somewhere in (or after) the kernel's EFI stub.
> 
> You can see this from the serial log in comment 2:
> 
> > EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path
> > [... register dump ...]
> 
> At that time, the kernel's EFI stub has taken control, but we're still in
> firmware-land because the CPU exception handler is still from OVMF.
> 
[...]
> 
> Now, consider the following snippet from the firmware log:
> 
> > [Security] 3rd party image[0] can be loaded after EndOfDxe: VenMedia(1428F772-B64A-441E-B8C3-9EBDD7F893C7)/kernel.
> > [...]
> > Loading driver at 0x00071C35000 EntryPoint=0x000726F72D5 
> > [...]
> > QemuLoadKernelImage: command line: "inst.sshd inst.repo=cdrom inst.ks=cdrom:/ks.cfg net.ifnames=0 console=ttyS0,115200 initrd=initrd"
> 
> This shows that the kernel image (basically a UEFI application, due to the
> presence of the UEFI stub) is loaded at 00071C35000.
> 
> Therefore the crash happens genuinely in (or after) the UEFI stub of the
> kernel, at offset 0xAC2C35.

Sounds right. 

> 
> (1) I suggest adding
> 
>   earlyprintk=ttyS0,115200n8 ignore_loglevel

I tried that. It did not change anything, output looks the same.

> to the kernel command line.
> 
> (2) My reading of QEMU commit 67f7e426e538 ("hw/i386: pass RNG seed via
> setup_data entry", 2022-07-22) is that it enables setup_data block chaining
> -- in fact, *any* setup_data blocks -- for the first time ever for OVMF
> guests. In OVMF, the whole setup area is handled by commit 0758a8e979be
> ("OvmfPkg/QemuKernelLoaderFsDxe: add support for the kernel setup block",
> 2020-03-05). It's possible that OVMF does not concatenate the setup area
> with the kernel block correctly, and the QEMU commit in question
> (introducing the first setup_data linkage) is what is triggering the problem
> now, for the first time ever.

I looked at these commits but I do not understand how they go together yet.

> 
> FWIW, from the firmware log, it seems that the setup block size is 31
> sectors (of 512 byte each) -- 15872 bytes.

Where do you see that in the log?

Comment 6 Laszlo Ersek 2022-08-05 14:21:14 UTC

(In reply to Oliver Steffen from comment #5)
> (In reply to Laszlo Ersek from comment #4)

> > FWIW, from the firmware log, it seems that the setup block size is 31
> > sectors (of 512 byte each) -- 15872 bytes.
> 
> Where do you see that in the log?

From here:

QemuFwCfg interface (DMA) is supported.
Select Item: 0x17
Select Item: 0x8
FetchBlob: loading 11313296 bytes for "kernel"
Select Item: 0x18
FetchBlob: 0 bytes remaining for "kernel" (0)
Select Item: 0x11
FetchBlob: 10248848 bytes remaining for "kernel" (1)
FetchBlob: 9200272 bytes remaining for "kernel" (1)
FetchBlob: 8151696 bytes remaining for "kernel" (1)
FetchBlob: 7103120 bytes remaining for "kernel" (1)
FetchBlob: 6054544 bytes remaining for "kernel" (1)
FetchBlob: 5005968 bytes remaining for "kernel" (1)
FetchBlob: 3957392 bytes remaining for "kernel" (1)
FetchBlob: 2908816 bytes remaining for "kernel" (1)
FetchBlob: 1860240 bytes remaining for "kernel" (1)
FetchBlob: 811664 bytes remaining for "kernel" (1)
FetchBlob: 0 bytes remaining for "kernel" (1)

This matches FetchBlob() [OvmfPkg/QemuKernelLoaderFsDxe/QemuKernelLoaderFsDxe.c].

For "kernel" -- see "mKernelBlob" [OvmfPkg/QemuKernelLoaderFsDxe/QemuKernelLoaderFsDxe.c] --, we first grab QemuFwCfgItemKernelSetupSize (0x0017), then QemuFwCfgItemKernelSize (0x0008). Their sum is 11313296 bytes.

Then we download QemuFwCfgItemKernelSetupData (0x0018), for FwCfgItem 0. There is no particular information about this, as the download loop only logs a message once every full megabyte.

Then we download QemuFwCfgItemKernelData (0x0011), for FwCfgItem 1. Here, after we download the first MB, we have 10248848 bytes left. Therefore QemuFwCfgItemKernelSize is 10248848 + 1024*1024 = 11297424.

Further, QemuFwCfgItemKernelSize is 11313296 - 11297424 = 15872 = 31 sectors.

Anyway... the issue is now understood, but designing the fix is proving really difficult. Not even the component needing the fix has been decided.

http://mid.mail-archive.com/20220804030012-mutt-send-email-mst@kernel.org

Comment 7 Laszlo Ersek 2022-08-05 14:24:11 UTC

(In reply to Laszlo Ersek from comment #6)

> Further, QemuFwCfgItemKernelSize is 11313296 - 11297424 = 15872 = 31 sectors.

Obviously I had to screw up the last statement; the calculation is correct, but the name is QemuFwCfgItemKernelSetupSize.

In summary:

QemuFwCfgItemKernelSetupSize (0x0017) = ?
QemuFwCfgItemKernelSize (0x0008) = ?

QemuFwCfgItemKernelSetupSize (0x0017) + QemuFwCfgItemKernelSize (0x0008) = 11313296

QemuFwCfgItemKernelSize (0x0008) - 1024*1024 = 10248848

--> QemuFwCfgItemKernelSize (0x0008) = 11297424
--> QemuFwCfgItemKernelSetupSize (0x0017) = 11313296 - QemuFwCfgItemKernelSize (0x0008) = 11313296 - 11297424 = 15872 = 31 sectors.

Comment 8 Oliver Steffen 2022-08-09 09:02:21 UTC

Ah, OK. Makes sense. I noticed the FetchBlob lines but did not think further.

Thank you for explaining, Laszlo!

Comment 9 Oliver Steffen 2022-08-09 13:56:03 UTC

While the discussion around a solution is ongoing upstream,
can we disable the RNG seed feature or drop that patch downstream
until the problem is solved?
Or do we rely on it in some way?

Comment 10 Daniel Berrangé 2022-08-09 14:06:56 UTC

IIUC, we've not actually merged the 7.1 rebase yet, so technically RHEL/CentOS Stream are not yet affected.

When we do merge it, we should be disabling the RNG feature in *existing* machine types for back compat reasons, so they also should not be affected.

The first point at which RHEL / CentOS Stream become affected is if & when we add a new rhel-9.1.0 machine type. I'm not sure if we're even going to add that, since last time we discussed the view was coming towards only adding new machine types in EUS streams (9.0, 9.2, 9.4, etc).

Comment 11 Gerd Hoffmann 2022-08-15 10:41:36 UTC

Note: Possibly the whole direct kernel boot process can be made more robust using the systemd efi stub (see "man systemd-stub") which allows to pack up everything (kernel, initrd, cmdline) into a single efi binary.
Unfortunately not packaged up for rhel, we have a request open for that though (bug 2099278).

Comment 12 Gerd Hoffmann 2022-08-17 08:46:17 UTC

https://patchwork.ozlabs.org/project/qemu-devel/list/?series=314119

Comment 15 John Ferlan 2022-09-03 13:35:45 UTC

Fixed by commit id 3824e25db1a84fadc50b88dfbe27047aa2f7f85d

Closing this as current release since it's resolved in qemu-7.1