Bug 2114637
| Summary: | [QEMU7.1 - rc0] Install VM failed when use firmware ovmf | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Yanan Fu <yfu> |
| Component: | qemu-kvm | Assignee: | Gerd Hoffmann <kraxel> |
| qemu-kvm sub component: | General | QA Contact: | Yanan Fu <yfu> |
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | berrange, coli, hongzliu, jinzhao, juzhang, kkiwi, kraxel, lersek, lijin, lkotek, mdeng, meili, mrezanin, osteffen, virt-maint, xuwei, yfu |
| Version: | 9.1 | Keywords: | Triaged, Upstream |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-7.1 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-09-03 13:35:45 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Yanan Fu
2022-08-03 02:22:02 UTC
This sounds like it the same issue reported by Intel with the recent RNG seed patches to PC machine types: https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg00148.html The crash is triggered somewhere in (or after) the kernel's EFI stub. You can see this from the serial log in comment 2: > EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path > [... register dump ...] At that time, the kernel's EFI stub has taken control, but we're still in firmware-land because the CPU exception handler is still from OVMF. Furthermore, the serial log contains > Find image based on IP(0x726F7C35) (No PDB) (ImageBase=0000000071C35000, EntryPoint=00000000726F72D5) Normally, in such situations it is useful to collect, from the *firmware log* from comment 1, lines matching the pattern > Loading driver at 0x0007E39C000 EntryPoint=0x0007E39D0FA NullMemoryTestDxe.efi These lines tell us what DXE driver was loaded at what address, with what entry point address. Then these lines can be correlated with the ImageBase / EntryPoint information from the crash message. If you find a match, that gives you the crashing DXE driver. Yet further, reusing the original edk2 Build directory, you can disassemble ("objdump -S") the culprit driver binary (using the above example: that would be "NullMemoryTestDxe.debug"). This objdump command provides a disassembly with the C-language source code intermixed, with instruction offsets relative to the image load base. In the present case, RIP (from the register dump) is 00000000726F7C35, therefore the relative offset would be RIP - ImageBase = 00000000726F7C35 - 0000000071C35000 = AC2C35 Looking up this relative insn offset in the disassembly usually provides a pretty good hint about the problem. Now, consider the following snippet from the firmware log: > [Security] 3rd party image[0] can be loaded after EndOfDxe: VenMedia(1428F772-B64A-441E-B8C3-9EBDD7F893C7)/kernel. > [...] > Loading driver at 0x00071C35000 EntryPoint=0x000726F72D5 > [...] > QemuLoadKernelImage: command line: "inst.sshd inst.repo=cdrom inst.ks=cdrom:/ks.cfg net.ifnames=0 console=ttyS0,115200 initrd=initrd" This shows that the kernel image (basically a UEFI application, due to the presence of the UEFI stub) is loaded at 00071C35000. Therefore the crash happens genuinely in (or after) the UEFI stub of the kernel, at offset 0xAC2C35. (1) I suggest adding earlyprintk=ttyS0,115200n8 ignore_loglevel to the kernel command line. (2) My reading of QEMU commit 67f7e426e538 ("hw/i386: pass RNG seed via setup_data entry", 2022-07-22) is that it enables setup_data block chaining -- in fact, *any* setup_data blocks -- for the first time ever for OVMF guests. In OVMF, the whole setup area is handled by commit 0758a8e979be ("OvmfPkg/QemuKernelLoaderFsDxe: add support for the kernel setup block", 2020-03-05). It's possible that OVMF does not concatenate the setup area with the kernel block correctly, and the QEMU commit in question (introducing the first setup_data linkage) is what is triggering the problem now, for the first time ever. FWIW, from the firmware log, it seems that the setup block size is 31 sectors (of 512 byte each) -- 15872 bytes. OVMF places the setup area first, and the kernel second. The QEMU commit in question seems to append a setup_data block to the end of the kernel area though?... (In reply to Laszlo Ersek from comment #4) > The crash is triggered somewhere in (or after) the kernel's EFI stub. > > You can see this from the serial log in comment 2: > > > EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path > > [... register dump ...] > > At that time, the kernel's EFI stub has taken control, but we're still in > firmware-land because the CPU exception handler is still from OVMF. > [...] > > Now, consider the following snippet from the firmware log: > > > [Security] 3rd party image[0] can be loaded after EndOfDxe: VenMedia(1428F772-B64A-441E-B8C3-9EBDD7F893C7)/kernel. > > [...] > > Loading driver at 0x00071C35000 EntryPoint=0x000726F72D5 > > [...] > > QemuLoadKernelImage: command line: "inst.sshd inst.repo=cdrom inst.ks=cdrom:/ks.cfg net.ifnames=0 console=ttyS0,115200 initrd=initrd" > > This shows that the kernel image (basically a UEFI application, due to the > presence of the UEFI stub) is loaded at 00071C35000. > > Therefore the crash happens genuinely in (or after) the UEFI stub of the > kernel, at offset 0xAC2C35. Sounds right. > > (1) I suggest adding > > earlyprintk=ttyS0,115200n8 ignore_loglevel I tried that. It did not change anything, output looks the same. > to the kernel command line. > > (2) My reading of QEMU commit 67f7e426e538 ("hw/i386: pass RNG seed via > setup_data entry", 2022-07-22) is that it enables setup_data block chaining > -- in fact, *any* setup_data blocks -- for the first time ever for OVMF > guests. In OVMF, the whole setup area is handled by commit 0758a8e979be > ("OvmfPkg/QemuKernelLoaderFsDxe: add support for the kernel setup block", > 2020-03-05). It's possible that OVMF does not concatenate the setup area > with the kernel block correctly, and the QEMU commit in question > (introducing the first setup_data linkage) is what is triggering the problem > now, for the first time ever. I looked at these commits but I do not understand how they go together yet. > > FWIW, from the firmware log, it seems that the setup block size is 31 > sectors (of 512 byte each) -- 15872 bytes. Where do you see that in the log? (In reply to Oliver Steffen from comment #5) > (In reply to Laszlo Ersek from comment #4) > > FWIW, from the firmware log, it seems that the setup block size is 31 > > sectors (of 512 byte each) -- 15872 bytes. > > Where do you see that in the log? From here: QemuFwCfg interface (DMA) is supported. Select Item: 0x17 Select Item: 0x8 FetchBlob: loading 11313296 bytes for "kernel" Select Item: 0x18 FetchBlob: 0 bytes remaining for "kernel" (0) Select Item: 0x11 FetchBlob: 10248848 bytes remaining for "kernel" (1) FetchBlob: 9200272 bytes remaining for "kernel" (1) FetchBlob: 8151696 bytes remaining for "kernel" (1) FetchBlob: 7103120 bytes remaining for "kernel" (1) FetchBlob: 6054544 bytes remaining for "kernel" (1) FetchBlob: 5005968 bytes remaining for "kernel" (1) FetchBlob: 3957392 bytes remaining for "kernel" (1) FetchBlob: 2908816 bytes remaining for "kernel" (1) FetchBlob: 1860240 bytes remaining for "kernel" (1) FetchBlob: 811664 bytes remaining for "kernel" (1) FetchBlob: 0 bytes remaining for "kernel" (1) This matches FetchBlob() [OvmfPkg/QemuKernelLoaderFsDxe/QemuKernelLoaderFsDxe.c]. For "kernel" -- see "mKernelBlob" [OvmfPkg/QemuKernelLoaderFsDxe/QemuKernelLoaderFsDxe.c] --, we first grab QemuFwCfgItemKernelSetupSize (0x0017), then QemuFwCfgItemKernelSize (0x0008). Their sum is 11313296 bytes. Then we download QemuFwCfgItemKernelSetupData (0x0018), for FwCfgItem 0. There is no particular information about this, as the download loop only logs a message once every full megabyte. Then we download QemuFwCfgItemKernelData (0x0011), for FwCfgItem 1. Here, after we download the first MB, we have 10248848 bytes left. Therefore QemuFwCfgItemKernelSize is 10248848 + 1024*1024 = 11297424. Further, QemuFwCfgItemKernelSize is 11313296 - 11297424 = 15872 = 31 sectors. Anyway... the issue is now understood, but designing the fix is proving really difficult. Not even the component needing the fix has been decided. http://mid.mail-archive.com/20220804030012-mutt-send-email-mst@kernel.org (In reply to Laszlo Ersek from comment #6) > Further, QemuFwCfgItemKernelSize is 11313296 - 11297424 = 15872 = 31 sectors. Obviously I had to screw up the last statement; the calculation is correct, but the name is QemuFwCfgItemKernelSetupSize. In summary: QemuFwCfgItemKernelSetupSize (0x0017) = ? QemuFwCfgItemKernelSize (0x0008) = ? QemuFwCfgItemKernelSetupSize (0x0017) + QemuFwCfgItemKernelSize (0x0008) = 11313296 QemuFwCfgItemKernelSize (0x0008) - 1024*1024 = 10248848 --> QemuFwCfgItemKernelSize (0x0008) = 11297424 --> QemuFwCfgItemKernelSetupSize (0x0017) = 11313296 - QemuFwCfgItemKernelSize (0x0008) = 11313296 - 11297424 = 15872 = 31 sectors. Ah, OK. Makes sense. I noticed the FetchBlob lines but did not think further. Thank you for explaining, Laszlo! While the discussion around a solution is ongoing upstream, can we disable the RNG seed feature or drop that patch downstream until the problem is solved? Or do we rely on it in some way? IIUC, we've not actually merged the 7.1 rebase yet, so technically RHEL/CentOS Stream are not yet affected. When we do merge it, we should be disabling the RNG feature in *existing* machine types for back compat reasons, so they also should not be affected. The first point at which RHEL / CentOS Stream become affected is if & when we add a new rhel-9.1.0 machine type. I'm not sure if we're even going to add that, since last time we discussed the view was coming towards only adding new machine types in EUS streams (9.0, 9.2, 9.4, etc). Note: Possibly the whole direct kernel boot process can be made more robust using the systemd efi stub (see "man systemd-stub") which allows to pack up everything (kernel, initrd, cmdline) into a single efi binary. Unfortunately not packaged up for rhel, we have a request open for that though (bug 2099278). Fixed by commit id 3824e25db1a84fadc50b88dfbe27047aa2f7f85d Closing this as current release since it's resolved in qemu-7.1 |