Bug 2196178
| Summary: | libvirt: Changes to firmware selection | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Andrea Bolognani <abologna> | ||||
| Component: | libvirt | Assignee: | Andrea Bolognani <abologna> | ||||
| libvirt sub component: | General | QA Contact: | Meina Li <meili> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | unspecified | ||||||
| Priority: | unspecified | CC: | dzheng, jdenemar, jsuchane, jwboyer, lmen, shdunne, virt-maint | ||||
| Version: | 9.1 | Keywords: | AutomationTriaged, Triaged | ||||
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-9.5.0-6.el9 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2023-11-07 08:31:17 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | 9.7.0 | ||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Andrea Bolognani
2023-05-08 09:02:24 UTC
Created attachment 1963449 [details]
libvirtd.log
Hi Andrea,
I've tested this bug and many scenarios tests passed. But there are still three test scenarios that I suspect are bugs. Or that's how it was designed. Could you help to review this scenarios again? Thank you very much.
Test Version:
libvirt-9.3.0-1.el9.x86_64
qemu-kvm-8.0.0-1.el9.x86_64
S1: Define a guest only with loader and file backed nvram template
Test Steps:
1. Prepare a guest xml with the following os xml:
<os>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
<nvram template="/usr/share/OVMF/OVMF_VARS.secboot.fd" type="file">
<source file="/tmp/rhel_VARS.fd"/>
</nvram>
<boot dev="hd"/>
</os>
2. Define the guest.
# virsh define rhel.xml
Domain 'rhel' defined from rhel.xml
3. Check the dumpxml.
# virsh dumpxml rhel --xpath //os
<os>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
<nvram template="/usr/share/OVMF/OVMF_VARS.fd" type="file">
<source file="/tmp/rhel_VARS.fd"/>
</nvram>
<boot dev="hd"/>
</os>
------>> We can't get the firmware features automatically. And the nvram template has been changed from OVMF_CODE.secboot.fd to OVMF_VARS.fd.
S2: Define a guest with efi firmware and file backed nvram template.
1. Prepare a guest xml with the following os xml:
<os firmware='efi'>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<nvram template="/usr/share/OVMF/OVMF_VARS.secboot.fd" type="file">
<source file="/tmp/rhel_VARS.fd"/>
</nvram>
<boot dev="hd"/>
</os>
2. Define the guest.
# virsh define rhel.xml
error: Disconnected from qemu:///system due to end of file
error: Failed to define domain from rhel.xml
error: End of file while reading data: Input/output error
------>> Get unexpected error with no coredump and no error info in libvirtd.log(look at attachment.)
S3: Define a guest with readonly='no'.
1. Prepare a guest xml with readonly='no' in os loader.
<os>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<loader readonly="no" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
<boot dev="hd"/>
</os>
2. Define the guest.
# virsh define rhel.xml
Domain 'rhel' defined from rhel.xml
3. Check the dumpxml.
# virsh dumpxml rhel --xpath //os
<os firmware="efi">
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<firmware>
<feature enabled="yes" name="enrolled-keys"/>
<feature enabled="yes" name="secure-boot"/>
</firmware>
<loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
<nvram template="/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd">/var/lib/libvirt/qemu/nvram/rhel_VARS.fd</nvram>
<boot dev="hd"/>
</os>
------>> The value of readonly is changed to 'yes', is it expected?
(In reply to Meina Li from comment #3) > Hi Andrea, > > I've tested this bug and many scenarios tests passed. But there are still > three test scenarios that I suspect are bugs. Or that's how it was designed. > Could you help to review this scenarios again? Thank you very much. Thanks for the report! These all seem unexpected. I'll look into them and get back to you. I have manage to reproduce all the scenarios that you've so very nicely and accurately described, and yeah they're all undesired. I'll start working on patches. In the meantime, moving the bug back to ASSIGNED. Another unreasonable test scenario.
Test Version:
libvirt-9.3.0-2.el9.x86_64
qemu-kvm-8.0.0-3.el9.x86_64
Test Steps:
1. Prepare a guest xml with /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd loader and noexist template.
<os>
<type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type>
<loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
<nvram template='noexist'>/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
<boot dev='hd'/>
</os>
2. Define the guest.
# virsh define test.xml
Domain 'test' defined from test.xml
# virsh dumpxml test --xpath //os
<os>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
<nvram template="noexist">/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
<boot dev="hd"/>
</os>
3. Start the guest.
# virsh start test
error: Failed to start domain 'test'
error: Failed to open file 'noexist': No such file or directory
------> I think this is expected.
But if we use /usr/share/OVMF/OVMF_CODE.secboot.fd as the loader path. It will have different result, which is strange.
1. Prepare a guest xml with /usr/share/OVMF/OVMF_CODE.secboot.fd loader and noexist template.
<os>
<type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type>
<loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
<nvram template='noexist'>/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
<boot dev='hd'/>
</os>
2. Define the guest.
# virsh define test.xml
Domain 'test' defined from test.xml
# virsh dumpxml test --xpath //os
<os>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
<nvram template="/usr/share/OVMF/OVMF_VARS.fd">/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
<boot dev="hd"/>
</os>
------> After we defining the guest, the template will be changed and then certainly the guest start succssfully.
The relationship between /usr/share/OVMF/OVMF_CODE.secboot.fd and /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd:
# ll /usr/share/OVMF/OVMF_CODE.secboot.fd
lrwxrwxrwx. 1 root root 33 May 22 04:38 /usr/share/OVMF/OVMF_CODE.secboot.fd -> ../edk2/ovmf/OVMF_CODE.secboot.fd
(In reply to Meina Li from comment #7) > ------> I think this is expected. > > But if we use /usr/share/OVMF/OVMF_CODE.secboot.fd as the loader path. It > will have different result, which is strange. Thanks for the additional testing! I'm still working on patches that will hopefully improve the situation, and I will keep these scenarios in mind too. Patches posted upstream. https://listman.redhat.com/archives/libvir-list/2023-August/241171.html A few clarifications. (In reply to Meina Li from comment #3) > S1: Define a guest only with loader and file backed nvram template > Test Steps: > 1. Prepare a guest xml with the following os xml: > <os> > <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> > <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> > <nvram template="/usr/share/OVMF/OVMF_VARS.secboot.fd" type="file"> > <source file="/tmp/rhel_VARS.fd"/> > </nvram> > <boot dev="hd"/> > </os> > 2. Define the guest. > # virsh define rhel.xml > Domain 'rhel' defined from rhel.xml > 3. Check the dumpxml. > # virsh dumpxml rhel --xpath //os > <os> > <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> > <loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> > <nvram template="/usr/share/OVMF/OVMF_VARS.fd" type="file"> > <source file="/tmp/rhel_VARS.fd"/> > </nvram> > <boot dev="hd"/> > </os> > ------>> We can't get the firmware features automatically. And the nvram > template has been changed from OVMF_CODE.secboot.fd to OVMF_VARS.fd. The fact that the NVRAM template has been changed from OVMF_CODE.secboot.fd to OVMF_VARS.fd is definitely undesired and shouldn't happen. The fact that firmware features are not automatically discovered, however, is expected: that process relies on a corresponding JSON firmware descriptor file existing on the system, and as of RHEL 9 those files only include references to modern paths such as /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd, not to legacy paths such as /usr/share/OVMF/OVMF_CODE.secboot.fd. So using the legacy paths will result in firmware features not being automatically discovered and added to the XML. > S3: Define a guest with readonly='no'. > 1. Prepare a guest xml with readonly='no' in os loader. > <os> > <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> > <loader readonly="no" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> > <boot dev="hd"/> > </os> > 2. Define the guest. > # virsh define rhel.xml > Domain 'rhel' defined from rhel.xml > 3. Check the dumpxml. > # virsh dumpxml rhel --xpath //os > <os firmware="efi"> > <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> > <firmware> > <feature enabled="yes" name="enrolled-keys"/> > <feature enabled="yes" name="secure-boot"/> > </firmware> > <loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> > <nvram> template="/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd">/var/lib/libvirt/qemu/nvram/rhel_VARS.fd</nvram> > <boot dev="hd"/> > </os> > ------>> The value of readonly is changed to 'yes', is it expected? loader.readonly=no should be used for combined firmware images, that is, those where both the CODE and VARS parts are included in the same file. That's something that's not really used anymore, if it ever has been, because it prevents sharing the CODE part among different VMs. Even if you wanted to use it for some reason, you'd have to bring your own combined build - RHEL doesn't come with one. So the configuration you're testing is fundamentally incorrect. That said, the fact that libvirt will happily override some user-provided value is still a problem. (In reply to Meina Li from comment #7) > But if we use /usr/share/OVMF/OVMF_CODE.secboot.fd as the loader path. It > will have different result, which is strange. > [...] > > The relationship between /usr/share/OVMF/OVMF_CODE.secboot.fd and > /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd: > # ll /usr/share/OVMF/OVMF_CODE.secboot.fd > lrwxrwxrwx. 1 root root 33 May 22 04:38 /usr/share/OVMF/OVMF_CODE.secboot.fd > -> ../edk2/ovmf/OVMF_CODE.secboot.fd See above. Even though the two paths refer to the same file, as far as libvirt is concerned they are completely different. Specifically, the modern path (/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd) is mentioned in a JSON firmware descriptor file, while the legacy one (/usr/share/OVMF/OVMF_CODE.secboot.fd) is not and is instead included in the DEFAULT_LOADER_NVRAM list, which acts as the default value for the "nvram" key in qemu.conf if that hasn't been provided by the admin. Fixes pushed upstream. 07b6189ef4 NEWS: Mention fixes to firmware selection 7c328b6cf4 tests: Reintroduce firmware-auto-efi-format-mismatch 48e5fe7af4 tests: Rename firmware-auto-efi-format-loader-qcow2-nvram-path 10a8997cbb conf: Don't default to raw format for loader/NVRAM b845e376a4 qemu: Match NVRAM template extension for new domains e96e322725 qemu: Filter firmware based on loader.readonly ccbb987707 qemu: Generate NVRAM path in more cases 4a49114ff4 qemu: Don't overwrite NVRAM template for legacy firmware 1b3e9c67e3 tests: Include microvm in firmwaretest da6b98394b tests: Drop tags from BIOS firmware descriptor a97c56888c tests: Update firmware descriptor files e930f62a02 tests: Add more tests for firmware selection 87d91e9e24 tests: Add some more DO_TEST*ABI_UPDATE* macros ac76386eda qemu: Fix lookup against stateless/combined pflash d917883b30 qemu: Fix return value for qemuFirmwareFillDomainLegacy() 4ba04107d9 tests: Rename firmware-auto-efi-nvram-path 8627ec167c tests: Turn abi-update.xml into a symlink 1773526224 tests: Consistently use /path/to/guest_VARS.fd 5c129c8e7a tests: Use virt-4.0 machine type for aarch64 8c326914d8 tests: Switch to firmware autoselection for hvf 751b0e6dbf tests: Use DO_TEST_CAPS_*_ABI_UPDATE() for ppc64 v9.6.0-71-g07b6189ef4 CentOS Stream 9 backport prepared. https://gitlab.com/redhat/centos-stream/src/libvirt/-/merge_requests/7 This was reviewed by the RHEL voting members and approved. Pre-verified Version:
libvirt-9.5.0-6.el9.x86_64
qemu-kvm-8.0.0-13.el9.x86_64
edk2-ovmf-20230524-3.el9.noarch
Pre-verified Steps:
S1: Define and start a guest with readonly='no'.
1. Prepare a guest xml with readonly='no' in os loader.
<os>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<loader readonly="no" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
<boot dev="hd"/>
</os>
2. Define the guest and check the dumpxml.
# virsh define rhel.xml
Domain 'rhel' defined from rhel.xml
# virsh dumpxml rhel --xpath //os
<os>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<loader readonly="no" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
<boot dev="hd"/>
</os>
3. Start the guest.
# virsh start rhel
error: Failed to start domain 'rhel'
error: internal error: QEMU unexpectedly closed the monitor (vm='rhel'): 2023-08-30T06:18:52.502400Z qemu-kvm: Could not open '/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd': Permission denied
4. Repeat test with the legacy loader path, got the same expected error message: qemu-kvm: Could not open '/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd': Permission denied.
5. Repeat test only with readonly='no' and efi firmware, got expected error:
error: operation failed: Unable to find any firmware to satisfy 'efi'.
S2: Define and start guest with nonexisted template.
1. Prepare a guest xml with legacy loader path and nonexisted template.
<os>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
<nvram template="noexist">/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
<boot dev="hd"/>
</os>
2. Define the guest and check dumpxml.
# virsh define rhel.xml
Domain 'rhel' defined from rhel.xml
# virsh dumpxml rhel --xpath //os
<os>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
<nvram template="noexist">/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
<boot dev="hd"/>
</os>
3. Start the guest.
# virsh start rhel
error: Failed to start domain 'rhel'
error: Failed to open file 'noexist': No such file or directory
4. Repeat test with modern loader path and nonexisted template, can get same expected error.
S3: Start a guest only with loader and file backed nvram template.
1. Prepare a guest xml with file backend nvram template.
<os>
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
<nvram template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd' type='file'>
<source file='/tmp/rhel_VARS.fd'/>
</nvram>
<boot dev="hd"/>
</os>
2. Define and start the guest.
# virsh define rhel.xml
Domain 'rhel' defined from rhel.xml
# virsh start rhel
Domain 'rhel' started
# virsh dumpxml rhel --xpath //os
<os firmware="efi">
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<firmware>
<feature enabled="yes" name="enrolled-keys"/>
<feature enabled="yes" name="secure-boot"/>
</firmware>
<loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
<nvram template="/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd" type="file">
<source file="/tmp/rhel_VARS.fd"/>
</nvram>
<boot dev="hd"/>
</os>
S4: Start a guest with efi firmware and file backed nvram template.
1. Prepare a guest xml with efi firmware and file backed nvram template.
<os firmware='efi'>
<type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type>
<nvram template="/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd" type="file">
<source file="/tmp/rhel_VARS.fd"/>
</nvram>
<boot dev='hd'/>
</os>
2. Define and start the guest.
# virsh define rhel.xml
Domain 'rhel' defined from rhel.xml
# virsh start rhel
Domain 'rhel' started
# virsh dumpxml rhel --xpath //os
<os firmware="efi">
<type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
<firmware>
<feature enabled="yes" name="enrolled-keys"/>
<feature enabled="yes" name="secure-boot"/>
</firmware>
<loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
<nvram template="/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd" type="file">
<source file="/tmp/rhel_VARS.fd"/>
</nvram>
<boot dev="hd"/>
</os>
Hi Andrea, I found there are some patches in comment 13 related to the Fedora 38's edk2 package. For example, the 4MB qcow2 format builds have been introduced on x86_64 in those new firmware descriptor files. Now we can't verify them. And it will block the verification of this bug. So could we open another bug to track those patches until the new edk2 package comes? Thanks. (In reply to Meina Li from comment #20) > Hi Andrea, > > I found there are some patches in comment 13 related to the Fedora 38's edk2 > package. For example, the 4MB qcow2 format builds have been introduced on > x86_64 in those new firmware descriptor files. Now we can't verify them. And > it will block the verification of this bug. > > So could we open another bug to track those patches until the new edk2 > package comes? There are no plans to switch x86_64 builds to qcow2 in RHEL. We did the switch in Fedora because that gave us the chance to also switch from 2M builds to 4M builds at the same time, which was long overdue and we just didn't have a good mechanism for until now. In RHEL x86_64 builds are already 4M, and unlike aarch64 there are no benefits in terms of memory usage to using qcow2, so switching would be pointless. So, for any scenario in which upstream/Fedora uses qcow2 on x86_64, RHEL should get raw instead. And for those in which qcow2 is explicitly requested on x86_64, RHEL should report a failure. Hope this helps. Please let me know if you need more information! Sorry for the misunderstanding as I thought the qcow2 firmware will also be supported on rhel x86_64. So based on comment 19 and comment 21, move this bug to Verified: Tested. (In reply to Meina Li from comment #22) > Sorry for the misunderstanding as I thought the qcow2 firmware will > also be supported on rhel x86_64. No worries, it's a perfectly reasonable conclusion to reach based on looking at the libvirt changes. > So based on comment 19 and comment 21, move this bug to > Verified: Tested. Thanks :) Verification test passed with the test scenarios in comment 19. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: libvirt security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6409 |