As part of adding support for QCOW2 firmware images (Bug 2161965) I have had to make some changes to the way libvirt performs firmware selection. Specifically: * firmware selection now happens only once, when the VM is defined, instead of each time it is started. This improves guest ABI stability; * firmware features (secure-boot, enrolled-keys) discovered via looking at JSON descriptors are now reflected in the XML. While QCOW2 firmware images are only going to be used on aarch64, the changes described above affect x86_64 as well. All of this is included in libvirt 9.2.0.
Created attachment 1963449 [details] libvirtd.log
Hi Andrea, I've tested this bug and many scenarios tests passed. But there are still three test scenarios that I suspect are bugs. Or that's how it was designed. Could you help to review this scenarios again? Thank you very much. Test Version: libvirt-9.3.0-1.el9.x86_64 qemu-kvm-8.0.0-1.el9.x86_64 S1: Define a guest only with loader and file backed nvram template Test Steps: 1. Prepare a guest xml with the following os xml: <os> <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> <nvram template="/usr/share/OVMF/OVMF_VARS.secboot.fd" type="file"> <source file="/tmp/rhel_VARS.fd"/> </nvram> <boot dev="hd"/> </os> 2. Define the guest. # virsh define rhel.xml Domain 'rhel' defined from rhel.xml 3. Check the dumpxml. # virsh dumpxml rhel --xpath //os <os> <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> <loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> <nvram template="/usr/share/OVMF/OVMF_VARS.fd" type="file"> <source file="/tmp/rhel_VARS.fd"/> </nvram> <boot dev="hd"/> </os> ------>> We can't get the firmware features automatically. And the nvram template has been changed from OVMF_CODE.secboot.fd to OVMF_VARS.fd. S2: Define a guest with efi firmware and file backed nvram template. 1. Prepare a guest xml with the following os xml: <os firmware='efi'> <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> <nvram template="/usr/share/OVMF/OVMF_VARS.secboot.fd" type="file"> <source file="/tmp/rhel_VARS.fd"/> </nvram> <boot dev="hd"/> </os> 2. Define the guest. # virsh define rhel.xml error: Disconnected from qemu:///system due to end of file error: Failed to define domain from rhel.xml error: End of file while reading data: Input/output error ------>> Get unexpected error with no coredump and no error info in libvirtd.log(look at attachment.) S3: Define a guest with readonly='no'. 1. Prepare a guest xml with readonly='no' in os loader. <os> <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> <loader readonly="no" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> <boot dev="hd"/> </os> 2. Define the guest. # virsh define rhel.xml Domain 'rhel' defined from rhel.xml 3. Check the dumpxml. # virsh dumpxml rhel --xpath //os <os firmware="efi"> <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> <firmware> <feature enabled="yes" name="enrolled-keys"/> <feature enabled="yes" name="secure-boot"/> </firmware> <loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> <nvram template="/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd">/var/lib/libvirt/qemu/nvram/rhel_VARS.fd</nvram> <boot dev="hd"/> </os> ------>> The value of readonly is changed to 'yes', is it expected?
(In reply to Meina Li from comment #3) > Hi Andrea, > > I've tested this bug and many scenarios tests passed. But there are still > three test scenarios that I suspect are bugs. Or that's how it was designed. > Could you help to review this scenarios again? Thank you very much. Thanks for the report! These all seem unexpected. I'll look into them and get back to you.
I have manage to reproduce all the scenarios that you've so very nicely and accurately described, and yeah they're all undesired. I'll start working on patches. In the meantime, moving the bug back to ASSIGNED.
Another unreasonable test scenario. Test Version: libvirt-9.3.0-2.el9.x86_64 qemu-kvm-8.0.0-3.el9.x86_64 Test Steps: 1. Prepare a guest xml with /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd loader and noexist template. <os> <type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type> <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> <nvram template='noexist'>/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram> <boot dev='hd'/> </os> 2. Define the guest. # virsh define test.xml Domain 'test' defined from test.xml # virsh dumpxml test --xpath //os <os> <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> <loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> <nvram template="noexist">/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram> <boot dev="hd"/> </os> 3. Start the guest. # virsh start test error: Failed to start domain 'test' error: Failed to open file 'noexist': No such file or directory ------> I think this is expected. But if we use /usr/share/OVMF/OVMF_CODE.secboot.fd as the loader path. It will have different result, which is strange. 1. Prepare a guest xml with /usr/share/OVMF/OVMF_CODE.secboot.fd loader and noexist template. <os> <type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type> <loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> <nvram template='noexist'>/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram> <boot dev='hd'/> </os> 2. Define the guest. # virsh define test.xml Domain 'test' defined from test.xml # virsh dumpxml test --xpath //os <os> <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> <loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> <nvram template="/usr/share/OVMF/OVMF_VARS.fd">/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram> <boot dev="hd"/> </os> ------> After we defining the guest, the template will be changed and then certainly the guest start succssfully. The relationship between /usr/share/OVMF/OVMF_CODE.secboot.fd and /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd: # ll /usr/share/OVMF/OVMF_CODE.secboot.fd lrwxrwxrwx. 1 root root 33 May 22 04:38 /usr/share/OVMF/OVMF_CODE.secboot.fd -> ../edk2/ovmf/OVMF_CODE.secboot.fd
(In reply to Meina Li from comment #7) > ------> I think this is expected. > > But if we use /usr/share/OVMF/OVMF_CODE.secboot.fd as the loader path. It > will have different result, which is strange. Thanks for the additional testing! I'm still working on patches that will hopefully improve the situation, and I will keep these scenarios in mind too.
Patches posted upstream. https://listman.redhat.com/archives/libvir-list/2023-August/241171.html
A few clarifications. (In reply to Meina Li from comment #3) > S1: Define a guest only with loader and file backed nvram template > Test Steps: > 1. Prepare a guest xml with the following os xml: > <os> > <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> > <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> > <nvram template="/usr/share/OVMF/OVMF_VARS.secboot.fd" type="file"> > <source file="/tmp/rhel_VARS.fd"/> > </nvram> > <boot dev="hd"/> > </os> > 2. Define the guest. > # virsh define rhel.xml > Domain 'rhel' defined from rhel.xml > 3. Check the dumpxml. > # virsh dumpxml rhel --xpath //os > <os> > <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> > <loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> > <nvram template="/usr/share/OVMF/OVMF_VARS.fd" type="file"> > <source file="/tmp/rhel_VARS.fd"/> > </nvram> > <boot dev="hd"/> > </os> > ------>> We can't get the firmware features automatically. And the nvram > template has been changed from OVMF_CODE.secboot.fd to OVMF_VARS.fd. The fact that the NVRAM template has been changed from OVMF_CODE.secboot.fd to OVMF_VARS.fd is definitely undesired and shouldn't happen. The fact that firmware features are not automatically discovered, however, is expected: that process relies on a corresponding JSON firmware descriptor file existing on the system, and as of RHEL 9 those files only include references to modern paths such as /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd, not to legacy paths such as /usr/share/OVMF/OVMF_CODE.secboot.fd. So using the legacy paths will result in firmware features not being automatically discovered and added to the XML. > S3: Define a guest with readonly='no'. > 1. Prepare a guest xml with readonly='no' in os loader. > <os> > <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> > <loader readonly="no" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> > <boot dev="hd"/> > </os> > 2. Define the guest. > # virsh define rhel.xml > Domain 'rhel' defined from rhel.xml > 3. Check the dumpxml. > # virsh dumpxml rhel --xpath //os > <os firmware="efi"> > <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type> > <firmware> > <feature enabled="yes" name="enrolled-keys"/> > <feature enabled="yes" name="secure-boot"/> > </firmware> > <loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> > <nvram> template="/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd">/var/lib/libvirt/qemu/nvram/rhel_VARS.fd</nvram> > <boot dev="hd"/> > </os> > ------>> The value of readonly is changed to 'yes', is it expected? loader.readonly=no should be used for combined firmware images, that is, those where both the CODE and VARS parts are included in the same file. That's something that's not really used anymore, if it ever has been, because it prevents sharing the CODE part among different VMs. Even if you wanted to use it for some reason, you'd have to bring your own combined build - RHEL doesn't come with one. So the configuration you're testing is fundamentally incorrect. That said, the fact that libvirt will happily override some user-provided value is still a problem. (In reply to Meina Li from comment #7) > But if we use /usr/share/OVMF/OVMF_CODE.secboot.fd as the loader path. It > will have different result, which is strange. > [...] > > The relationship between /usr/share/OVMF/OVMF_CODE.secboot.fd and > /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd: > # ll /usr/share/OVMF/OVMF_CODE.secboot.fd > lrwxrwxrwx. 1 root root 33 May 22 04:38 /usr/share/OVMF/OVMF_CODE.secboot.fd > -> ../edk2/ovmf/OVMF_CODE.secboot.fd See above. Even though the two paths refer to the same file, as far as libvirt is concerned they are completely different. Specifically, the modern path (/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd) is mentioned in a JSON firmware descriptor file, while the legacy one (/usr/share/OVMF/OVMF_CODE.secboot.fd) is not and is instead included in the DEFAULT_LOADER_NVRAM list, which acts as the default value for the "nvram" key in qemu.conf if that hasn't been provided by the admin.