Bug 2196178 - libvirt: Changes to firmware selection [NEEDINFO]
Summary: libvirt: Changes to firmware selection
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: libvirt
Version: 9.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Andrea Bolognani
QA Contact: Meina Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-08 09:02 UTC by Andrea Bolognani
Modified: 2023-08-15 18:51 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:
jferlan: needinfo? (abologna)


Attachments (Terms of Use)
libvirtd.log (378.20 KB, text/plain)
2023-05-09 03:19 UTC, Meina Li
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-156590 0 None None None 2023-05-08 09:05:36 UTC

Description Andrea Bolognani 2023-05-08 09:02:24 UTC
As part of adding support for QCOW2 firmware images (Bug 2161965) I
have had to make some changes to the way libvirt performs firmware
selection.

Specifically:

  * firmware selection now happens only once, when the VM is defined,
    instead of each time it is started. This improves guest ABI
    stability;

  * firmware features (secure-boot, enrolled-keys) discovered via
    looking at JSON descriptors are now reflected in the XML.

While QCOW2 firmware images are only going to be used on aarch64, the
changes described above affect x86_64 as well.

All of this is included in libvirt 9.2.0.

Comment 2 Meina Li 2023-05-09 03:19:51 UTC
Created attachment 1963449 [details]
libvirtd.log

Comment 3 Meina Li 2023-05-09 03:21:29 UTC
Hi Andrea,

I've tested this bug and many scenarios tests passed. But there are still three test scenarios that I suspect are bugs. Or that's how it was designed. Could you help to review this scenarios again? Thank you very much.

Test Version:
libvirt-9.3.0-1.el9.x86_64
qemu-kvm-8.0.0-1.el9.x86_64

S1: Define a guest only with loader and file backed nvram template
Test Steps:
1. Prepare a guest xml with the following os xml:
<os>
  <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
  <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
  <nvram template="/usr/share/OVMF/OVMF_VARS.secboot.fd" type="file">
    <source file="/tmp/rhel_VARS.fd"/>
  </nvram>
  <boot dev="hd"/>
</os>
2. Define the guest.
# virsh define rhel.xml 
Domain 'rhel' defined from rhel.xml
3. Check the dumpxml.
# virsh dumpxml rhel --xpath //os
<os>
  <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
  <loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
  <nvram template="/usr/share/OVMF/OVMF_VARS.fd" type="file">
    <source file="/tmp/rhel_VARS.fd"/>
  </nvram>
  <boot dev="hd"/>
</os>
------>> We can't get the firmware features automatically. And the nvram template has been changed from OVMF_CODE.secboot.fd to OVMF_VARS.fd.

S2: Define a guest with efi firmware and file backed nvram template. 
1. Prepare a guest xml with the following os xml:
<os firmware='efi'>
  <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
  <nvram template="/usr/share/OVMF/OVMF_VARS.secboot.fd" type="file">
  <source file="/tmp/rhel_VARS.fd"/>
  </nvram>
  <boot dev="hd"/>
</os>
2. Define the guest.
# virsh define rhel.xml 
error: Disconnected from qemu:///system due to end of file
error: Failed to define domain from rhel.xml
error: End of file while reading data: Input/output error
------>> Get unexpected error with no coredump and no error info in libvirtd.log(look at attachment.)

S3: Define a guest with readonly='no'.
1. Prepare a guest xml with readonly='no' in os loader.
<os>
  <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
  <loader readonly="no" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
  <boot dev="hd"/>
</os>
2. Define the guest.
# virsh define rhel.xml 
Domain 'rhel' defined from rhel.xml
3. Check the dumpxml.
# virsh dumpxml rhel --xpath //os
<os firmware="efi">
  <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
  <firmware>
    <feature enabled="yes" name="enrolled-keys"/>
    <feature enabled="yes" name="secure-boot"/>
  </firmware>
  <loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
  <nvram template="/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd">/var/lib/libvirt/qemu/nvram/rhel_VARS.fd</nvram>
  <boot dev="hd"/>
</os>
------>> The value of readonly is changed to 'yes', is it expected?

Comment 4 Andrea Bolognani 2023-05-09 09:09:34 UTC
(In reply to Meina Li from comment #3)
> Hi Andrea,
> 
> I've tested this bug and many scenarios tests passed. But there are still
> three test scenarios that I suspect are bugs. Or that's how it was designed.
> Could you help to review this scenarios again? Thank you very much.

Thanks for the report! These all seem unexpected. I'll look into
them and get back to you.

Comment 5 Andrea Bolognani 2023-05-11 13:32:26 UTC
I have manage to reproduce all the scenarios that you've so very
nicely and accurately described, and yeah they're all undesired.

I'll start working on patches. In the meantime, moving the bug back
to ASSIGNED.

Comment 7 Meina Li 2023-05-24 02:26:12 UTC
Another unreasonable test scenario.

Test Version:
libvirt-9.3.0-2.el9.x86_64
qemu-kvm-8.0.0-3.el9.x86_64

Test Steps:
1. Prepare a guest xml with /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd loader and noexist template.
  <os>
    <type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
    <nvram template='noexist'>/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
    <boot dev='hd'/>
  </os>
2. Define the guest.
# virsh define test.xml 
Domain 'test' defined from test.xml
# virsh dumpxml test --xpath //os
<os>
  <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
  <loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
  <nvram template="noexist">/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
  <boot dev="hd"/>
</os>
3. Start the guest.
# virsh start test
error: Failed to start domain 'test'
error: Failed to open file 'noexist': No such file or directory
------> I think this is expected.

But if we use /usr/share/OVMF/OVMF_CODE.secboot.fd as the loader path. It will have different result, which is strange.
1. Prepare a guest xml with /usr/share/OVMF/OVMF_CODE.secboot.fd loader and noexist template.
  <os>
    <type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
    <nvram template='noexist'>/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
    <boot dev='hd'/>
  </os>
2. Define the guest.
# virsh define test.xml 
Domain 'test' defined from test.xml
# virsh dumpxml test --xpath //os
<os>
  <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
  <loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
  <nvram template="/usr/share/OVMF/OVMF_VARS.fd">/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
  <boot dev="hd"/>
</os>

------> After we defining the guest, the template will be changed and then certainly the guest start succssfully.

The relationship between /usr/share/OVMF/OVMF_CODE.secboot.fd and /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd:
# ll /usr/share/OVMF/OVMF_CODE.secboot.fd
lrwxrwxrwx. 1 root root 33 May 22 04:38 /usr/share/OVMF/OVMF_CODE.secboot.fd -> ../edk2/ovmf/OVMF_CODE.secboot.fd

Comment 8 Andrea Bolognani 2023-05-25 13:55:47 UTC
(In reply to Meina Li from comment #7)
> ------> I think this is expected.
> 
> But if we use /usr/share/OVMF/OVMF_CODE.secboot.fd as the loader path. It
> will have different result, which is strange.

Thanks for the additional testing! I'm still working on patches that
will hopefully improve the situation, and I will keep these scenarios
in mind too.

Comment 9 Andrea Bolognani 2023-08-10 16:53:33 UTC
Patches posted upstream.

https://listman.redhat.com/archives/libvir-list/2023-August/241171.html

Comment 10 Andrea Bolognani 2023-08-10 17:15:01 UTC
A few clarifications.

(In reply to Meina Li from comment #3)
> S1: Define a guest only with loader and file backed nvram template
> Test Steps:
> 1. Prepare a guest xml with the following os xml:
> <os>
>   <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
>   <loader readonly="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
>   <nvram template="/usr/share/OVMF/OVMF_VARS.secboot.fd" type="file">
>     <source file="/tmp/rhel_VARS.fd"/>
>   </nvram>
>   <boot dev="hd"/>
> </os>
> 2. Define the guest.
> # virsh define rhel.xml 
> Domain 'rhel' defined from rhel.xml
> 3. Check the dumpxml.
> # virsh dumpxml rhel --xpath //os
> <os>
>   <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
>   <loader readonly="yes" secure="yes" type="pflash">/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
>   <nvram template="/usr/share/OVMF/OVMF_VARS.fd" type="file">
>     <source file="/tmp/rhel_VARS.fd"/>
>   </nvram>
>   <boot dev="hd"/>
> </os>
> ------>> We can't get the firmware features automatically. And the nvram
> template has been changed from OVMF_CODE.secboot.fd to OVMF_VARS.fd.

The fact that the NVRAM template has been changed from
OVMF_CODE.secboot.fd to OVMF_VARS.fd is definitely undesired and
shouldn't happen.

The fact that firmware features are not automatically discovered,
however, is expected: that process relies on a corresponding JSON
firmware descriptor file existing on the system, and as of RHEL 9
those files only include references to modern paths such as
/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd, not to legacy paths such
as /usr/share/OVMF/OVMF_CODE.secboot.fd. So using the legacy paths
will result in firmware features not being automatically discovered
and added to the XML.

> S3: Define a guest with readonly='no'.
> 1. Prepare a guest xml with readonly='no' in os loader.
> <os>
>   <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
>   <loader readonly="no" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
>   <boot dev="hd"/>
> </os>
> 2. Define the guest.
> # virsh define rhel.xml
> Domain 'rhel' defined from rhel.xml
> 3. Check the dumpxml.
> # virsh dumpxml rhel --xpath //os
> <os firmware="efi">
>   <type arch="x86_64" machine="pc-q35-rhel9.2.0">hvm</type>
>   <firmware>
>     <feature enabled="yes" name="enrolled-keys"/>
>     <feature enabled="yes" name="secure-boot"/>
>   </firmware>
>   <loader readonly="yes" secure="yes" type="pflash">/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
>   <nvram> template="/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd">/var/lib/libvirt/qemu/nvram/rhel_VARS.fd</nvram>
>   <boot dev="hd"/>
> </os>
> ------>> The value of readonly is changed to 'yes', is it expected?

loader.readonly=no should be used for combined firmware images, that
is, those where both the CODE and VARS parts are included in the same
file. That's something that's not really used anymore, if it ever has
been, because it prevents sharing the CODE part among different VMs.

Even if you wanted to use it for some reason, you'd have to bring
your own combined build - RHEL doesn't come with one. So the
configuration you're testing is fundamentally incorrect.

That said, the fact that libvirt will happily override some
user-provided value is still a problem.

(In reply to Meina Li from comment #7)
> But if we use /usr/share/OVMF/OVMF_CODE.secboot.fd as the loader path. It
> will have different result, which is strange.
>
[...]
>
> The relationship between /usr/share/OVMF/OVMF_CODE.secboot.fd and
> /usr/share/edk2/ovmf/OVMF_CODE.secboot.fd:
> # ll /usr/share/OVMF/OVMF_CODE.secboot.fd
> lrwxrwxrwx. 1 root root 33 May 22 04:38 /usr/share/OVMF/OVMF_CODE.secboot.fd > -> ../edk2/ovmf/OVMF_CODE.secboot.fd

See above. Even though the two paths refer to the same file, as far
as libvirt is concerned they are completely different. Specifically,
the modern path (/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd) is
mentioned in a JSON firmware descriptor file, while the legacy one
(/usr/share/OVMF/OVMF_CODE.secboot.fd) is not and is instead
included in the DEFAULT_LOADER_NVRAM list, which acts as the default
value for the "nvram" key in qemu.conf if that hasn't been provided
by the admin.


Note You need to log in before you can comment on or make changes to this bug.