Bug 1564270

Summary: RFE: QEMU firmware metadata format - libvirt support [RHEL-8]
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Ademar Reis <areis>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Meina Li <meili>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.0CC: berrange, chayang, crobinso, dyuan, edacval, hpopal, jdenemar, jsuchane, juzhang, kchamart, knoel, lersek, lmen, meili, mprivozn, virt-bugs, virt-maint, xuzhang
Target Milestone: rcKeywords: FutureFeature, Upstream
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-5.3.0-1.el8 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: 1546084 Environment:
Last Closed: 2019-11-06 07:11:30 UTC Type: Feature Request
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1546084    
Bug Blocks: 1217444, 1369007    
Attachments:
Description Flags
The xml of ovmf guest
none
Debug messages none

Description Ademar Reis 2018-04-05 21:00:12 UTC
libvirt will need to support it somehow.

+++ This bug was initially created as a clone of Bug #1546084 +++

(Suggested by: Gerd Hoffmann and Dan Berrangé on different lists.)

For each firmware file we need a metadata file in a well defined
location, e.g.  /usr/share/qemu/bios/  that lists stuff like:

 - Path to the firmware binary
 - Path to the pre-built OVMF 'vars' file (if any)
 - Support architectures
 - Associated QEMU feature flags (Secure Boot) 
 - If the binary provides / requires SMM (System Management Mode)

Libvirt can read these metadata files and then pick the correct firmware
image based on the settings for the guest.

Essentially QEMU would define the file format, and provide metadata
files from any ROMs it ships directly.  If vendors ship extra ROMs like
OVMF, etc the vendor (distribution) should provide suitable metadata 
files.
   

References:
 - https://www.redhat.com/archives/virt-tools-list/2014-September/msg00145.html
   Discussion from 2014 about the OVMF metdata format, where Gerd
   suggested an idea of a firemware registry format for libvirt

--- Additional comment from Kashyap Chamarthy on 2018-02-16 08:07:58 BRST ---

There's an additional consideration that Laszlo brought up elsewhere on this topic:

    Would this [the idea presented in bug description] be flexible 
    enough to tell apart OVMF binaries that can be used interchangeably
    w.r.t. the QEMU command line, but have different firmware features 
    built into them?

--- Additional comment from Laszlo Ersek on 2018-04-03 11:11:19 BRT ---

http://mid.mail-archive.com/20180307144951.d75lo5rgzi2vf27z@eukaryote

Comment 2 Laszlo Ersek 2018-07-27 12:38:37 UTC
*** Bug 1609225 has been marked as a duplicate of this bug. ***

Comment 3 Daniel Berrangé 2019-02-01 15:49:11 UTC
FYI from libvirt side this entire effort in QEMU stems from this original libvirt patch series https://www.redhat.com/archives/libvir-list/2016-October/msg00045.html  where we concluded we wanted to replace libvirt's list of OVMF files with metadata before continuing.

Comment 4 Michal Privoznik 2019-02-27 10:06:58 UTC
First version posted onto the list:

https://www.redhat.com/archives/libvir-list/2019-February/msg01503.html

Comment 6 Michal Privoznik 2019-03-12 15:11:55 UTC
I've just merged patches upstream:

68ade25372 qemu: Enable firmware autoselection
d433f3cdd8 qemuDomainDefValidate: Don't require SMM if automatic firmware selection enabled
43527af27c qemu_process: Call qemuFirmwareFillDomain
804d2003e6 qemu_firmware: Introduce qemuFirmwareFillDomain()
31eb3093c0 qemufirmwaretest: Test qemuFirmwareFetchConfigs()
3c876d2428 qemu_firmware: Introduce qemuFirmwareFetchConfigs
04406d87d2 test: Introduce qemufirmwaretest
8b5b80f4c5 qemu: Introduce basic skeleton for parsing firmware description
d947fa8a08 conf: Introduce firmware attribute to <os/>
d21f89cc1a conf: Introduce VIR_DOMAIN_LOADER_TYPE_NONE
cdd592553a virDomainLoaderDefParseXML: Allow loader path to be NULL
849a0cfef1 qemu_capabilities: Expose qemu <-> libvirt arch translators
23018c0823 qemu_domain: Separate NVRAM VAR store file name generation

v5.1.0-191-g68ade25372

Comment 8 Michal Privoznik 2019-04-05 08:02:00 UTC
I've sent some more patches that expose the feature in domain capabilities:

https://www.redhat.com/archives/libvir-list/2019-April/msg00460.html

They might be worth backporting too.

Comment 9 Michal Privoznik 2019-04-10 16:12:21 UTC
Patches are now pushed upstream:

947ea8665e tests: Fix MinGW build for domaincapstest
5b9819eedc domain capabilities: Expose firmware auto selection feature
9c0d73bf49 qemu_firmware: Introduce qemuFirmwareGetSupported
2337309e04 qemu_firmware: Separate machine and arch matching into a function
15e0b76480 qemu_firmware: Separate firmware loading into a function

v5.2.0-163-g947ea8665e

Comment 11 Meina Li 2019-07-02 03:24:09 UTC
Hi Michal,

I can't start guest with firmware='bios', can you help me check this issue? Thank you very much.

Test Version:
libvirt-5.4.0-1.module+el8.1.0+3304+7eb41d5f.x86_64
qemu-kvm-4.0.0-4.module+el8.1.0+3356+cda7f1ee.x86_64
kernel-4.18.0-107.el8.x86_64

Test Steps:
1. Start a guest with the following os elements with 'bios' firmware:
...
<os firmware='bios'>
  <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
  <boot dev='hd'/>
  <bootmenu enable='yes'/>
</os>
...
# virsh start lmn
error: Failed to start domain lmn
error: operation failed: Unable to find any firmware to satisfy 'bios'
# rpm -ql seabios-bin-1.11.1-3.module+el8.1.0+2983+b2ae9c0a.noarch
/usr/share/seabios
/usr/share/seabios/bios-256k.bin
/usr/share/seabios/bios.bin

2. Start a guest with the following os elements with 'efi' firmware:
...
<os firmware='efi'>
  <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
  <boot dev='hd'/>
</os>
...
# virsh start ovmf 
Domain ovmf started
# virsh dumpxml ovmf | grep os -B6
...
  <os>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
    <nvram template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd'>/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd</nvram>
    <boot dev='hd'/>
    <bootmenu enable='yes'/>
  </os>

Comment 12 Michal Privoznik 2019-07-08 09:14:35 UTC
(In reply to Meina Li from comment #11)
> Hi Michal,
> 
> I can't start guest with firmware='bios', can you help me check this issue?
> Thank you very much.
> 
> Test Version:
> libvirt-5.4.0-1.module+el8.1.0+3304+7eb41d5f.x86_64
> qemu-kvm-4.0.0-4.module+el8.1.0+3356+cda7f1ee.x86_64
> kernel-4.18.0-107.el8.x86_64
> 
> Test Steps:
> 1. Start a guest with the following os elements with 'bios' firmware:
> ...
> <os firmware='bios'>
>   <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
>   <boot dev='hd'/>
>   <bootmenu enable='yes'/>
> </os>
> ...
> # virsh start lmn
> error: Failed to start domain lmn
> error: operation failed: Unable to find any firmware to satisfy 'bios'
> # rpm -ql seabios-bin-1.11.1-3.module+el8.1.0+2983+b2ae9c0a.noarch
> /usr/share/seabios
> /usr/share/seabios/bios-256k.bin
> /usr/share/seabios/bios.bin


This is not enough. For 'firmware=' to work you need to have so called firmware descriptor files. You can find them in /usr/share/qemu/firmware/. Libvirt exposes this under "virsh domcapabilities":

<domainCapabilities>
  ...
  <os supported='yes'>
    <enum name='firmware'>
      <value>efi</value>
    </enum>
  </os>
  ...
</domainCapabilities>

In this example, libvirt found only 'efi' descriptors and thus only "firmware='efi'" will work.

Comment 13 Meina Li 2019-07-09 10:11:17 UTC
Hi Michal,

Thanks for your reply, and I still have another two issues:

1) For 'firmware=', does it means we only support 'efi' and 'bios' doesn't have actual meaning now? Will we provide firmware descriptor meta-files for bios in the future?

2) For 'virsh domcapabilities', I also can't get the efi value:
# rpm -qa libvirt qemu-kvm
libvirt-5.5.0-1.module+el8.1.0+3580+d7f6488d.x86_64
qemu-kvm-4.0.0-4.module+el8.1.0+3523+b348b848.x86_64
# virsh domcapabilities
...
 <os supported='yes'>
    <enum name='firmware'/>
    <loader supported='yes'>
      <value>/usr/share/OVMF/OVMF_CODE.secboot.fd</value>
      <enum name='type'>
        <value>rom</value>
        <value>pflash</value>
      </enum>
      <enum name='readonly'>
        <value>yes</value>
        <value>no</value>
      </enum>
      <enum name='secure'>
        <value>no</value>
      </enum>
    </loader>
  </os>
...

Thank you very much.

Comment 14 Michal Privoznik 2019-07-09 13:48:53 UTC
(In reply to Meina Li from comment #13)
> Hi Michal,
> 
> Thanks for your reply, and I still have another two issues:
> 
> 1) For 'firmware=', does it means we only support 'efi' and 'bios' doesn't
> have actual meaning now?

Libvirt's code is written so that it's capable of dealing with both. It's up to distros to ship FW descriptor files.

> Will we provide firmware descriptor meta-files for
> bios in the future?

This is a very good question and I don't know the answer. Perhaps Laszlo does?

> 
> 2) For 'virsh domcapabilities', I also can't get the efi value:
> # rpm -qa libvirt qemu-kvm
> libvirt-5.5.0-1.module+el8.1.0+3580+d7f6488d.x86_64
> qemu-kvm-4.0.0-4.module+el8.1.0+3523+b348b848.x86_64
> # virsh domcapabilities
> ...
>  <os supported='yes'>
>     <enum name='firmware'/>
>     <loader supported='yes'>
>       <value>/usr/share/OVMF/OVMF_CODE.secboot.fd</value>
>       <enum name='type'>
>         <value>rom</value>
>         <value>pflash</value>
>       </enum>
>       <enum name='readonly'>
>         <value>yes</value>
>         <value>no</value>
>       </enum>
>       <enum name='secure'>
>         <value>no</value>
>       </enum>
>     </loader>
>   </os>
> ...
> 
> Thank you very much.

This is very likely because machine type mismatch. Try specifying 'domcapabilities --machine q35' - AFAICT, in RHEL we support UEFI only for q35 but the default machine type is pc.

Comment 15 Laszlo Ersek 2019-07-10 14:54:13 UTC
(In reply to Michal Privoznik from comment #14)
> (In reply to Meina Li from comment #13)

> > Will we provide firmware descriptor meta-files for
> > bios in the future?
> 
> This is a very good question and I don't know the answer. Perhaps Laszlo
> does?

I'm unaware of such a plan, at this time.

> > 2) For 'virsh domcapabilities', I also can't get the efi value:
> > # rpm -qa libvirt qemu-kvm
> > libvirt-5.5.0-1.module+el8.1.0+3580+d7f6488d.x86_64
> > qemu-kvm-4.0.0-4.module+el8.1.0+3523+b348b848.x86_64
> > # virsh domcapabilities
> > ...
> >  <os supported='yes'>
> >     <enum name='firmware'/>
> >     <loader supported='yes'>
> >       <value>/usr/share/OVMF/OVMF_CODE.secboot.fd</value>
> >       <enum name='type'>
> >         <value>rom</value>
> >         <value>pflash</value>
> >       </enum>
> >       <enum name='readonly'>
> >         <value>yes</value>
> >         <value>no</value>
> >       </enum>
> >       <enum name='secure'>
> >         <value>no</value>
> >       </enum>
> >     </loader>
> >   </os>
> > ...
> > 
> > Thank you very much.
> 
> This is very likely because machine type mismatch. Try specifying
> 'domcapabilities --machine q35' - AFAICT, in RHEL we support UEFI only for
> q35 but the default machine type is pc.

Correct:

$ rpm -ql edk2-ovmf | fgrep /usr/share/qemu/firmware/

/usr/share/qemu/firmware/40-edk2-ovmf-sb.json
/usr/share/qemu/firmware/50-edk2-ovmf.json

And in both of those files, we have:

    "targets": [
        {
            "architecture": "x86_64",
            "machines": [
                "pc-q35-*"
            ]
        }
    ],

Thanks.

Comment 16 Meina Li 2019-07-11 06:00:19 UTC
According to comment 14 and comment 15, this bug will mainly test efi firmware for q35 machine.

Verified Version:
libvirt-5.5.0-1.module+el8.1.0+3580+d7f6488d.x86_64
qemu-kvm-4.0.0-4.module+el8.1.0+3523+b348b848.x86_64
kernel-4.18.0-107.el8.x86_64

Verified Steps:
Test Scenario1: Define/start OVMF guest with efi firmware but without secure element
1. Prepare a ovmf guest with the following xml:
…
 <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
    <boot dev='hd'/>
    <bootmenu enable='yes'/>
  </os>
...
2. Define and start the guest, check the guest xml.
# virsh define ovmf.xml
Domain ovmf defined from ovmf.xml
# virsh start ovmf
Domain ovmf started
# virsh dumpxml ovmf | grep os -B 5
…
 <os>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
    <nvram template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd'>/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd</nvram>
    <boot dev='hd'/>
    <bootmenu enable='yes'/>
  </os>
3. Login the guest and check.
(guest)# efibootmgr
BootCurrent: 0002
Timeout: 0 seconds
BootOrder: 0002,0001,0000
Boot0000* UiApp
Boot0001* UEFI Misc Device
Boot0002* Red Hat Enterprise Linux

Test Scenario2: Define/start OVMF guest with efi firmware and secure=’yes|no’
1. Prepare a ovmf guest with the following xml:
…
 <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
    <loader secure='yes'/>                      --or use ‘no’ to test
    <boot dev='hd'/>
    <bootmenu enable='yes'/>
  </os>
…
2. Define and start the guest, check the guest xml.
# virsh define ovmf.xml
Domain ovmf defined from ovmf.xml
# virsh start ovmf
Domain ovmf started
# virsh dumpxml ovmf | grep os -B 6
...
  <os>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
    <nvram template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd'>/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd</nvram>
    <boot dev='hd'/>
    <bootmenu enable='yes'/>
  </os>

Test Scenario3: Define/start OVMF guest with efi firmware but without smm element
1. Prepare a ovmf guest with the following xml:
…
  <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
  </features>
…
 <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
    <loader secure='yes'/>                      
    <boot dev='hd'/>
    <bootmenu enable='yes'/>
  </os>
…
2. Define and start the guest, check the guest xml.
# virsh define ovmf.xml
Domain ovmf defined from ovmf.xml
# virsh start ovmf
Domain ovmf started
# virsh dumpxml ovmf
…
  <os>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
    <nvram template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd'>/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd</nvram>
    <boot dev='hd'/>
    <bootmenu enable='yes'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
    <smm state='on'/>
  </features>
…

Scenario 4: Use domcapabilities to check q35 firmware.
# virsh domcapabilities --machine q35 | grep efi -a5
  <arch>x86_64</arch>
  <vcpu max='384'/>
  <iothreads supported='yes'/>
  <os supported='yes'>
    <enum name='firmware'>
      <value>efi</value>
    </enum>
    <loader supported='yes'>
      <value>/usr/share/OVMF/OVMF_CODE.secboot.fd</value>
      <enum name='type'>
        <value>rom</value>

Scenario 5: [negative] Negative test when the description doesn’t match the given domain
1. Define ovmf guest with invalid value for firmware
1) Prepare a ovmf guest with the following xml:
…
 <os firmware='uefi'>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
    <loader secure='yes'/>                      
    <boot dev='hd'/>
    <bootmenu enable='yes'/>
  </os>
…
2) Define the guest.
# virsh define ovmf.xml
error: Failed to define domain from ovmf.xml
error: XML error: unknown firmware value uefi
2. Start a bios guest without descriptor files
1) Prepare a guest with the following xml:
...
<os firmware='bios'>
  <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
  <boot dev='hd'/>
  <bootmenu enable='yes'/>
</os>
…
2) Define and start the guest.
# virsh define ovmf.xml
Domain ovmf defined from ovmf.xml
# virsh start ovmf
error: Failed to start domain ovmf
error: operation failed: Unable to find any firmware to satisfy 'bios'
3. Start a uefi guest with smm disabled.
1) Prepare a guest with the following xml:
…
<smm state='off'/>
  </features>
 <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
    <loader secure='yes'/>                      
    <boot dev='hd'/>
    <bootmenu enable='yes'/>
  </os>
…
2) Define and start the guest.
# virsh define ovmf.xml
Domain ovmf defined from ovmf.xml
# virsh start ovmf
error: Failed to start domain ovmf
error: Requested operation is not valid: domain has SMM turned off but chosen firmware requires it
4. Start uefi guest with unsupported flash format.
1) Change flash format “raw” to unsupported format “qcow2” in /usr/share/qemu/firmware/40-edk2-ovmf-sb.json.
2) Start the guest with the following xml:
…
 <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>                    
    <boot dev='hd'/>
  </os>
…
# virsh start ovmf
error: Failed to start domain ovmf
error: Operation not supported: unsupported flash format 'qcow2'
5. Start uefi guest with unsupported nvram template format.
1) Change nvram template format “raw” to unsupported format “qcow2” in /usr/share/qemu/firmware/40-edk2-ovmf-sb.json.
2) Start the guest with the following xml:
…
 <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>                    
    <boot dev='hd'/>
  </os>
…
# virsh start ovmf
error: Failed to start domain ovmf
error: Operation not supported: unsupported nvram template format 'qcow2'

Comment 17 Meina Li 2019-09-05 08:38:32 UTC
Hi Michal,

For this new feature, I found some new issues on it. Can you help review them again? Thank you in advance.

First, we can't undefine ovmf guest as expected:

1)# virsh undefine ovmf
Domain ovmf has been undefined

# ll /var/lib/libvirt/qemu/nvram/
total 528
-rw-------. 1 root root 540672 Sep  5 04:16 ovmf_VARS.fd

But expected result:
# virsh undefine ovmf
error: Failed to undefine domain ovmf
error: Requested operation is not valid: cannot undefine domain with nvram

Or there's no nvram in domain xml, so is it expected that we can undefine the domain without --nvram or --keep-nvram?

2)# virsh destroy ovmf 
Domain ovmf destroyed
# virsh undefine ovmf --nvram
Domain ovmf has been undefined
# ll /var/lib/libvirt/qemu/nvram/
total 528
-rw-------. 1 root root 540672 Sep  5 04:16 ovmf_VARS.fd

But expected result:
The ovmf_VARS.fd file should be removed.

Then, we can't boot the guest successfully:
1) Should confirm there's no ovmf_VARS.fd file exist before define the guest.
----During my previous test in comment 16, I directly edit the exist ovmf guest with loader and nvram element to firmware='efi', then generate a xml by dumpxml guest and define it. So eventually the ovmf_VARS.fd file can't be removed and both of my test used the previous ovmf_VARS.fd file but not generate a new one.

2) Define and start the guest, connect the console:
# virsh console ovmf
Connected to domain ovmf
Escape character is ^]
error:
../../grub-core/loader/i386/efi/linux.c:208:(hd0,gpt2)/vmlinuz-4.18.0-135.el8.x
86_64 has invalid signature.
error: ../../grub-core/loader/i386/efi/linux.c:93:you need to load the kernel
first.
Press any key to continue...
-----I know this error usually caused when enable secure for a non-released os image, but during my test I didn't enable secure dunction.

Comment 18 Michal Privoznik 2019-09-10 09:04:53 UTC
(In reply to Meina Li from comment #17)
> Hi Michal,
> 
> For this new feature, I found some new issues on it. Can you help review
> them again? Thank you in advance.
> 
> First, we can't undefine ovmf guest as expected:
> 
> 1)# virsh undefine ovmf
> Domain ovmf has been undefined
> 
> # ll /var/lib/libvirt/qemu/nvram/
> total 528
> -rw-------. 1 root root 540672 Sep  5 04:16 ovmf_VARS.fd
> 
> But expected result:
> # virsh undefine ovmf
> error: Failed to undefine domain ovmf
> error: Requested operation is not valid: cannot undefine domain with nvram
> 
> Or there's no nvram in domain xml, so is it expected that we can undefine
> the domain without --nvram or --keep-nvram?

Yep, this is a bug. Please open a new bug for it.

> 
> 2)# virsh destroy ovmf 
> Domain ovmf destroyed
> # virsh undefine ovmf --nvram
> Domain ovmf has been undefined
> # ll /var/lib/libvirt/qemu/nvram/
> total 528
> -rw-------. 1 root root 540672 Sep  5 04:16 ovmf_VARS.fd
> 
> But expected result:
> The ovmf_VARS.fd file should be removed.

This is the same issue as above.

> 
> Then, we can't boot the guest successfully:
> 1) Should confirm there's no ovmf_VARS.fd file exist before define the guest.
> ----During my previous test in comment 16, I directly edit the exist ovmf
> guest with loader and nvram element to firmware='efi', then generate a xml
> by dumpxml guest and define it. So eventually the ovmf_VARS.fd file can't be
> removed and both of my test used the previous ovmf_VARS.fd file but not
> generate a new one.
> 
> 2) Define and start the guest, connect the console:
> # virsh console ovmf
> Connected to domain ovmf
> Escape character is ^]
> error:
> ../../grub-core/loader/i386/efi/linux.c:208:(hd0,gpt2)/vmlinuz-4.18.0-135.
> el8.x
> 86_64 has invalid signature.
> error: ../../grub-core/loader/i386/efi/linux.c:93:you need to load the kernel
> first.
> Press any key to continue...
> -----I know this error usually caused when enable secure for a non-released
> os image, but during my test I didn't enable secure dunction.

This looks like a separate bug. What are steps to reproduce it please?

Comment 19 Meina Li 2019-09-10 10:17:18 UTC
I think this issue may be related with nvram.
Reproduced Scenario (please refer to the guest xml in attachment if necessary):
1. Prepare a guest xml with firmware='efi':
# cat ovmf.xml
...
 <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-rhel8.1.0'>hvm</type>
    <boot dev='hd'/>
  </os>
...
or directly with the default loader and nvram generated after start guest:
# cat ovmf.xml
...
<type arch='x86_64' machine='pc-q35-rhel8.1.0'>hvm</type>
 <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
 <nvram template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd'>/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd</nvram>
 <boot dev='hd'/>
...
2. Check nvram file
# ll /var/lib/libvirt/qemu/nvram/
total 0
3. Define and start guest
# virsh define ovmf.xml 
Domain ovmf defined from ovmf.xml
# virsh start ovmf 
Domain ovmf started
# ll /var/lib/libvirt/qemu/nvram/
total 532
-rw-------. 1 qemu qemu 540672 Sep 10 05:15 ovmf_VARS.fd
# virsh console ovmf 
Connected to domain ovmf
Escape character is ^]
error:
../../grub-core/loader/i386/efi/linux.c:208:(hd0,gpt2)/vmlinuz-4.18.0-135.el8.x
86_64 has invalid signature.
error: ../../grub-core/loader/i386/efi/linux.c:93:you need to load the kernel
first.

Press any key to continue...

There are two scenarios that we can boot the guest successfully.
SC1: Boot guest without template (The guest we installed by virt-manager don't include template)
# cat ovmf.xml
...
<type arch='x86_64' machine='pc-q35-rhel8.1.0'>hvm</type>
 <loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
 <nvram>/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd</nvram>           
 <boot dev='hd'/>
...

SC2:Boot guest with template and existed nvram file which generated in bootable guest.
# ll /var/lib/libvirt/qemu/nvram/
total 532
-rw-------. 1 qemu qemu 540672 Sep 10 05:15 ovmf_VARS.fd      ---generated in bootable guest
# cat ovmf.xml
...
 <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-rhel8.1.0'>hvm</type>
    <boot dev='hd'/>
  </os>
...
Define and start the guest, and check the console:
# virsh console ovmf 
Connected to domain ovmf
Escape character is ^]

Red Hat Enterprise Linux 8.1 Beta (Ootpa)
Kernel 4.18.0-135.el8.x86_64 on an x86_64

localhost login:

Comment 20 Meina Li 2019-09-10 10:18:19 UTC
Created attachment 1613540 [details]
The xml of ovmf guest

Comment 21 Michal Privoznik 2019-09-10 13:29:25 UTC
(In reply to Meina Li from comment #19)
>

This smells like an OVMF bug to me. When starting a domain libvirt does nothing more than it copies the template into domain specific _VARS file if it doesn't exist yet. And the fact that you're unable to boot the domain afterwards means that the image libvirt used to copy data from might be corrupted. If you copy /usr/share/OVMF/OVMF_VARS.secboot.fd info /var/lib/libvirt/qemu/nvram/ovmf_VARS.fd (overwrite the domain specific file, while the domain is shut off) and try to boot it then, does it work?

Comment 22 Meina Li 2019-09-11 08:04:33 UTC
(In reply to Michal Privoznik from comment #21)
> (In reply to Meina Li from comment #19)
> >
> 
> This smells like an OVMF bug to me. When starting a domain libvirt does
> nothing more than it copies the template into domain specific _VARS file if
> it doesn't exist yet. And the fact that you're unable to boot the domain
> afterwards means that the image libvirt used to copy data from might be
> corrupted. If you copy /usr/share/OVMF/OVMF_VARS.secboot.fd info
> /var/lib/libvirt/qemu/nvram/ovmf_VARS.fd (overwrite the domain specific
> file, while the domain is shut off) and try to boot it then, does it work?

Are you mean to copy /usr/share/OVMF/OVMF_VARS.secboot.fd into /var/lib/libvirt/qemu/nvram/ovmf_VARS.fd? It also doesn't work. And if my unstanding is wrong, please give a correct one, thanks.
# virsh list --all
 Id   Name             State
---------------------------------
 -    ovmf   shut off
# cp /usr/share/OVMF/OVMF_VARS.secboot.fd  /var/lib/libvirt/qemu/nvram/ovmf_VARS.fd 
cp: overwrite '/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd'? y
# virsh start ovmf 
Domain ovmf started
# virsh console ovmf 
Connected to domain ovmf
Escape character is ^]
error:
../../grub-core/loader/i386/efi/linux.c:208:(hd0,gpt2)/vmlinuz-4.18.0-135.el8.x
86_64 has invalid signature.
error: ../../grub-core/loader/i386/efi/linux.c:93:you need to load the kernel
first.

Press any key to continue...

Comment 23 Michal Privoznik 2019-09-11 12:37:33 UTC
(In reply to Meina Li from comment #22)
> (In reply to Michal Privoznik from comment #21)
> > (In reply to Meina Li from comment #19)
> > >
> > 
> > This smells like an OVMF bug to me. When starting a domain libvirt does
> > nothing more than it copies the template into domain specific _VARS file if
> > it doesn't exist yet. And the fact that you're unable to boot the domain
> > afterwards means that the image libvirt used to copy data from might be
> > corrupted. If you copy /usr/share/OVMF/OVMF_VARS.secboot.fd info
> > /var/lib/libvirt/qemu/nvram/ovmf_VARS.fd (overwrite the domain specific
> > file, while the domain is shut off) and try to boot it then, does it work?
> 
> Are you mean to copy /usr/share/OVMF/OVMF_VARS.secboot.fd into
> /var/lib/libvirt/qemu/nvram/ovmf_VARS.fd? 

Yes, I mean exactly that.

> It also doesn't work. And if my
> unstanding is wrong, please give a correct one, thanks.
> # virsh list --all
>  Id   Name             State
> ---------------------------------
>  -    ovmf   shut off
> # cp /usr/share/OVMF/OVMF_VARS.secboot.fd 
> /var/lib/libvirt/qemu/nvram/ovmf_VARS.fd 
> cp: overwrite '/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd'? y
> # virsh start ovmf 
> Domain ovmf started
> # virsh console ovmf 
> Connected to domain ovmf
> Escape character is ^]
> error:
> ../../grub-core/loader/i386/efi/linux.c:208:(hd0,gpt2)/vmlinuz-4.18.0-135.
> el8.x
> 86_64 has invalid signature.
> error: ../../grub-core/loader/i386/efi/linux.c:93:you need to load the kernel
> first.
> 
> Press any key to continue...

So this is not a libvirt issue then. In this example you are using the _VARS file as provided by OVMF package and guest is unable to boot. Lazslo, do you recognize this perhaps?

Comment 24 Laszlo Ersek 2019-09-11 16:33:12 UTC
Sure; the root cause was already given by Meina Li at the end of comment 17.

Namely, "OVMF_VARS.secboot.fd" is the varstore template file that has some certificates pre-enrolled, and the Secure Boot mode enabled. Therefore, if you attempt to boot an incorrectly signed kernel, things will fail.

I don't see any bug here; things seem to be working by design.

If you want a new domain that does not have SB enabled at once, use the other varstore template file ("OVMF_VARS.fd").

Note that, for bug 1600230, the edk2-ovmf package gained the following two descriptor files:

/usr/share/qemu/firmware/40-edk2-ovmf-sb.json
/usr/share/qemu/firmware/50-edk2-ovmf.json

This means that the system-wide priority is assigned to "edk2-ovmf-sb" (prefix 40), and not to "edk2-ovmf" (prefix 50). This is why the auto-selection feature picks "OVMF_VARS.secboot.fd" over "OVMF_VARS.fd".

If you want to override that priority order for testing, you can for example hide the "40-edk2-ovmf-sb.json" file, by running:

- as a sysadmin:

  mkdir -p /etc/qemu/firmware/
  cat /dev/null > /etc/qemu/firmware/40-edk2-ovmf-sb.json

- as a user:

  mkdir -p $HOME/.config/qemu/firmware/
  cat /dev/null > $HOME/.config/qemu/firmware/40-edk2-ovmf-sb.json

Either of these will create a zero-length file, named "40-edk2-ovmf-sb.json", in a more specific directory than "/usr/share/qemu/firmware/". Therefore "/usr/share/qemu/firmware/40-edk2-ovmf-sb.json" will be hidden from the search, and "/usr/share/qemu/firmware/50-edk2-ovmf.json" will take effect. That descriptor specifies the "OVMF_VARS.fd" template, which will not enable SB at domain creation. Then you can boot unsigned media (or media signed with test keys).

Comment 26 Laszlo Ersek 2019-09-11 16:39:29 UTC
To summarize:

- for bug 1600230, we added two descriptor files, and the one with higher priority (= numerically lower prefix) enables SB at domain creation
- and, the guest you were trying to launch was signed with a Beta key (or some other non-accepted key)

In order to remedy the situation, change one of the above factors. Either override the priority order, as described in comment 24, or else use a correctly signed guest.

Comment 27 Meina Li 2019-09-12 08:22:08 UTC
Hi Laszlo,

Thanks for your reply and I still be confused on it.

First, the guest I used wasn't been signed with a key. And according to the comment, when I directly use "OVMF_VARS.fd" file and it can work well.
...
<loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
<nvram template='/usr/share/edk2/ovmf/OVMF_VARS.fd'>/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd</nvram>

But when I use firmware='efi' to test, it always use template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd' by default even though I hide the "40-edk2-ovmf-sb.json" file, so it can't boot.
1) Create a zero-length file named "40-edk2-ovmf-sb.json"
# mkdir -p /etc/qemu/firmware/
# cat /dev/null > /etc/qemu/firmware/40-edk2-ovmf-sb.json
2) Define and start a guest with firmware='efi'
# cat ovmf.xml
...
 <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-rhel8.1.0'>hvm</type>
    <boot dev='hd'/>
 </os>
...
# virsh define ovmf.xml 
Domain ovmf defined from ovmf.xml
# virsh start ovmf 
Domain ovmf started
# virsh dumpxml ovmf
…
  <os>
    <type arch='x86_64' machine='pc-q35-rhel8.0.0'>hvm</type>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
    <nvram template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd'>/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd</nvram>
    <boot dev='hd'/>
...

In addition, if the auto-selection feature picks "OVMF_VARS.secboot.fd" over "OVMF_VARS.fd" by design, we should run extra steps to hide the "40-edk2-ovmf-sb.json" file when we define/start a unsigned guest with firmware='efi', which will increase complexity. Will we have some description in related doc?

Comment 28 Laszlo Ersek 2019-09-12 15:32:33 UTC
(In reply to Meina Li from comment #27)

> But when I use firmware='efi' to test, it always use
> template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd' by default even
> though I hide the "40-edk2-ovmf-sb.json" file, so it can't boot.

Hmm, this doesn't sound good.

The libvirt patch set linked in comment 6 added a bunch of helpful
VIR_DEBUG() messages to the daemon. Can you please enable debug logging
with "log_filters" and "log_outputs" in "/etc/libvirt/libvirtd.conf",
restart libvirtd, and attach the debug log?

> In addition, if the auto-selection feature picks
> "OVMF_VARS.secboot.fd" over "OVMF_VARS.fd" by design, we should run
> extra steps to hide the "40-edk2-ovmf-sb.json" file when we
> define/start a unsigned guest with firmware='efi', which will increase
> complexity. Will we have some description in related doc?

I'm not sure about that.

The design principle / requirement behind this feature was that the
end-user see as few firmware details as possible. The idea was that a
single config knob, such as "firmware='efi'", give them a UEFI domain.
Whether that would be with SB enabled automatically, or with SB
disabled, was considered a detail that should be explicitly hidden from
the end-user. It is considered system policy, i.e. a job for the
operator of the layered product that sits atop of QEMU and/or libvirt
(such as OpenStack or RHV).

It is not expected that you, as an end-user, will want to pick and
choose "SB enabled" vs. "SB disabled" every time you create a new UEFI
domain -- that was specifically what the feature request wanted to
*avoid*, if I understand correctly. If you, as an end-user, wanted this
level of control, you could already specify the @template attribute for
the <nvram> element, with an explicit pathname. The idea was to save
end-users from that burden.

So, if you want to pick and choose SB-enabled / SB-disabled every time,
then you're not supposed to use this feature. You're certainly not
supposed to work with symlinks or empty files (to override priorities)
every time you define a new domain. Instead, just stick with the manual
setting for the @template attribute. (Note that even "virt-install"
supports this, with "--boot uefi,[...],nvram_template=[...]".)

Therefore, I don't think we've planned extra RHEL documentation for
this. Of course, if someone were to write such RHEL documentation, I
wouldn't oppose it. I'm just saying that thus far it hasn't appeared
necessary.

(The documentation does *exist* BTW, in the file
"docs/interop/firmware.json", in the QEMU source tree, under the heading
"@Firmware:":

<https://git.qemu.org/?p=qemu.git;a=blob;f=docs/interop/firmware.json;h=8ffb7856d2c3;hb=6d2fdde42c33#l297>)

Rather than in end-user documentation, I think this should be described
in the libvirt test plan.

So, possible further steps are:

- File an RHBZ for user-facing documentation.

  (I don't think that would be justified, but others may disagree.
  Discuss with Dan, primarily.)

- Update the libvirt test plan.

- File an RHBZ for reversing the relative order of "edk2-ovmf-sb.json"
  and "edk2-ovmf.json".

  (I don't think that's a good idea, because all the UEFI guests that we
  (are going to) officially support with RHV / RHOSP should boot fine
  with SB enabled. Discuss with PM.)

Thanks,
Laszlo

Comment 29 Meina Li 2019-09-16 06:13:53 UTC
Created attachment 1615450 [details]
Debug messages

Comment 30 Laszlo Ersek 2019-09-16 09:38:37 UTC
(In reply to Meina Li from comment #29)
> Created attachment 1615450 [details]
> Debug messages

Thanks, I'll look at them soon.

Please be mindful of attachment sizes (and download sizes) in Bugzilla. This is a plaintext log file, 6.4MB in size. But when compressed with xz, it's only 116KB. Please always try to see if compression would help save space (and download/upload bandwidth) in Bugzilla. Thanks.

Comment 31 Laszlo Ersek 2019-09-16 10:10:25 UTC
(In reply to Meina Li from comment #29)
> Created attachment 1615450 [details]
> Debug messages

The log file from comment 29 is conclusive; it explains the problem.

There is a file called
"/usr/share/qemu/firmware/40-edk2-ovmf-sb.json.bak" in the host
filesystem -- you must have edited the file
"/usr/share/qemu/firmware/40-edk2-ovmf-sb.json", which was installed as
part of the edk2-ovmf RPM, in-place. Then, your text editor must have
auto-created a backup file called
"/usr/share/qemu/firmware/40-edk2-ovmf-sb.json.bak".

The "40-edk2-ovmf-sb.json" file would be masked alright by the
zero-length file (with identical filename) under /etc. However, the BAK
file is not masked, and it still has prefix 40. Therefore it takes
priority over "50-edk2-ovmf.json".

> qemuFirmwareFetchConfigs:1041 : firmware description path
>                                 '/etc/qemu/firmware/40-edk2-ovmf-sb.json'
>                                 len=0
>
> qemuFirmwareFetchConfigs:1041 : firmware description path
>                                 '/usr/share/qemu/firmware/40-edk2-ovmf-sb.json.bak'
>                                 len=770
>
> qemuFirmwareFetchConfigs:1041 : firmware description path
>                                 '/usr/share/qemu/firmware/50-edk2-ovmf.json'
>                                 len=722
>
> qemuFirmwareInterfaceParse:327 : firmware description path
>                                  '/usr/share/qemu/firmware/40-edk2-ovmf-sb.json.bak'
>                                  supported interfaces:  uefi
>
> qemuFirmwareInterfaceParse:327 : firmware description path
>                                  '/usr/share/qemu/firmware/50-edk2-ovmf.json'
>                                  supported interfaces:  uefi
>
> qemuFirmwareMatchDomain:1161 : Firmware
>                                '/usr/share/qemu/firmware/40-edk2-ovmf-sb.json.bak'
>                                matches domain requirements
>
> qemuFirmwareFillDomain:1385 : Found matching firmware (description path
>                               '/usr/share/qemu/firmware/40-edk2-ovmf-sb.json.bak')
>
>
> qemuFirmwareEnableFeatures:1215 : decided on firmware
>                                   '/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd'
>                                   varstore template
>                                   '/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd'
>

Please remove the BAK file, reinstall edk2-ovmf from scratch, and retest.

Comment 32 Meina Li 2019-09-16 10:41:24 UTC
The retest is passed after removing the BAK file. 
# virsh dumpxml ovmf --inactive | grep /os -B3
  <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-rhel8.1.0'>hvm</type>
    <boot dev='hd'/>
  </os>
# virsh dumpxml ovmf | grep /os -B5
  <os>
    <type arch='x86_64' machine='pc-q35-rhel8.1.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader>
    <nvram template='/usr/share/edk2/ovmf/OVMF_VARS.fd'>/var/lib/libvirt/qemu/nvram/ovmf_VARS.fd</nvram>
    <boot dev='hd'/>
  </os>

Thanks Laszlo for your detailed explanation on this issue and the reminder of attachment sizes in bugzilla. I'll update test plan for this issue.

Comment 34 errata-xmlrpc 2019-11-06 07:11:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723