RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2218196 - Add vtpm devices with OVMF.amdsev.fd causes VM reset
Summary: Add vtpm devices with OVMF.amdsev.fd causes VM reset
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: edk2
Version: 9.3
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Gerd Hoffmann
QA Contact: zixchen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-28 12:23 UTC by zixchen
Modified: 2023-11-07 09:04 UTC (History)
12 users (show)

Fixed In Version: edk2-20230524-3.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-07 08:24:30 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/centos-stream/src edk2 merge_requests 46 0 None opened OvmfPkg/AmdSevDxe: Shim Reboot workaround (RHEL only) 2023-08-23 14:57:06 UTC
Red Hat Issue Tracker RHELPLAN-161066 0 None None None 2023-06-28 12:24:22 UTC
Red Hat Product Errata RHSA-2023:6330 0 None None None 2023-11-07 08:24:45 UTC

Description zixchen 2023-06-28 12:23:35 UTC
Description of problem:
Add vtpm devices with OVMF.amdsev.fd causes VM reset, this is an issue for SEV-ES guest, as it doesn't support reboot/reset, so sev-es VM will never boot automatically. But manually select 'continue boot' on system reset screen, sev-es VM can boot.  

Version-Release number of selected component (if applicable):
edk2-ovmf-20230301gitf80f052277c8-5.el9.noarch
qemu-kvm-8.0.0-6.el9.x86_64
kernel-5.14.0-331.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Boot a sev-es with a vtpm device and /usr/share/edk2/ovmf/OVMF.amdsev.fd
# /usr/bin/swtpm socket --ctrl type=unixio,path=/root/avocado/data/avocado-vt/swtpm/avocado-vt-vm1_tpm0_swtpm.sock,mode=0600 --tpmstate dir=/root/avocado/data/avocado-vt/swtpm/avocado-vt-vm1_tpm0_state,mode=0600 --terminate --tpm2 --log file=/root/avocado/job-results/job-2023-06-28T03.49-3724c4a/test-results/1-Host_RHEL.m9.u3.ovmf.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.3.0.x86_64.io-github-autotest-qemu.boot.q35/vtpm_avocado-vt-vm1_tpm0_swtpm.log

qemu cmdline:
/usr/libexec/qemu-kvm \
     -S  \
     -name 'avocado-vt-vm1'  \
     -sandbox on  \
     -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/edk2/ovmf/OVMF.amdsev.fd", "auto-read-only": true, "discard": "unmap"}' \
     -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \
     -machine q35,pflash0=drive_ovmf_code,confidential-guest-support=lsec0,memory-backend=mem-machine_mem \
     -object sev-guest,id=lsec0,policy=7,cbitpos=51,reduced-phys-bits=1  \
     -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \
     -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' \
     -nodefaults \
     -device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' \
     -m 4096 \
     -object '{"size": 4294967296, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}'  \
     -smp 48,maxcpus=48,cores=24,threads=1,dies=1,sockets=2  \
     -cpu 'EPYC-Milan',x2apic=on,tsc-deadline=on,hypervisor=on,tsc-adjust=on,vaes=on,vpclmulqdq=on,spec-ctrl=on,stibp=on,arch-capabilities=on,ssbd=on,cmp-legacy=on,virt-ssbd=on,lbrv=on,tsc-scale=on,vmcb-clean=on,pause-filter=on,pfthreshold=on,v-vmsave-vmload=on,vgif=on,rdctl-no=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,kvm_pv_unhalt=on \
     -chardev socket,server=on,wait=off,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_as4vdz88/monitor-qmpmonitor1-20230628-030129-xipuDste  \
     -mon chardev=qmp_id_qmpmonitor1,mode=control \
     -chardev socket,server=on,wait=off,id=qmp_id_catch_monitor,path=/var/tmp/avocado_as4vdz88/monitor-catch_monitor-20230628-030129-xipuDste  \
     -mon chardev=qmp_id_catch_monitor,mode=control \
     -device '{"ioport": 1285, "driver": "pvpanic", "id": "idr8y3qB"}' \
     -chardev socket,server=on,wait=off,id=chardev_serial0,path=/var/tmp/avocado_as4vdz88/serial-serial0-20230628-030129-xipuDste \
     -device '{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}'  \
     -chardev socket,id=seabioslog_id_20230628-030129-xipuDste,path=/var/tmp/avocado_as4vdz88/seabios-20230628-030129-xipuDste,server=on,wait=off \
     -device isa-debugcon,chardev=seabioslog_id_20230628-030129-xipuDste,iobase=0x402 \
     -device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \
     -device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' \
     -device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' \
     -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \
     -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' \
     -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel930-64-virtio-scsi-ovmf.qcow2", "cache": {"direct": true, "no-flush": false}}' \
     -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
     -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \
     -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \
     -device '{"driver": "virtio-net-pci", "mac": "9a:8f:81:2a:7b:a8", "id": "idkuetK7", "netdev": "idHAEXFR", "bus": "pcie-root-port-3", "addr": "0x0"}'  \
     -netdev tap,id=idHAEXFR,vhost=on  \
     -chardev socket,id=char_vtpm_avocado-vt-vm1_tpm0,path=/root/avocado/data/avocado-vt/swtpm/avocado-vt-vm1_tpm0_swtpm.sock \
     -tpmdev emulator,chardev=char_vtpm_avocado-vt-vm1_tpm0,id=emulator_vtpm_avocado-vt-vm1_tpm0 \
     -device '{"id": "tpm-crb_vtpm_avocado-vt-vm1_tpm0", "tpmdev": "emulator_vtpm_avocado-vt-vm1_tpm0", "driver": "tpm-crb"}' \
     -monitor stdio \
     -vnc :0  \
     -rtc base=utc,clock=host,driftfix=slew  \
     -boot menu=off,order=cdn,once=c,strict=off \
     -enable-kvm \


Actual results:
System reset automatically

Expected results:
VM can boot

Additional info:
Not a regression, test the first OVMF.amdsev.fd edk2 build edk2-ovmf-202202gitb24306f15d-1.el9.noarch also reproduces this issue.

Comment 1 Gerd Hoffmann 2023-06-29 13:04:00 UTC
> Actual results:
> System reset automatically

Yes, sev-es does not support reboot, so the guest is powered off instead.

> Expected results:
> VM can boot

It should boot fine if you start it again with the identical configuration.

(Might be needed multiple times, in case fallback.efi restores the boot
 configuration it will try reboot the guest too).

Comment 2 zixchen 2023-06-30 09:03:10 UTC
(In reply to Gerd Hoffmann from comment #1)
> > Actual results:
> > System reset automatically
> 
> Yes, sev-es does not support reboot, so the guest is powered off instead.
> 
> > Expected results:
> > VM can boot
> 
> It should boot fine if you start it again with the identical configuration.
> 
> (Might be needed multiple times, in case fallback.efi restores the boot
>  configuration it will try reboot the guest too).
In this case, repeat booting the guest with same configuration always goes to system reset, I've tried repeating to start the vm more than 10 times after vm automatically quit.
This is only the case when tpm + OVMF.amdsev.fd, tpm + OVMF_CODE.fd & nvram VAR file doesn't reproduce this issue.

Comment 3 Gerd Hoffmann 2023-07-06 09:53:11 UTC
> > It should boot fine if you start it again with the identical configuration.
> > 
> > (Might be needed multiple times, in case fallback.efi restores the boot
> >  configuration it will try reboot the guest too).
> In this case, repeat booting the guest with same configuration always goes
> to system reset, I've tried repeating to start the vm more than 10 times
> after vm automatically quit.
> This is only the case when tpm + OVMF.amdsev.fd, tpm + OVMF_CODE.fd & nvram
> VAR file doesn't reproduce this issue.

Can you attach the firmware log to this bug?

Comment 4 zixchen 2023-07-12 02:57:09 UTC
Failed case with OVMF.amdsev.fd firmware log: ovmf.log
Worked case with OVMF_CODE.fd and OVMF_VARS.log: ovmf_nvram.log

Comment 7 Oliver Steffen 2023-08-02 16:02:44 UTC
My suspicion is this:

1) The AmdSev firmware build uses a ephemeral/stateless UEFI variable store:
   there is no separate "OVMF_VARS.fd" file, and the "OVMF.amdsev.fd" (code binary)
   is added to the VM as read-only. The firmware uses a memory-backed write overlay internally.
   This means any changes made to it will be lost when Qemu is stopped.

2) AmdSev VMs can not be rebooted. Qemu needs to be stop and restarted to perform a reboot, losing
   any changes made to the variable store.

3) shim's fallback mechanism checks the variable store when booting. If it is empty it sets the
   default EFI boot option (to bypass the BOOTX64.efi default).
   If a TPM is present, the machine is reset after this in order to keep the boot chain clean
   for measured boot.
   In case of an ephemeral variable store, this creates an endless reboot loop. To offer a way out
   the blue box with the message and the option to skip the reboot is presented.

See also:
- https://github.com/rhboot/shim/issues/418
- https://github.com/rhboot/shim/pull/445

Comment 8 Oliver Steffen 2023-08-03 14:25:35 UTC
I was able to test this on an AMD SEV-ES capable machine and I can confirm that EFI variables are not preserved across reboots.

Inside the SEV-ES guest:

# dmesg | grep -i sev
[    0.316049] Memory Encryption Features active: AMD SEV SEV-ES
[    3.842543] kvm_amd: KVM is unsupported when running as an SEV guest
# echo "test" > data
# NAME=$(uuidgen)-test
# echo $NAME
e41b4422-7d52-4a9d-b0eb-2f87da469d55-test
# efivar -w --name=$NAME -f data
# efivar -l
e41b4422-7d52-4a9d-b0eb-2f87da469d55-test
8be4df61-93ca-11d2-aa0d-00e098032b8c-BootOrder
8be4df61-93ca-11d2-aa0d-00e098032b8c-Boot0001
8be4df61-93ca-11d2-aa0d-00e098032b8c-Boot0003
8be4df61-93ca-11d2-aa0d-00e098032b8c-ConIn
8be4df61-93ca-11d2-aa0d-00e098032b8c-ErrOut
8be4df61-93ca-11d2-aa0d-00e098032b8c-ConOut
04b37fe8-f6ae-480b-bdd5-37d98c5e89aa-VarErrorFlag
8be4df61-93ca-11d2-aa0d-00e098032b8c-Lang
8be4df61-93ca-11d2-aa0d-00e098032b8c-PlatformLang
8be4df61-93ca-11d2-aa0d-00e098032b8c-Timeout
8be4df61-93ca-11d2-aa0d-00e098032b8c-Boot0000
eb704011-1402-11d3-8e77-00a0c969723b-MTC
605dab50-e046-4300-abb6-3dd810dd8b23-MokListTrustedRT
605dab50-e046-4300-abb6-3dd810dd8b23-SbatLevelRT
605dab50-e046-4300-abb6-3dd810dd8b23-MokListXRT
605dab50-e046-4300-abb6-3dd810dd8b23-MokListRT
8be4df61-93ca-11d2-aa0d-00e098032b8c-BootCurrent
8be4df61-93ca-11d2-aa0d-00e098032b8c-ConInDev
8be4df61-93ca-11d2-aa0d-00e098032b8c-ErrOutDev
8be4df61-93ca-11d2-aa0d-00e098032b8c-ConOutDev
8be4df61-93ca-11d2-aa0d-00e098032b8c-PlatformRecovery0000
8be4df61-93ca-11d2-aa0d-00e098032b8c-PlatformLangCodes
8be4df61-93ca-11d2-aa0d-00e098032b8c-LangCodes
8be4df61-93ca-11d2-aa0d-00e098032b8c-BootOptionSupport
8be4df61-93ca-11d2-aa0d-00e098032b8c-OsIndicationsSupported
[root@ibm-p8-kvm-03-guest-02 ~]# efivar -p --name=$NAME
GUID: e41b4422-7d52-4a9d-b0eb-2f87da469d55
Name: "test"
Attributes:
	Non-Volatile
	Boot Service Access
	Runtime Service Access
Value:
00000000  74 65 73 74 0a                                    |test.           |

# reboot
[...]

# efivar -l
8be4df61-93ca-11d2-aa0d-00e098032b8c-BootOrder
8be4df61-93ca-11d2-aa0d-00e098032b8c-Boot0001
8be4df61-93ca-11d2-aa0d-00e098032b8c-Boot0003
8be4df61-93ca-11d2-aa0d-00e098032b8c-ConIn
8be4df61-93ca-11d2-aa0d-00e098032b8c-ErrOut
8be4df61-93ca-11d2-aa0d-00e098032b8c-ConOut
04b37fe8-f6ae-480b-bdd5-37d98c5e89aa-VarErrorFlag
8be4df61-93ca-11d2-aa0d-00e098032b8c-Lang
8be4df61-93ca-11d2-aa0d-00e098032b8c-PlatformLang
8be4df61-93ca-11d2-aa0d-00e098032b8c-Timeout
8be4df61-93ca-11d2-aa0d-00e098032b8c-Boot0000
eb704011-1402-11d3-8e77-00a0c969723b-MTC
605dab50-e046-4300-abb6-3dd810dd8b23-MokListTrustedRT
605dab50-e046-4300-abb6-3dd810dd8b23-SbatLevelRT
605dab50-e046-4300-abb6-3dd810dd8b23-MokListXRT
605dab50-e046-4300-abb6-3dd810dd8b23-MokListRT
8be4df61-93ca-11d2-aa0d-00e098032b8c-BootCurrent
8be4df61-93ca-11d2-aa0d-00e098032b8c-ConInDev
8be4df61-93ca-11d2-aa0d-00e098032b8c-ErrOutDev
8be4df61-93ca-11d2-aa0d-00e098032b8c-ConOutDev
8be4df61-93ca-11d2-aa0d-00e098032b8c-PlatformRecovery0000
8be4df61-93ca-11d2-aa0d-00e098032b8c-PlatformLangCodes
8be4df61-93ca-11d2-aa0d-00e098032b8c-LangCodes
8be4df61-93ca-11d2-aa0d-00e098032b8c-BootOptionSupport
8be4df61-93ca-11d2-aa0d-00e098032b8c-OsIndicationsSupported

-> variable is gone

Comment 9 Daniel Berrangé 2023-08-03 14:59:48 UTC
(In reply to Oliver Steffen from comment #8)
> I was able to test this on an AMD SEV-ES capable machine and I can confirm
> that EFI variables are not preserved across reboots.

This is true of all of the confidential virt tech in general - TDX, SEV, SEV-ES, SEV-SNP.

In all of them, while the EFI code is covered by the confidential VM attestation, the EFI variables cannot be covered as they're inherently variable data and under control of the host OS admin.

This forces an initial model where the EFI builds are stateless, such that variable changes are never persisted.

It will eventually be possible to address this in the SEV-SNP case. It would require taking advantage of SVSM, to do an early pre-(firmware|OS)-boot attestation to acquire keys to provision and unlock encrypted state for EFI variable storage and vTPM NVRAM storage. 

In the absence of SVSM pre-(firmware|OS)-boot attestation though, the guest OS image cannot rely on being able to persist EFI vars.

IOW, the shim fallback impl is not currently supportable with confidential virt and TPMs, unless a persistence mechanism can be provided.

Some options

 * Modify shim to NOT do a reboot for fallback mode when in a confidential VM, instead just make it boot directly. The TPM measurements are still usable as they would be the same on every boot by virtue of the EFI being stateless. Wouldnt want this change in an SVSM environment though
 * Instead of a single EFI firmware blob for code have the hypervisor provisioning tool maintain a set of known good firmware for each supported guest OS, with the EFI specific boot entry hardcoded avoiding shim fallback being triggered
 * Declare vTPM to be an unsupported feature unless EFI variable persistence is available.


In the case of SEV/SEV-ES specifically though, I think there's a case to be made that vTPM should be an unsupported configuration. It makes little sense to go to the trouble of using a confidential VM, and then rely on a vTPM impl that's entirely under control of the untrusted host OS admin.

Comment 10 zixchen 2023-08-08 02:12:21 UTC
Thanks Oliver and Daniel. From QE view, if this could be addressed with SEV-SNP eventually, I agree to doc a limitation of vtpm reboot with sev-es VM.

Comment 11 Oliver Steffen 2023-08-14 11:37:49 UTC
zixchen: I am still not sure how to proceed, discussion is ongoing.


Quoting an email from Cole <crobinso>:

> Behavior was added to shim here:
> https://github.com/rhboot/shim/commit/431b8a2e75a71a0b1f47d47d3f045b1e3efbce53
>
> commit 431b8a2e75a71a0b1f47d47d3f045b1e3efbce53
> Author: Peter Jones <pjones>
> Date:   Mon Jul 31 13:10:41 2017 -0400
>
>     Make fallback aware of tpm measurements, and reboot if tpm is used.
> 
>     Since booting the entry with fallback in the stack of things that got
>     measured will result in all the wrong PCR values, in the cases where TPM
>     is present and enabled, use ->Reset() instead of loading the Boot####
>     variable and executing its target.
> 
> Note this behavior also causes some weirdness with even non-SEV
> virt-install installations with TPM attached, the reset is unexpected
> and tricks virt-install into thinking the install was completed:
> https://bugzilla.redhat.com/show_bug.cgi?id=2133525
> 
> This later shim commit provides an escape hatch with an NV variable. Any
> way to use that for SEV and non SEV cases?
> https://github.com/rhboot/shim/commit/a5db51a52e8d4cae938fc807b991383309dffca7

The "escape hatch" is setting the EFI variable "FB_NO_REBOOT". Maybe the OVMF 
could set this variable in the store on startup?

Comment 12 Vivek Goyal 2023-08-14 18:53:04 UTC
(In reply to Daniel Berrangé from comment #9)
> (In reply to Oliver Steffen from comment #8)
> > I was able to test this on an AMD SEV-ES capable machine and I can confirm
> > that EFI variables are not preserved across reboots.
> 
> This is true of all of the confidential virt tech in general - TDX, SEV,
> SEV-ES, SEV-SNP.
> 
> In all of them, while the EFI code is covered by the confidential VM
> attestation, the EFI variables cannot be covered as they're inherently
> variable data and under control of the host OS admin.
> 
> This forces an initial model where the EFI builds are stateless, such that
> variable changes are never persisted.
> 
> It will eventually be possible to address this in the SEV-SNP case. It would
> require taking advantage of SVSM, to do an early pre-(firmware|OS)-boot
> attestation to acquire keys to provision and unlock encrypted state for EFI
> variable storage and vTPM NVRAM storage. 
> 
> In the absence of SVSM pre-(firmware|OS)-boot attestation though, the guest
> OS image cannot rely on being able to persist EFI vars.
> 
> IOW, the shim fallback impl is not currently supportable with confidential
> virt and TPMs, unless a persistence mechanism can be provided.
> 
> Some options
> 
>  * Modify shim to NOT do a reboot for fallback mode when in a confidential
> VM, instead just make it boot directly. The TPM measurements are still
> usable as they would be the same on every boot by virtue of the EFI being
> stateless. Wouldnt want this change in an SVSM environment though


Will be good if we can make it work without having to reset/reboot.

>  * Instead of a single EFI firmware blob for code have the hypervisor
> provisioning tool maintain a set of known good firmware for each supported
> guest OS, with the EFI specific boot entry hardcoded avoiding shim fallback
> being triggered
>  * Declare vTPM to be an unsupported feature unless EFI variable persistence
> is available.
> 

EFI variable persistence is a desired feature but nobody is working on that AFAIK
in coconut-SVSM. Also most likely vTPM support will come first without EFI
variable persistence and we will like it to work.

So it probably will be good to not rely on EFI variable persistence being available
and we should be able to work without that.

Peter, do you any thoughts on this. We don't have EFI variable persistence available
in case of confidential VMs yet and it might not happen for a long time. Is there
a way to make it work even without persistence EFI variable support.

> 
> In the case of SEV/SEV-ES specifically though, I think there's a case to be
> made that vTPM should be an unsupported configuration. It makes little sense
> to go to the trouble of using a confidential VM, and then rely on a vTPM
> impl that's entirely under control of the untrusted host OS admin.

Fair enough. I am more worried about SEV-SNP + SVSM + vTPM + no-persitent-EFI-variable-store configuration. I suspect that's going to be a supported use case for a while.

Comment 13 Oliver Steffen 2023-08-21 13:41:40 UTC
Investigating further the workaround from the firmware side, as an intermediate
solution.

Setting the EFI variable "FB_NO_REBOOT" to 1 (binary uint32, little endian)
under the shim guid should cause shin fallback to skip the reboot (see
shim's fallback.c, around lines 1061, 1077, 1187).

Although the variable store of the AMD SEV firmware is volatile, the firmware
image still contains a variable store section. The virt-fw-vars tool can be
used to set variables in the store of the combined firmware image (thanks Gerd
for the hint!) The tool even has a shortcut to set the no-reboot for shim. The
image can be modified like this:

$ virt-fw-vars -v \
    -i /usr/share/edk2/ovmf/OVMF.amdsev.fd \
    --set-fallback-no-reboot \
    -o OVMF.amdsev.noreboot.fd
INFO: reading raw edk2 varstore from /usr/share/edk2/ovmf/OVMF.amdsev.fd
INFO: var store range: 0x64 -> 0xe000
INFO: create variable FB_NO_REBOOT
INFO: set variable FB_NO_REBOOT: 1
INFO: writing raw edk2 varstore to OVMF.amdsev.noreboot.fd


Checking what is in the varstore:

$ virt-fw-vars -v -i ./OVMF.amdsev.noreboot.fd --output-json -
INFO: reading raw edk2 varstore from ./OVMF.amdsev.noreboot.fd
INFO: var store range: 0x64 -> 0xe000
{
    "version": 2,
    "variables": [
        {
            "name": "FB_NO_REBOOT",
            "guid": "605dab50-e046-4300-abb6-3dd810dd8b23",
            "attr": 7,
            "data": "01000000"
        }
    ]
}

I would now expect shim to boot straight on into RHEL.
But I still get the countdown, followed by a reboot.

Booting through manually and examining the varstore from userspace I see:

# ls /sys/firmware/efi/efivars/
Boot0000-8be4df61-93ca-11d2-aa0d-00e098032b8c
Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c
Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c
BootCurrent-8be4df61-93ca-11d2-aa0d-00e098032b8c
BootOptionSupport-8be4df61-93ca-11d2-aa0d-00e098032b8c
BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c
ConIn-8be4df61-93ca-11d2-aa0d-00e098032b8c
ConInDev-8be4df61-93ca-11d2-aa0d-00e098032b8c
ConOut-8be4df61-93ca-11d2-aa0d-00e098032b8c
ConOutDev-8be4df61-93ca-11d2-aa0d-00e098032b8c
ErrOut-8be4df61-93ca-11d2-aa0d-00e098032b8c
ErrOutDev-8be4df61-93ca-11d2-aa0d-00e098032b8c
Lang-8be4df61-93ca-11d2-aa0d-00e098032b8c
LangCodes-8be4df61-93ca-11d2-aa0d-00e098032b8c
MTC-eb704011-1402-11d3-8e77-00a0c969723b
MokListRT-605dab50-e046-4300-abb6-3dd810dd8b23
MokListTrustedRT-605dab50-e046-4300-abb6-3dd810dd8b23
MokListXRT-605dab50-e046-4300-abb6-3dd810dd8b23
OsIndicationsSupported-8be4df61-93ca-11d2-aa0d-00e098032b8c
PlatformLang-8be4df61-93ca-11d2-aa0d-00e098032b8c
PlatformLangCodes-8be4df61-93ca-11d2-aa0d-00e098032b8c
PlatformRecovery0000-8be4df61-93ca-11d2-aa0d-00e098032b8c
SbatLevelRT-605dab50-e046-4300-abb6-3dd810dd8b23
Timeout-8be4df61-93ca-11d2-aa0d-00e098032b8c
VarErrorFlag-04b37fe8-f6ae-480b-bdd5-37d98c5e89aa

The FB_NO_REBOOT variable is missing.

Maybe the content of the in-flash varstore is not used at all?


Another option could be to add some code to the SEV firmware part that inserts
the variable into the store (via the regular Variable Protocol), before
handing over to the efi applications.

Comment 14 Daniel Berrangé 2023-08-21 13:48:36 UTC
(In reply to Oliver Steffen from comment #13)
> Investigating further the workaround from the firmware side, as an
> intermediate
> solution.
> 
> Setting the EFI variable "FB_NO_REBOOT" to 1 (binary uint32, little endian)
> under the shim guid should cause shin fallback to skip the reboot (see
> shim's fallback.c, around lines 1061, 1077, 1187).
> 
> Although the variable store of the AMD SEV firmware is volatile, the firmware
> image still contains a variable store section. The virt-fw-vars tool can be
> used to set variables in the store of the combined firmware image (thanks
> Gerd
> for the hint!) The tool even has a shortcut to set the no-reboot for shim.
> The
> image can be modified like this:
> 
> $ virt-fw-vars -v \
>     -i /usr/share/edk2/ovmf/OVMF.amdsev.fd \
>     --set-fallback-no-reboot \
>     -o OVMF.amdsev.noreboot.fd
> INFO: reading raw edk2 varstore from /usr/share/edk2/ovmf/OVMF.amdsev.fd
> INFO: var store range: 0x64 -> 0xe000
> INFO: create variable FB_NO_REBOOT
> INFO: set variable FB_NO_REBOOT: 1
> INFO: writing raw edk2 varstore to OVMF.amdsev.noreboot.fd
> 
> 
> Checking what is in the varstore:
> 
> $ virt-fw-vars -v -i ./OVMF.amdsev.noreboot.fd --output-json -
> INFO: reading raw edk2 varstore from ./OVMF.amdsev.noreboot.fd
> INFO: var store range: 0x64 -> 0xe000
> {
>     "version": 2,
>     "variables": [
>         {
>             "name": "FB_NO_REBOOT",
>             "guid": "605dab50-e046-4300-abb6-3dd810dd8b23",
>             "attr": 7,
>             "data": "01000000"
>         }
>     ]
> }
> 
> I would now expect shim to boot straight on into RHEL.
> But I still get the countdown, followed by a reboot.
> 
> Booting through manually and examining the varstore from userspace I see:
> 
> # ls /sys/firmware/efi/efivars/
...snip...

> The FB_NO_REBOOT variable is missing.
> 
> Maybe the content of the in-flash varstore is not used at all?

I hope it is, because we already rely on that working when we enroll the secureboot certs & DBX in edk2.spec

# experimental stateless builds
virt-fw-vars --input   Fedora/experimental/OVMF.stateless.fd \
             --output  Fedora/experimental/OVMF.stateless.secboot.fd \
             --set-dbx DBXUpdate-%{DBXDATE}.x64.bin \
             --enroll-redhat --secure-boot

Ideally we should only need to add the   --set-fallback-no-reboot flag to this command

Comment 15 Gerd Hoffmann 2023-08-23 07:58:48 UTC
> > Maybe the content of the in-flash varstore is not used at all?
> 
> I hope it is, because we already rely on that working when we enroll the
> secureboot certs & DBX in edk2.spec
> 
> # experimental stateless builds
> virt-fw-vars --input   Fedora/experimental/OVMF.stateless.fd \
>              --output  Fedora/experimental/OVMF.stateless.secboot.fd \
>              --set-dbx DBXUpdate-%{DBXDATE}.x64.bin \
>              --enroll-redhat --secure-boot
> 
> Ideally we should only need to add the   --set-fallback-no-reboot flag to
> this command

This is about the OVMF.amdsev.fd build I think.

One difference between them is that the stateless build has secure boot
support enabled.

Secure boot enabled builds copy over the pristine variable store from ROM or
readonly flash to RAM, to make sure the guest can not change the variables
in RAM directly, reboot, and circumvent secure boot settings that way (see
OvmfPkg/PlatformPei/Platform.c, ReserveEmuVariableNvStore())

Builds without secure boot support don't do that, which gives us a somewhat
persistent variable store even without flash (It will survive reset but
will not survive poweroff).

The latter could probably improved to check whenever there is a valid variable
store in RAM and should that not be the case copy over too.

For the stateless builds just adding --set-fallback-no-reboot should indeed
do the trick.

Comment 16 Oliver Steffen 2023-08-23 15:12:36 UTC
See this MR
https://gitlab.com/redhat/centos-stream/src/edk2/-/merge_requests/46
for a patch for EDK2 that sets the FB_NO_REBOOT variable at the end of the Dxe phase
of the AmdSev firmware. In my tests I was able to boot a plain RHEL9 guest in SEV-ES
with a swtpm device present.

Is this acceptable as a short-term workaround?

Comment 17 Gerd Hoffmann 2023-08-24 07:42:10 UTC
> The latter could probably improved to check whenever there is a valid
> variable store in RAM and should that not be the case copy over too.

[ not tested yet other than compiling ]

--- a/OvmfPkg/PlatformPei/Platform.c
+++ b/OvmfPkg/PlatformPei/Platform.c
@@ -16,6 +16,7 @@
 //
 // The Library classes this module consumes
 //
+#include <Guid/SystemNvDataGuid.h>
 #include <Library/BaseMemoryLib.h>
 #include <Library/BaseLib.h>
 #include <Library/DebugLib.h>
@@ -216,13 +217,18 @@ VOID
 ReserveEmuVariableNvStore (
   )
 {
-  EFI_PHYSICAL_ADDRESS  VariableStore;
-  RETURN_STATUS         PcdStatus;
+  EFI_PHYSICAL_ADDRESS        VariableStore;
+  RETURN_STATUS               PcdStatus;
+  EFI_FIRMWARE_VOLUME_HEADER  *FvHdr;
+  STATIC EFI_GUID             FvHdrGUID = EFI_SYSTEM_NV_DATA_FV_GUID;
 
-  VariableStore = (EFI_PHYSICAL_ADDRESS)(UINTN)PlatformReserveEmuVariableNvStore ();
+  FvHdr         = PlatformReserveEmuVariableNvStore ();
+  VariableStore = (EFI_PHYSICAL_ADDRESS)(UINTN)FvHdr;
   PcdStatus     = PcdSet64S (PcdEmuVariableNvStoreReserved, VariableStore);
 
-  if (FeaturePcdGet (PcdSecureBootSupported)) {
+  if (FeaturePcdGet (PcdSecureBootSupported) ||
+      !CompareGuid (&FvHdrGUID, &FvHdr->FileSystemGuid))
+  {
     // restore emulated VarStore from pristine ROM copy
     PlatformInitEmuVariableNvStore ((VOID *)(UINTN)VariableStore);
   }

I think this is more flexible and also has a better chance of being accepted upstream.

I'm also fine with the MR in comment 16 for 9.3, we will need some downstream-only
solution anyway for 9.3, the deadlines are so close that we can't wait for upsteam
to settle on something.

Comment 21 Xueqiang Wei 2023-08-25 06:31:17 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 22 zixchen 2023-08-25 08:13:39 UTC
Tested edk2-20230524-3.el9, no issue found.

Version:
edk2-ovmf-20230524-3.el9
qemu-kvm-8.0.0-13.el9.x86_64
kernel-5.14.0-360.el9.x86_64

Steps:
1. /usr/bin/swtpm socket --ctrl type=unixio,path=/root/avocado/data/avocado-vt/swtpm/avocado-vt-vm1_tpm0_swtpm.sock,mode=0600 --tpmstate dir=/root/avocado/data/avocado-vt/swtpm/avocado-vt-vm1_tpm0_state,mode=0600 --terminate --tpm2 --log file=/vtpm_avocado-vt-vm1_tpm0_swtpm.log
2. /usr/libexec/qemu-kvm \
     -S  \
     -name 'avocado-vt-vm1'  \
     -sandbox on  \
     -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/edk2/ovmf/OVMF.amdsev.fd", "auto-read-only": true, "discard": "unmap"}' \
     -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \
     -machine q35,pflash0=drive_ovmf_code,confidential-guest-support=lsec0,memory-backend=mem-machine_mem \
     -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \
     -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' \
     -object sev-guest,id=lsec0,policy=7,cbitpos=51,reduced-phys-bits=1  \
     -nodefaults \
     -device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' \
     -device '{"driver": "i6300esb", "bus": "pcie-pci-bridge-0", "addr": "0x1"}' \
     -watchdog-action reset \
     -m 4096 \
     -object '{"size": 4294967296, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}'  \
     -smp 48,maxcpus=48,cores=24,threads=1,dies=1,sockets=2  \
     -cpu 'EPYC-Milan',x2apic=on,tsc-deadline=on,hypervisor=on,tsc-adjust=on,vaes=on,vpclmulqdq=on,spec-ctrl=on,stibp=on,arch-capabilities=on,ssbd=on,cmp-legacy=on,virt-ssbd=on,lbrv=on,tsc-scale=on,vmcb-clean=on,pause-filter=on,pfthreshold=on,v-vmsave-vmload=on,vgif=on,rdctl-no=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,kvm_pv_unhalt=on \
     -device '{"driver": "intel-hda", "bus": "pcie-pci-bridge-0", "addr": "0x2"}' \
     -device '{"driver": "hda-duplex"}' \
     -chardev socket,id=qmp_id_qmpmonitor1,server=on,path=/var/tmp/avocado_4jctbnnb/monitor-qmpmonitor1-20230824-232922-FItchEWx,wait=off  \
     -mon chardev=qmp_id_qmpmonitor1,mode=control \
     -chardev socket,id=qmp_id_catch_monitor,server=on,path=/var/tmp/avocado_4jctbnnb/monitor-catch_monitor-20230824-232922-FItchEWx,wait=off  \
     -mon chardev=qmp_id_catch_monitor,mode=control \
     -device '{"ioport": 1285, "driver": "pvpanic", "id": "iddIqPyA"}' \
     -chardev socket,id=chardev_serial0,server=on,path=/var/tmp/avocado_4jctbnnb/serial-serial0-20230824-232922-FItchEWx,wait=off \
     -device '{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}' \
     -object '{"qom-type": "rng-random", "filename": "/dev/urandom", "id": "passthrough-XB6ReiR9"}' \
     -device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \
     -device '{"driver": "virtio-rng-pci", "id": "virtio-rng-YIeepccp", "rng": "passthrough-XB6ReiR9", "bus": "pcie-root-port-1", "addr": "0x0"}'  \
     -chardev socket,id=seabioslog_id_20230824-232922-FItchEWx,path=/var/tmp/avocado_4jctbnnb/seabios-20230824-232922-FItchEWx,server=on,wait=off \
     -device isa-debugcon,chardev=seabioslog_id_20230824-232922-FItchEWx,iobase=0x402 \
     -device '{"driver": "ich9-usb-ehci1", "id": "usb1", "addr": "0x1d.0x7", "multifunction": true, "bus": "pcie.0"}' \
     -device '{"driver": "ich9-usb-uhci1", "id": "usb1.0", "multifunction": true, "masterbus": "usb1.0", "addr": "0x1d.0x0", "firstport": 0, "bus": "pcie.0"}' \
     -device '{"driver": "ich9-usb-uhci2", "id": "usb1.1", "multifunction": true, "masterbus": "usb1.0", "addr": "0x1d.0x2", "firstport": 2, "bus": "pcie.0"}' \
     -device '{"driver": "ich9-usb-uhci3", "id": "usb1.2", "multifunction": true, "masterbus": "usb1.0", "addr": "0x1d.0x4", "firstport": 4, "bus": "pcie.0"}' \
     -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \
     -device '{"driver": "qemu-xhci", "id": "usb2", "bus": "pcie-root-port-2", "addr": "0x0"}' \
     -device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb2.0", "port": "1"}' \
     -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \
     -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-3", "addr": "0x0"}' \
     -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/rhel93_sev.qcow2", "cache": {"direct": true, "no-flush": false}}' \
     -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
     -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \
     -device '{"id": "pcie-root-port-4", "port": 4, "driver": "pcie-root-port", "addr": "0x1.0x4", "bus": "pcie.0", "chassis": 5}' \
     -device '{"driver": "virtio-net-pci", "mac": "9a:f0:c3:82:7e:1c", "id": "idHN48sN", "netdev": "idVMi8hy", "bus": "pcie-root-port-4", "addr": "0x0"}'  \
     -netdev tap,id=idVMi8hy  \
     -vnc :0  \
     -rtc base=utc,clock=host,driftfix=slew  \
     -boot menu=off,order=cdn,once=c,strict=off \

Result:
Sev-es guest can boot with vtpm devices without system_reset.

Comment 25 Xueqiang Wei 2023-08-27 18:30:16 UTC
Tested edk2 and qemu gating test loops, no new bug was found.

Versions:
kernel-5.14.0-356.el9.x86_64
qemu-kvm-8.0.0-13.el9
edk2-ovmf-20230524-3.el9.noarch

1. Edk2_test_with_edk2-ovmf-20230524-3.el9
Job link: http://fileshare.hosts.qa.psi.pek2.redhat.com/pub/logs/edk2_test_with_edk2-ovmf-20230524-3.el9/results.html
2. Qemu_gating_test_rhel9_with_edk2_test_with_edk2-ovmf-20230524-3.el9
Job link: http://fileshare.hosts.qa.psi.pek2.redhat.com/pub/logs/qemu_gating_test_rhel9_with_edk2_test_with_edk2-ovmf-20230524-3.el9/results.html

Hit a system poweroff issue, tracked by Bug 2234782 - system failed to poweroff

Comment 26 zixchen 2023-08-28 06:05:46 UTC
SEV regression test passed with edk2-ovmf-20230524-3.el9.noarch

Version:
kernel-5.14.0-360.el9.x86_64
qemu-kvm-8.0.0-13.el9
edk2-ovmf-20230524-3.el9.noarch

Sanity sev test log:
https://beaker.engineering.redhat.com/jobs/8225293

Comment 28 errata-xmlrpc 2023-11-07 08:24:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: edk2 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6330


Note You need to log in before you can comment on or make changes to this bug.