Bug 2376851 - TDX enable guest has immediate failure on kernel boot
Summary: TDX enable guest has immediate failure on kernel boot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: edk2
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Paolo Bonzini
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2025-07-07 15:52 UTC by Daniel Berrangé
Modified: 2025-07-16 00:57 UTC (History)
7 users (show)

Fixed In Version: edk2-20250523-11.fc42
Clone Of:
Environment:
Last Closed: 2025-07-16 00:57:36 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Daniel Berrangé 2025-07-07 15:52:10 UTC
After updating to edk2-ovmf-20250523-6.fc43.noarch  I am no longer able to successfully run any TDX enabled guests.

The previous build edk2-ovmf-20250221-8.fc42.noarch.rpm worked fine, as do the builds from CentOS Stream 9 (edk2-ovmf-20241117-4.el9.noarch.rpm) and Stream 10 (edk2-ovmf-20250523-2.el10.noarch.rpm).  The latter ought to be the same base version of EDK2 as in Fedora rawhide, so it is particularly strange the c10s build works but rawhide build fails.

I enabled isa-debugcon and compare the EDK2 logs from 20250523 vs 20250221  and I can see the EDK2 log stops immediately where you'd expect to see some TDX initialization

@@ -3,2238 +3,20 @@
 ResourceAttribute: 0x7
 PhysicalStart: 0x0
 ResourceLength: 0x800000
 Owner: 00000000-0000-0000-0000-000000000000
 
 ResourceAttribute: 0x7
 PhysicalStart: 0x806000
 ResourceLength: 0x3000
 Owner: 00000000-0000-0000-0000-000000000000
 
 ResourceAttribute: 0x7
 PhysicalStart: 0x80D000
 ResourceLength: 0x3000
 Owner: 00000000-0000-0000-0000-000000000000
 
 ResourceAttribute: 0x7
 PhysicalStart: 0x820000
 ResourceLength: 0x9D3E0000
 Owner: 00000000-0000-0000-0000-000000000000
 
-SecCoreStartupWithStack(0xFFFCC000, 0x820000)
-SecMtrrSetup: Skip TD-Guest
-Tdx started with(Hob: 0x809000, Gpaw: 0x34, Cpus: 1)
-LowMemory Start and End: 820000, 9DC00000
-HobList: 820000
-InitializePlatform in Pei-less boot
-CMOS:
-00: 56 00 22 00 15 00 02 07 07 25 26 02 10 80 00 00
-10: 00 00 00 00 06 80 02 FF FF 00 00 00 00 00 00 00
-20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-30: FF FF 20 00 C0 9C 00 20 30 00 00 00 00 12 00 00
-40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-HostBridgeDeviceId = 0x29C0
-Select Item: 0x19
-Select Item: 0x0
-FW CFG Signature: 0x554D4551
-Select Item: 0x1
-FW CFG Revision: 0x3
-QemuFwCfg interface is supported.
-Select Item: 0x19


Reproducible: Always

Steps to Reproduce:
1. Power on any TDX guest
2. Select a kernel from grub to boot

Actual Results:
VM immediately powers off before the kernel emits any console messages.

Comment 1 Daniel Berrangé 2025-07-07 15:53:18 UTC
Created attachment 2096380 [details]
Broken EDK2 log from edk2-ovmf-20250523-6.fc43.noarch

Comment 2 Daniel Berrangé 2025-07-07 15:53:56 UTC
Created attachment 2096381 [details]
Working EDK2 log from edk2-ovmf-20250221-8.fc42.noarch

Comment 3 Daniel Berrangé 2025-07-07 19:45:08 UTC
The following change makes it build with identical options to RHEL and works as expected

diff --git a/edk2-build.fedora b/edk2-build.fedora
index 957a28b..7c1b609 100644
--- a/edk2-build.fedora
+++ b/edk2-build.fedora
@@ -181,13 +181,13 @@ dest = Fedora/ovmf
 cpy1 = FV/OVMF.fd OVMF.amdsev.fd
 
 [build.ovmf.inteltdx]
-desc = ovmf build for IntelTdx (2MB)
+desc = ovmf build for IntelTdx (4MB)
 conf = OvmfPkg/IntelTdx/IntelTdxX64.dsc
 arch = X64
 opts = ovmf.common
-       ovmf.2m
+       ovmf.4m
        ovmf.sb.stateless
-pcds = nx.strict
+pcds = nx.compat.x64
        la57
 plat = IntelTdx
 dest = Fedora/ovmf


This more minimal change which exclusively re-enables TPM/CC options does NOT work

diff --git a/edk2-build.fedora b/edk2-build.fedora
index 957a28b..f3366f1 100644
--- a/edk2-build.fedora
+++ b/edk2-build.fedora
@@ -17,8 +17,8 @@ FD_SIZE_4MB              = TRUE
 FD_SIZE_2MB              = TRUE
 NETWORK_ISCSI_ENABLE     = FALSE
 NETWORK_TLS_ENABLE       = FALSE
-CC_MEASUREMENT_ENABLE    = FALSE
-TPM2_ENABLE              = FALSE
+#CC_MEASUREMENT_ENABLE    = FALSE
+#TPM2_ENABLE              = FALSE
 #BUILD_SHELL              = FALSE
 
 [opts.ovmf.sb.smm]


Similarly this change which exclusively switches to 4m builds does NOT work

diff --git a/edk2-build.fedora b/edk2-build.fedora
index 957a28b..1a6b49b 100644
--- a/edk2-build.fedora
+++ b/edk2-build.fedora
@@ -181,11 +181,11 @@ dest = Fedora/ovmf
 cpy1 = FV/OVMF.fd OVMF.amdsev.fd
 
 [build.ovmf.inteltdx]
-desc = ovmf build for IntelTdx (2MB)
+desc = ovmf build for IntelTdx (4MB)
 conf = OvmfPkg/IntelTdx/IntelTdxX64.dsc
 arch = X64
 opts = ovmf.common
-       ovmf.2m
+       ovmf.4m
        ovmf.sb.stateless
 pcds = nx.strict
        la57


The most minimal change that makes it work is this:


diff --git a/edk2-build.fedora b/edk2-build.fedora
index 957a28b..17b735f 100644
--- a/edk2-build.fedora
+++ b/edk2-build.fedora
@@ -187,7 +187,7 @@ arch = X64
 opts = ovmf.common
        ovmf.2m
        ovmf.sb.stateless
-pcds = nx.strict
+pcds = nx.compat.x64
        la57
 plat = IntelTdx
 dest = Fedora/ovmf


This is rather confusing as AFAICT the use of 'nx.strict' was already present in the edk2-ovmf-20250221-8.fc42.noarch.rpm  build which worked correctly.

So it looks like something in the rebase to 'edk2-ovmf-20250523' has caused 'nx.strict' to take effect in a way that it did not previously do.

Comment 4 Gerd Hoffmann 2025-07-08 07:06:10 UTC
Can you try https://copr.fedorainfracloud.org/coprs/kraxel/edk2.testbuilds/ ?
Also add 'grep PageFault $firmwarelog` output to this bug please.  Thanks.

Comment 5 Daniel Berrangé 2025-07-08 08:54:25 UTC
(In reply to Gerd Hoffmann from comment #4)
> Can you try https://copr.fedorainfracloud.org/coprs/kraxel/edk2.testbuilds/ ?

With edk2-ovmf-20250523-9.copr9233614.noarch I get new failure behaviour - a pagefault dump on the guest serial console

!!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000003  I:0 R:0 U:0 W:1 P:1 PK:0 SS:0 SGX:0
RIP  - 00000000020062E0, CS  - 0000000000000038, RFLAGS - 0000000000210046
RAX  - 0000000002022000, RCX - 0000000002022000, RDX - 0000000000000000
RBX  - 000000009A1C4B18, RSP - 000000009C89DF28, RBP - 000000009C46D018
RSI  - 0000000000000000, RDI - 0000000002065078
R8   - 0000000000000000, R9  - 000000007FDA4195, R10 - 00000000990C6828
R11  - 0000000000000000, R12 - 000000007FFFF000, R13 - 0000000000000000
R14  - 000000009992E6E8, R15 - 000000009992E6F0
DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
GS   - 0000000000000030, SS  - 0000000000000030
CR0  - 0000000080010031, CR2 - 0000000002022000, CR3 - 000000009C601000
CR4  - 0000000000000268, CR8 - 0000000000000000
DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 000000009C45D000 0000000000000047, LDTR - 0000000000000000
IDTR - 000000009B114018 0000000000000FFF,   TR - 0000000000000000
FXSAVE_STATE - 000000009C89DB80


> Also add 'grep PageFault $firmwarelog` output to this bug please.  Thanks.

No results from that in either of the logs I've attached to this bug, nor in the logs from the copr build above.

Comment 6 Gerd Hoffmann 2025-07-08 10:23:49 UTC
> The most minimal change that makes it work is this:

> -pcds = nx.strict
> +pcds = nx.compat.x64

> This is rather confusing as AFAICT the use of 'nx.strict' was already
> present in the edk2-ovmf-20250221-8.fc42.noarch.rpm  build which worked
> correctly.

Indeed, especially as the firmware doesn't do any NX stuff that early at boot.
(didn't notice it is failing /that/ early when checking the bug the first time).

So it might be something totally unrelated, which is triggered by good/bad
luck, maybe due to changed image layout.

The firmware simply hangs?  Could be here with the messages you get:

    [ in OvmfPkg/IntelTdx/Sec/SecMain.c ]
    if (TdxHelperProcessTdHob () != EFI_SUCCESS) {
      CpuDeadLoop ();
    }


> With edk2-ovmf-20250523-9.copr9233614.noarch I get new failure behaviour - a
> pagefault dump on the guest serial console

Thanks.  First, strange that it makes it that far, there are no changes
in the early TX code path.

> !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
> ExceptionData - 0000000000000003  I:0 R:0 U:0 W:1 P:1 PK:0 SS:0 SGX:0

Known grub bug if the EFI_MEMORY_ATTRIBUTE_PROTOCOL is present.
The build has a custom page fault handler to fixup NX faults
(and warn about them, like selinux in permissive mode), which
apparently is not active in TDX mode.  Need to check why.

There is a runtime switch to turn off EFI_MEMORY_ATTRIBUTE_PROTOCOL (downstream builds only):
-fw_cfg name=opt/org.tianocore/UninstallMemAttrProtocol,string=yes

Comment 7 Gerd Hoffmann 2025-07-08 10:30:43 UTC
new copr test builds underway [compiling still]

Comment 8 Daniel Berrangé 2025-07-08 10:54:48 UTC
(In reply to Gerd Hoffmann from comment #6)
> > The most minimal change that makes it work is this:
> 
> > -pcds = nx.strict
> > +pcds = nx.compat.x64
> 
> > This is rather confusing as AFAICT the use of 'nx.strict' was already
> > present in the edk2-ovmf-20250221-8.fc42.noarch.rpm  build which worked
> > correctly.
> 
> Indeed, especially as the firmware doesn't do any NX stuff that early at
> boot.
> (didn't notice it is failing /that/ early when checking the bug the first
> time).
> 
> So it might be something totally unrelated, which is triggered by good/bad
> luck, maybe due to changed image layout.
> 
> The firmware simply hangs?

Actually it isn't a hang - the whole VM resets - libvirt receives this
from QEMU:

  {"timestamp": {"seconds": 1751971878, "microseconds": 54790}, "event": "SHUTDOWN", "data": {"guest": true, "reason": "guest-reset"}}

NB: TDX can't do normal resets, so QEMU always shuts down for resets, and libvirt has to re-create the whole VM

> > With edk2-ovmf-20250523-9.copr9233614.noarch I get new failure behaviour - a
> > pagefault dump on the guest serial console
> 
> Thanks.  First, strange that it makes it that far, there are no changes
> in the early TX code path.
> 
> > !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
> > ExceptionData - 0000000000000003  I:0 R:0 U:0 W:1 P:1 PK:0 SS:0 SGX:0
> 
> Known grub bug if the EFI_MEMORY_ATTRIBUTE_PROTOCOL is present.
> The build has a custom page fault handler to fixup NX faults
> (and warn about them, like selinux in permissive mode), which
> apparently is not active in TDX mode.  Need to check why.
> 
> There is a runtime switch to turn off EFI_MEMORY_ATTRIBUTE_PROTOCOL
> (downstream builds only):
> -fw_cfg name=opt/org.tianocore/UninstallMemAttrProtocol,string=yes

Setting that fw_cfg flag, the copr build exhibits the same failure mode
as current rawhide VMs - the VM immediately shuts down due to a guest reset,
either in EFI stub or Linux early boot.

Comment 9 Gerd Hoffmann 2025-07-08 12:24:26 UTC
> > The firmware simply hangs?
> 
> Actually it isn't a hang - the whole VM resets - libvirt receives this
> from QEMU:
> 
>   {"timestamp": {"seconds": 1751971878, "microseconds": 54790}, "event":
> "SHUTDOWN", "data": {"guest": true, "reason": "guest-reset"}}

OK, so the firmware does NOT sit in a CpuDeadLoop() due to unrecoverable errors.
Might get a fault it can't handle -> triple-fault -> reset.
Can we get details from kvm on what happend?  Or is that confidential in TDX mode?

> NB: TDX can't do normal resets, so QEMU always shuts down for resets, and
> libvirt has to re-create the whole VM

Yes, much like 'qemu -no-reboot' on non-cc guests.

> > There is a runtime switch to turn off EFI_MEMORY_ATTRIBUTE_PROTOCOL
> > (downstream builds only):
> > -fw_cfg name=opt/org.tianocore/UninstallMemAttrProtocol,string=yes
> 
> Setting that fw_cfg flag, the copr build exhibits the same failure mode
> as current rawhide VMs - the VM immediately shuts down due to a guest reset,
> either in EFI stub or Linux early boot.

So, with '-fw-cfg name=opt/org.tianocore/UninstallMemAttrProtocol,string=yes' you get an almost instant reset (comment #1 logfile)?

And with '-fw-cfg name=opt/org.tianocore/UninstallMemAttrProtocol,string=no' you get the page fault on the serial line like in comment #4?  [ side note: With the latest copr build this should change into 'PageFault' messages in the firmware log ]

Comment 10 Daniel Berrangé 2025-07-08 13:17:48 UTC
(In reply to Gerd Hoffmann from comment #9)
> > > The firmware simply hangs?
> > 
> > Actually it isn't a hang - the whole VM resets - libvirt receives this
> > from QEMU:
> > 
> >   {"timestamp": {"seconds": 1751971878, "microseconds": 54790}, "event":
> > "SHUTDOWN", "data": {"guest": true, "reason": "guest-reset"}}
> 
> OK, so the firmware does NOT sit in a CpuDeadLoop() due to unrecoverable
> errors.
> Might get a fault it can't handle -> triple-fault -> reset.
> Can we get details from kvm on what happend?  Or is that confidential in TDX
> mode?

I've now traced it in QEMU and got back to kvm_cpu_exec

	        case KVM_EXIT_SHUTDOWN:
	            qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);

which IIUC will happen when the guest exits with a triple-fault

> > > There is a runtime switch to turn off EFI_MEMORY_ATTRIBUTE_PROTOCOL
> > > (downstream builds only):
> > > -fw_cfg name=opt/org.tianocore/UninstallMemAttrProtocol,string=yes
> > 
> > Setting that fw_cfg flag, the copr build exhibits the same failure mode
> > as current rawhide VMs - the VM immediately shuts down due to a guest reset,
> > either in EFI stub or Linux early boot.
> 
> So, with '-fw-cfg
> name=opt/org.tianocore/UninstallMemAttrProtocol,string=yes' you get an
> almost instant reset (comment #1 logfile)?

Correct.

> And with '-fw-cfg name=opt/org.tianocore/UninstallMemAttrProtocol,string=no'
> you get the page fault on the serial line like in comment #4?  [ side note:
> With the latest copr build this should change into 'PageFault' messages in
> the firmware log ]

Correct, or to be more precise I simply don't set that -fw_cfg feature at all.

Comment 11 Gerd Hoffmann 2025-07-09 13:48:41 UTC
> > So, with '-fw-cfg
> > name=opt/org.tianocore/UninstallMemAttrProtocol,string=yes' you get an
> > almost instant reset (comment #1 logfile)?
> 
> Correct.

Log is truncated.  append mode seems to be 'off' by default and libvirt restarting
the guest will zap the old log content.  That explains quite a bit of the confusion ;)

So, key log message is this (on the serial console, with EFI_MEMORY_ATTRIBUTE_PROTOCOL disabled):

  EFI stub: WARNING: Unable to unprotect memory range [9a0a0000,9a0a1000]: 8000000000000003
  EFI stub: WARNING: Unable to unprotect memory range [59c00000,5b200000]: 8000000000000003

First range is the trampoline to turn on 5-level paging, second range is the kernel image.
EFI stub can not clear NX + set RW -> page fault -> boom.  Dunno why this happens with TDX
enabled only, this should not be TDX-specific at all.

With EFI_MEMORY_ATTRIBUTE_PROTOCOL enabled this works fine, but requires the page fault
handler which compensates for the NX bug in grub, leaving this trail in the firmware log:

PageFaultInit: StrictNX disabled - installing page fault handler
PageFaultInit: mCpu->RegisterInterruptHandler: Success
PageFaultInit: gBS->CreateEvent: Success
PageFaultHandler: CR2: 000000000208A000 - RIP: 000000000206F920 - ID:0 WR:1 P:1 [0x3]
PageFaultHandler: setting RW for page 0x208A000 [large pte]
PageFaultExitBoot: fixups: 0 NX, 1 RW

Comment 12 Daniel Berrangé 2025-07-09 15:56:59 UTC
(In reply to Gerd Hoffmann from comment #11)
> > > So, with '-fw-cfg
> > > name=opt/org.tianocore/UninstallMemAttrProtocol,string=yes' you get an
> > > almost instant reset (comment #1 logfile)?
> > 
> > Correct.
> 
> Log is truncated.  append mode seems to be 'off' by default and libvirt
> restarting
> the guest will zap the old log content.  That explains quite a bit of the
> confusion ;)
> 
> So, key log message is this (on the serial console, with
> EFI_MEMORY_ATTRIBUTE_PROTOCOL disabled):
> 
>   EFI stub: WARNING: Unable to unprotect memory range [9a0a0000,9a0a1000]:
> 8000000000000003
>   EFI stub: WARNING: Unable to unprotect memory range [59c00000,5b200000]:
> 8000000000000003
> 
> First range is the trampoline to turn on 5-level paging, second range is the
> kernel image.
> EFI stub can not clear NX + set RW -> page fault -> boom.  Dunno why this
> happens with TDX
> enabled only, this should not be TDX-specific at all.
>

Urgh, I'm sorry, I don't know why I forgot to copy those lines of console
output from EFI stub into the initial description :-(

Comment 13 Fedora Update System 2025-07-11 12:22:09 UTC
FEDORA-2025-7e2a69db6b (edk2-20250523-11.fc42) has been submitted as an update to Fedora 42.
https://bodhi.fedoraproject.org/updates/FEDORA-2025-7e2a69db6b

Comment 15 Fedora Update System 2025-07-12 04:20:59 UTC
FEDORA-2025-7e2a69db6b has been pushed to the Fedora 42 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2025-7e2a69db6b`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2025-7e2a69db6b

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 16 Fedora Update System 2025-07-16 00:57:36 UTC
FEDORA-2025-7e2a69db6b (edk2-20250523-11.fc42) has been pushed to the Fedora 42 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.