2420062 – requesting updated build to fix issues with AMD APU platforms

Bug 2420062 - requesting updated build to fix issues with AMD APU platforms

Summary: requesting updated build to fix issues with AMD APU platforms

Keywords:
Status:	ON_QA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	linux-firmware
Sub Component:
Version:	rawhide
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	David Woodhouse
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2025-12-08 17:33 UTC by Tim Flink
Modified:	2026-01-13 08:27 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Tim Flink 2025-12-08 17:33:24 UTC

There has been at least one report of this in Fedora, as negative karma on the latest build:

https://bodhi.fedoraproject.org/updates/FEDORA-2025-698dc1bbfa

There have been various issues filed upstream about some newer AMD platforms (Strix Point and Strix Halo APUs):

https://gitlab.freedesktop.org/drm/amd/-/issues/4751
https://gitlab.freedesktop.org/drm/amd/-/issues/4738
https://gitlab.freedesktop.org/drm/amd/-/issues/4737

The commits that address those issues are:

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=3d5c8135206cef364e7d353711b3e7358a90d152
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=c092c7487eb7c3d58697f490ff605bc38f4cc947
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=baf6c2f67a247eba7f298ed74bc471de43ad632d

Please make a new build with at least those commits in it to address those upstream issues.

Comment 1 Peter Robinson 2025-12-08 17:47:57 UTC

Groan, so good to see the AMD CI is non-existent :-/

Comment 2 Mario Limonciello 2025-12-08 18:13:33 UTC

Unfortunately; linux-firmware is totally a blind spot.  There is ROCm CI for all things ROCm, there is IGT for all things GFX.

These interactions are where the problems are :/

I think the right answer is going to be adding runners to the ROCm CI that linux-firmware CI can contact.

Comment 3 louisgtwo 2025-12-11 18:27:26 UTC

I think I'm hitting this bug on fedora 43. With amd-ucode-firmware-20251125 and amd-gpu-firmware-20251125 gnome was crashing multiple times. I tried kde and xfce with same result. When I downgraded to amd-ucode-firmware-20251021 and amd-gpu-firmware-20251021, the firmware that was released with fedora 43 and rebuilt initramfs system was stable. This is the second time amd firmware bit me.

Comment 4 Peter Robinson 2025-12-11 18:45:36 UTC

I don't think amd-ucode-firmware has anything to with the crashing, I suspect if you upgrade amd-ucode-firmware and leave the GPU FW downgraded you'll be fine.

But also this bug isssues with ROCm so I'm not sure if your issue is directly related.

Comment 5 Tim Flink 2025-12-11 18:49:58 UTC

2 of the linked issues are rocm specific and are why I started digging into the negative karma but https://gitlab.freedesktop.org/drm/amd/-/issues/4737 is more general, AFAIK. It details crashes and system freezes during normal graphical usage.

Comment 6 Mario Limonciello 2025-12-11 19:06:39 UTC

> I don't think amd-ucode-firmware has anything to with the crashing, I suspect if you upgrade amd-ucode-firmware and leave the GPU FW downgraded you'll be fine.

I agree.

> But also this bug isssues with ROCm so I'm not sure if your issue is directly related.
> 2 of the linked issues are rocm specific and are why I started digging into the negative karma but https://gitlab.freedesktop.org/drm/amd/-/issues/4737 is more general, AFAIK. It details crashes and system freezes during normal graphical usage.

There's definitely a real issue.  ROCm probably just tickled it more easily.

Comment 7 luke 2025-12-26 00:09:40 UTC

Hello everyone, 
Merry Christmas!

Is there any way we could push out the reverted version from Mario? I am currently stuck with old packages thanks to an atomic system (silverblue).

I am new to the processes here - if there is anything I can do to help out / expedite, just leave me some info.

Comment 8 Peter Robinson 2025-12-26 05:13:41 UTC

> Is there any way we could push out the reverted version from Mario? I am
> currently stuck with old packages thanks to an atomic system (silverblue).

You should be able to do a dnf downgrade to drop back to working firmware. The FW will be updated when things are coordinated, it is holiday season so things take a little longer at times.

Comment 9 Donato Capitella 2026-01-07 12:12:06 UTC

Happy New Year! Just checking in to see if there's an update on the expected timeline. I am part of a community of Strix Halo users and wrote many tutorial based on Fedora, and right now ever user who's intalling an updated version of Fedora has a broken ROCm implementation. I have been advising users to downgrade the Linux firmware, but as you can imagine this creates a lot of confusion.

Comment 10 Peter Robinson 2026-01-07 12:14:53 UTC

Aiming for Friday

Comment 11 Sid 2026-01-11 02:09:01 UTC

@Tim Flink (and AMD team) -- A new release came thru today but failed

```
sid@vega:~$ ./run-rocm-smoketest.sh 
ggml_cuda_init: found 1 ROCm devices:
  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
Memory access fault by GPU node-1 (Agent handle: 0x2ccbbe20) on address 0x7f45a612a000. Reason: Page not present or supervisor privilege.
sid@vega:~$ rpm-ostree status
State: idle
Deployments:
● fedora:fedora/43/x86_64/silverblue
                  Version: 43.20260110.0 (2026-01-10T00:28:27Z)
                   Commit: 3a0477ec79f1edb269ba6ed2844d86777b2d5a0d70624be1efa4c5530c9161c6
             GPGSignature: Valid signature by C6E7F081CF80E13146676E88829B606631645531

  fedora:fedora/43/x86_64/silverblue
                  Version: 43.1.6 (2025-10-23T03:11:18Z)
                   Commit: 4d40d281be93a88f3d559b5756df602f454f932f3c809a6a4250b91049ce40e8
             GPGSignature: Valid signature by C6E7F081CF80E13146676E88829B606631645531

  fedora:fedora/43/x86_64/silverblue
                  Version: 43.1.6 (2025-10-23T03:11:18Z)
                   Commit: 4d40d281be93a88f3d559b5756df602f454f932f3c809a6a4250b91049ce40e8
             GPGSignature: Valid signature by C6E7F081CF80E13146676E88829B606631645531
                   Pinned: yes
sid@vega:~$
```

Rollback makes it working again
```
```
sid@vega:~$ rpm-ostree status
State: idle
Deployments:
● fedora:fedora/43/x86_64/silverblue
                  Version: 43.1.6 (2025-10-23T03:11:18Z)
                   Commit: 4d40d281be93a88f3d559b5756df602f454f932f3c809a6a4250b91049ce40e8
             GPGSignature: Valid signature by C6E7F081CF80E13146676E88829B606631645531

  fedora:fedora/43/x86_64/silverblue
                  Version: 43.20260110.0 (2026-01-10T00:28:27Z)
                   Commit: 3a0477ec79f1edb269ba6ed2844d86777b2d5a0d70624be1efa4c5530c9161c6
             GPGSignature: Valid signature by C6E7F081CF80E13146676E88829B606631645531

  fedora:fedora/43/x86_64/silverblue
                  Version: 43.1.6 (2025-10-23T03:11:18Z)
                   Commit: 4d40d281be93a88f3d559b5756df602f454f932f3c809a6a4250b91049ce40e8
             GPGSignature: Valid signature by C6E7F081CF80E13146676E88829B606631645531
                   Pinned: yes
sid@vega:~$ ./run-rocm-smoketest.sh 
ggml_cuda_init: found 1 ROCm devices:
  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
| gemma3 4B Q4_K - Medium        |   2.31 GiB |     3.88 B | ROCm       |  99 |  1 |    0 |           pp512 |      2541.22 ± 29.97 |
| gemma3 4B Q4_K - Medium        |   2.31 GiB |     3.88 B | ROCm       |  99 |  1 |    0 |           tg128 |         68.67 ± 0.10 |

build: 9e41884dc (7687)
sid@vega:~$
```
```
What the quick smoke test runs ... 
```
sid@vega:~$ cat ./run-rocm-smoketest.sh 
#!/usr/bin/env bash
set -uo pipefail

toolbox run -c llama-rocm-7.1.1 -- /usr/local/bin/llama-bench  -fa 1 -ngl 99 -mmp 0 -m /mnt/data/models/hub/models--ggml-org--gemma-3-4b-it-GGUF/snapshots/d0976223747697cb51e056d85c532013931fe52e/gemma-3-4b-it-Q4_K_M.gguf
```

Comment 12 Fedora Update System 2026-01-11 04:14:13 UTC

FEDORA-2026-1d240112ff (linux-firmware-20260110-1.fc42) has been submitted as an update to Fedora 42.
https://bodhi.fedoraproject.org/updates/FEDORA-2026-1d240112ff

Comment 13 Fedora Update System 2026-01-11 04:14:26 UTC

FEDORA-2026-2cebf295af (linux-firmware-20260110-1.fc43) has been submitted as an update to Fedora 43.
https://bodhi.fedoraproject.org/updates/FEDORA-2026-2cebf295af

Comment 14 Mario Limonciello 2026-01-11 18:17:58 UTC

> @Tim Flink (and AMD team) -- A new release came thru today but failed

New release of what?  A Silverblue snapshot?  I need to know what details are in this "release".
* MES F/W version
* rocr-runtime version (7.1.1-XXX)  What's the XXX?  It needs to be -2 or newer to pick up the GFX1151 patch IIUC.

Comment 15 Peter Robinson 2026-01-12 00:27:58 UTC

(In reply to Mario Limonciello from comment #14)
> > @Tim Flink (and AMD team) -- A new release came thru today but failed
> 
> New release of what?  A Silverblue snapshot?  I need to know what details

I suspect they means the new upstream linux-firmware.

Comment 16 Peter Robinson 2026-01-12 00:34:00 UTC

It's in updates-testing so will be a stable update later this week

Comment 17 Fedora Update System 2026-01-12 01:34:11 UTC

FEDORA-2026-2cebf295af has been pushed to the Fedora 43 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2026-2cebf295af`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2026-2cebf295af

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 18 Fedora Update System 2026-01-12 01:55:46 UTC

FEDORA-2026-1d240112ff has been pushed to the Fedora 42 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2026-1d240112ff`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2026-1d240112ff

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 19 Peter Robinson 2026-01-12 10:33:47 UTC

Tim: can you confirm this looks good to you?

Comment 20 Sid 2026-01-12 17:40:06 UTC

@mario - I meant the new silverblue upgrade/snapshot pushed out, it was 43.20260110.0 when originally messaged. Since then 43.20260112.0 has been pushed out, which is also broken. I'll list the underlying components, but as a silverblue user, treating it as "one release". From what I'm reading this could be an issue in kernel (6.18.4-200.fc43.x86_64 vs 6.17.1-300.fc43.x86_64). Details below.

Side note, we actually moved this lab machine from workstation to silverblue for "greater stability" (which is sort of true? we can quickly rollback/upgrade trivially). I'll do my best to gather more helpful info, that AMD Strix Halo box is the only one right now, so intrusive to stop everything -> upgrade -> test -> rollback and resume actual workloads. 
-------------------------------------------------------------
Working:
Fedora 43 Silverblue snapshot: 43.1.6
sid@vega:~$ rpm-ostree status
State: idle
Deployments:
● fedora:fedora/43/x86_64/silverblue
                  Version: 43.1.6 (2025-10-23T03:11:18Z)
                   Commit: 4d40d281be93a88f3d559b5756df602f454f932f3c809a6a4250b91049ce40e8
             GPGSignature: Valid signature by C6E7F081CF80E13146676E88829B606631645531

sid@vega:~$ rpm -q linux-firmware
Kernel: Linux 6.17.1-300.fc43.x86_64
Firmware: linux-firmware-20251021-1.fc43.noarch
-------------------------------------------------------------
Broken:
Fedora 43 Silverblue snapshot: 43.20260112.0 
sid@vega:~$ rpm-ostree status
State: idle
Deployments:
● fedora:fedora/43/x86_64/silverblue
                  Version: 43.20260112.0 (2026-01-12T00:27:07Z)
                   Commit: 15edf9df6181db7fe6d70cd704d4dfda85edaf64f95317698386495b6f00e99a
             GPGSignature: Valid signature by C6E7F081CF80E13146676E88829B606631645531

Kernel: Linux 6.18.4-200.fc43.x86_64
Firmware: linux-firmware-20251125-1.fc43.noarch

Failure Rate: 100% (10/10; immediately )
sid@vega:~$ toolbox enter llama-rocm-7.1.1
⬢ [sid@toolbx ~]$ llama-bench  -fa 1 -ngl 99 -mmp 0 -m /mnt/data/projects/ai/models/hub/models--ggml-org--gemma-3-4b-it-GGUF/snapshots/d0976223747697cb51e056d85c532013931fe52e/gemma-3-4b-it-Q4_K_M.gguf
ggml_cuda_init: found 1 ROCm devices:
  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
Segmentation fault         (core dumped) llama-bench -fa 1 -ngl 99 -mmp 0 -m /mnt/data/projects/ai/models/hub/models--ggml-org--gemma-3-4b-it-GGUF/snapshots/d0976223747697cb51e056d85c532013931fe52e/gemma-3-4b-it-Q4_K_M.gguf
⬢ [sid@toolbx ~]$ llama-bench  -fa 1 -ngl 99 -mmp 0 -m /mnt/data/projects/ai/models/hub/models--ggml-org--gemma-3-4b-it-GGUF/snapshots/d0976223747697cb51e056d85c532013931fe52e/gemma-3-4b-it-Q4_K_M.gguf
ggml_cuda_init: found 1 ROCm devices:
  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
| model                          |       size |     params | backend    | ngl | fa | mmap |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | --------------: | -------------------: |
Segmentation fault         (core dumped) llama-bench -fa 1 -ngl 99 -mmp 0 -m /mnt/data/projects/ai/models/hub/models--ggml-org--gemma-3-4b-it-GGUF/snapshots/d0976223747697cb51e056d85c532013931fe52e/gemma-3-4b-it-Q4_K_M.gguf
⬢ [sid@toolbx ~]$

Comment 21 Mario Limonciello 2026-01-12 18:48:18 UTC

OK, that confirms you don't have the updated linux-firmware with the fix in the broken image.  Once it migrates out of testing and you get a new snapshot you /should/ be good to go.

Comment 22 Sid 2026-01-12 19:16:17 UTC

Thanks Mario. Any indicators of that 'stable build'? Like a version # (of immutable silverblue snapshot or linux-firmware) or an expected date?

Could I also recommend a gatekeeper test around llama-bench on your CI? It's quick, yet stressful, relatively isolated for end to end tests as llama-cpp is a single folder app. kyuz0's `amd-strix-halo-toolboxes` make it very trivial for runtime switching. And even smaller model would work (e.g. gemma3 4b Q4_K_M is 2.4GB).

Comment 23 Tim Flink 2026-01-12 19:17:00 UTC

(In reply to Peter Robinson from comment #19)
> Tim: can you confirm this looks good to you?

The changelog has all the relevant changes listed so everything should be good.

I don't have access to any of the relevant HW myself but I'm trying to find someone who can at at least run some basic tests to confirm that the quickly-testable issues have disappeared.

Comment 24 Tim Flink 2026-01-12 19:21:01 UTC

(In reply to Sid from comment #22)
> Thanks Mario. Any indicators of that 'stable build'? Like a version # (of
> immutable silverblue snapshot or linux-firmware) or an expected date?
> 
> Could I also recommend a gatekeeper test around llama-bench on your CI? It's
> quick, yet stressful, relatively isolated for end to end tests as llama-cpp
> is a single folder app. kyuz0's `amd-strix-halo-toolboxes` make it very
> trivial for runtime switching. And even smaller model would work (e.g.
> gemma3 4b Q4_K_M is 2.4GB).

I'm not terribly familiar with silverblue but I believe that it's built on the bits that have gone stable in the relevant Fedora release repos. I doubt that you'll see a change in silverblue until this linux-firmware update has gone stable but once it does get pushed stable, I imagine that the next silverblue build/update after that will have the firmware changes in it.

As far as an indication, the "rpm-ostree status" command shows the linux-firmware build used. Once that says "linux-firmware-20260110-1.fc43.noarch", the firmware fixes should be present.

Comment 25 Peter Robinson 2026-01-13 00:46:35 UTC

That is correct, silverblue is behind the rest off Fedora. It goes updates-testing -> updates -> silverblue. That last bit is at some point in the future because they have their own testing cycles. We are currently at updates-testing, I will push it to updates later in the week once I am happy the firmware update as a whole has had wide enough testing. Go an ask the silverblue people what happens from there because it's out of scope for this bug.

Comment 26 Peter Robinson 2026-01-13 00:49:47 UTC

(In reply to Sid from comment #20)
> @mario - I meant the new silverblue upgrade/snapshot pushed out, it was
> 43.20260110.0 when originally messaged. Since then 43.20260112.0 has been

For future reference, if a bug you are looking for a fix for as a silverblue user is anything but CLOSED -> ERRATA you won't have the fix. The fact this is currently ON_QA means it's not even in Fedora stable updates yet. Silverblue always trails.

Comment 27 Peter Robinson 2026-01-13 02:37:15 UTC

So I've clarified that atomic desktops will get the update when it goes stable with the rest of Fedora, CoreOS releases every two weeks so will get the update on their next release after the update goes stable.

Comment 28 Arthur Sore 2026-01-13 08:27:52 UTC

Thanks Peter, 20260110-1 is working on my workflows, on 44 rawhide with rocm 7.11.0.

Sid: I've also ran llama-bench with your parameters without issue.

Note You need to log in before you can comment on or make changes to this bug.