Bug 2362696 - kernel: amdgpu 0000:64:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Summary: kernel: amdgpu 0000:64:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: D...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: linux-firmware
Version: 41
Hardware: Unspecified
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: David Woodhouse
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2025-04-28 13:23 UTC by Michal Nowak
Modified: 2025-07-03 19:48 UTC (History)
8 users (show)

Fixed In Version: linux-firmware-20250627-1.fc42 linux-firmware-20250627-1.fc41
Clone Of:
Environment:
Last Closed: 2025-06-30 02:21:44 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
dnf history info 115 from Apr 20 (26.63 KB, text/plain)
2025-04-28 15:22 UTC, Michal Nowak
no flags Details

Description Michal Nowak 2025-04-28 13:23:17 UTC
After linux-firmware got updated to 20250410 I face system hangs on my Lenovo P14s (a 2024 model) with Radeon 780M Graphics when I attach or detach USB-C type charger:

Apr 27 18:17:36 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Apr 27 18:17:36 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Apr 27 18:17:37 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Apr 27 18:17:37 fedora kernel: amdgpu 0000:64:00.0: [drm] REG_WAIT timeout 1us * 100000 tries - mpc2_assert_idle_mpcc line:481
Apr 27 18:17:47 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed out

Here's the upstream error: https://gitlab.freedesktop.org/drm/amd/-/issues/3913

And a fix: https://gitlab.com/kernel-firmware/linux-firmware/-/merge_requests/420

SUSE deployed the fix in February: https://bugzilla.suse.com/show_bug.cgi?id=1236196

I just downgraded amd-gpu-firmware to 20241017 and am testing the workaround.

Reproducible: Sometimes

Steps to Reproduce:
Run Fedora 41 GNOME on a laptop for a while, suspend/resume, attach/detach the charging cable.



Additional Information:
After a second crash yesterday, I lost all opened Firefox tabs and windows even with the "Open previous windows and tabs" Firefox option enabled.

Comment 1 Peter Robinson 2025-04-28 13:34:10 UTC
> And a fix:
> https://gitlab.com/kernel-firmware/linux-firmware/-/merge_requests/420
> 
> SUSE deployed the fix in February:
> https://bugzilla.suse.com/show_bug.cgi?id=1236196

That fix has been in Fedora since the 20250211 release which has the above MR in the release.

> I just downgraded amd-gpu-firmware to 20241017 and am testing the workaround.

We have the upstream fix. There's been other updates since:

$ git log --format=oneline  amdgpu/dcn_3_1_4_dmcub.bin
152e5e12df704b78d1fda9e29d9c893d76db615d amdgpu: update dcn 3.1.4 firmware to 8.0.78.0
c2c0e64a1b022724dc3b1b10bba9a4ab1b60587d amdgpu: DMCUB updates for various ASICs
61d257d5a8b3303a0159ade514138d98a154248b amdgpu: DMCUB updates for various ASICs
0e16f416fa296f66c83187c2bfa2984ef0be47a0 amdgpu: revert DMCUB 3.1.4 firmware

$ git tag --contains 0e16f416fa296f66c83187c2bfa2984ef0be47a0
20250211
20250311
20250410

Are you sure it's that problem upstream and not another one?

Comment 2 Michal Nowak 2025-04-28 15:14:28 UTC
You must be right, I wasn't on linux-firmware 20241017 when it worked, but on 20250311:

Upgrade  amd-gpu-firmware-0:20250410-1.fc41.noarch              Group         updates
Replaced amd-gpu-firmware-0:20250311-1.fc41.noarch              Group         @System

This is when it happened the first time: (Linux version 6.13.9-200.fc41.x86_64):

Apr 20 15:46:49 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Apr 20 15:46:49 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Apr 20 15:46:50 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Apr 20 15:46:50 fedora kernel: amdgpu 0000:64:00.0: [drm] REG_WAIT timeout 1us * 100000 tries - mpc2_assert_idle_mpcc line:481
Apr 20 15:47:00 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed out

The screen was light grey, the caps lock led could be turned on and off, but I could not interact with the system by any other means. I closed the laptop screen lid.

Apr 20 15:48:05 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* flip_done timed out
Apr 20 15:48:05 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* [CRTC:79:crtc-0] commit wait timed out
Apr 20 15:48:05 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* flip_done timed out
Apr 20 15:48:05 fedora kernel: amdgpu 0000:64:00.0: [drm] *ERROR* [PLANE:58:plane-3] commit wait timed out
Apr 20 15:48:05 fedora kernel: Freezing user space processes failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 20 15:48:05 fedora kernel: task:KMS thread      state:D stack:0     pid:7359  tgid:7344  ppid:2744   flags:0x00004006
Apr 20 15:48:05 fedora kernel: Call Trace:
Apr 20 15:48:05 fedora kernel:  <TASK>
Apr 20 15:48:05 fedora kernel:  __schedule+0x2ad/0x5f0
Apr 20 15:48:05 fedora kernel:  schedule+0x27/0xa0
Apr 20 15:48:05 fedora kernel:  schedule_timeout+0x84/0x100
Apr 20 15:48:05 fedora kernel:  ? __pfx_process_timeout+0x10/0x10
Apr 20 15:48:05 fedora kernel:  __wait_for_common+0x8e/0x1c0
Apr 20 15:48:05 fedora kernel:  ? __pfx_schedule_timeout+0x10/0x10
Apr 20 15:48:05 fedora kernel:  drm_crtc_commit_wait+0x36/0x50
Apr 20 15:48:05 fedora kernel:  drm_atomic_helper_wait_for_dependencies+0xd2/0x100
Apr 20 15:48:05 fedora kernel:  commit_tail+0x3e/0x160
Apr 20 15:48:05 fedora kernel:  drm_atomic_helper_commit+0x11a/0x140
Apr 20 15:48:05 fedora kernel:  drm_atomic_commit+0xaf/0xe0
Apr 20 15:48:05 fedora kernel:  ? __pfx___drm_printfn_info+0x10/0x10
Apr 20 15:48:05 fedora kernel:  drm_mode_atomic_ioctl+0x70b/0x7c0
Apr 20 15:48:05 fedora kernel:  ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
Apr 20 15:48:05 fedora kernel:  drm_ioctl_kernel+0xad/0x100
Apr 20 15:48:05 fedora kernel:  drm_ioctl+0x288/0x530
Apr 20 15:48:05 fedora kernel:  ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
Apr 20 15:48:05 fedora kernel:  amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
Apr 20 15:48:05 fedora kernel:  __x64_sys_ioctl+0x94/0xc0
Apr 20 15:48:05 fedora kernel:  do_syscall_64+0x82/0x160
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? rseq_get_rseq_cs+0x1d/0x220
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? rseq_ip_fixup+0x8d/0x1d0
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? __x64_sys_ppoll+0xf4/0x160
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? eventfd_read+0xdf/0x230
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? vfs_read+0x299/0x370
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? syscall_exit_to_user_mode+0x10/0x210
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? do_syscall_64+0x8e/0x160
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? do_syscall_64+0x8e/0x160
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? syscall_exit_to_user_mode+0x10/0x210
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? do_syscall_64+0x8e/0x160
Apr 20 15:48:05 fedora kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 20 15:48:05 fedora kernel:  ? __irq_exit_rcu+0x4c/0xe0
Apr 20 15:48:05 fedora kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Apr 20 15:48:05 fedora kernel: RIP: 0033:0x7f4c4d2fd4ad
Apr 20 15:48:05 fedora kernel: RSP: 002b:00007f4c31f55bf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Apr 20 15:48:05 fedora kernel: RAX: ffffffffffffffda RBX: 00007f4c1000d8d0 RCX: 00007f4c4d2fd4ad
Apr 20 15:48:05 fedora kernel: RDX: 00007f4c31f55c90 RSI: 00000000c03864bc RDI: 000000000000000c
Apr 20 15:48:05 fedora kernel: RBP: 00007f4c31f55c40 R08: 0000000000000130 R09: 0000000000000001
Apr 20 15:48:05 fedora kernel: R10: 0000000000000015 R11: 0000000000000246 R12: 00007f4c31f55c90
Apr 20 15:48:05 fedora kernel: R13: 00000000c03864bc R14: 000000000000000c R15: 00007f4c1003a960
...
Apr 20 15:48:05 fedora kernel: PM: suspend exit
Apr 20 15:48:05 fedora kernel: PM: suspend entry (s2idle)
Apr 20 15:48:05 fedora bluetoothd[1570]: Controller resume with wake event 0x0
Apr 20 15:48:06 fedora kernel: Filesystems sync: 0.011 seconds

But would not wake up, and I held the power button for a while to reset.

I found two similar reports. Mine could be a duplicate of #2360956.

https://bugzilla.redhat.com/show_bug.cgi?id=2360956
https://bugzilla.redhat.com/show_bug.cgi?id=2312366

Btw, I just upgraded to Fedora 42.

Comment 3 Michal Nowak 2025-04-28 15:22:14 UTC
Created attachment 2087627 [details]
dnf history info 115 from Apr 20

Comment 4 Michal Nowak 2025-05-01 16:31:45 UTC
Now it happened on Fedora 42. I downgraded amd-gpu-firmware to 20250311-1.fc42 and versionlocked it in dnf.

Comment 5 Michal Nowak 2025-05-02 20:20:32 UTC
I downgraded all linux-firmware-related packages to 20250311 but it did not help.

Comment 6 Michal Nowak 2025-05-02 20:24:54 UTC
Unsure if it's just a conincidence but loupe - the new GNOME image viewer - is always implicated shortly before the freeze:

    loupe[10758]: vkAcquireNextImageKHR(): A swapchain no longer matches the surface properties exactly, but can still be used to present to the surface successfully. (VK_SUBOPTIMAL_KHR) (1000001003)

Or:

May 02 21:05:39 fedora systemd[2659]: Started dbus-:1.2-org.gnome.evince.Daemon.
May 02 21:05:46 fedora systemd[2659]: Started dbus-:1.2-org.gnome.Loupe.
May 02 21:05:48 fedora systemd[2659]: dbus-:1.2-org.gnome.Loupe: Consumed 1.177s CPU time, 251.7M memory peak.
May 02 21:05:49 fedora systemd[2659]: Started dbus-:1.2-org.gnome.Loupe.
May 02 21:06:30 fedora systemd[2659]: dbus-:1.2-org.gnome.Loupe: Consumed 3.854s CPU time, 329.1M memory peak.
May 02 21:06:33 fedora systemd[2659]: Started dbus-:1.2-org.gnome.Loupe.
May 02 21:06:37 fedora systemd[2659]: dbus-:1.2-org.gnome.Loupe: Consumed 2.666s CPU time, 397.5M memory peak.
May 02 21:06:40 fedora systemd[2659]: Started dbus-:1.2-org.gnome.Loupe.
May 02 21:06:46 fedora systemd[2659]: dbus-:1.2-org.gnome.Loupe: Consumed 2.785s CPU time, 401.4M memory peak.
May 02 21:06:49 fedora systemd[2659]: Started dbus-:1.2-org.gnome.Loupe.
May 02 21:06:54 fedora systemd[2659]: dbus-:1.2-org.gnome.Loupe: Consumed 2.625s CPU time, 407.3M memory peak.
May 02 21:06:55 fedora systemd[2659]: Started dbus-:1.2-org.gnome.Loupe.
May 02 21:07:09 fedora systemd[2659]: dbus-:1.2-org.gnome.Loupe: Consumed 3.102s CPU time, 397M memory peak.
May 02 21:07:14 fedora systemd[2659]: Started dbus-:1.2-org.gnome.Loupe.
May 02 21:07:25 fedora systemd[2659]: dbus-:1.2-org.gnome.Loupe: Consumed 2.601s CPU time, 326.1M memory peak.
May 02 21:07:34 fedora systemd[2659]: Started dbus-:1.2-org.gnome.Loupe.
May 02 21:07:36 fedora loupe[21117]: vkAcquireNextImageKHR(): A swapchain no longer matches the surface properties exactly, but can still be used to present to the surface successfully. (VK_SUBOPTIMAL_KHR) (1000001003)

Uninstalled loupe and back to eog.

Comment 7 Michal Nowak 2025-05-03 06:47:10 UTC
These two look similar to my issue:

https://gitlab.freedesktop.org/drm/amd/-/issues/2950
https://gitlab.freedesktop.org/drm/amd/-/issues/3926

Comment 8 Michal Nowak 2025-05-13 14:49:33 UTC
> Uninstalled loupe and back to eog.

Actually, I use gThumb now, but getting rid of loupe workarounded things for me.

Comment 9 Masoud 2025-06-25 09:00:22 UTC
I have the same problem and only downgrading the following two packages I can go past LUKS passphrase screen:
```
amd-gpu-firmware
amd-ucode-firmware
```

You can find some more relevant info here:
https://discussion.fedoraproject.org/t/system-not-booting-after-installation-of-kernel-0-6-15-3-200-fc42-x86-64-with-the-amdgpu-drm-error/156338/25

Comment 10 Fedora Update System 2025-06-27 19:12:29 UTC
FEDORA-2025-7feed8b25a (kernel-6.15.4-200.fc42 and linux-firmware-20250627-1.fc42) has been submitted as an update to Fedora 42.
https://bodhi.fedoraproject.org/updates/FEDORA-2025-7feed8b25a

Comment 11 Fedora Update System 2025-06-27 19:12:33 UTC
FEDORA-2025-f6f8526a43 (kernel-6.15.4-100.fc41 and linux-firmware-20250627-1.fc41) has been submitted as an update to Fedora 41.
https://bodhi.fedoraproject.org/updates/FEDORA-2025-f6f8526a43

Comment 12 Fedora Update System 2025-06-28 02:08:48 UTC
FEDORA-2025-7feed8b25a has been pushed to the Fedora 42 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2025-7feed8b25a`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2025-7feed8b25a

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 13 Fedora Update System 2025-06-28 02:33:42 UTC
FEDORA-2025-f6f8526a43 has been pushed to the Fedora 41 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2025-f6f8526a43`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2025-f6f8526a43

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 14 Fedora Update System 2025-06-30 02:21:44 UTC
FEDORA-2025-7feed8b25a (kernel-6.15.4-200.fc42 and linux-firmware-20250627-1.fc42) has been pushed to the Fedora 42 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 15 Fedora Update System 2025-06-30 02:46:06 UTC
FEDORA-2025-f6f8526a43 (kernel-6.15.4-100.fc41 and linux-firmware-20250627-1.fc41) has been pushed to the Fedora 41 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.