Bug 2189186 - [abrt] commit_tail: WARNING: CPU: 5 PID: 9398 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:7902 amdgpu_dm_atomic_commit_tail+0x2c5e/0x2ce0 [amdgpu]
Summary: [abrt] commit_tail: WARNING: CPU: 5 PID: 9398 at drivers/gpu/drm/amd/amdgpu/....
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 38
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL: https://retrace.fedoraproject.org/faf...
Whiteboard: abrt_hash:f1187b7c674d3c3fd637219368f...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-24 11:20 UTC by Lars Geiger
Modified: 2023-07-13 12:34 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: ---
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-07-13 12:31:24 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: dmesg (116.05 KB, text/plain)
2023-04-24 11:20 UTC, Lars Geiger
no flags Details

Description Lars Geiger 2023-04-24 11:20:13 UTC
Description of problem:
I was browsing some web pages with Firefox in a Gnome Wayland session, nothing uncommon.

Ever since the update to kernel 6.2 with Fedora 37 and now with Fedora 38, I have been experiencing this kind of freeze/oops.
Under Fedora 37, I stayed on kernel 6.1, which had been rock solid the months before.
The freezes seem to happen more frequently when I am running AnyDesk (as a client) from Flathub, however this was not the case this time.

Additional info:
reporter:       libreport-2.17.9
WARNING: CPU: 5 PID: 9398 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:7902 amdgpu_dm_atomic_commit_tail+0x2c5e/0x2ce0 [amdgpu]
Modules linked in: uinput michael_mic rfcomm snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr_mhi bnep sunrpc binfmt_misc vfat fat qrtr ath11k_pci snd_soc_dmic snd_acp6x_pdm_dma snd_soc_acp6x_mach snd_sof_amd_rembrandt ath11k snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_ctl_led snd_sof snd_hda_codec_realtek qmi_helpers snd_sof_utils snd_hda_codec_generic snd_hda_codec_hdmi snd_soc_core mac80211 snd_hda_intel intel_rapl_msr snd_intel_dspcfg intel_rapl_common snd_intel_sdw_acpi snd_compress btusb uvcvideo snd_hda_codec ac97_bus edac_mce_amd snd_pcm_dmaengine btrtl snd_pci_ps snd_rpl_pci_acp6x btbcm videobuf2_vmalloc snd_hda_core videobuf2_memops btintel videobuf2_v4l2 snd_pci_acp6x kvm_amd snd_hwdep videobuf2_common btmtk
 libarc4 snd_seq snd_pci_acp5x kvm cfg80211 bluetooth videodev snd_seq_device thinkpad_acpi irqbypass snd_rn_pci_acp3x think_lmi snd_pcm mc rapl ledtrig_audio snd_acp_config platform_profile firmware_attributes_class snd_timer rfkill snd_soc_acpi wmi_bmof snd k10temp snd_pci_acp3x i2c_piix4 mhi soundcore acpi_cpufreq amd_pmc acpi_tad joydev loop zram amdgpu drm_ttm_helper ttm nvme iommu_v2 drm_buddy nvme_core gpu_sched drm_display_helper crct10dif_pclmul crc32_pclmul crc32c_intel video polyval_clmulni ucsi_acpi hid_multitouch polyval_generic ghash_clmulni_intel typec_ucsi sha512_ssse3 ccp r8169 sp5100_tco cec nvme_common typec wmi i2c_hid_acpi i2c_hid serio_raw ip6_tables ip_tables fuse ecryptfs
CPU: 5 PID: 9398 Comm: kworker/5:0 Not tainted 6.2.11-300.fc38.x86_64 #1
Hardware name: LENOVO 21CHCTO1WW/21CHCTO1WW, BIOS R23ET65W (1.35 ) 03/21/2023
Workqueue: events fbcon_register_existing_fbs
RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2c5e/0x2ce0 [amdgpu]
Code: e8 47 4c f4 ed 4c 8b 9d 08 fd ff ff 48 83 c4 18 83 bd 10 fd ff ff 02 77 37 c7 85 10 fd ff ff 02 00 00 00 e9 f4 fd ff ff 0f 0b <0f> 0b e9 c4 f5 ff ff 0f 0b e9 53 f5 ff ff 0f 0b e9 d5 f5 ff ff 8b
RSP: 0018:ffffba8c8d52b870 EFLAGS: 00010002
RAX: 0000000000000286 RBX: 0000000000000286 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000297 RDI: 0000000000000000
RBP: ffffba8c8d52bbd0 R08: 0000000000000002 R09: 0000000000000001
R10: 0000000000000000 R11: ffff8d4b8c8ac118 R12: ffff8d4b8c8ac118
R13: 0000000000000000 R14: ffff8d4b8bc89200 R15: ffff8d4b8c8ac000
FS:  0000000000000000(0000) GS:ffff8d529ef40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055de9ccaf274 CR3: 000000061b010000 CR4: 0000000000750ee0
PKRU: 55555554
Call Trace:
 <TASK>
 commit_tail+0x94/0x130
 drm_atomic_helper_commit+0x116/0x140
 drm_atomic_commit+0x96/0xc0
 ? __pfx___drm_printfn_info+0x10/0x10
 drm_client_modeset_commit_atomic+0x203/0x250
 drm_client_modeset_commit_locked+0x56/0x160
 drm_client_modeset_commit+0x21/0x40
 drm_fb_helper_set_par+0x9e/0x100
 fbcon_init+0x248/0x560
 visual_init+0xcc/0x120
 do_bind_con_driver.isra.0+0x19d/0x3c0
 do_take_over_console+0x148/0x180
 do_fbcon_takeover+0x5a/0xc0
 fbcon_register_existing_fbs+0x3f/0x70
 process_one_work+0x1c7/0x3d0
 worker_thread+0x4d/0x380
 ? _raw_spin_lock_irqsave+0x23/0x50
 ? __pfx_worker_thread+0x10/0x10
 kthread+0xe9/0x110
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x2c/0x50
 </TASK>

Comment 1 Lars Geiger 2023-04-24 11:20:18 UTC
Created attachment 1959514 [details]
File: dmesg

Comment 2 Lars Geiger 2023-07-13 12:31:24 UTC
A post in Lenovo's forum pointed towards the DMCUB firmware needing an update: https://forums.lenovo.com/t5/Fedora/drm-amdgpu-job-timedout-amdgpu-ERROR-ring-sdma0-timeout/m-p/5227959?page=1#6018762

I pulled the latest file from git and installed it as described there. After I did that, dmesg reported "[drm] Loading DMUB firmware via PSP: version=0x0400003C" (before, it was "version=0x0400002E") and I have not experienced a freeze in the last three weeks.

On July 6, I undid these changes, and then let dnf update to amd-gpu-firmware-20230625-151.fc38.noarch (from amd-gpu-firmware-20230515-150.fc38.noarch before). Afterwards, the DM(C)UB version was still 0x0400003C and the system continues to be stable.

So, from my point of view, this bug can be closed (I would do it myself but I am not sure what I should select as reason for closing).

Comment 3 Lars Geiger 2023-07-13 12:34:53 UTC
(Ah, I forgot to undo the state change from when I was looking at the possible resolutions before I saved my comment. I guess it is closed now. :) Please feel free to set a more appropriate resolution, though.)


Note You need to log in before you can comment on or make changes to this bug.