Bug 2184855

Summary: amdgpu: NULL pointer dereference at drm_dp_add_payload_part2+0xca/0x100
Product: [Fedora] Fedora Reporter: Jeff Layton <jlayton>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 37CC: acaringi, adscvr, airlied, alciregi, bskeggs, hdegoede, hpa, jarodwilson, jens+redhat, jglisse, josef, kernel-maint, lgoncalv, linville, marc, mario.limonciello, masami256, mchehab, ptalbert, steved
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jeff Layton 2023-04-06 00:28:33 UTC
I've had several crashes in recent weeks (starting with the 6.2 kernels):

Apr 05 18:23:19 tleilax kernel: amdgpu 0000:30:00.0: [drm] Failed to create MST payload for port 000000003eed65e6: -5
Apr 05 18:23:19 tleilax kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Apr 05 18:23:19 tleilax kernel: #PF: supervisor read access in kernel mode
Apr 05 18:23:19 tleilax kernel: #PF: error_code(0x0000) - not-present page
Apr 05 18:23:19 tleilax kernel: PGD 0 P4D 0 
Apr 05 18:23:19 tleilax kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Apr 05 18:23:19 tleilax kernel: CPU: 1 PID: 3228 Comm: gnome-shell Kdump: loaded Not tainted 6.2.8-200.fc37.x86_64 #1
Apr 05 18:23:19 tleilax kernel: Hardware name: Micro-Star International Co., Ltd. MS-7A33/X370 SLI PLUS (MS-7A33), BIOS 3.JR 11/29/2019
Apr 05 18:23:19 tleilax kernel: RIP: 0010:drm_dp_add_payload_part2+0xca/0x100 [drm_display_helper]
Apr 05 18:23:19 tleilax kernel: Code: 8b 7e 08 44 89 e9 4c 89 c2 48 c7 c6 60 82 70 c0 e8 5b ba 39 c5 44 89 e8 5b 5d 41 5c 41 5d e9 ed b3 87 c5 48 8b 80 60 05 00 00 <48> 8b 76 08 4c 8b 40 60 48 85 f6 74 04 48 8b 76 08>
Apr 05 18:23:19 tleilax kernel: RSP: 0018:ffffb0374a0875f0 EFLAGS: 00010246
Apr 05 18:23:19 tleilax kernel: RAX: ffff8fc511664000 RBX: ffff8fc511664000 RCX: ffffffffc0707a98
Apr 05 18:23:19 tleilax kernel: RDX: ffff8fc5190e7480 RSI: 0000000000000000 RDI: ffff8fc511650568
Apr 05 18:23:19 tleilax kernel: RBP: 0000000000000001 R08: 00000000fffffffb R09: 0000000000000000
Apr 05 18:23:19 tleilax kernel: R10: 0000000000000002 R11: 0000000000000100 R12: ffff8fc511650000
Apr 05 18:23:19 tleilax kernel: R13: ffff8fc503acc420 R14: ffffffffc0f2c980 R15: ffff8fc512ee4390
Apr 05 18:23:19 tleilax kernel: FS:  00007f30b6bfc5c0(0000) GS:ffff8fd3dee40000(0000) knlGS:0000000000000000
Apr 05 18:23:19 tleilax kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 05 18:23:19 tleilax kernel: CR2: 0000000000000008 CR3: 000000010842a000 CR4: 00000000003506e0
Apr 05 18:23:19 tleilax kernel: Call Trace:
Apr 05 18:23:19 tleilax kernel:  <TASK>
Apr 05 18:23:19 tleilax kernel:  dm_helpers_dp_mst_send_payload_allocation+0x83/0xb0 [amdgpu]
Apr 05 18:23:19 tleilax kernel:  dc_link_allocate_mst_payload+0x16d/0x280 [amdgpu]
Apr 05 18:23:19 tleilax kernel:  core_link_enable_stream+0x8ec/0xa10 [amdgpu]
Apr 05 18:23:19 tleilax kernel:  ? optc1_set_drr+0x136/0x1e0 [amdgpu]
Apr 05 18:23:19 tleilax kernel:  dce110_apply_ctx_to_hw+0x61b/0x670 [amdgpu]
Apr 05 18:23:19 tleilax kernel:  dc_commit_state_no_check+0x39b/0xcd0 [amdgpu]
Apr 05 18:23:19 tleilax kernel:  dc_commit_state+0x107/0x120 [amdgpu]
Apr 05 18:23:19 tleilax kernel:  amdgpu_dm_atomic_commit_tail+0x5bf/0x2d20 [amdgpu]
Apr 05 18:23:19 tleilax kernel:  ? cpufreq_this_cpu_can_update+0x12/0x60
Apr 05 18:23:19 tleilax kernel:  ? sugov_update_single_freq+0x62/0x180
Apr 05 18:23:19 tleilax kernel:  ? _raw_spin_lock+0x13/0x40
Apr 05 18:23:19 tleilax kernel:  ? raw_spin_rq_lock_nested+0x1e/0x70
Apr 05 18:23:19 tleilax kernel:  ? psi_group_change+0x168/0x400
Apr 05 18:23:19 tleilax kernel:  ? _raw_spin_unlock+0x15/0x30
Apr 05 18:23:19 tleilax kernel:  ? finish_task_switch.isra.0+0x9b/0x300
Apr 05 18:23:19 tleilax kernel:  ? __switch_to+0x106/0x410
Apr 05 18:23:19 tleilax kernel:  ? __schedule+0x3d4/0x13c0
Apr 05 18:23:19 tleilax kernel:  ? dma_resv_get_fences+0x11b/0x220
Apr 05 18:23:19 tleilax kernel:  ? schedule+0x67/0xe0
Apr 05 18:23:19 tleilax kernel:  ? schedule_timeout+0x14d/0x160
Apr 05 18:23:19 tleilax kernel:  ? preempt_count_add+0x6a/0xa0
Apr 05 18:23:19 tleilax kernel:  ? preempt_count_add+0x6a/0xa0
Apr 05 18:23:19 tleilax kernel:  ? _raw_spin_lock_irq+0x19/0x40
Apr 05 18:23:19 tleilax kernel:  ? _raw_spin_unlock_irq+0x1b/0x40
Apr 05 18:23:19 tleilax kernel:  ? wait_for_completion_timeout+0x13a/0x170
Apr 05 18:23:19 tleilax kernel:  ? wait_for_completion_interruptible+0x135/0x1e0
Apr 05 18:23:19 tleilax kernel:  ? __pfx_dma_fence_default_wait_cb+0x10/0x10
Apr 05 18:23:19 tleilax kernel:  commit_tail+0x94/0x130
Apr 05 18:23:19 tleilax kernel:  drm_atomic_helper_commit+0x112/0x140
Apr 05 18:23:19 tleilax kernel:  drm_atomic_commit+0x96/0xc0
Apr 05 18:23:19 tleilax kernel:  ? __pfx___drm_printfn_info+0x10/0x10
Apr 05 18:23:19 tleilax kernel:  drm_mode_atomic_ioctl+0x959/0xb50
Apr 05 18:23:19 tleilax kernel:  ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
Apr 05 18:23:19 tleilax kernel:  drm_ioctl_kernel+0xc9/0x170
Apr 05 18:23:19 tleilax kernel:  drm_ioctl+0x22f/0x410
Apr 05 18:23:19 tleilax kernel:  ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
Apr 05 18:23:19 tleilax kernel:  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
Apr 05 18:23:19 tleilax kernel:  __x64_sys_ioctl+0x90/0xd0
Apr 05 18:23:19 tleilax kernel:  do_syscall_64+0x5b/0x80
Apr 05 18:23:19 tleilax kernel:  ? do_syscall_64+0x67/0x80
Apr 05 18:23:19 tleilax kernel:  ? syscall_exit_to_user_mode+0x17/0x40
Apr 05 18:23:19 tleilax kernel:  ? do_syscall_64+0x67/0x80
Apr 05 18:23:19 tleilax kernel:  ? __irq_exit_rcu+0x3d/0x140
Apr 05 18:23:19 tleilax kernel:  entry_SYSCALL_64_after_hwframe+0x72/0xdc
Apr 05 18:23:19 tleilax kernel: RIP: 0033:0x7f30bc323d6f
Apr 05 18:23:19 tleilax kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b>
Apr 05 18:23:19 tleilax kernel: RSP: 002b:00007ffc86d6c560 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Apr 05 18:23:19 tleilax kernel: RAX: ffffffffffffffda RBX: 0000560b426ee7a0 RCX: 00007f30bc323d6f
Apr 05 18:23:19 tleilax kernel: RDX: 00007ffc86d6c600 RSI: 00000000c03864bc RDI: 0000000000000008
Apr 05 18:23:19 tleilax kernel: RBP: 00007ffc86d6c600 R08: 0000000000000011 R09: 0000000000000011
Apr 05 18:23:19 tleilax kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c03864bc
Apr 05 18:23:19 tleilax kernel: R13: 0000000000000008 R14: 0000560b43cf84e0 R15: 0000560b45257720
Apr 05 18:23:19 tleilax kernel:  </TASK>
Apr 05 18:23:19 tleilax kernel: Modules linked in: xpad ff_memless tls uinput xt_mark rfcomm snd_seq_dummy snd_hrtimer rpcrdma rdma_cm iw_cm ib_cm ib_core tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_>
Apr 05 18:23:19 tleilax kernel:  snd_seq_device videobuf2_memops kvm videobuf2_v4l2 snd_pcm irqbypass bluetooth pl2303 videobuf2_common snd_timer rapl mxm_wmi wmi_bmof pcspkr videodev snd rfkill k10temp i2c_piix4 mc >
Apr 05 18:23:19 tleilax kernel: CR2: 0000000000000008
Apr 05 18:23:19 tleilax kernel: ---[ end trace 0000000000000000 ]---
Apr 05 18:23:19 tleilax kernel: RIP: 0010:drm_dp_add_payload_part2+0xca/0x100 [drm_display_helper]
Apr 05 18:23:19 tleilax kernel: Code: 8b 7e 08 44 89 e9 4c 89 c2 48 c7 c6 60 82 70 c0 e8 5b ba 39 c5 44 89 e8 5b 5d 41 5c 41 5d e9 ed b3 87 c5 48 8b 80 60 05 00 00 <48> 8b 76 08 4c 8b 40 60 48 85 f6 74 04 48 8b 76 08>
Apr 05 18:23:19 tleilax kernel: RSP: 0018:ffffb0374a0875f0 EFLAGS: 00010246
Apr 05 18:23:19 tleilax kernel: RAX: ffff8fc511664000 RBX: ffff8fc511664000 RCX: ffffffffc0707a98
Apr 05 18:23:19 tleilax kernel: RDX: ffff8fc5190e7480 RSI: 0000000000000000 RDI: ffff8fc511650568
Apr 05 18:23:19 tleilax kernel: RBP: 0000000000000001 R08: 00000000fffffffb R09: 0000000000000000
Apr 05 18:23:19 tleilax kernel: R10: 0000000000000002 R11: 0000000000000100 R12: ffff8fc511650000
Apr 05 18:23:19 tleilax kernel: R13: ffff8fc503acc420 R14: ffffffffc0f2c980 R15: ffff8fc512ee4390
Apr 05 18:23:19 tleilax kernel: FS:  00007f30b6bfc5c0(0000) GS:ffff8fd3dee40000(0000) knlGS:0000000000000000
Apr 05 18:23:19 tleilax kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 05 18:23:19 tleilax kernel: CR2: 0000000000000008 CR3: 000000010842a000 CR4: 00000000003506e0
Apr 05 18:23:19 tleilax kernel: note: gnome-shell[3228] exited with irqs disabled
Apr 05 18:23:20 tleilax abrt-dump-journal-oops[1789]: abrt-dump-journal-oops: Found oopses: 1
Apr 05 18:23:20 tleilax abrt-dump-journal-oops[1789]: abrt-dump-journal-oops: Creating problem directories
Apr 05 18:23:21 tleilax abrt-dump-journal-oops[1789]: Reported 1 kernel oopses to Abrt

faddr2line says:

drm_dp_add_payload_part2 at /usr/src/debug/kernel-6.2.9/linux-6.2.9-200.fc37.x86_64/drivers/gpu/drm/display/drm_dp_mst_topology.c:3407
 3402 	{
 3403 		int ret = 0;
 3404 	
 3405 		/* Skip failed payloads */
 3406 		if (payload->vc_start_slot == -1) {
>3407<			drm_dbg_kms(state->dev, "Part 1 of payload creation for %s failed, skipping part 2\n",
 3408 				    payload->port->connector->name);
 3409 			return -EIO;
 3410 		}
 3411 	
 3412 		ret = drm_dp_create_payload_step2(mgr, payload);

%rsi is NULL so I suspect that "state" was NULL here. 

It seems to mostly happen when I'm away from my desk. I come back to black screens that can't be awoken. It may be related to the displays going to sleep, but I'm not sure.

I can still ssh into the box when it occurs, but trying to reboot the box remotely seems to just hang. Unfortunately, I don't have serial console set up on this machine.

Comment 1 Jeff Layton 2023-04-13 11:15:55 UTC
Patch posted to dri-devel and the DRI maintainers. We'll see what they say!

    https://lore.kernel.org/dri-devel/20230413111254.22458-1-jlayton@kernel.org/T/#u

Comment 2 Jens 2023-05-15 15:01:47 UTC
I seem to have come across the same bug (at least the stack trace looks almost identical).
I saw this behaviour several times in the past weeks but can't reproduce it reliably.

The last case happened as follows:
 - My Thinkpad T14 (AMD) is connected to a Thinkpad Hub through USB-C which is connected to a display through Display Port, additionally another external display is connected through HDMI directly to the Laptop.
 - The Laptop lid is closed as well
 - I unplug the USB-C Hub and HDMI connection in quick succession
 - When I later opened the lid, I had a blank screen

I expected my Laptop to go into sleep, which did not happen. Trying to switch to a TTY didn't work and I had to force the machine to power off.

Below the logs:


Mai 15 16:00:08 jens-init7-t14 kernel: amdgpu 0000:04:00.0: [drm] Failed to create MST payload for port 00000000c0600df9: -5
Mai 15 16:00:08 jens-init7-t14 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Mai 15 16:00:08 jens-init7-t14 kernel: #PF: supervisor read access in kernel mode
Mai 15 16:00:08 jens-init7-t14 kernel: #PF: error_code(0x0000) - not-present page
Mai 15 16:00:08 jens-init7-t14 kernel: PGD 0 P4D 0 
Mai 15 16:00:08 jens-init7-t14 kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Mai 15 16:00:08 jens-init7-t14 kernel: CPU: 0 PID: 1520 Comm: Xorg Tainted: G        W          6.1.26-1-MANJARO #1 e1c38db939b1e3efe75f91662b55a6291cec7a5e
Mai 15 16:00:08 jens-init7-t14 kernel: Hardware name: LENOVO 21CFCTO1WW/21CFCTO1WW, BIOS R23ET60W (1.30 ) 09/14/2022
Mai 15 16:00:08 jens-init7-t14 kernel: RIP: 0010:drm_dp_add_payload_part2+0xae/0xe0 [drm_display_helper]
Mai 15 16:00:08 jens-init7-t14 kernel: Code: 89 e9 48 c7 c1 d0 41 c1 c0 ba 02 00 00 00 31 ff e8 d7 7c 3a e3 44 89 e8 5b 5d 41 5c 41 5d c3 cc cc cc cc 48 8b 80 60 05 00 00 <48> 8b 76 08 4c 8b 40 60 48 85 f6 74 04 48 8b 76 08 48 c7 c1 50 41
Mai 15 16:00:08 jens-init7-t14 kernel: RSP: 0018:ffffac5c03a6f5d8 EFLAGS: 00010246
Mai 15 16:00:08 jens-init7-t14 kernel: RAX: ffff9b37915ee000 RBX: ffff9b37915ee000 RCX: ffffffffc0c136b0
Mai 15 16:00:08 jens-init7-t14 kernel: RDX: ffff9b32d53bcc00 RSI: 0000000000000000 RDI: ffff9b31687ba540
Mai 15 16:00:08 jens-init7-t14 kernel: RBP: 0000000000000001 R08: 00000000fffffffb R09: 0000000000000000
Mai 15 16:00:08 jens-init7-t14 kernel: R10: 0000000000000002 R11: 0000000000000000 R12: ffff9b31687ba000
Mai 15 16:00:08 jens-init7-t14 kernel: R13: ffff9b3151461c60 R14: ffffffffc1c252a0 R15: ffff9b3168797390
Mai 15 16:00:08 jens-init7-t14 kernel: FS:  00007f05e6fff400(0000) GS:ffff9b385ee00000(0000) knlGS:0000000000000000
Mai 15 16:00:08 jens-init7-t14 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mai 15 16:00:08 jens-init7-t14 kernel: CR2: 0000000000000008 CR3: 0000000101d2e000 CR4: 0000000000750ef0
Mai 15 16:00:08 jens-init7-t14 kernel: PKRU: 55555554
Mai 15 16:00:08 jens-init7-t14 kernel: Call Trace:
Mai 15 16:00:08 jens-init7-t14 kernel:  <TASK>
Mai 15 16:00:08 jens-init7-t14 kernel:  dm_helpers_dp_mst_send_payload_allocation+0x87/0xb0 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  dc_link_allocate_mst_payload+0x171/0x270 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  core_link_enable_stream+0x7c4/0x970 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  ? optc31_set_drr+0x128/0x1d0 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  dce110_apply_ctx_to_hw+0x67b/0x720 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  ? dm_read_reg_func+0x3b/0xb0 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  ? dcn10_verify_allow_pstate_change_high+0x39/0x380 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  ? dcn10_wait_for_mpcc_disconnect+0x3d/0x150 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  dc_commit_state_no_check+0x38c/0xc80 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  ? dc_validate_global_state+0x3b7/0x3e0 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  dc_commit_state+0xe6/0x100 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  amdgpu_dm_atomic_commit_tail+0x5bf/0x2a70 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  ? sysvec_apic_timer_interrupt+0xe/0x90
Mai 15 16:00:08 jens-init7-t14 kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
Mai 15 16:00:08 jens-init7-t14 kernel:  ? delay_halt_mwaitx+0x3d/0x50
Mai 15 16:00:08 jens-init7-t14 kernel:  ? delay_halt+0x3f/0x70
Mai 15 16:00:08 jens-init7-t14 kernel:  ? amdgpu_fence_wait_polling+0x2b/0x60 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  ? amdgpu_virt_kiq_reg_write_reg_wait+0xd6/0x180 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  ? dma_resv_iter_first_unlocked+0x66/0x70
Mai 15 16:00:08 jens-init7-t14 kernel:  ? dma_resv_get_fences+0x61/0x220
Mai 15 16:00:08 jens-init7-t14 kernel:  ? wait_for_completion_timeout+0x13e/0x170
Mai 15 16:00:08 jens-init7-t14 kernel:  ? wait_for_completion_interruptible+0x139/0x1e0
Mai 15 16:00:08 jens-init7-t14 kernel:  commit_tail+0x94/0x130
Mai 15 16:00:08 jens-init7-t14 kernel:  drm_atomic_helper_commit+0x116/0x140
Mai 15 16:00:08 jens-init7-t14 kernel:  drm_atomic_commit+0x9a/0xd0
Mai 15 16:00:08 jens-init7-t14 kernel:  ? drm_plane_get_damage_clips.cold+0x1c/0x1c
Mai 15 16:00:08 jens-init7-t14 kernel:  drm_atomic_helper_set_config+0x74/0xb0
Mai 15 16:00:08 jens-init7-t14 kernel:  drm_mode_setcrtc+0x515/0x7e0
Mai 15 16:00:08 jens-init7-t14 kernel:  ? drm_mode_getcrtc+0x180/0x180
Mai 15 16:00:08 jens-init7-t14 kernel:  drm_ioctl_kernel+0xcd/0x170
Mai 15 16:00:08 jens-init7-t14 kernel:  drm_ioctl+0x233/0x410
Mai 15 16:00:08 jens-init7-t14 kernel:  ? drm_mode_getcrtc+0x180/0x180
Mai 15 16:00:08 jens-init7-t14 kernel:  amdgpu_drm_ioctl+0x4e/0x90 [amdgpu 0480d40e88acad82771dcf83484cee747c9f72c9]
Mai 15 16:00:08 jens-init7-t14 kernel:  __x64_sys_ioctl+0x94/0xd0
Mai 15 16:00:08 jens-init7-t14 kernel:  do_syscall_64+0x5f/0x90
Mai 15 16:00:08 jens-init7-t14 kernel:  ? do_syscall_64+0x6b/0x90
Mai 15 16:00:08 jens-init7-t14 kernel:  ? do_syscall_64+0x6b/0x90
Mai 15 16:00:08 jens-init7-t14 kernel:  ? do_syscall_64+0x6b/0x90
Mai 15 16:00:08 jens-init7-t14 kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
Mai 15 16:00:08 jens-init7-t14 kernel: RIP: 0033:0x7f05e79d353f
Mai 15 16:00:08 jens-init7-t14 kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
Mai 15 16:00:08 jens-init7-t14 kernel: RSP: 002b:00007ffe3f933650 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mai 15 16:00:08 jens-init7-t14 kernel: RAX: ffffffffffffffda RBX: 000055917c83a720 RCX: 00007f05e79d353f
Mai 15 16:00:08 jens-init7-t14 kernel: RDX: 00007ffe3f9336e0 RSI: 00000000c06864a2 RDI: 000000000000000f
Mai 15 16:00:08 jens-init7-t14 kernel: RBP: 00007ffe3f9336e0 R08: 0000000000000000 R09: 000055917c962200
Mai 15 16:00:08 jens-init7-t14 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c06864a2
Mai 15 16:00:08 jens-init7-t14 kernel: R13: 000000000000000f R14: 000055917b587590 R15: 000055917b388c30
Mai 15 16:00:08 jens-init7-t14 kernel:  </TASK>
Mai 15 16:00:08 jens-init7-t14 kernel: Modules linked in: tun veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype iptable_filter br_netfilter bridge stp llc overlay exfat ses enclosure scsi_transport_sas uas usb_storage snd_seq_dummy snd_seq hid_logitech_hidpp uhid rfcomm ccm michael_mic qrtr_mhi squashfs cmac algif_hash algif_skcipher af_alg bnep snd_soc_dmic snd_soc_acp6x_mach snd_acp6x_pdm_dma snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci qrtr snd_sof ath11k_pci snd_sof_utils snd_soc_core ath11k intel_rapl_msr qmi_helpers snd_ctl_led amdgpu snd_compress intel_rapl_common btusb ac97_bus snd_hda_codec_realtek snd_pcm_dmaengine snd_hda_codec_generic snd_pci_ps edac_mce_amd snd_hda_codec_hdmi snd_rpl_pci_acp6x btrtl snd_acp_pci vfat btbcm mac80211 snd_pci_acp6x fat snd_hda_intel uvcvideo btintel kvm_amd snd_usb_audio snd_intel_dspcfg gpu_sched think_lmi btmtk libarc4 videobuf2_vmalloc
Mai 15 16:00:08 jens-init7-t14 kernel:  snd_intel_sdw_acpi hid_multitouch drm_buddy videobuf2_memops snd_usbmidi_lib wmi_bmof firmware_attributes_class thinkpad_acpi snd_hda_codec bluetooth snd_pci_acp5x r8169 cdc_ether drm_ttm_helper videobuf2_v4l2 ledtrig_audio kvm usbnet cfg80211 snd_rawmidi ucsi_acpi snd_hda_core videobuf2_common snd_rn_pci_acp3x ecdh_generic ttm platform_profile irqbypass snd_hwdep realtek snd_seq_device typec_ucsi snd_acp_config r8152 sp5100_tco videodev snd_soc_acpi drm_display_helper rfkill snd_pcm mdio_devres rapl pcspkr psmouse typec video k10temp i2c_piix4 snd_timer mii mc crc16 cec snd snd_pci_acp3x mhi libphy roles soundcore mousedev i2c_hid_acpi joydev i2c_hid wmi acpi_cpufreq acpi_tad amd_pmc mac_hid dm_multipath sg crypto_user loop fuse bpf_preload ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq usbhid dm_crypt cbc encrypted_keys trusted asn1_encoder tee dm_mod serio_raw atkbd crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic
Mai 15 16:00:08 jens-init7-t14 kernel:  libps2 gf128mul ghash_clmulni_intel vivaldi_fmap sha512_ssse3 nvme aesni_intel crypto_simd nvme_core cryptd xhci_pci ccp i8042 nvme_common xhci_pci_renesas serio
Mai 15 16:00:08 jens-init7-t14 kernel: CR2: 0000000000000008
Mai 15 16:00:08 jens-init7-t14 kernel: ---[ end trace 0000000000000000 ]---
Mai 15 16:00:08 jens-init7-t14 kernel: RIP: 0010:drm_dp_add_payload_part2+0xae/0xe0 [drm_display_helper]
Mai 15 16:00:08 jens-init7-t14 kernel: Code: 89 e9 48 c7 c1 d0 41 c1 c0 ba 02 00 00 00 31 ff e8 d7 7c 3a e3 44 89 e8 5b 5d 41 5c 41 5d c3 cc cc cc cc 48 8b 80 60 05 00 00 <48> 8b 76 08 4c 8b 40 60 48 85 f6 74 04 48 8b 76 08 48 c7 c1 50 41
Mai 15 16:00:08 jens-init7-t14 kernel: RSP: 0018:ffffac5c03a6f5d8 EFLAGS: 00010246
Mai 15 16:00:08 jens-init7-t14 kernel: RAX: ffff9b37915ee000 RBX: ffff9b37915ee000 RCX: ffffffffc0c136b0
Mai 15 16:00:08 jens-init7-t14 kernel: RDX: ffff9b32d53bcc00 RSI: 0000000000000000 RDI: ffff9b31687ba540
Mai 15 16:00:08 jens-init7-t14 kernel: RBP: 0000000000000001 R08: 00000000fffffffb R09: 0000000000000000
Mai 15 16:00:08 jens-init7-t14 kernel: R10: 0000000000000002 R11: 0000000000000000 R12: ffff9b31687ba000
Mai 15 16:00:08 jens-init7-t14 kernel: R13: ffff9b3151461c60 R14: ffffffffc1c252a0 R15: ffff9b3168797390
Mai 15 16:00:08 jens-init7-t14 kernel: FS:  00007f05e6fff400(0000) GS:ffff9b385ee00000(0000) knlGS:0000000000000000
Mai 15 16:00:08 jens-init7-t14 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mai 15 16:00:08 jens-init7-t14 kernel: CR2: 0000000000000008 CR3: 0000000101d2e000 CR4: 0000000000750ef0
Mai 15 16:00:08 jens-init7-t14 kernel: PKRU: 55555554
Mai 15 16:00:08 jens-init7-t14 kernel: note: Xorg[1520] exited with irqs disabled
Mai 15 16:00:08 jens-init7-t14 kernel: usb 9-1.4.3: USB disconnect, device number 48
Mai 15 16:00:08 jens-init7-t14 kernel: usb 9-1.4.4: USB disconnect, device number 51

I'm no kernel dev, so just wanted to add some info in case it'd help :)

Comment 3 Mario Limonciello 2023-07-05 19:53:38 UTC
As your solution was accepted upstream 6.4 (54d217406afe) I've nominated it for stable 6.1.y and 6.3.y.
https://lore.kernel.org/stable/1c04a328-10e2-606a-c1ab-370d785d3534@amd.com/T/#u

I think this should be pulled into Fedora kernel sooner though than waiting for those to land if possible.

Comment 4 Jeff Layton 2023-07-06 12:19:03 UTC
That sounds great to me. I hit the same bug as recently as June 29th.