Bug 1852283 - crash/oops in drm_encoder_init, taint, on every boot
Summary: crash/oops in drm_encoder_init, taint, on every boot
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-30 03:13 UTC by Dimitris
Modified: 2020-07-24 06:16 UTC (History)
23 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug


Attachments (Terms of Use)
part of kernel logs that seems relevant (10.79 KB, text/plain)
2020-06-30 03:13 UTC, Dimitris
no flags Details


Links
System ID Priority Status Summary Last Updated
freedesktop.org Gitlab drm/amd - issues 1108 None None None 2020-06-30 15:05:11 UTC

Description Dimitris 2020-06-30 03:13:50 UTC
Created attachment 1699226 [details]
part of kernel logs that seems relevant

1. Please describe the problem:

I get an ABRT warning and kernel taint on every boot about a crash/oops in drm_encoder_init.

2. What is the Version-Release number of the kernel:

5.7.6-201.fc32.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

Yes, this is a regression starting on 5.7.6-201.fc32.x86_64


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Reproduces every time on boot.


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Haven't tried rawhide.

6. Are you running any modules that not shipped with directly Fedora's kernel?:

No.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Relevant part of logs attached.  Possibly relevant, this (ThinkPad T495, Ryzen 3700U) laptop boots while docked with a DisplayPort monitor attached via the dock.

Comment 1 Martin Wolf 2020-07-05 10:34:49 UTC
I am also affected by this error and I would like to add, that with this new kernel my system wakes up from standby. 
5.6.19 stays asleep like a baby, but 5.7.6 wakes up once.

Comment 2 Luca 2020-07-21 14:11:17 UTC
For me the bug is still there with Kernel 5.8.0-0.rc5.20200717git07a56bb875af.1.fc33.x86_64.

dmesg excerpt:

[    5.192415] WARNING: CPU: 5 PID: 267 at drivers/gpu/drm/drm_mode_object.c:45 drm_mode_object_add+0x7d/0x90 [drm]
[    5.192417] Modules linked in: amdgpu iommu_v2 gpu_sched i2c_algo_bit ttm crct10dif_pclmul crc32_pclmul crc32c_intel drm_kms_helper ghash_clmulni_intel cec drm e1000e xhci_pci xhci_pci_renesas video fuse i2c_dev
[    5.192432] CPU: 5 PID: 267 Comm: kworker/5:2 Tainted: G        W        --------- ---  5.8.0-0.rc5.20200717git07a56bb875af.1.fc33.x86_64 #1
[    5.192435] Hardware name: Gigabyte Technology Co., Ltd. H97-D3H/H97-D3H-CF, BIOS F7 09/19/2015
[    5.192447] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[    5.192463] RIP: 0010:drm_mode_object_add+0x7d/0x90 [drm]
[    5.192466] Code: 07 89 45 00 44 89 65 04 4c 89 ef e8 ed b2 8e dc 85 db b8 00 00 00 00 0f 4e c3 5b 5d 41 5c 41 5d c3 80 bf a0 00 00 00 00 74 a4 <0f> 0b eb a0 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44
[    5.192469] RSP: 0018:ffffb965c073fad8 EFLAGS: 00010202
[    5.192472] RAX: ffffffffc096f160 RBX: ffff9a87f55d5000 RCX: 0000000000000007
[    5.192474] RDX: 00000000e0e0e0e0 RSI: ffff9a87eec30618 RDI: ffff9a87f55d5000
[    5.192477] RBP: ffff9a87eec30618 R08: 0000000000000000 R09: ffff9a87eec30600
[    5.192479] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000e0e0e0e0
[    5.192481] R13: 0000000000000007 R14: ffffffffc07fde00 R15: ffff9a87eec30618
[    5.192484] FS:  0000000000000000(0000) GS:ffff9a8805800000(0000) knlGS:0000000000000000
[    5.192487] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.192489] CR2: 00007f33a5ef0e60 CR3: 00000005f5f7a003 CR4: 00000000001606e0
[    5.192492] Call Trace:
[    5.192510]  drm_encoder_init+0x49/0x170 [drm]
[    5.192515]  ? trace_kmalloc+0xf2/0x120
[    5.192522]  ? kmem_cache_alloc_trace+0x11a/0x230
[    5.192616]  dm_dp_add_mst_connector+0x11a/0x1f0 [amdgpu]
[    5.192631]  drm_dp_mst_port_add_connector+0x10d/0x1a0 [drm_kms_helper]
[    5.192637]  ? find_held_lock+0x32/0x90
[    5.192647]  ? drm_dp_send_link_address+0x41c/0x990 [drm_kms_helper]
[    5.192659]  ? drm_dp_send_link_address+0x41c/0x990 [drm_kms_helper]
[    5.192662]  ? find_held_lock+0x32/0x90
[    5.192666]  ? sched_clock+0x5/0x10
[    5.192669]  ? sched_clock_cpu+0xc/0xb0
[    5.192679]  ? __mutex_unlock_slowpath+0x35/0x270
[    5.192692]  drm_dp_send_link_address+0x36b/0x990 [drm_kms_helper]
[    5.192730]  ? slab_free_freelist_hook+0x116/0x1d0
[    5.192742]  drm_dp_check_and_send_link_address+0xad/0xd0 [drm_kms_helper]
[    5.192754]  drm_dp_mst_link_probe_work+0x98/0x190 [drm_kms_helper]
[    5.192763]  process_one_work+0x269/0x5c0
[    5.192775]  worker_thread+0x55/0x3c0
[    5.192779]  ? process_one_work+0x5c0/0x5c0
[    5.192785]  kthread+0x138/0x160
[    5.192789]  ? kthread_create_worker_on_cpu+0x40/0x40
[    5.192796]  ret_from_fork+0x1f/0x30
[    5.192812] irq event stamp: 2215
[    5.192815] hardirqs last  enabled at (2221): [<ffffffff9c16f2a7>] console_unlock+0x4b7/0x6c0
[    5.192819] hardirqs last disabled at (2226): [<ffffffff9c16ee9d>] console_unlock+0xad/0x6c0
[    5.192822] softirqs last  enabled at (2160): [<ffffffff9d000377>] __do_softirq+0x377/0x4a2
[    5.192825] softirqs last disabled at (2149): [<ffffffff9ce010cf>] asm_call_on_stack+0xf/0x20
[    5.192828] ---[ end trace e7e1f9f14a3015d6 ]---
[    5.210160] [drm] fb mappable at 0xE0CEE000
[    5.210165] [drm] vram apper at 0xE0000000
[    5.210167] [drm] size 14745600
[    5.210170] [drm] fb depth is 24
[    5.210172] [drm]    pitch is 10240
[    5.210502] fbcon: amdgpudrmfb (fb0) is primary device
[    5.210506] fbcon: Deferring console take-over
[    5.210514] amdgpu 0000:03:00.0: fb0: amdgpudrmfb frame buffer device

Comment 3 Michel Dänzer 2020-07-21 15:18:51 UTC
(In reply to Luca from comment #2)
> For me the bug is still there with Kernel
> 5.8.0-0.rc5.20200717git07a56bb875af.1.fc33.x86_64.

The fix (3168470142e0 "drm/amdgpu/display: create fake mst encoders ahead of time (v4)") is in 5.8-rc6, and should get backported to 5.7.y as well.

Comment 4 Dimitris 2020-07-24 06:16:40 UTC
Looks like the fix is included in 5.7.10-201.fc32


Note You need to log in before you can comment on or make changes to this bug.