Bug 1981039
| Summary: | Black screen when starting Plasma and GNOME on Wayland with the 5.13.1 kernel involving amdgpu driver | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Matt Fagnani <matt.fagnani> | ||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
| Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 34 | CC: | acaringi, adscvr, agurenko, airlied, alciregi, bskeggs, hdegoede, jarodwilson, jeremy, jforbes, jglisse, jonathan, josef, kernel-maint, kheine7, lgoncalv, linville, masami256, mchehab, ptalbert, savelov, steved | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| URL: | https://gitlab.freedesktop.org/drm/amd/-/issues/1644 | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2022-06-07 22:41:13 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Matt Fagnani
2021-07-10 20:51:10 UTC
I reported this problem at https://gitlab.freedesktop.org/drm/amd/-/issues/1644 The latest Rawhide kernel kernel-5.14.0-0.rc0.20210709gitf55966571d5e.14.fc35 had this problem also. I heard that most kernel problems were introduced in the merge window, so I tried 5.13.0-0.rc1.13.fc35 from koji first. 5.13.0-0.rc1.13.fc35 had the same problem. The only successful 5.13 merge window Rawhide build 5.13.0-0.rc0.20210428gitacd3d2859453.2.fc35 in koji didn't have the problem. I take that to mean the problem was introduced in the 5.13 merge window after 5.13.0-0.rc0.20210428gitacd3d2859453.2.fc35. Michel Dänzer suggested using git bisect at https://gitlab.freedesktop.org/drm/amd/-/issues/1644#note_988139 I bisected the 5.13 merge window of the mainline kernel using the Fedora instructions at https://docs.fedoraproject.org/en-US/quick-docs/kernel/troubleshooting/index.html#_bisecting_the_kernel The first bad commit according to git bisect was the following. 1f928f51593ca07e2b125ca862fcff687e9e498b is the first bad commit commit 1f928f51593ca07e2b125ca862fcff687e9e498b Author: Oak Zeng <Oak.Zeng> Date: Sat Jan 23 11:34:45 2021 -0600 drm/amdgpu: Use physical translation mode to access page table On A+A platform, CPU write page directory and page table in cached mode. So it is necessary for page table walker to snoop CPU cache. This setting is necessary for page walker to snoop page directory and page table data out of CPU cache. Signed-off-by: Oak Zeng <Oak.Zeng> Acked-by: Christian Konig <christian.koenig> Reviewed-by: Felix Kuehling <felix.kuehling> Signed-off-by: Alex Deucher <alexander.deucher> drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 9 +++++++-- drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c | 13 +++++++++++-- 2 files changed, 18 insertions(+), 4 deletions(-) https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.14-rc1&id=1f928f51593ca07e2b125ca862fcff687e9e498b 5.13.2 is affected by this problem. I attached the journal for a boot of 5.13.1 with drm.debug=94 as suggested by Michel where I reproduced the black screen problem by logging into Plasma on Wayland at https://gitlab.freedesktop.org/drm/amd/-/issues/1644#note_988861 There were a lot more repeated messages from amdgpu with drm.debug=94 after the login, but I'm not sure which of them might indicate the error. I booted the 5.12.15 kernel with drm.debug=94, and it also had the messages like [drm:amdgpu_dm_atomic_check [amdgpu]] Atomic check failed with err: -22 [drm:dm_update_crtc_state [amdgpu]] Disabling DRM crtc: 47 So those messages might not be indicative of the problem. *** Bug 1984686 has been marked as a duplicate of this bug. *** 5.13.4 and earlier 5.13 kernels had this problem. 5.14.0-0.rc2.20210723git8baef6386baa.26.fc35 didn't have this problem with the default kernel command line in a Fedora 34 KDE Plasma installation and Fedora-KDE-Live-x86_64-Rawhide-20210724.n.0.iso from https://koji.fedoraproject.org/koji/buildinfo?buildID=1805690 5.14.0-0.rc2.20210722git3d5895cd3517.25.fc35 had this issue. I bisected the mainline kernel after 5.14-rc2 from 3d5895cd3517 to 8baef6386baa. The first commit that fixed the problem was 6be50f5d83adc9541de3d5be26e968182b5ac150 which fixed a regression in the amdgpu DC on some embedded panels as follows. 6be50f5d83adc9541de3d5be26e968182b5ac150 is the first new commit commit 6be50f5d83adc9541de3d5be26e968182b5ac150 Author: Stylon Wang <stylon.wang> Date: Wed Jul 21 12:25:24 2021 +0800 drm/amd/display: Fix ASSR regression on embedded panels [Why] Regression found in some embedded panels traces back to the earliest upstreamed ASSR patch. The changed code flow are causing problems with some panels. [How] - Change ASSR enabling code while preserving original code flow as much as possible - Simplify the code on guarding with internal display flag Bug: https://bugzilla.kernel.org/show_bug.cgi?id=213779 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1620 Reviewed-by: Alex Deucher <alexander.deucher> Signed-off-by: Stylon Wang <stylon.wang> Signed-off-by: Alex Deucher <alexander.deucher> Cc: stable.org drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) Kernel 5.13.12 still has the same issue. The issue you have linked in upstream was closed as a duplicate of another, which was fixed in 5.13.9. It would probably be worth mentioning that 5.13.9 did not fix your issue in the upstream bug so that they can figure out what the real fix is. I updated to F35 on my main drive two weeks ago, and 5.14-rc4 to 5.14-rc6 haven't had this problem. 5.13.12 has this problem in a F34 installation on another drive. Alex Deucher wrote at https://gitlab.freedesktop.org/drm/amd/-/issues/1620#note_1006759 patches are in 5.14, they should be landing in stable soon: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=715bfff397634c44d616e27e11c873be1d442977 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6be50f5d83adc9541de3d5be26e968182b5ac150 6be50f5d83adc9541de3d5be26e968182b5ac150 was included in 5.13.9 as fad0494f626f1a6b2ea76cd7c6d137d1b4961636 according to https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.13.9 I couldn't find 715bfff397634c44d616e27e11c873be1d442977 drm/amd/display: Revert "Guard ASSR with internal display flag" in the 5.13 repo or changelogs, even though it was cced to stable. Adding 715bfff397634c44d616e27e11c873be1d442977 to 5.13 might be needed to fix this problem. I commented on the above in my upstream report https://gitlab.freedesktop.org/drm/amd/-/issues/1644#note_1038663 Thanks. This message is a reminder that Fedora Linux 34 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '34'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 34 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed. Fedora Linux 34 entered end-of-life (EOL) status on 2022-06-07. Fedora Linux 34 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. Thank you for reporting this bug and we are sorry it could not be fixed. |