Bug 1727482 - kwin_wayland segmentation faults and black screen when logging out of Plasma on Wayland
Summary: kwin_wayland segmentation faults and black screen when logging out of Plasma ...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: mesa
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Daniel Vrátil
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-06 21:27 UTC by Matt Fagnani
Modified: 2021-05-25 15:01 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-25 15:01:55 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journal from logging out of Plasma on Wayland when the black screen issue occurred (74.10 KB, text/plain)
2019-07-06 21:30 UTC, Matt Fagnani
no flags Details


Links
System ID Private Priority Status Summary Last Updated
KDE Software Compilation 372789 0 None None None 2019-07-06 21:27:36 UTC
KDE Software Compilation 416147 0 None None None 2020-01-12 06:17:09 UTC
freedesktop.org Gitlab mesa/mesa/issues/2342 0 None None None 2020-01-13 03:43:12 UTC

Description Matt Fagnani 2019-07-06 21:27:36 UTC
Description of problem:

I've logged out of Plasma 5.15.5 on Wayland in F30 many times, and the screen stayed black. sddm didn't show up. I tried to switch to another VT with ctrl-alt-f2 etc which did nothing. I pressed sysrq+alt+e , sysrq+alt+i which terminated then killed most of the user-space processes. sddm restarted after that.

The journal had an error in systemd-logind at the time of the blank screen which has happened on 12 different times I logged out from Plasma on Wayland.
systemd-logind[1006]: Failed to restore VT, ignoring: Input/output error

I'll attach the journal from when the black screen issue occurred along with a segmentation fault in powerdevil I reported at https://bugzilla.redhat.com/show_bug.cgi?id=1727470

Version-Release number of selected component (if applicable):

kf5-kwayland-0:5.59.0-2.fc30.x86_64
kwayland-integration-0:5.15.5-1.fc30.x86_64
plasma-desktop-0:5.15.5-1.fc30.x86_64
qt5-qtwayland-0:5.12.4-2.fc30.x86_64
sddm-0:0.18.1-2.fc30.x86_64

How reproducible:
The black screen has occurred the majority of the times I've logged out of Plasma 5.15.5 on Wayland. This issue hasn't happened when I've logged out of Plasma on X.

Steps to Reproduce:
1. Boot F30 Plasma spin fully updated with updates-testing enabled
2. Log in to Plasma on Wayland from sddm
3. Log out of Plasma
4. Wait a minute
5. sysrq+alt+e , sysrq+alt+i

Actual results:
Black screen when logging out of Plasma 5.15.5 on Wayland

Expected results:
sddm is shown


Additional info:

The black screen problem seems to have been reported at https://bugs.kde.org/show_bug.cgi?id=372789 

A patch to fix this issue for kwayland-integration was written by David Edmundson for Plasma 5.16.3
https://cgit.kde.org/kwayland-integration.git/commit/?id=bfce3c6727cdc58a2b8ba33c933df05e21914876
https://bugs.kde.org/show_bug.cgi?id=372789#c46

I've seen crashes of powerdevil and kglobalaccel5 which occurred at about the same time as this issue occurred which I reported at
https://bugzilla.redhat.com/show_bug.cgi?id=1727470
https://bugzilla.redhat.com/show_bug.cgi?id=1713467
https://bugzilla.redhat.com/show_bug.cgi?id=1701485#c27

I don't know if those crashes are related to the blank screen issue.

Comment 1 Matt Fagnani 2019-07-06 21:30:07 UTC
Created attachment 1587967 [details]
journal from logging out of Plasma on Wayland when the black screen issue occurred

Comment 2 Matt Fagnani 2020-01-03 05:13:08 UTC
When I've logged out of Plasma 5.17.4 on Wayland in rawhide with KF 5.65 and Qt 5.13.2, a black screen happened about 30-50% of the time. I've needed to use sysrq+alt+e then sysrq+alt+i to get back to sddm. I guess that the problem is different from that addressed by the kwayland-integration patches for 5.16.3. I'm reassigning this report to plasma-desktop as I'm unsure where the issue is. The user session might not be ending properly. About 20 KDE programs have aborted every time I've logged out of Plasma on Wayland since 5.16.5. I don't know if those crashes are related to the black screen problem.

Comment 3 Matt Fagnani 2020-01-10 07:41:21 UTC
I've seen kwin_wayland segmentation faults in the journal with black screens when logging out of Plasma 5.17.4 on Wayland in rawhide with KF 5.65 and Qt 5.13.2. The core dumps were 2.1 GB uncompressed and were being truncated because the systemd-coredump default core dump limit was 2 GB. I changed /etc/systemd/coredump.conf to have ProcessSizeMax=3G and ExternalSizeMax=3G. I logged out of Plasma on Wayland and got the full kwin_wayland core dump. Using coredumpctl gdb, the kwin_wayland segmentation fault in frame #0 was at an address 0x0000560ba76632b0 pointing to an inaccessible address 0x5a200000.
The functions in frames #1-11 appeared to be in Mesa 19.3.1 including in the radeonsi driver. The GPU is an integrated AMD Radeon R5 using the radeonsi Mesa driver and the amdgpu kernel driver.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000560ba76632b0 in ?? ()
[Current thread is 1 (Thread 0x7f49e66c8e00 (LWP 1291))]
(gdb) bt
#0  0x0000560ba76632b0 in  ()
#1  0x00007f49c3418f23 in util_hash_table_remove (ht=0x560ba7717e00, key=0x560ba8ab3740)
    at ../src/gallium/auxiliary/util/u_hash_table.c:213
#2  0x00007f49c2d0ce01 in amdgpu_bo_destroy (_buf=0x560ba85817c0)
    at ../src/gallium/winsys/amdgpu/drm/amdgpu_bo.c:185
#3  0x00007f49c2cb9e6e in pb_destroy (buf=<optimized out>)
    at ../src/gallium/auxiliary/pipebuffer/pb_buffer.h:238
#4  pb_reference (src=0x0, dst=0x560ba87e59e0) at ../src/gallium/auxiliary/pipebuffer/pb_buffer.h:249
#5  si_texture_destroy (screen=<optimized out>, ptex=0x560ba87e5980)
    at ../src/gallium/drivers/radeonsi/si_texture.c:1125
#6  0x00007f49c2ad432c in pipe_resource_reference (src=0x0, dst=0x560ba8aba2e0)
    at ../src/gallium/auxiliary/util/u_inlines.h:148
#7  dri2_destroy_image (img=0x560ba8aba2e0) at ../src/gallium/state_trackers/dri/dri_helpers.c:318
#8  0x00007f49d086e7e7 in dri2_destroy_image_khr
    (drv=<optimized out>, disp=<optimized out>, image=0x560ba8a98ea0)
    at ../src/egl/drivers/dri2/egl_dri2.c:2941
#9  0x00007f49d086b7dd in _eglReleaseDisplayResources (drv=
    0x560ba78b2280, display=display@entry=0x560ba78b1790) at ../src/egl/main/egldisplay.c:483
#10 0x00007f49d08715bd in dri2_terminate (drv=<optimized out>, disp=0x560ba78b1790)
    at ../src/egl/drivers/dri2/egl_dri2.c:1130
#11 0x00007f49d0862b32 in eglTerminate (dpy=0x560ba78b1790) at ../src/egl/main/eglapi.c:675
#12 0x00007f49e7955818 in KWin::Platform::~Platform() (this=0x560ba76395a0, __in_chrg=<optimized out>)
    at /usr/src/debug/kwin-5.17.4-2.fc32.x86_64/platform.cpp:54
#13 0x00007f49d23a2c2d in KWin::DrmBackend::~DrmBackend()
--Type <RET> for more, q to quit, c to continue without paging--c
    (this=0x560ba76395a0, __in_chrg=<optimized out>) at /usr/src/debug/kwin-5.17.4-2.fc32.x86_64/plugins/platforms/drm/drm_backend.cpp:88
#14 0x00007f49e696d8bc in QObjectPrivate::deleteChildren() (this=this@entry=0x560ba75ffda0) at kernel/qobject.cpp:2019
#15 0x00007f49e696e80f in QObject::~QObject() (this=<optimized out>, __in_chrg=<optimized out>) at kernel/qobject.cpp:1032
#16 0x00007f49e693e7ae in QCoreApplication::~QCoreApplication() (this=0x7ffd10a44bc0, __in_chrg=<optimized out>) at ../../include/QtCore/../../src/corelib/tools/qstringlist.h:99
#17 0x00007f49e497ce1a in QGuiApplication::~QGuiApplication() (this=0x7ffd10a44bc0, __in_chrg=<optimized out>) at kernel/qguiapplication.cpp:697
#18 0x00007f49e6d744ee in QApplication::~QApplication() (this=0x7ffd10a44bc0, __in_chrg=<optimized out>) at kernel/qapplication.cpp:841
#19 0x0000560ba7142cd2 in main(int, char**) (argc=<optimized out>, argv=<optimized out>) at /usr/include/c++/9/bits/atomic_base.h:326

(gdb) x 0x0000560ba76632b0
0x560ba76632b0: 0x5a200000
(gdb) x 0x5a200000
0x5a200000:     Cannot access memory at address 0x5a200000

I'm reassigning this to kwin, but the underlying problem of the crash above might be in Mesa. I've seen other kwin_wayland segmentation faults when logging out in libraries like libwayland-client, but the core dumps were truncated. 

I'm seeing kwin_wayland aborts and segmentation faults when I shut down or reboot which might be related. I reported those kwin_wayland crashes originally at https://bugzilla.redhat.com/show_bug.cgi?id=1728716

Comment 4 Matt Fagnani 2020-01-12 06:17:10 UTC
Another kwin_wayland segmentation fault when logging out of Plasma on Wayland with KF5 5.66.0 happened in cso_hash_find_node in at ../src/gallium/auxiliary/cso_cache/cso_hash.c:271 in mesa-dri-drivers-19.3.2-1.fc32.x86_64. The frames #0-13 were in Mesa.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f8c5eb39a0a in cso_hash_find_node (akey=2880521741, hash=0x561c1142a9c0)
    at ../src/gallium/auxiliary/cso_cache/cso_hash.c:271
271        struct cso_node **nextNode = cso_hash_find_node(hash, key);
[Current thread is 1 (Thread 0x7f8c826f2e00 (LWP 1303))]

(gdb) bt
#0  0x00007f8c5eb39a0a in cso_hash_find_node (akey=2880521741, hash=0x561c1142a9c0)
    at ../src/gallium/auxiliary/cso_cache/cso_hash.c:271
#1  cso_hash_find (hash=0x561c1142a9c0, key=2880521741)
    at ../src/gallium/auxiliary/cso_cache/cso_hash.c:271
#2  0x00007f8c5f41af2d in util_hash_table_find_iter
    (ht=0x561c11398b80, ht=0x561c11398b80, key_hash=<optimized out>, key=0x561c12355b80)
    at ../src/gallium/auxiliary/util/u_hash_table.c:215
#3  util_hash_table_remove (ht=0x561c11398b80, key=0x561c12355b80)
    at ../src/gallium/auxiliary/util/u_hash_table.c:215
#4  0x00007f8c5ed0ee01 in amdgpu_bo_destroy (_buf=0x561c124a5030)
    at ../src/gallium/winsys/amdgpu/drm/amdgpu_bo.c:185
#5  0x00007f8c5ecbbe6e in pb_destroy (buf=<optimized out>)
    at ../src/gallium/auxiliary/pipebuffer/pb_buffer.h:238
#6  pb_reference (src=0x0, dst=0x561c120cb820) at ../src/gallium/auxiliary/pipebuffer/pb_buffer.h:249
#7  si_texture_destroy (screen=<optimized out>, ptex=0x561c120cb7c0)
    at ../src/gallium/drivers/radeonsi/si_texture.c:1125
#8  0x00007f8c5ead632c in pipe_resource_reference (src=0x0, dst=0x561c125ab6b0)
    at ../src/gallium/auxiliary/util/u_inlines.h:148
#9  dri2_destroy_image (img=0x561c125ab6b0) at ../src/gallium/state_trackers/dri/dri_helpers.c:318
#10 0x00007f8c6c8c07e7 in dri2_destroy_image_khr
    (drv=<optimized out>, disp=<optimized out>, image=0x561c124e1320)
    at ../src/egl/drivers/dri2/egl_dri2.c:2941
#11 0x00007f8c6c8bd7dd in _eglReleaseDisplayResources (drv=
    0x561c11531f50, display=display@entry=0x561c11531460) at ../src/egl/main/egldisplay.c:483
--Type <RET> for more, q to quit, c to continue without paging--
#12 0x00007f8c6c8c35bd in dri2_terminate (drv=<optimized out>, disp=0x561c11531460)
    at ../src/egl/drivers/dri2/egl_dri2.c:1130
#13 0x00007f8c6c8b4b32 in eglTerminate (dpy=0x561c11531460) at ../src/egl/main/eglapi.c:675
#14 0x00007f8c839af818 in KWin::Platform::~Platform() (this=0x561c112c92b0, __in_chrg=<optimized out>)
    at /usr/src/debug/kwin-5.17.4-2.fc32.x86_64/platform.cpp:54
#15 0x00007f8c6e3f4c2d in KWin::DrmBackend::~DrmBackend()
    (this=0x561c112c92b0, __in_chrg=<optimized out>)
    at /usr/src/debug/kwin-5.17.4-2.fc32.x86_64/plugins/platforms/drm/drm_backend.cpp:88
#16 0x00007f8c829c78bc in QObjectPrivate::deleteChildren() (this=this@entry=0x561c1127eda0)
    at kernel/qobject.cpp:2019
#17 0x00007f8c829c880f in QObject::~QObject() (this=<optimized out>, __in_chrg=<optimized out>)
    at kernel/qobject.cpp:1032
#18 0x00007f8c829987ae in QCoreApplication::~QCoreApplication()
    (this=0x7ffe0d7c0a90, __in_chrg=<optimized out>)
    at ../../include/QtCore/../../src/corelib/tools/qstringlist.h:99
#19 0x00007f8c809d6e1a in QGuiApplication::~QGuiApplication()
    (this=0x7ffe0d7c0a90, __in_chrg=<optimized out>) at kernel/qguiapplication.cpp:697
#20 0x00007f8c82dce4ee in QApplication::~QApplication()
    (this=0x7ffe0d7c0a90, __in_chrg=<optimized out>) at kernel/qapplication.cpp:841
#21 0x0000561c0fd8acd2 in main(int, char**) (argc=<optimized out>, argv=<optimized out>)
    at /usr/include/c++/9/bits/atomic_base.h:326

About 20 KDE programs aborted with errors that the Wayland connection was lost. I saw another kwin_wayland logout crash with the same trace as in comment 3. I reported these crashes at https://bugs.kde.org/show_bug.cgi?id=416147

Comment 5 Rex Dieter 2020-01-12 19:53:15 UTC
I recommend continuing to report these upstream, at least unless there's evidence that it's fedora-specific issue.

That said, these last 2 crashes are inside mesa/radeon-driver

Comment 6 Matt Fagnani 2020-01-13 03:43:12 UTC
(In reply to Rex Dieter from comment #5)
> I recommend continuing to report these upstream, at least unless there's
> evidence that it's fedora-specific issue.
> 
> That said, these last 2 crashes are inside mesa/radeon-driver

Thanks Rex. I'm reassigning this report to mesa. I've reported the crashes at https://gitlab.freedesktop.org/mesa/mesa/issues/2342

Comment 7 Ben Cotton 2020-02-11 15:41:10 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle.
Changing version to 32.

Comment 8 Fedora Program Management 2021-04-29 15:56:00 UTC
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Ben Cotton 2021-05-25 15:01:55 UTC
Fedora 32 changed to end-of-life (EOL) status on 2021-05-25. Fedora 32 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.