Bug 1901610

Summary: On system boot, gnome shell crashes after log in
Product: [Fedora] Fedora Reporter: Brad Smith <bradley.g.smith>
Component: gnome-shellAssignee: Florian Müllner <fmuellner>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 33CC: fmuellner, gnome-sig, jadahl, otaylor, philip.wyett, ralph
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-29 14:57:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
journalctl -elf -p err -b > dump.txt
none
backtrace results
none
journalctl -elf -p err -b > dump2.txt
none
backtrace full in gdb session
none
journalctl -elf _PID=4819 -b none

Description Brad Smith 2020-11-25 16:12:12 UTC
Created attachment 1733390 [details]
journalctl -elf -p err -b > dump.txt

Description of problem: Gnome shell crashes after authentication and displays message "Opps" and instruction and button to log out. After log out, reauthentication, gnome shell will start but all extensions are disabled.


Version-Release number of selected component (if applicable):

gnome-shell-extension-horizontal-workspaces-3.38.1-1.fc33.noarch
gnome-shell-extension-background-logo-3.37.3-2.fc33.noarch
chrome-gnome-shell-10.1-10.fc33.x86_64
gnome-shell-extension-window-list-3.38.1-1.fc33.noarch
gnome-shell-theme-selene-3.4.0-19.fc33.noarch
gnome-shell-extension-launch-new-instance-3.38.1-1.fc33.noarch
gnome-shell-extension-user-theme-3.38.1-1.fc33.noarch
gnome-shell-extension-apps-menu-3.38.1-1.fc33.noarch
gnome-shell-extension-places-menu-3.38.1-1.fc33.noarch
gnome-shell-extension-common-3.38.1-1.fc33.noarch
gnome-shell-theme-yaru-20.10.6.1-1.fc33.noarch
gnome-shell-3.38.1-3.fc33.x86_64

How reproducible:

Every boot for last 3 days

Steps to Reproduce:
1. Boot system (or reboot)
2.
3.

Actual results:

System will boot after the "Opps" splash sceen and log out. But gnome extensions are all disabled.

Expected results:

Boot to gnome shell with all extensions working 

Additional info:

This started about 3 days ago. System has otherwise been very stable. Attached a file generated by "journalctl -elf -p err -b > dump.txt"

I use the nvisia drivers from rpmfusion. This system was upgraded from F32 about 1 week after F33 release.

Comment 1 Jonas Ådahl 2020-11-26 15:42:27 UTC
Install the debug symbols,

    sudo dnf debuginfo-install gnome-shell mutter glib2 gtk3

Then add

    export GDK_SYNCHRONIZE=1
    export MUTTER_VERBOSE=1

to ~/.bashrc, log out, then log back in again, then run

    sudo dnf install gdb
    coredumpctl gdb -r gnome-shell

and in the launched gdb session run

    backtrace

Then attach the result of that as a file here, as well as the new journal log content that comes before the crash.

Comment 2 Brad Smith 2020-11-26 16:44:28 UTC
Created attachment 1733842 [details]
backtrace results

As requested in needinfo - backtrace

Comment 3 Brad Smith 2020-11-26 16:45:42 UTC
Created attachment 1733843 [details]
journalctl -elf -p err -b > dump2.txt

result from journalctl -elf -p err -b. Please let me know if additional info is needed

Comment 4 Jonas Ådahl 2020-11-26 17:12:00 UTC
(In reply to Brad Smith from comment #3)
> result from journalctl -elf -p err -b. Please let me know if additional info
> is needed

Could you run the following:

    sudo dnf debuginfo-install libXfixes
    coredumpctl gdb 19007

then in gdb run

    backtrace full

It will hopefully provide us with the parameters sent to the X server that caused the bad request.

Comment 5 Brad Smith 2020-11-26 17:37:15 UTC
Created attachment 1733847 [details]
backtrace full in gdb session

As requested.  Thank you!

Brad

Comment 6 Jonas Ådahl 2020-11-27 09:52:32 UTC
(In reply to Brad Smith from comment #3)
> Created attachment 1733843 [details]
> journalctl -elf -p err -b > dump2.txt
> 
> result from journalctl -elf -p err -b. Please let me know if additional info
> is needed

Can you provide one without '-p err', and e.g. for the gnome-shell process that crashed alone, e.g. 'journalctl _PID=1234' if 1234 was the pid of the crashed gnome-shell. It seems the journal you attached had filtered out most of what gnome-shell should have logged.

Comment 7 Brad Smith 2020-11-27 16:31:25 UTC
Created attachment 1734160 [details]
journalctl -elf _PID=4819 -b

journalctl -elf _PID=4819 -b

Good morning

this is the PID reported by Problem Reporting app after log in. "gnome-shell quit unexpectedly"

thank you

Comment 8 Jonas Ådahl 2020-11-27 16:48:33 UTC
(In reply to Brad Smith from comment #7)
> Created attachment 1734160 [details]
> journalctl -elf _PID=4819 -b
> 
> journalctl -elf _PID=4819 -b
> 
> Good morning
> 
> this is the PID reported by Problem Reporting app after log in. "gnome-shell
> quit unexpectedly"
> 
> thank you

Seems you have an extension called "Multi Monitors Add-On" enabled; does the problem go away if you disable it?

Either way, the issue seems to be caused by incorrect pointer barriers being created, and the newly attached journal points me to that it's the fault of that extension, since it's the one creating the barrier.

Either way, I created https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1611 which should log a warning instead of crashing when this happens.

Comment 9 Brad Smith 2020-11-29 14:57:34 UTC
Jonas - thank you for your analysis. You are correct, removing the multi-monitor gnome extension solved the problem. I will mark this as closed upstream if that is appropriate.

Comment 10 Jonas Ådahl 2020-11-29 17:21:01 UTC
Thanks for verifying. In mutter 3.38.2 will be less crash:y, but will instead just cause degraded functionality (missing pointer barriers). Also, if you haven't already, I recommend removing the two lines you added to .bashrc, as they may cause performance degradation.

Comment 11 Brad Smith 2020-11-29 17:25:05 UTC
thanks for the reminder!