1636060 – Crash when starting g-c-c on X11 with two external monitors and lid shut

Bug 1636060 - Crash when starting g-c-c on X11 with two external monitors and lid shut

Summary: Crash when starting g-c-c on X11 with two external monitors and lid shut

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	mutter
Sub Component:
Version:	29
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Florian Müllner
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-10-04 11:10 UTC by Benjamin Berg
Modified:	2019-11-13 14:02 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2019-11-13 14:02:35 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
DBus monitor (680.84 KB, text/plain) 2018-10-04 11:56 UTC, Benjamin Berg	no flags	Details
Monitors XML that triggers the crash (55.37 KB, text/plain) 2018-10-04 15:29 UTC, Benjamin Berg	no flags	Details
log file from a login and starting g-c-c with extra debug output (166.08 KB, text/plain) 2018-10-11 15:15 UTC, Benjamin Berg	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
GNOME Gitlab	https://gitlab.gnome.org/GNOME/mutter/issues/330	0	None	None	None	2018-10-04 13:04:28 UTC

Description Benjamin Berg 2018-10-04 11:10:56 UTC

I am seeing a crash which looks somewhat similar to bug #1630943. This issue can still be seen with mutter 3.30.0-3.

Steps to reproduce:
 * Install mutter 3.30.0-3 (otherwise you will run into #1630943 instead)
 * In GDM, select X11 session
 * Log in user
 * Try to modify the display configuration in control-center

I get crashes from gnome-shell. There isn't much in the logs, and it appears the triggering calls are coming out of gjs.

Some selected log messages:

Oct 04 10:58:14 ben-x1 kernel: gnome-shell[2318]: segfault at 18 ip 00007fda2b1c6d89 sp 00007ffee0fce1f0 error 4 in libmutter-3.so.0.0.0[7fda2b195000+ea000]
Oct 04 10:58:14 ben-x1 kernel: Code: e8 9c ef fd ff 85 c0 74 e8 48 89 ef e8 70 76 fe ff 4c 89 e7 48 89 c6 e8 05 cc fd ff 85 c0 74 d1 48 89 ef e8 e9 76 fe ff 5b 5d <8b> 40 18 41 5>
Oct 04 10:58:15 ben-x1 gnome-session[2221]: gnome-session-binary[2221]: WARNING: Application 'org.gnome.Shell.desktop' killed by signal 11
Oct 04 10:58:16 ben-x1 systemd-coredump[3106]: Process 2318 (gnome-shell) of user 1000 dumped core.
                                               
                                               Stack trace of thread 2318:
                                               #0  0x00007fda2b1c6d89 meta_monitor_manager_get_monitor_for_connector (libmutter-3.so.0)
                                               #1  0x00007fda2a543ace ffi_call_unix64 (libffi.so.6)
                                               #2  0x00007fda2a54348f ffi_call (libffi.so.6)
                                               #3  0x00007fda2b4f6781 n/a (libgjs.so.0)
                                               #42 0x00007fda2be9132a g_signal_emit_valist (libgobject-2.0.so.0)
                                               #43 0x00007fda2be91923 g_signal_emit (libgobject-2.0.so.0)
                                               #44 0x00007fda2b528029 n/a (libgjs.so.0)
                                               #45 0x00007fda2bf94396 n/a (libgio-2.0.so.0)
                                               #46 0x00007fda2bf7bb20 n/a (libgio-2.0.so.0)
                                               #47 0x00007fda2bd8fb7b n/a (libglib-2.0.so.0)
                                               #48 0x00007fda2bd9326d g_main_context_dispatch (libglib-2.0.so.0)
                                               #49 0x00007fda2bd93638 n/a (libglib-2.0.so.0)
                                               #50 0x00007fda2bd93962 g_main_loop_run (libglib-2.0.so.0)
                                               #51 0x00007fda2b1fc1b0 meta_run (libmutter-3.so.0)
                                               #52 0x0000556ca92f3b96 n/a (gnome-shell)
                                               #53 0x00007fda2af76413 __libc_start_main (libc.so.6)
                                               #54 0x0000556ca92f3cee n/a (gnome-shell)

Comment 1 Jonas Ådahl 2018-10-04 11:13:24 UTC

Can I have a backtrace with debug symbols too?

Comment 2 Benjamin Berg 2018-10-04 11:49:14 UTC

So, it appears to only happen if I use a specific monitors.xml. When that is used, I am only presented with one monitor after logging in.

gdb says the following:

0x00007f3899cd7d89 in meta_monitor_manager_get_monitor_for_connector (manager=<optimized out>, connector=0x563947bb3100 "eDP-1") at backends/meta-monitor-manager.c:2894
2894            return meta_monitor_get_logical_monitor (monitor)->number;

Comment 3 Benjamin Berg 2018-10-04 11:50:33 UTC

The crash happens immediately when opening gnome-control-center in the display panel. I guess it has to do with g-c-c submitting the current configuration back for verification.

Comment 4 Benjamin Berg 2018-10-04 11:56:12 UTC

Created attachment 1490485 [details]
DBus monitor

I guess this could be interesting, as the query to check the configuration seems to trigger the crash.

Oh, maybe this is related to the monitor label showing in gnome-shell!

Comment 5 Jonas Ådahl 2018-10-04 13:01:35 UTC

Ah, it'll try to show the label on the disabled monitor. Could be a race condition, as in that g-c-c reads the state where the lid is open, shows the labels, lid is closed, still wants to show the eDP-1 label, but since it's not active, things go bad. We should probably just NULL check to avoid that issue.

Comment 6 Jonas Ådahl 2018-10-04 14:43:03 UTC

Yes, the crash comes from the monitor label showing I think. The issue seems to be that something is causing a monitor (I guess the laptop panel) to be technically active (has a mode) but without it being really active (be displayed). This is done when the lid is closed when there are no external monitors, but when there are external monitors, the laptop panel should be deactivated for real.

Comment 7 Jonas Ådahl 2018-10-04 15:26:14 UTC

Could you attach your monitors.xml?

Comment 8 Benjamin Berg 2018-10-04 15:29:08 UTC

Created attachment 1490633 [details]
Monitors XML that triggers the crash

Comment 9 Adam Williamson 2018-10-04 18:46:26 UTC

jadahl: from comment #6, could this have the same root cause as https://bugzilla.redhat.com/show_bug.cgi?id=1630367 , possibly? that one seems to fail in X modesetting, so perhaps it's this same problematic 'technically active' monitor - we wind up trying to set a mode for it when we shouldn't, and X fails on that?

Comment 10 Jonas Ådahl 2018-10-05 06:33:43 UTC

(In reply to Adam Williamson from comment #9)
> jadahl: from comment #6, could this have the same root cause as
> https://bugzilla.redhat.com/show_bug.cgi?id=1630367 , possibly? that one
> seems to fail in X modesetting, so perhaps it's this same problematic
> 'technically active' monitor - we wind up trying to set a mode for it when
> we shouldn't, and X fails on that?

I don't think they are related. That bug seems to be about the X server not being able to set a mode, while this is hardware state representation internal to mutter that seems to have some issue.

Comment 11 Adam Williamson 2018-10-05 06:48:43 UTC

well, my thought was that there must a *reason* X cannot set a mode, and perhaps that reason is that it's being asked to set a mode on a display it shouldn't be asked to set a mode on (the one that's not really active).

Comment 12 Jonas Ådahl 2018-10-05 09:50:40 UTC

Benjamin, could you try https://koji.fedoraproject.org/koji/taskinfo?taskID=30057987 ? I haven't managed to reproduce locally, but I have a suspicion at least.

Comment 13 Benjamin Berg 2018-10-09 08:17:44 UTC

Hmm, looks like the crash is still happening with mutter-3.30.0-4.fc29 from the koji build.

Thread 1 "gnome-shell" received signal SIGSEGV, Segmentation fault.
0x00007f1b834e5d89 in meta_monitor_manager_get_monitor_for_connector (manager=<optimized out>, 
    connector=0x563ff4bc0ce0 "eDP-1") at backends/meta-monitor-manager.c:2894
2894	        return meta_monitor_get_logical_monitor (monitor)->number;

Comment 14 Benjamin Berg 2018-10-09 12:16:25 UTC

Hm, not sure if this is clear.

With the monitors.xml in question the initial configuration is broken. i.e. I only get one of my two external displays enabled in the first place rather than both.

Comment 15 Benjamin Berg 2018-10-11 15:15:22 UTC

Created attachment 1492958 [details]
log file from a login and starting g-c-c with extra debug output

Some notes from IRC:

16:44 < jadahl> benzea: thanks, that's the correct crash :)
16:47 < jadahl> benzea: which ones of your monitors are turned on when it crashes?
16:48 < benzea> jadahl: the smaller DP one is turned on, the larger is off
16:48 < benzea> the builtin one, not sure, could try checking without lifting it enough to open the lid
16:49 < benzea> so DELL U2515H is off
16:50 < jadahl> do you know whether /usr/libexec/gdm-x-session[30804] is the one of the login session or gdms?
16:50 < jadahl> or do gdm use wayland?
16:50 < benzea> pretty sure that is the login session
16:50 < benzea> I ran journalctl as the test user there
16:50 < jadahl> then the issue seems to be that the driver fails to modeset
16:50 < jadahl> and mutter assumes it succeded, resulting in the wrong idea of what is what
16:52 < benzea> yeah, 30804 is the login session, and gdm should be on wayland
16:52 < benzea> hm, not sure if that might be relevant, but the DELL U2515H which does not turn on is the first in the DP daisy chain

Comment 16 Ben Cotton 2019-10-31 20:38:49 UTC

This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 17 Kamil Páral 2019-11-13 13:42:39 UTC

I tested this in Fedora 31 and it looks resolved.

Comment 18 Jonas Ådahl 2019-11-13 14:02:35 UTC

Closing as per comment 17.

Note You need to log in before you can comment on or make changes to this bug.