Bug 1212702

Summary: allow headless gnome-shell start
Product: Red Hat Enterprise Linux 7 Reporter: Vladimir Benes <vbenes>
Component: mutterAssignee: Florian Müllner <fmuellner>
Status: CLOSED ERRATA QA Contact: Desktop QE <desktop-qa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: ayadav, mclasen, mdomonko, otaylor, rstrode, svashisht, tpelka, vrutkovs
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1243856 1273984 (view as bug list) Environment:
Last Closed: 2015-11-19 07:18:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1243856, 1273984, 1273986    
Attachments:
Description Flags
File: backtrace
none
Session log none

Description Vladimir Benes 2015-04-17 07:05:24 UTC
Description of problem:
We need to start gnome shell without monitor attached. We may use dummy driver as a workaround for testing now but this has some implications in policykit not taking session the same way as real one (more details here 1181816).

There is an upstream bug that deals with this issue for gnome 3.16 but it wasn't solved.  
https://bugzilla.gnome.org/show_bug.cgi?id=730551

Version-Release number of selected component (if applicable):
mutter-3.14.2-1.el7.x86_64
gnome-shell-3.14.2-3.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1.obtain a box with supported card (nvidia/intel/amd)
2.disconnect monitor
2.boot to graphical env
3.connect monitor back

Actual results:
crash

Expected results:
no crash


Additional info:

Comment 2 Florian Müllner 2015-07-16 14:59:50 UTC
I should note that I did all testing on a laptop, turning monitors off/on using xrandr, so testing of an actual headless start will be appreciated. However we will need some changes on the shell side as well, so until we get a build for bug 1243856, I expect this to fail.

Comment 3 Michal Domonkos 2015-07-17 11:39:44 UTC
Hey Florian,

I've tried this on a workstation machine with a GPU and no monitors connected but it didn't work.  See https://bugzilla.redhat.com/show_bug.cgi?id=1243856#c2

Comment 4 Ray Strode [halfline] 2015-09-15 16:38:30 UTC
*** Bug 1250128 has been marked as a duplicate of this bug. ***

Comment 5 Ray Strode [halfline] 2015-09-15 16:39:07 UTC
moving back to assigned given comment 3.

Comment 6 Florian Müllner 2015-09-24 11:44:16 UTC
*** Bug 1265316 has been marked as a duplicate of this bug. ***

Comment 7 Vadim Rutkovsky 2015-09-25 10:36:17 UTC
Another user experienced a similar problem:

Crashes on automatic/timed login

Reproduced so far on wlan-r2s36.wlan.rhts.eng.bos.redhat.com only

reporter:       libreport-2.1.11
backtrace_rating: 4
cmdline:        /usr/bin/gnome-shell
crash_function: _meta_window_shared_new
event_log:      
executable:     /usr/bin/gnome-shell
global_pid:     3178
kernel:         3.10.0-319.el7.x86_64
package:        gnome-shell-3.14.4-32.el7
reason:         gnome-shell killed by SIGSEGV
runlevel:       N 3
type:           CCpp
uid:            1000

Comment 8 Vadim Rutkovsky 2015-09-25 10:36:19 UTC
Created attachment 1076988 [details]
File: backtrace

Comment 9 Ray Strode [halfline] 2015-09-25 12:14:42 UTC
just look quickly looking at the backtrace this morning while sipping coffee I see the line that crashes is this one:

g_signal_emit_by_name (window->screen,
                       "window-entered-monitor",
                       window->monitor->number,
                       window);

Window is clearly value, and window->screen is clearly valid or it would have crashed earlier.  That makes me suspect that window->monitor is NULL.

Indeed at the top of the function I see these two lines:

window->monitor = meta_screen_get_monitor_for_window (window->screen, window);

window->preferred_output_winsys_id = window->monitor ?
                                     window->monitor->winsys_id :
                                     -1;

Note how the code checks if window->monitor is NULL and takes countermeasures.  This suggests to me that there is the expectation that window->monitor could be NULL in this function and we need to account for that.

I would need to dive into the code more to confirm that expectation is valid and to find out the right way to account for it.  It might be something easy like just adding

if (window->monitor) above the emit_by_name line.

Comment 10 Florian Müllner 2015-09-25 13:17:57 UTC
(In reply to Ray Strode [halfline] from comment #9)
> It might be something easy
> like just adding
> 
> if (window->monitor) above the emit_by_name line.

For that particular backtrace: yes, and I've had the patch locally for weeks. But I haven't pushed any updates because I still get crashes with this and a couple other obvious issues fixed.

Comment 11 Ray Strode [halfline] 2015-09-25 13:25:58 UTC
So I just noticed the upstream bug has this patch mentioned:

https://github.com/endlessm/mutter/commit/e9fbb10cfa15cb3ce8cc17b4959b0ebdbdca7f5f

Is it sufficient?

Comment 12 Florian Müllner 2015-09-29 15:23:54 UTC
(In reply to Ray Strode [halfline] from comment #11)
> So I just noticed the upstream bug has this patch mentioned:
> 
> https://github.com/endlessm/mutter/commit/
> e9fbb10cfa15cb3ce8cc17b4959b0ebdbdca7f5f
> 
> Is it sufficient?

No, that patch is for 3.8. The original patch on this bug is based on that patch, but no longer sufficient with all the movement that has been going on for wayland.

Comment 13 Ray Strode [halfline] 2015-09-29 17:15:35 UTC
from perusing around the github repo, looks like it was updated to 3.16 here:

https://github.com/endlessm/mutter/commit/d8492f1244169c86a90998b5f8aa6d358c7c501b

Comment 14 Matthias Clasen 2015-10-01 12:21:21 UTC
Florian, did you check that out ?

Comment 15 Florian Müllner 2015-10-01 12:45:08 UTC
(In reply to Matthias Clasen from comment #14)
> Florian, did you check that out ?

Only just now. Unfortunately I don't see anything in there that's not already in our patch, but a couple of issues I've hit in testing that the patch does not cover.

That said, I've fixed a couple more stuff[0], which has allowed me to

 (1) turn the display on/off using xrandr
 (2) start up with all outputs disabled (via xorg.conf snippet)

Of course (2) does not allow to enable the output again to simulate plugging-in a monitor, however GDM was apparently functional as I was able to open a session with <enter>password<enter>.


[0] https://brewweb.devel.redhat.com/taskinfo?taskID=9907057

Comment 16 Matthias Clasen 2015-10-01 14:45:05 UTC
Are we going to include that build in the errata ?

Comment 17 Florian Müllner 2015-10-02 15:51:36 UTC
(In reply to Matthias Clasen from comment #16)
> Are we going to include that build in the errata ?

I was hoping for some testing first :-(

That said, I finally figured out a way to do some limited testing on a laptop:
 - GDM fails to bring up X on F23 without monitors attached, but
   considering the successful VM tests mentioned in comment #15,
   I don't think the version in RHEL is affected
 - a user session using mutter/gnome-shell with RHEL patches start up
   fine without monitor and plugging/unplugging an external monitor
   works

All in all that doesn't look too bad, so I've done a proper build now and included it in the errata.

Comment 18 Vadim Rutkovsky 2015-10-02 19:14:11 UTC
Still occurs on wlan-r2s36.wlan.rhts.eng.bos.redhat.com
See http://faf-report.itos.redhat.com/reports/11414/

Comment 19 Vadim Rutkovsky 2015-10-02 19:50:53 UTC
Created attachment 1079493 [details]
Session log

Comment 20 Michal Domonkos 2015-10-06 09:54:43 UTC
(In reply to Vadim Rutkovsky from comment #18)
> Still occurs on wlan-r2s36.wlan.rhts.eng.bos.redhat.com
> See http://faf-report.itos.redhat.com/reports/11414/

Hey Vadim,

The original issue of gnome-shell crashing has apparently been resolved, see my comment in the related bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1243856#c15

What you were seeing seems to me like a different crash (judging from the backtrace).  The backtrace I was getting
originally with this bug looked like this:
https://bugzilla.redhat.com/show_bug.cgi?id=1265316

Anyway, I reserved this machine and tried to reproduce but I couldn't; gnome-shell didn't crash for me.  I was using
the same gnome-shell and mutter packages as mentioned in the abrt report you linked above.

Comment 21 Vadim Rutkovsky 2015-10-06 12:52:52 UTC
The machine seems to have KVM attached. We'll conduct some more testing to see if this is causing the crash

Comment 22 Michal Domonkos 2015-10-06 16:18:04 UTC
So we managed to reproduce the crash that Vadim first saw in Comment 18 (using a different machine).  It only happens when there's no monitor attached to the video card so it's obviously related to gnome-shell running in headless mode and thus this bug.

Comment 30 Siteshwar Vashisht 2015-11-03 10:15:43 UTC
I am getting similar crashes while trying to access RHEL 7.2 beta machine using Xming. This crash seems to be fixed by updating to gnome-shell-3.14.4-37 and mutter-3.14.4-17.

Comment 31 errata-xmlrpc 2015-11-19 07:18:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2216.html