Bug 1195074 - Fedora 21 + Gnome 3.14 + nvidia 304.125 gdm black screen, no gnome session
Fedora 21 + Gnome 3.14 + nvidia 304.125 gdm black screen, no gnome session
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: libselinux (Show other bugs)
rawhide
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Daniel Walsh
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-02-22 12:56 EST by mm19827
Modified: 2015-05-14 09:58 EDT (History)
13 users (show)

See Also:
Fixed In Version: libselinux-2.3-9.fc21
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-04-28 09:10:11 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Xorg log with kdm running (14.36 KB, text/plain)
2015-02-22 12:56 EST, mm19827
no flags Details
Xorg log with gdm (19.35 KB, text/x-vhdl)
2015-02-22 12:57 EST, mm19827
no flags Details
glxinfo output (42.68 KB, text/plain)
2015-02-22 12:58 EST, mm19827
no flags Details
.xsession-errors (66.65 KB, text/plain)
2015-02-22 12:59 EST, mm19827
no flags Details
patch for libselinux (2.11 KB, patch)
2015-04-17 09:45 EDT, Stephen Smalley
no flags Details | Diff

  None (edit)
Description mm19827 2015-02-22 12:56:28 EST
Created attachment 994288 [details]
Xorg log with kdm running

Description of problem:

Upgraded from FC21 to FC21 x86_64 (via fedup)

kdm starts up correctly, login to GNOME shell fails, with on screen message "oh no, something has gone wrong..." after a couple of minutes.
Xorg.0.log reports no error, ~/.xsession-errors reports nothing.
After running gnome-session with --debug (in gnome.desktop) ~/.xsession-errors reports debug output, but no errors, instead "emitting SessionIsActive" is logged.
GNOME was starting up correctly on FC20

KDE and xfce both start up correctly.
glxinfo shows GLX loads correctly.

gdm does not start up, only black screen. No errors in journalctl for Xorg.bin

Version-Release number of selected component (if applicable):
Fedora 21
GNOME 3.14
nvidia drivers 304.125 (from nvidia, not rpmfusion)
NVIDIA Quadro FX 1400 (NV41: old, but should be supported)

How reproducible:
always

Using KDM:
Steps to Reproduce:
1. start kdm
2. select GNOME shell
3. try to log in

Actual results:
"oh no, something has gone wrong" on screen message after a couple of minutes. No errors logged.

Expected results:
GNOME shell shows up

Using GDM:
Steps to Reproduce:
1. start gdm

Actual results:
Black screen. No errors logged

Expected results:
GNOME login screen shows up

Additional info:
Made a quick try with nouveau, but failed to load GLX (undefined symbol in nouveau driver).
Comment 1 mm19827 2015-02-22 12:57:38 EST
Created attachment 994289 [details]
Xorg log with gdm
Comment 2 mm19827 2015-02-22 12:58:27 EST
Created attachment 994290 [details]
glxinfo output
Comment 3 mm19827 2015-02-22 12:59:33 EST
Created attachment 994291 [details]
.xsession-errors
Comment 4 mm19827 2015-02-22 16:15:49 EST
Trying to get some logging for gnome-shell
gnome-shell --help hangs

gnome-shell 3.14.3
Comment 5 mm19827 2015-02-26 15:23:33 EST
gnome-shell --version  hangs at 100% CPU

backtrace:
(gdb) start --version
Temporary breakpoint 1 at 0x402040
Starting program: /usr/bin/gnome-shell --version
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
^C
Program received signal SIGINT, Interrupt.
0x0000003c36e123d8 in tls_get_addr_tail () from /lib64/ld-linux-x86-64.so.2
(gdb) bt
#0  0x0000003c36e123d8 in tls_get_addr_tail () at /lib64/ld-linux-x86-64.so.2
#1  0x0000003c3920af63 in getprocattrcon_raw () at /lib64/libselinux.so.1
#2  0x0000003c3920a6ce in is_selinux_enabled () at /lib64/libselinux.so.1
#3  0x0000003c45ca5c06 in  () at /lib64/libGL.so.1
#4  0x0000003c45c84c7b in  () at /lib64/libGL.so.1
#5  0x0000003c36e0ff0d in call_init.part () at /lib64/ld-linux-x86-64.so.2
#6  0x0000003c36e1005b in _dl_init_internal () at /lib64/ld-linux-x86-64.so.2
#7  0x0000003c36e00d2a in _dl_start_user () at /lib64/ld-linux-x86-64.so.2
#8  0x0000000000000002 in  ()
#9  0x00007fffffffdfd1 in  ()
#10 0x00007fffffffdfe6 in  ()
#11 0x0000000000000000 in  ()

This problem does not occur if selinux is disabled (permissive still fails), and gnome starts up correctly
Comment 6 adlo 2015-03-03 12:08:20 EST
I can confirm that GDM starts correctly when SELinux is disabled, but does not start correctly and shows a black screen when SELinux is enabled.
Comment 7 zimon 2015-03-09 13:27:04 EDT
bug 1198915 seems to be concerning the same issue. Both gdm and also gnome-session (started without gdm) fail same way (ie. no mouse pointer).
Comment 8 mm19827 2015-03-09 15:25:12 EDT
Reverted to the nouveau driver more thoroughly than I did earlier. Removed all nvidia OpenGL libraries, and reinstalled all mesa libraries.
Gnome shell is starting up correctly, and I have been using it for a few days now.
It looks like the problem could be in the nvidia OpenGL library not linking properly to the latest glibc/libselinux?
Comment 9 zimon 2015-03-10 09:23:12 EDT
(In reply to mm19827 from comment #8)
> Gnome shell is starting up correctly, and I have been using it for a few
> days now.

Have you tried GNOME Wayland session and does it also work with nouveau?
Comment 10 mm19827 2015-03-10 15:26:27 EDT
(In reply to zimon from comment #9)
> (In reply to mm19827 from comment #8)
> > Gnome shell is starting up correctly, and I have been using it for a few
> > days now.
> 
> Have you tried GNOME Wayland session and does it also work with nouveau?

Honestly I am not familiar with Wayland, anyway I gave it a quick try.
A gnome Wayland session started up, but many windows (including gnome terminal) displayed no menu and no title; moreover, after launching a few applications the shell froze up and I had to power down and restart the system.
This seems a different issue than what I reported, though.
Comment 11 Marek Novotny 2015-03-24 18:10:22 EDT
I experience the same issue

Version-Release number of selected component (if applicable):
Fedora 21
GNOME 3.14
nvidia drivers 304.125 (from rpmfusion)
NVIDIA GeForce 9800GT - old driver akmod-nvidia-304xx
01:00.0 VGA compatible controller: NVIDIA Corporation G92 [GeForce 9800 GT] (rev a2)

I do use xfce4 session instead because gnome-shell is not able to start.
Comment 12 mm19827 2015-04-04 13:39:12 EDT
I have been doing some debugging here, execution is stuck in an endless loop in thread local storage handling in the ELF dynamic linker at tls_get_addr_tail in dl-tls.c(742):

========
 again:
  /* Make sure that, if a dlopen running in parallel forces the
     variable into static storage, we'll wait until the address in the
     static TLS block is set up, and use that.  If we're undecided
     yet, make sure we make the decision holding the lock as well.  */
  if (__builtin_expect (the_map->l_tls_offset
			!= FORCED_DYNAMIC_TLS_OFFSET, 0))
    {
      __rtld_lock_lock_recursive (GL(dl_load_lock));
      if (__glibc_likely (the_map->l_tls_offset == NO_TLS_OFFSET))
	{
	  the_map->l_tls_offset = FORCED_DYNAMIC_TLS_OFFSET;
	  __rtld_lock_unlock_recursive (GL(dl_load_lock));
	}
      else
	{
	  __rtld_lock_unlock_recursive (GL(dl_load_lock));
	  if (__builtin_expect (the_map->l_tls_offset
				!= FORCED_DYNAMIC_TLS_OFFSET, 1))
	    {
	      void *p = dtv[GET_ADDR_MODULE].pointer.val;
	      if (__glibc_unlikely (p == TLS_DTV_UNALLOCATED))
		goto again;

	      return (char *) p + GET_ADDR_OFFSET;
	    }
	}
    }
=========

It keeps on looping on the 'goto again;' statement.
This occurs during initialization of libGL.so.1, i.e. libGL.so.304.125, which unfortunately is not open source.
Not sure if the problem is due to nvidia libGL.so.304.125 being not any more compatible with the latest dynamic loader, or if this can reveal anything about the code above.

Need to mention that nvidia libGL.so.304.125 loads and runs correctly with many other executables, including glxgears, and gnome-session (gnome-session --version loads libGL.so.304.125 and runs OK). Yet gnome-shell --version keeps on looping instead.
gnome-shell-3.14.4-2.fc21.x86_64
Comment 13 mm19827 2015-04-11 09:17:44 EDT
I see I need to correct my original post here:

> 
> Upgraded from FC21 to FC21 x86_64 (via fedup)
> 

In fact I upgraded from FC20 x86_64 to FC21 x86_64 (so an upgrade FC20 -> FC21, not FC21 -> FC21, whatever that would be )

Sorry for the typo.
Comment 14 mm19827 2015-04-11 09:34:03 EDT
(In reply to mm19827 from comment #12)

> I have been doing some debugging here, execution is stuck in an endless loop
> in thread local storage handling in the ELF dynamic linker at
> tls_get_addr_tail in dl-tls.c(742):
> 
> ========
>  again:
> ...

Maybe it is worth to add here that my debugging above also revealed that at load time for nvidia libGL.so.304.125 the above code at dl-tls.c is running a single thread.
Comment 15 Stephen Smalley 2015-04-13 16:25:54 EDT
Sounds like http://marc.info/?t=142252697100003&r=1&w=2
Comment 16 Stephen Smalley 2015-04-17 09:45:22 EDT
Created attachment 1015592 [details]
patch for libselinux

Can you test this patch to see if it resolves your bug?
Comment 17 mm19827 2015-04-18 12:31:39 EDT
(In reply to Stephen Smalley from comment #16)
> Created attachment 1015592 [details]
> patch for libselinux
> 
> Can you test this patch to see if it resolves your bug?

I can imagine that libGL.so.304.125 calls some other selinux routine after is_selinux_enabled(), but nevertheless after applying your patch on my system gnome-shell works fine, and I have the gnome desktop up and running.

So I would answer yes to your question, thanks.
Comment 18 Robert Hinson 2015-04-19 13:44:51 EDT
Seems to be happening in the 32 bit version too.

Just a little extra info.
Comment 19 Fedora Update System 2015-04-23 05:20:40 EDT
libselinux-2.3-9.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/libselinux-2.3-9.fc22
Comment 20 Fedora Update System 2015-04-23 05:21:42 EDT
libselinux-2.3-9.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/libselinux-2.3-9.fc21
Comment 21 Fedora Update System 2015-04-24 18:47:33 EDT
Package libselinux-2.3-9.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing libselinux-2.3-9.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-6771/libselinux-2.3-9.fc21
then log in and leave karma (feedback).
Comment 22 mm19827 2015-04-25 13:08:34 EDT
Package libselinux-2.3-9.fc21 tested OK for me.
Comment 23 Ivor Durham 2015-04-25 14:48:30 EDT
Package libselinux-2.3-9.fc21 resolved my issue to. Bug #1198915 is apparently the same issue.
Comment 24 Fedora Update System 2015-04-28 09:10:11 EDT
libselinux-2.3-9.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 25 Russell Odom 2015-05-05 17:09:51 EDT
libselinux-2.3-9.fc21 solved this for me too. Thanks :-)
Comment 26 adlo 2015-05-06 13:05:17 EDT
Bug #1178249 is apparently the same issue.
Comment 27 Fedora Update System 2015-05-10 19:59:20 EDT
libselinux-2.3-9.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 28 Andris Pavenis 2015-05-14 06:55:41 EDT
There seems to more reasons which causes the same symptoms. I have same problem in Fedora 21. There were no improvement from upgrading to to new libselinux package and also from disabling SELINUX (I have xorg-x11-drv-nvidia-346.59-1.fc21.x86_64, GTX-960 connected to display)

Additional observations:

There are problems with configuration reading for Xorg:

/var/log/Xorg.0.log says that configuration directories

[    13.112] (==) Using config directory: "/etc/X11/xorg.conf.d"
[    13.112] (==) Using system config directory "/usr/share/X11/xorg.conf.d"

Are being used. 

However only modules path
[    13.112] (==) ModulePath set to "/usr/lib64/xorg/modules"
is searched even if /etc/X11/xorg.conf.d/99-nvidia.conf contains

Section "Files"
        ModulePath   "/usr/lib64/nvidia/xorg"
        ModulePath   "/usr/lib64/xorg/modules"
EndSection

As result correct GLX module is not found and the Xorg attempts to use the default one and fails. I suspect that this could be cause of problems with Gnome session.

Additionally keyboard layout different from US english is specified in /etc/X11/xorg.conf.d/00-keyboard.conf but it does not get set in KDM login screen (it still seems to use US english).
Comment 29 Andris Pavenis 2015-05-14 08:10:36 EDT
Appending contents of /etc/X11/xorg.conf.d/99-nvidia.conf to /etc/X11/xorg.conf workarounds problem with missing module lookup directory so GLX loads OK. It does not however solve problem with GNOME session (fortunately I'm using KDE)

Keyboard layout problem is not related to this
Comment 30 adlo 2015-05-14 09:58:02 EDT
It sounds like your issue occurs with the 343 driver, whereas this bug was filed for the 304 driver. I'm guessing your issue probably has a different cause as well. Perhaps a separate bug should be filed for this issue.

Note You need to log in before you can comment on or make changes to this bug.