Bug 1288783

Summary: Fedora 23 XFCE Spin :: Dual Intel / nVidia (w/ Optimus) + Bumblebee: X server hangs when exiting ...
Product: [Fedora] Fedora Reporter: nmvega <nmvega>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CANTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 23CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab, nmvega
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-08 13:22:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description nmvega 2015-12-06 00:17:31 UTC
Hello Friends:

FIRST.....: A brief preamble.
SECOND....: A description of the problem
THIRD.....: Various outputs from system commands to help you help me. =:)

===================================================================
[ FIRST ] :: PREAMBLE :
===================================================================
I have an new Lenovo Y700-17ISK laptop (seen here http://www.amazon.com/gp/product/B014MIC1EI).

It's a dual-graphics card laptop, with an integrated Intel HD card hardwired to the display;
and an nVidia GPU that is used on an application-by-application basis to do back-end processing.

When I installed Fedora-23/64 bit on this laptop (in particular the XFCE Spin), ... 
I first performed the traditional steps for disabling the "nouveau" driver; removing the package if
it was installed; then blacklisting it in both /etc/modprobe.d/ and on the kernel GRUB line".

Next, I downloaded and ran the nvidia driver from NVIDIA (user$ sudo ./NVIDIA-Linux-x86_64-358.16.run).

Finally, I followed the steps here -- https://fedoraproject.org/wiki/Bumblebee -- to get the Bumblebee
drivers and packages installed (from the Closed-source, Unmanaged option).
==================================================================


==================================================================
[ SECOND ] :: THE PROBLEM:
==================================================================
I can start up X/XFCE just fine, and run applications normally. In fact, I'm typing this while on
that laptop and with XFCE running. The issue is that when I try to exit out of XFCE **IN ANY WAY,
SHAPE or FORM**, the display hangs. This is true whether I press the "Logout" icon; or if, from a xTerm
terminal, issue "user$ sudo reboot". It always just hangs. Whatever is on the display the moment I
try to exit out of the X session (XFCE) remains on the display, and the unit then becomes
unresponsive. My only recourse after that is a force power off by pressing and holding the
power button.

By the way, I have this set up to multi-boot, and Windows-10 works fine with both the INTEL
and NVIDIA processors, switching between the two seamlessly. I mentioned it above but,
according to the NVIDIA Control Panel on Windows-10, the INTEL card is hardwired to the display.
=================================================================



===========================================================================
[ THIRD ] :: SYSTEM INFORMATION:
===========================================================================

===========================================================================
nmvega@y700$ uname -a; cat /etc/redhat-release
===========================================================================
Linux y700 4.2.6-301.fc23.x86_64 #1 SMP Fri Nov 20 22:22:41 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Fedora release 23 (Twenty Three)  # Installed from Fedora-23 XFCE Spin.
===========================================================================


===========================================
It has a dual-graphics card configuration:
===========================================
(1) An Intel HD card is hardwired to the laptop display.
(2) A Optimus enabled nVidia GPU.

user$ lspci -ks 00.02.0
00:02.0 VGA compatible controller: Intel Corporation Device 191b (rev 06)
	Subsystem: Lenovo Device 3802
	Kernel modules: i915

user$ lspci -ks 01:00.0
01:00.0 3D controller: NVIDIA Corporation GM107M [GeForce GTX 960M] (rev a2)
	Subsystem: Lenovo Device 3802
	Kernel modules: nouveau, nvidia
===========================================


===========================================================================
user$ bumblebee-nvidia --check
===========================================================================
--force compile selected via /etc/sysconfig/nvidia/compile-nvidia-driver
Warning! This NVIDIA driver has not compiled successfully before on kernel 4.2.6-301.fc23.x86_64!
Warning! This NVIDIA driver userland
 /usr/lib64/nvidia-bumblebee/libGL.so.1 library is missing!

nvidia.ko compiled into in the kernel tree ok.
modinfo output for NVIDIA:
filename:       /lib/modules/4.2.6-301.fc23.x86_64/extra/nvidia.ko
alias:          char-major-195-*
version:        358.16  <---- Version of the "NVIDIA.run" blob that I downloaded from NVIDIA.
supported:      external
license:        NVIDIA
srcversion:     38681B6CCC0B032F48069CF
alias:          pci:v000010DEd00000E00sv*sd*bc04sc80i00*
alias:          pci:v000010DEd*sv*sd*bc03sc02i00*
alias:          pci:v000010DEd*sv*sd*bc03sc00i00*
depends:        drm
vermagic:       4.2.6-301.fc23.x86_64 SMP mod_unload 
parm:           NVreg_Mobile:int
parm:           NVreg_ResmanDebugLevel:int
parm:           NVreg_RmLogonRC:int
parm:           NVreg_ModifyDeviceFiles:int
parm:           NVreg_DeviceFileUID:int
parm:           NVreg_DeviceFileGID:int
parm:           NVreg_DeviceFileMode:int
parm:           NVreg_UpdateMemoryTypes:int
parm:           NVreg_InitializeSystemMemoryAllocations:int
parm:           NVreg_UsePageAttributeTable:int
parm:           NVreg_MapRegistersEarly:int
parm:           NVreg_RegisterForACPIEvents:int
parm:           NVreg_CheckPCIConfigSpace:int
parm:           NVreg_EnablePCIeGen3:int
parm:           NVreg_EnableMSI:int
parm:           NVreg_TCEBypassMode:int
parm:           NVreg_MemoryPoolSize:int
parm:           NVreg_RegistryDwords:charp
parm:           NVreg_RmMsg:charp
parm:           NVreg_AssignGpus:charp

Check bbswitch kernel module...
bbswitch is loaded into the current kernel ok.

All NVIDIA checks completed, but there were 1 or more failures...
Try running this script with the --debug option to find clues about what has
gone wrong with the NVIDIA driver compile process.
======================================================================

I'm using the closed-source unmanaged solution

======================================================================
user$ ls -1 /etc/yum.repos.d/bumblebee*
======================================================================
/etc/yum.repos.d/bumblebee-nonfree-unmanaged.repo  <-- closed source, unmanaged distribution.
/etc/yum.repos.d/bumblebee.repo
======================================================================

=================================================================================
user$ rpm -qa | egrep -i "bumblebee-nvidia|bbswitch-dkms|VirtualGL|primus"
=================================================================================
VirtualGL-2.4-5.fc23.i686
VirtualGL-2.4-5.fc23.x86_64
bumblebee-nvidia-2.0-5.fc23.noarch
primus-1.1.03282015-2.fc23.i686
bbswitch-dkms-0.8.0-2.fc23.x86_64
primus-1.1.03282015-2.fc23.x86_64
=================================================================================

===============================================================
user$ lsmod | egrep -i "nouveau|nvidia"
===============================================================
nvidia_modeset        716800  0
nvidia               8749056  1 nvidia_modeset
drm                   335872  4 i915,drm_kms_helper,nvidia
===============================================================


=========================================================================
user$ (cd /etc/modprobe.d; \
       cat blacklist-nvidia.conf bumblebee.conf nvidia-installer-disable-nouveau.conf)
=========================================================================
blacklist nvidia   # blacklist-nvidia.conf

blacklist nvidia   # bumblebee.conf
blacklist nouveau  # bumblebee.conf

blacklist nouveau           # nvidia-installer-disable-nouveau.conf
options nouveau modeset=0   # nvidia-installer-disable-nouveau.conf
=========================================================================


==========================================================================
user$ optirun glxgears -info | grep GL_VENDOR
    [ERROR]The Bumblebee daemon has not been started yet. Is it running?

user$ sudo service bumblebeed start

user$ optirun glxgears -info | grep GL_VENDOR
    GL_VENDOR     = NVIDIA Corporation   <--- Also, an animation with rotating gears shows up.
========================================================================


# =====================================================================
# This command launches the nVidia Settings UI (even though it's not
# physically connected to the display (the Intel card is).
# =====================================================================
user$ optirun -b none nvidia-settings -c :8
# =======================================================================
Xlib:  extension "GLX" missing on display "unix:0.0".
# The nVidia settings GUI launches.

# By the way, is the "GLX" message above significant? You can answer below. =:)
# =====================================================================


Anyone know why this Hang-on-leaving-X/XFCE is happening? It happens 100% of the time.

I have Fedora-23 running in several places (I'm very comfortable with Fedora for years now),
but this is my first time seeing this.

Thank you in advance!

Comment 1 nmvega 2015-12-06 00:56:43 UTC
Correction to above...

Issuing a "user$ sudo reboot" does reboot the system, it just takes about 20 seconds. But doing, say "user$ sudo pkill X" (or killing the equivalent X/Xorg process number), results in a hung, unresponsive display.

I can still ssh into the laptop remotely (the underlying O/S is still running) and reboot it that way. However even trying to kill the X server from a remote login results in a display hang, too.

Comment 2 nmvega 2015-12-06 01:25:06 UTC
More information ...

Just as a temporary trial, I also tried the latest RAWHIDE kernel:
=============================================================
user$ rpm -qa | grep kernel | grep 4.4.0
=============================================================
kernel-modules-extra-4.4.0-0.rc3.git4.1.fc24.x86_64
kernel-modules-4.4.0-0.rc3.git4.1.fc24.x86_64
kernel-devel-4.4.0-0.rc3.git4.1.fc24.x86_64
kernel-core-4.4.0-0.rc3.git4.1.fc24.x86_64
kernel-4.4.0-0.rc3.git4.1.fc24.x86_64
=============================================================

That kernel does allow me to "Exit" the X/XFCE server, but, sadly:

(1) It's a RAWHIDE kernel, and

(2) Compiling the NVIDIA driver (user$ sudo ./NVIDIA-Linux-x86_64-358.16.run) now fails, as shown next. So GPU capability is totally unavailable.

=======================================================================
[ ... /var/log/nvidia-installer.log -- SNIP ... ]
FATAL: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol 'lock_release'
/usr/src/kernels/4.4.0-0.rc3.git4.1.fc24.x86_64/scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[3]: *** [__modpost] Error 1
make[3]: Target '_modpost' not remade because of errors.
/usr/src/kernels/4.4.0-0.rc3.git4.1.fc24.x86_64/Makefile:1391: recipe for target 'modules' failed
make[2]: *** [modules] Error 2
make[2]: Leaving directory '/usr/src/kernels/4.4.0-0.rc3.git4.1.fc24.x86_64'
Makefile:146: recipe for target 'sub-make' failed
make[1]: *** [sub-make] Error 2
make[1]: Target 'modules' not remade because of errors.
make[1]: Leaving directory '/usr/src/kernels/4.4.0-0.rc3.git4.1.fc24.x86_64'
Makefile:81: recipe for target 'modules' failed
make: *** [modules] Error 2
======================================================================


Reverting back to the latest released kernel (this was just a try).

Comment 3 Josh Boyer 2015-12-08 13:22:23 UTC
Fedora does not provide or support proprietary modules.  You will need to take this up with whomever is providing those.