Description of problem: Machine freezes after password entered into GDM login screen when booted with any 4.8.x kernel. When booting debug kernel, GDM is never launched. Booting to runlevel 3 allows console login, but shortly afterward one CPU core hangs and the system becomes unresponsive. Console shows kernel messages: Dec 06 08:59:28 libra kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:381] Dec 06 08:59:56 libra kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:381] Dec 06 09:00:28 libra kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:381] Dec 06 09:00:56 libra kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:381] Dec 06 09:01:24 libra kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:381] System requires a hard boot at this point. Kernels 4.7.10-100 and earlier operate normally. Machine has been in continuous use since 10/2015. Version-Release number of selected component (if applicable): 4.8.8-100 4.8.10-100 4.8.11-100 How reproducible: Boot machine, choose any 4.8.x kernel from GRUB menu. Actual results: System "freezes" upon attempted login to GDM. Grey screen, mouse pointer and KB lock keys non-responsive. Expected results: Normal XFCE desktop session. Additional info: Lockup appears to be triggered by attempt to login (when in GUI mode). Prior to login attempt, I can ssh to the machine and perform console commands indefinitely. Once login is attempted, remote ssh session remains responsive for a window of a few minutes (I was running a tail on the syslog), then the connection was dropped. Upon hard reboot, journalctl showed that the kernel was still logging events, including the hard shutdown via power button. Attached tarball contains 3 complete journalctl dumps as follows: 4.8.11.out: failure when booting with latest kernel image 4.8.11_debug.out: failure when booting with latest debug kernel image 4.7.10.out: successful boot with last 4.7 kernel image
Created attachment 1228629 [details] journalctl dumps from 3 boot attempts (two failed, one OK)
May be same or similar to issue reported in Bug #1397864 (https://bugzilla.redhat.com/show_bug.cgi?id=1397864) # lspci|grep VGA 00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06) 01:00.0 VGA compatible controller: NVIDIA Corporation GK107GLM [Quadro K1100M] (rev a1) # lsmod|grep i915 i915 1302528 43 i2c_algo_bit 16384 2 i915,nouveau drm_kms_helper 143360 2 i915,nouveau drm 344064 13 ttm,i915,drm_kms_helper,nouveau video 40960 3 i915,thinkpad_acpi,nouveau BIOS is configured to use discrete gfx (NVIDIA) for external monitor, but this machine is always using the native TFT panel only.
Tested with latest kernel builds (4.8.12-100 & 4.8.13-100) and the GDM login screen never even appeared. The system appeared to hang with the blue "f" logo on the black background. However, the SSH daemon had loaded and I was able to connect, but after I entered my password, the remote console became unresponsive. I attempted to open a second SSH connection, but there was no response to the connect request. More correctly, the listening port 22 appears to accept the connection, but no data is exchanged (I'm able to manually telnet to tcp/22, but there's no interaction). When booting runlevel 3, no login prompt appears, but I can see repeating kernel messages: [time+000] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87] [time+028] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87] [time+056] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87] [time+084] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87] [time+112] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87] [time+140] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87] [time+155] INFO: rcu_sched self-detected stall on CPU5-...: (240002 ticks this GP) idle=425/140000000000000001/0 softirq=850/850 fqs=59352 (t=240003 jiffies g=1655 c=1654 q=0) ^^^ This is an example from kernel 4.8.12-100. The stalled CPU# differs with each kernel image booted.
Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
Reopened against Fedora 24 Issue persists in all released kernels up to and including 4.8.15-200.fc24.x86-64
Issue appears to have been resolved with kernel update 4.8.16-200.fc24.x86-64