Bug 1402066 - kernel 4.8.x causes system to freeze upon login to GDM
Summary: kernel 4.8.x causes system to freeze upon login to GDM
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 24
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-06 17:21 UTC by Christopher
Modified: 2019-01-09 12:54 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Fedora 23 XFCE x64 Lenovo Thinkpad W541 (Core i7-4810MQ vPro @ 2.8GHz) Type/Model: 20EG-S0RA00 CPU: 4 core + HT (8 threads) CPU family/model/stepping/ucode: 6/60/3/0x21 RAM: 32GB Video: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06) Video: nVidia GK107GLM [Quadro K1100M] (rev a1)
Last Closed: 2017-01-24 18:53:15 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journalctl dumps from 3 boot attempts (two failed, one OK) (122.96 KB, application/x-gzip)
2016-12-06 17:22 UTC, Christopher
no flags Details

Description Christopher 2016-12-06 17:21:49 UTC
Description of problem:
Machine freezes after password entered into GDM login screen when booted with any 4.8.x kernel.  When booting debug kernel, GDM is never launched.

Booting to runlevel 3 allows console login, but shortly afterward one CPU core hangs and the system becomes unresponsive.  Console shows kernel messages:
Dec 06 08:59:28 libra kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:381]
Dec 06 08:59:56 libra kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:381]
Dec 06 09:00:28 libra kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:381]
Dec 06 09:00:56 libra kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:381]
Dec 06 09:01:24 libra kernel: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:381]

System requires a hard boot at this point.


Kernels 4.7.10-100 and earlier operate normally.  Machine has been in continuous use since 10/2015.


Version-Release number of selected component (if applicable):
4.8.8-100
4.8.10-100
4.8.11-100


How reproducible:
Boot machine, choose any 4.8.x kernel from GRUB menu.


Actual results:
System "freezes" upon attempted login to GDM. Grey screen, mouse pointer and KB lock keys non-responsive.


Expected results:
Normal XFCE desktop session.


Additional info:
Lockup appears to be triggered by attempt to login (when in GUI mode).  Prior to login attempt, I can ssh to the machine and perform console commands indefinitely.  Once login is attempted, remote ssh session remains responsive for a window of a few minutes (I was running a tail on the syslog), then the connection was dropped.  Upon hard reboot, journalctl showed that the kernel was still logging events, including the hard shutdown via power button.

Attached tarball contains 3 complete journalctl dumps as follows:

4.8.11.out: failure when booting with latest kernel image
4.8.11_debug.out: failure when booting with latest debug kernel image
4.7.10.out: successful boot with last 4.7 kernel image

Comment 1 Christopher 2016-12-06 17:22:57 UTC
Created attachment 1228629 [details]
journalctl dumps from 3 boot attempts (two failed, one OK)

Comment 2 Christopher 2016-12-07 04:39:51 UTC
May be same or similar to issue reported in Bug #1397864 (https://bugzilla.redhat.com/show_bug.cgi?id=1397864)

# lspci|grep VGA
00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06)
01:00.0 VGA compatible controller: NVIDIA Corporation GK107GLM [Quadro K1100M] (rev a1)

# lsmod|grep i915
i915                 1302528  43
i2c_algo_bit           16384  2 i915,nouveau
drm_kms_helper        143360  2 i915,nouveau
drm                   344064  13 ttm,i915,drm_kms_helper,nouveau
video                  40960  3 i915,thinkpad_acpi,nouveau

BIOS is configured to use discrete gfx (NVIDIA) for external monitor, but this machine is always using the native TFT panel only.

Comment 3 Christopher 2016-12-15 15:40:45 UTC
Tested with latest kernel builds (4.8.12-100 & 4.8.13-100) and the GDM login screen never even appeared.  The system appeared to hang with the blue "f" logo on the black background.

However, the SSH daemon had loaded and I was able to connect, but after I entered my password, the remote console became unresponsive.  I attempted to open a second SSH connection, but there was no response to the connect request.  More correctly, the listening port 22 appears to accept the connection, but no data is exchanged (I'm able to manually telnet to tcp/22, but there's no interaction).

When booting runlevel 3, no login prompt appears, but I can see repeating kernel messages:

[time+000] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87]
[time+028] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87]
[time+056] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87]
[time+084] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87]
[time+112] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87]
[time+140] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:87]
[time+155] INFO: rcu_sched self-detected stall on CPU5-...: (240002 ticks this GP) idle=425/140000000000000001/0 softirq=850/850 fqs=59352 (t=240003 jiffies g=1655 c=1654 q=0)

^^^
This is an example from kernel 4.8.12-100. The stalled CPU# differs with each kernel image booted.

Comment 4 Fedora End Of Life 2016-12-20 21:44:14 UTC
Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 5 Christopher 2017-01-18 13:12:23 UTC
Reopened against Fedora 24

Issue persists in all released kernels up to and including 4.8.15-200.fc24.x86-64

Comment 6 Christopher 2017-01-19 20:00:02 UTC
Issue appears to have been resolved with kernel update 4.8.16-200.fc24.x86-64


Note You need to log in before you can comment on or make changes to this bug.