Bug 1292423 - current linux-firmware causes system to lockup
Summary: current linux-firmware causes system to lockup
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: linux-firmware
Version: 23
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: David Woodhouse
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-12-17 12:07 UTC by Michael Godfrey
Modified: 2015-12-28 15:23 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-12-28 15:23:36 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg from current (good) system (69.15 KB, text/plain)
2015-12-17 13:29 UTC, Michael Godfrey
no flags Details
dmesg from current system (69.15 KB, text/plain)
2015-12-17 13:34 UTC, Michael Godfrey
no flags Details

Description Michael Godfrey 2015-12-17 12:07:36 UTC
Description of problem:
linux-firmware-20151214-60.gitbbe4917c.fc23.noarch causes screen, mouse,
and keyboard to freeze.

Version-Release number of selected component (if applicable):
linux-firmware-20151214-60.gitbbe4917c.fc23.noarch

How reproducible:
always

Steps to Reproduce:
1. dnf update to install above version of linux-firmware
2. Use system for up to an hour or 2.
3.

Actual results:
freeze

Expected results:
normal use

Additional info:
The hardware is an Intel NUC5i5RYH. The keyboard and mouse are USB wireless
made by Havic 2.4GHz Wireless. The screen is a Samsung on HDMI.

When the system freezes I can ssh into the system from another
machine and operation seems normal. But, "shutdown -r now" disconnects
the remote login but the system does not reboot. Power cycle is requires.

Reverting to: linux-firmware-20150904-56.git6ebf5d57.fc23.noarch
(Done by using dnf downgrade) brings the system back to normal.
When I ran the dnf update that updated linux-firmware, many iwlxx packages
were installed. I left these installed and this has worked.
So, the problem definitely seems to be in linux-firmware.
Also, my NUC system is at the current BIOS level and this has most
recently been updated about a week or 2 ago. I use this machine for my
daily work and normally have no problems.

Comment 1 Josh Boyer 2015-12-17 12:55:11 UTC
Can you attach the output of dmesg or the journal output from the working and non-working boots?  It is rather rare to have linux-firmware prevent a system from booting, particularly if you left the iwlwifi subpackages at the most recent release.

Comment 2 Michael Godfrey 2015-12-17 13:17:44 UTC
I should add one fact: When the system is "frozen" the monitor is
still being refreshed. I can unplug it or power it off and when
turned back on the frozen screen reappears. Only powering off
the computer itself clears the screen.

Comment 3 Michael Godfrey 2015-12-17 13:29:01 UTC
Created attachment 1106722 [details]
dmesg from current (good) system

Here is the dmesg from the current system with the "good" linux-firmware.
I will try to find a window for going through the install the
failing linux-firmware, capture dmesg  after hang, etc. ...
Tell me if the attached is sufficient.

I do think that the fact that the system is driving the monitor
when hung is possibly useful.

Comment 4 Michael Godfrey 2015-12-17 13:34:33 UTC
Created attachment 1106723 [details]
dmesg from current system

second attempt to provide attachment

Comment 5 Michael Godfrey 2015-12-23 13:06:26 UTC
Today I installed the latest kernel: Linux pbdsl4 4.2.8-300.fc23.x86_64 
but first ran it with the old linux-firmware. This worked without problems.

Now I have just updated to the latest linux-firmware:
linux-firmware-20151214-60.gitbbe4917c.fc23.noarch

The system has run normally for about an hour. It may be a bit
early to say that the hang problem is gone, but it seems likely.

One thing to note. In the dmesg after booting back to the old
linux-firmware the line:
[drm:gen8_irq_handler [i915]] *ERROR* The master control interrupt lied (SDE)!
appeared. No *ERROR* have appeared in any of the other dmesg outputs.

So, one possibility is that something went wrong in the BIOS which was
cleared at some point. I think that I power cycled the Intel box at some
point, but I did no keep a record of exactly when.

In any case, the hangs could have been caused by some IRQ failure in
the BIOS. I am not attaching any of the dmesgs that I captured since
they look normal.

I will post another report after a day or so, or if there is another
hang. In the meantime it appears that this was a transient BIOS-related
failure.

Comment 6 Michael Godfrey 2015-12-23 22:36:47 UTC
I rebooted, checked dmesg, up for several hours.
It is clear that this was not a problem in linux-firmware.
So, OK to close this report.

Thanks.


Note You need to log in before you can comment on or make changes to this bug.