Bug 1780800

Summary:

[drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out

Product:

[Fedora] Fedora

Reporter:

Chris Murphy <bugzilla>

Component:

kernel

Assignee:

Kernel Maintainer List <kernel-maint>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

urgent

Docs Contact:

Priority:

urgent

Version:

CC:

airlied, amessina, asogukpi, bhoefer, bskeggs, cbredesen, chemobejk, diego.ce, dimhen, dkaylor, fahmed, fcami, fweimer, goodmirek, hdegoede, hkario, ichavero, itamar, ivica.perovic, iweiss, jarodwilson, jcubic, jeremy, jforbes, jglisse, jlmagee, john.j5live, jonathan, josef, kernel-maint, linville, lists, lmiccini, lslebodn, mailinglists35, marcel.raad, marko.bevc, masami256, massi.ergosum, mchehab, mihai, mjg59, mparkins, myeservices+fedoraproject.org, oholy, pachoramos1, pep, pweil, rafsoon, redhat, redhat, sanjay.ankur, sb, sbroz, seldridg, steved, szidek, vitaly, votava, vrutkovs, youling257

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2020-02-24 08:46:04 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
dmesg	none

Description Chris Murphy 2019-12-07 00:50:29 UTC

Created attachment 1642785 [details]
dmesg

1. Please describe the problem:

Complete GUI hang, cannot switch to tty, and cannot ssh into the machine either.


2. What is the Version-Release number of the kernel:
5.4.2-300.fc31.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

I've never had a full lockup like this until 5.4.2-300.fc31.x86_64, but only minimal testing with 5.4.0-2.fc32.x86_64.



4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Uncertain so far.


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Uncertain of scope.


6. Are you running any modules that not shipped with directly Fedora's kernel?:

No.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Comment 1 Chris Murphy 2019-12-07 00:51:25 UTC

00:02.0 VGA compatible controller [0300]: Intel Corporation Skylake GT2 [HD Graphics 520] [8086:1916] (rev 07) (prog-if 00 [VGA controller])
	Subsystem: Hewlett-Packard Company Device [103c:81a0]

model name	: Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz

Comment 2 Chris Murphy 2019-12-07 01:09:29 UTC

Excerpts for search.


Dec 06 17:39:54 flap.local kernel: i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
Dec 06 17:39:54 flap.local kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Dec 06 17:39:54 flap.local kernel: [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Dec 06 17:39:57 flap.local kernel: Asynchronous wait on fence i915:gnome-shell[1470]:6952 timed out (hint:intel_atomic_commit_ready+0x0/0x50 [i915])

Comment 3 Chris Murphy 2019-12-10 21:24:53 UTC

Remove dup bug ID, replace with actual.

Comment containing commit reference that fixes this:
https://gitlab.freedesktop.org/drm/intel/issues/673#note_359912

Comment 4 Chris Murphy 2019-12-20 03:28:37 UTC

Still happens with 5.4.5-300.fc31.x86_64, but not 5.5.0rc1 or rc2.

Comment 5 Justin M. Forbes 2019-12-23 15:33:45 UTC

The patch was submitted to stable and rejected because it doesn't apply to 5.4.  I will give it a little time to see if it is properly backported before doing a 5.4.6 build.

Comment 6 youling257 2019-12-25 03:40:33 UTC

I have similar problem with kernel 5.5 rc3.

[  541.644847] Asynchronous wait on fence i915:surfaceflinger[1495]:119c8 timed out (hint:intel_atomic_commit_ready+0x0/0x50 [i915])
[  546.268573] i915 0000:00:02.0: GPU HANG: ecode 8:1:0x84dfbffe, in surfaceflinger [1495], stopped heartbeat on rcs0
[  546.268622] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  546.268689] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  546.268755] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  546.268821] The GPU crash dump is required to analyze GPU hangs, so please always attach it.
[  546.268887] GPU crash dump saved to /sys/class/drm/card0/error
[  546.372596] i915 0000:00:02.0: Resetting rcs0 for stopped heartbeat on rcs0

Comment 7 Ingo Weiss 2020-01-06 16:00:10 UTC

Hi,

I'm experiencing the same issue on Fedora 31 with kernel 5.4.7-200.

Computer: Lenovo ThinkPad T580
GPU: Intel UHD 620
   00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (rev 07) (prog-if 00 [VGA
    controller])                                                                                  
           Subsystem: Lenovo Device 225a                                                          
           Flags: bus master, fast devsel, latency 0, IRQ 153                                     
           Memory at eb000000 (64-bit, non-prefetchable) [size=16M]                               
           Memory at a0000000 (64-bit, prefetchable) [size=256M]                                  
           I/O ports at e000 [size=64]                                                            
           [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]                             
           Capabilities: <access denied>                                                          
           Kernel driver in use: i915                                                             
           Kernel modules: i915
CPU: Intel Core i7-8650U
BIOS version: N27ET36W (1.22)

Reverting to 5.3.16-300 for the time being since it doesn't have this issue.

Comment 8 Chris Murphy 2020-01-06 16:43:01 UTC

Upstream issue reports that backporting the fix from 5.5 to 5.4 is non-trivial. And now there are a few attempts at reverting the change that introduced the problem, so even the revert is apparently not straightforward. Skylake and Kabylake CPUs are affected, but I'm not sure if it's all or a subset of those.

Comment 9 Diego Vasconcelos 2020-01-07 23:50:24 UTC

kernel: 5.4.8-200.fc31.x86_64
CPU: i5-8400
GPU: UHD Intel 630

i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
drm/i915 developers can then reassign to the right component if it's not a kernel issue.
jThe GPU crash dump is required to analyze GPU hangs, so please always attach it.
GPU crash dump saved to /sys/class/drm/card0/error
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
i915 0000:00:02.0: Resetting chip for hang on rcs0
[drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}

Comment 10 Mirek Svoboda 2020-01-08 14:16:43 UTC

I experience the same issue. It never happened with 5.3 kernel.

My kernel:

```
Jan 08 10:04:23 localhost.localdomain kernel: microcode: microcode updated early to revision 0xca, date = 2019-09-26
Jan 08 10:04:23 localhost.localdomain kernel: Linux version 5.4.7-200.fc31.x86_64 (mockbuild.fedoraproject.org) (gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)) #1 SMP Tue Dec 31 22:25:12 UTC 2019
Jan 08 10:04:23 localhost.localdomain kernel: Command line: BOOT_IMAGE=(hd0,gpt5)/vmlinuz-5.4.7-200.fc31.x86_64 root=/dev/mapper/luks-b6994190-43c4-42f1-bc49-ab5cd4717038 ro rd.luks.uuid=luks-b6994190-43c4-42f1-bc49-ab5cd4717038 rd.lvm.lv=outer/fedora scsi_mod.use_blk_mq=1 noibrsnoibpb nopti nospectre_v2 nospectre_v1 l1tf=off nospec_store_bypass_disable no_stf_barrier mds=off mitigations=off
```

My hardware is i5-7200U.

Comment 11 Jakub T. Jankiewicz 2020-01-10 20:47:26 UTC

I have similar issue on Fedora 30

kernel: 5.4.8-100.fc30.x86_64.
Hardware: Laptop Dell Inspiron 15 5570 i7-8550U 

But in my case I was able to switch to tty after few tries. It sometimes freezing on ScreenSaver and sometimes don't (it's random). I've just upgraded to Fedora 30 from 29 few days ago, was not having issues with Fedora 29.

Comment 12 Rafal 2020-01-14 16:16:15 UTC

I experienced a complete hang without being able to do anything a few times over the past weeks with several kernel 5.4.X versions. Today, after update to 5.4.10, I experienced a hang which was released after a few seconds. Logs:

Jan 14 16:56:00 x1 kernel: i915 0000:00:02.0: Resetting rcs0 for stuck wait on rcs0
Jan 14 16:56:03 x1 kernel: Asynchronous wait on fence i915:gnome-shell[2073]:2672e timed out (hint:intel_atomic_commit_ready+0x0/0x50 [i915])
Jan 14 16:56:08 x1 kernel: i915 0000:00:02.0: GPU HANG: ecode 9:1:0x85dfbfff, in code [3378], hang on rcs0
Jan 14 16:56:08 x1 kernel: GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jan 14 16:56:08 x1 kernel: Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jan 14 16:56:08 x1 kernel: drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Jan 14 16:56:08 x1 kernel: The GPU crash dump is required to analyze GPU hangs, so please always attach it.
Jan 14 16:56:08 x1 kernel: GPU crash dump saved to /sys/class/drm/card0/error
Jan 14 16:56:08 x1 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0

Hardware:
Lenovo ThinkPad X1 Carbon 5th Gen

$ lspci -vs 00:02
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02) (prog-if 00 [VGA controller])
	Subsystem: Lenovo ThinkPad X1 Carbon 5th Gen
	Flags: bus master, fast devsel, latency 0, IRQ 130
	Memory at eb000000 (64-bit, non-prefetchable) [size=16M]
	Memory at 60000000 (64-bit, prefetchable) [size=256M]
	I/O ports at e000 [size=64]
	[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: i915
	Kernel modules: i915

$ lscpu | grep 'Model name'
Model name:                      Intel(R) Core(TM) i5-7300U CPU @ 2.60GHz

Comment 13 Tadej Janež 2020-01-15 12:15:49 UTC

AFAICS, the backport patch "drm/i915/gt: Detect if we miss WaIdleLiteRestore" has been added to F31 2 days ago:
https://src.fedoraproject.org/rpms/kernel/c/9607b5faaa81022ed8b97f517c766202f9680744?branch=f31

It should be part of kernel-5.4.11-202.fc31:
https://bodhi.fedoraproject.org/updates/FEDORA-2020-3738c94456

And the new kernel-5.4.12-200.fc31:
https://bodhi.fedoraproject.org/updates/FEDORA-2020-e328697628

Comment 14 Chris Murphy 2020-01-16 20:50:03 UTC

In my opinion this patch should be reverted in Fedora kernels. It makes the problem unquestionably worse: it takes longer to experience the problem, but once it happens, it's a hard crash. I can't ssh in. I can't switch to a VT. System gets hot, fans go to max, and I have to force power off.

Comment 15 Jakub T. Jankiewicz 2020-01-16 22:37:21 UTC

I've always install new kernels, not sure which one it was (I think it was 5.4.10-100.fc30.x86_64, I've installed 5.4.11-102.fc30.x86_64 but didn't rebooted the system to take effect), but also got hard crash. I was not able to switch to TTY like previously, with few key hits.

Comment 16 Rafal 2020-01-17 09:14:50 UTC

I experienced hard crashes exactly like you described (including overheating) also with earlier versions, I think in both 5.4.7 and 5.4.8.

Comment 17 Marc 2020-01-20 13:21:20 UTC

Had that issue on Manjaro with KDE. Seems like problem doesn't happen with the LTS Kernel Version 4.19.96-1-MANJARO

Comment 18 Marc 2020-01-20 13:21:47 UTC

Had that issue on Manjaro with KDE. Seems like problem doesn't happen with the LTS Kernel Version 4.19.96-1-MANJARO

Comment 19 Anthony Messina 2020-01-20 22:13:14 UTC

I face similar issues with 5.4.12-200.fc31.x86_64 but NOT with 5.4.10-200.fc31.x86_64

Using xorg-x11-drv-intel-2.99.917-43.20180618.fc31.x86_64

The issue seems to reproduce more easily with google-chrome or KDE kontact (QTWebEngine)

00:02.0 VGA compatible controller: Intel Corporation Iris Plus Graphics 650 (rev 06) (prog-if 00 [VGA controller])
        DeviceName:  CPU
        Subsystem: Intel Corporation Device 2068
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 128
        Region 0: Memory at db000000 (64-bit, non-prefetchable) [size=16M]
        Region 2: Memory at 90000000 (64-bit, prefetchable) [size=256M]
        Region 4: I/O ports at f000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0
                        ExtTag- RBE+
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                         AtomicOpsCtl: ReqEn-
        Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee00018  Data: 0000
        Capabilities: [d0] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Process Address Space ID (PASID)
                PASIDCap: Exec- Priv-, Max PASID Width: 14
                PASIDCtl: Enable- Exec- Priv-
        Capabilities: [200 v1] Address Translation Service (ATS)
                ATSCap: Invalidate Queue Depth: 00
                ATSCtl: Enable-, Smallest Translation Unit: 00
        Capabilities: [300 v1] Page Request Interface (PRI)
                PRICtl: Enable- Reset-
                PRISta: RF- UPRGI- Stopped+
                Page Request Capacity: 00008000, Page Request Allocation: 00000000
        Kernel driver in use: i915
        Kernel modules: i915



[drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.
[drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.
perf: interrupt took too long (2512 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
perf: interrupt took too long (3155 > 3140), lowering kernel.perf_event_max_sample_rate to 63000
perf: interrupt took too long (3946 > 3943), lowering kernel.perf_event_max_sample_rate to 50000
i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
drm/i915 developers can then reassign to the right component if it's not a kernel issue.
The GPU crash dump is required to analyze GPU hangs, so please always attach it.
GPU crash dump saved to /sys/class/drm/card0/error
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
i915 0000:00:02.0: Resetting chip for hang on rcs0
[drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
show_signal_msg: 58 callbacks suppressed
GpuWatchdog[17683]: segfault at 0 ip 00005591410d6ded sp 00007f61f5d9b500 error 6 in chrome[55913d19b000+7171000]
Code: 48 c1 c9 03 48 81 f9 af 00 00 00 0f 87 c9 00 00 00 48 8d 15 a9 5a 9c fb f6 04 11 20 0f 84 b8 00 00 00 be 01 00 00 00 ff 50 30 <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 c1 6d a4 03 01 80 7d 8f 00
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
i915 0000:00:02.0: Resetting chip for hang on rcs0
i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
i915 0000:00:02.0: Resetting chip for hang on rcs0
[drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.
GpuWatchdog[18003]: segfault at 0 ip 0000557e368added sp 00007fc750c74500 error 6 in chrome[557e32972000+7171000]
Code: 48 c1 c9 03 48 81 f9 af 00 00 00 0f 87 c9 00 00 00 48 8d 15 a9 5a 9c fb f6 04 11 20 0f 84 b8 00 00 00 be 01 00 00 00 ff 50 30 <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 c1 6d a4 03 01 80 7d 8f 00
[drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
i915 0000:00:02.0: Resetting chip for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
i915 0000:00:02.0: Resetting chip for hang on rcs0
i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
i915 0000:00:02.0: Resetting chip for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Asynchronous wait on fence i915:kwin_x11[2220]:8ef3a timed out (hint:intel_atomic_commit_ready+0x0/0x50 [i915])
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
i915 0000:00:02.0: Resetting chip for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Asynchronous wait on fence i915:Xorg[1360]:cfa86 timed out (hint:intel_atomic_commit_ready+0x0/0x50 [i915])
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.
Lockdown: Xorg: raw io port access is restricted; see man kernel_lockdown.7
broken atomic modeset userspace detected, disabling atomic
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
i915 0000:00:02.0: Resetting chip for hang on rcs0

Comment 20 Steven Bakker 2020-01-21 12:13:01 UTC

This issue affects me for at least all versions of 5.4.10 and upwards (just happened with 5.4.12 again). I've reverted to a 5.3.16 kernel which is stable from this point of view.

In my case, it always happens when I have an external monitor connected over USB-C through my Dell TB dock. It may take a few minutes, it may take an hour, but eventually the whole desktop will hang.

It happens on both X11 and Wayland.

I have not seen this happen when running without external monitors, but perhaps I've not used it untethered for long enough.

Perhaps unrelated: since the 5.4 kernels, I often have problems suspending (typical case: time to go home, disconnect dock, press power button to put laptop to sleep, fail).

Comment 21 Jakub T. Jankiewicz 2020-01-21 15:33:48 UTC

With 5.4.10-100.fc30.x86_64 I got different type of error. The screen was flickering (like really slow refresh rate) I was able to move the cursor it it wa changing when I hover over input field but it was no responsive in other way (I was not able to move the windows) I was able to switch to TTY and after restarting display-server service, my system was continue running.

Here is end of dmesg:

sty 21 16:00:36 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:00:38 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:00:40 kernel: i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
sty 21 16:00:40 kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0
sty 21 16:00:41 kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
sty 21 16:00:41 kernel: amdgpu: [powerplay] can't get the mac of 5
sty 21 16:00:47 kernel: amdgpu: [powerplay] VI should always have 2 performance levels
sty 21 16:00:47 kernel: amdgpu 0000:01:00.0: GPU pci config reset
sty 21 16:00:48 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:00:50 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:00:52 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:00:56 kernel: i915 0000:00:02.0: Resetting rcs0 for no progress on rcs0
sty 21 16:00:58 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:00 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:02 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:04 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:06 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:08 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:10 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:12 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:14 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:16 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:18 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:20 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:22 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:24 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:26 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:28 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:30 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:32 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:34 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:36 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:38 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:40 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:42 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:01:44 kernel: i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
sty 21 16:01:44 kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0
sty 21 16:01:45 kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
sty 21 16:01:45 kernel: amdgpu: [powerplay] can't get the mac of 5
sty 21 16:01:46 kernel: i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
sty 21 16:01:46 kernel: i915 0000:00:02.0: Resetting chip for no progress on rcs0
sty 21 16:01:52 kernel: amdgpu: [powerplay] VI should always have 2 performance levels
sty 21 16:01:53 kernel: amdgpu 0000:01:00.0: GPU pci config reset
sty 21 16:01:54 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:02:02 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
sty 21 16:18:20 kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
sty 21 16:18:20 kernel: amdgpu: [powerplay] can't get the mac of 5
sty 21 16:18:32 kernel: amdgpu: [powerplay] VI should always have 2 performance levels
sty 21 16:18:32 kernel: amdgpu 0000:01:00.0: GPU pci config reset

few lines before that, I've got this line:

sty 21 15:55:05 kernel: GpuWatchdog[2933]: segfault at 0 ip 000055878691877d sp 00007fdd3139e480 error 6 in chrome[5587829dd000+7170000]
sty 21 15:55:05 kernel: Code: 48 c1 c9 03 48 81 f9 af 00 00 00 0f 87 c9 00 00 00 48 8d 15 19 61 9c fb f6 04 11 20 0f 84 b8 00 00 00 be 01 00 00 00 ff 50 30 <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 f1 6b a4 03 01 80 7d 8f 00

I'm not sure if this is related but from some time I got something with temerature of CPU, not sure if there is something with fan

sty 16 13:15:09 kernel: mce: CPU4: Core temperature above threshold, cpu clock throttled (total events = 17)
sty 16 13:15:09 kernel: mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 17)
sty 16 13:15:09 kernel: mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 20)
sty 16 13:15:09 kernel: mce: CPU5: Package temperature above threshold, cpu clock throttled (total events = 20)
sty 16 13:15:09 kernel: mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 20)
sty 16 13:15:09 kernel: mce: CPU7: Package temperature above threshold, cpu clock throttled (total events = 20)
sty 16 13:15:09 kernel: mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 20)
sty 16 13:15:09 kernel: mce: CPU6: Package temperature above threshold, cpu clock throttled (total events = 20)
sty 16 13:15:09 kernel: mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 20)
sty 16 13:15:09 kernel: mce: CPU4: Package temperature above threshold, cpu clock throttled (total events = 20)

Comment 22 Chris Murphy 2020-01-21 20:53:25 UTC

I've seen the problem with and without an external display connected.

The consistent trigger in my case, is a particular chat app (yakyak). It uses npm and electron. I'd characterize it as "will inevitably result in a GPU hang", whereas I can't right now think of any other activity that does trigger it.
https://github.com/yakyak/yakyak

Comment 23 Christian Kujau 2020-01-22 23:43:32 UTC

I've seen the "hang on rcs0" messages since 5.4.8-200.fc31.x86_64 and they too appear to be triggered by a chat application ("Rambox") using the Electron framework. Now I have 5.4.12-200.fc31.x86_64 and experienced a complete hang of the machine, no log messages and no reaction to sysrq. Netconsole won't work because this laptop is connected via WiFi.

The Freedesktop issue #673 is closed - any chance we can get Linux 5.5 into updates-testing maybe?


# lspci -s 00:02.0 -vvv
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02) (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device 2245
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 129
        Region 0: Memory at eb000000 (64-bit, non-prefetchable) [size=16M]
        Region 2: Memory at a0000000 (64-bit, prefetchable) [size=256M]
        Region 4: I/O ports at e000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0
                        ExtTag- RBE+
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                         AtomicOpsCtl: ReqEn-
        Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee00018  Data: 0000
        Capabilities: [d0] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Process Address Space ID (PASID)
                PASIDCap: Exec- Priv-, Max PASID Width: 14
                PASIDCtl: Enable- Exec- Priv-
        Capabilities: [200 v1] Address Translation Service (ATS)
                ATSCap: Invalidate Queue Depth: 00
                ATSCtl: Enable-, Smallest Translation Unit: 00
        Capabilities: [300 v1] Page Request Interface (PRI)
                PRICtl: Enable- Reset-
                PRISta: RF- UPRGI- Stopped+
                Page Request Capacity: 00008000, Page Request Allocation: 00000000
        Kernel driver in use: i915
        Kernel modules: i915

Comment 24 Chris Murphy 2020-01-23 01:45:33 UTC

5.5 kernels are only available in rawhide, so they won't go to updates-testing, but you can grab them off koji. They install and run fine on F31 (and I imagine F30 too but havne't tested it), which is what I'm doing.
https://koji.fedoraproject.org/koji/packageinfo?packageID=8

My suggestion is grab kernel, kernel-core, kernel-modules arch specific RPMs, and just do
$ sudo dnf install Downloads/*rpm

The ones with git0 in the name are no debug (the debug version is explicitly named), where as git1, git2, git3 are all debug kernels and run a bit slower.

Comment 25 Steven Bakker 2020-01-25 16:13:10 UTC

So, today I tried 5.4.13-201. Took me all of 3 minutes to hang it completely.

At this point, I have given up on the 5.4 kernel series completely. It's garbage. I don't care if it's only the Intel driver that's causing issues. It's making 5.4 unusable and therefore it's garbage.

I tried 5.5 for a while, which works (no hangs), but causes suspend to fail at some point. This is very undesirable, because a running laptop in a snug backpack is a recipe for overheating. The workaround is to shut down when I leave the office and cold boot when I get home. So 5.5 is also garbage, although less toxic than 5.4 (and, incidentally, 5.4 had the smae suspend issues).

I am now back on 5.3.16-300. No hangs, no suspend failures

I'm not installing kernel updates anymore.

Comment 26 Rafal 2020-02-03 10:54:06 UTC

The issue is still observed in 5.4.15, so I'm also staying with 5.3 series for now.

Comment 27 Chris Bredesen 2020-02-03 12:59:07 UTC

I've got this also, with the same error listed above. Happy to provide any debugging information required. Workaround is the same for me, 5.3 kernel.

Comment 31 Stefan Becker 2020-02-08 17:40:50 UTC

So I wasn't imaging things... In my case I've mainly experienced temporary hangs, i.e. I just needed to wait 30-60 seconds for the system to come back. Highly annoying when you are working on something.

I'll install 5.3.16 and 5.5.2 from koji and re-test.

System Information
        Manufacturer: LENOVO
        Product Name: 20NX000EMX
        Version: ThinkPad T490s
...
        SKU Number: LENOVO_MT_20NX_BU_Think_FM_ThinkPad T490s
        Family: ThinkPad T490s

00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (Whiskey Lake) (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device 2286

model name      : Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz

Comment 32 François Cami 2020-02-14 14:24:44 UTC

The 5.5.2 kernel build from https://koji.fedoraproject.org/koji/buildinfo?buildID=1457411 fixed the issue on my Thinkpad T490S which exhibited the issue running 5.4.x.

Comment 33 Chris Bredesen 2020-02-14 14:40:38 UTC

Thank you Francois - did you install only kernel-5.5 or also kernel-modules-*? what version were those? I will probably wait till this hits updates-testing...

Comment 34 François Cami 2020-02-14 14:44:46 UTC

kernel, kernel-core and kernel-modules using rpm -ivh <rpms>, all downloaded from the url above (so same version: 5.5.2-200).

Comment 35 Stefan Becker 2020-02-14 14:45:26 UTC

I'm running kernel-5.5.2-200.fc31.x86_64 from on my T490s, uptime about 6 days (OK, with hibernation periods in-between :-). kernel log doesn't show the error message any more and I haven't experienced any desktop lockups during that time.

Comment 36 Marko Bevc 2020-02-16 17:55:05 UTC

Stefan, exactly same config here and 5.4 still seems broken :(

Is this going to be backported or need to wait until 5.5?

Thanks.

Comment 37 Wayne L. 2020-02-17 11:37:25 UTC

Hi Stefan, Marko

Same here. Installed kernel 5.5.3-200.fc31.x86_64 and running normally for the past 15 minutes as compared to 5.4.x, which crashes within 5 mins of use.

Comment 38 Marko Bevc 2020-02-17 14:18:16 UTC

Yeah, did the same here yesterday. 5.5.x tree seems fine so far.

Comment 39 Jakub T. Jankiewicz 2020-02-17 14:44:53 UTC

Few days ago I've installed 5.0.9 lastest on main fedora repo (as shown by dnf), and I've also got crash. In my case it's little bit longer like a day or two before the crash, but it also happen on stable fedora repo.

I've hoped that installing 5.0.9 will fix the issue, but I guess I need to try 5.5.x,

Comment 40 John L Magee 2020-02-22 14:12:53 UTC

Yesterday morning I installed kernel-*-5.5.5-200.fc31 from koji.  Haven't had a single issue since.  That kernel appears to be in the updates repo this morning.  FWIW, the system seems a bit snappier than it was on the last few 5.4.x kernels.

This is on a Lenovo P51 Thinkpad with this hardware
lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) (rev 05)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 630 (rev 04)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:14.0 USB controller: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller (rev 31)
00:14.2 Signal processing controller: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem (rev 31)
00:15.0 Signal processing controller: Intel Corporation 100 Series/C230 Series Chipset Family Serial IO I2C Controller #0 (rev 31)
00:16.0 Communication controller: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 (rev 31)
00:1b.0 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #17 (rev f1)
00:1c.0 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #1 (rev f1)
00:1c.2 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #3 (rev f1)
00:1c.4 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #5 (rev f1)
00:1d.0 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #9 (rev f1)
00:1d.4 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #13 (rev f1)
00:1f.0 ISA bridge: Intel Corporation CM238 Chipset LPC/eSPI Controller (rev 31)
00:1f.2 Memory controller: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller (rev 31)
00:1f.3 Audio device: Intel Corporation CM238 HD Audio Controller (rev 31)
00:1f.4 SMBus: Intel Corporation 100 Series/C230 Series Chipset Family SMBus (rev 31)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (5) I219-LM (rev 31)
01:00.0 3D controller: NVIDIA Corporation GM107GLM [Quadro M1200 Mobile] (rev a2)
01:00.1 Audio device: NVIDIA Corporation Device 0fbc (rev a1)
02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961
04:00.0 Network controller: Intel Corporation Wireless 8265 / 8275 (rev 78)
3e:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
3f:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader (rev 01)

Comment 41 François Cami 2020-02-24 08:46:04 UTC

The fix is available in 5.5.x kernels which are available now.
Closing due to comments by Stefan, Wayne, Marko, & John + my own experience.
Please feel free to reopen if you are running a 5.5.x Fedora kernel or later and experience the same issue.

Comment 42 Jakub T. Jankiewicz 2020-02-24 09:29:30 UTC

Is new kernel only available for Fedora 31? What about Fedora 30? I've just run `dnf update` and got update with `5.4.21-100.fc30` do I need to install this one to get 5.5?