Bug 1618906
Summary: | [regression] nouveau DRM: EVO timeout in kernel 4.15 or later | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dominik 'Rathann' Mierzejewski <dominik> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | NEW --- | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | rawhide | CC: | acaringi, airlied, ajax, alexviiiag, bskeggs, eugenemah, fedora, frival, hdegoede, ichavero, ifont, infrandomness, itamar, jarodwilson, jelledejong, jeremy, jglisse, john.j5live, jonathan, jordi, josef, kernel-maint, lgoncalv, linville, lou, masami256, mchehab, mitch.special, mjg59, mvazquez, ravvle, sam.baskinger, sergio, steeve.mccauley, steved, yferszt | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | Type: | Bug | |||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Dominik 'Rathann' Mierzejewski
2018-08-18 00:49:23 UTC
Also reproducible on Fedora 28 with kernel 4.17.14-202.fc28. Just for fun, I installed F29 kernel-4.18.1-300.fc29.x86_64 from koji and I get the same issue with slightly different errors with this one: [ 2.902501] nouveau 0000:01:00.0: NVIDIA G98 (298480a2) [ 2.977795] nouveau 0000:01:00.0: bios: version 62.98.3c.00.44 [ 3.010065] nouveau 0000:01:00.0: bios: M0203T not found [ 3.010218] nouveau 0000:01:00.0: bios: M0203E not matched! [ 3.010363] nouveau 0000:01:00.0: fb: 512 MiB DDR2 [ 3.185260] nouveau 0000:01:00.0: DRM: VRAM: 512 MiB [ 3.185262] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB [ 3.185268] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 [ 3.185271] nouveau 0000:01:00.0: DRM: DCB version 4.0 [ 3.185274] nouveau 0000:01:00.0: DRM: DCB outp 00: 01011323 00010034 [ 3.185277] nouveau 0000:01:00.0: DRM: DCB outp 01: 02000300 00000028 [ 3.185280] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022312 00020030 [ 3.185282] nouveau 0000:01:00.0: DRM: DCB conn 00: 00000000 [ 3.185284] nouveau 0000:01:00.0: DRM: DCB conn 01: 00000140 [ 3.185286] nouveau 0000:01:00.0: DRM: DCB conn 02: 00002261 [ 3.185288] nouveau 0000:01:00.0: DRM: DCB conn 07: 00000513 [ 3.195497] nouveau 0000:01:00.0: DRM: MM: using M2MF for buffer copies [ 3.242999] nouveau 0000:01:00.0: DRM: allocated 1440x900 fb: 0x50000, bo (____ptrval____) [ 3.257587] fbcon: nouveaufb (fb0) is primary device [ 5.288414] nouveau 0000:01:00.0: DRM: core notifier timeout [ 7.288412] nouveau 0000:01:00.0: DRM: base-0: timeout [ 7.334784] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device [ 7.343411] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0 [ 86.317620] nouveau 0000:01:00.0: DRM: core notifier timeout [ 88.989806] nouveau 0000:01:00.0: DRM: core notifier timeout [ 90.909693] nouveau 0000:01:00.0: DRM: base-0: timeout Bumping to F29, then. Bumping to rawhide after trying kernel-4.19.0-0.rc0.git5.1: [ 7.193842] nouveau 0000:01:00.0: NVIDIA G98 (298480a2) [ 7.253541] nouveau 0000:01:00.0: bios: version 62.98.3c.00.44 [ 7.301129] nouveau 0000:01:00.0: bios: M0203T not found [ 7.301492] nouveau 0000:01:00.0: bios: M0203E not matched! [ 7.301669] nouveau 0000:01:00.0: fb: 512 MiB DDR2 [ 7.719129] nouveau 0000:01:00.0: DRM: VRAM: 512 MiB [ 7.719498] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB [ 7.719681] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 [ 7.719851] nouveau 0000:01:00.0: DRM: DCB version 4.0 [ 7.720014] nouveau 0000:01:00.0: DRM: DCB outp 00: 01011323 00010034 [ 7.720182] nouveau 0000:01:00.0: DRM: DCB outp 01: 02000300 00000028 [ 7.720387] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022312 00020030 [ 7.720567] nouveau 0000:01:00.0: DRM: DCB conn 00: 00000000 [ 7.720736] nouveau 0000:01:00.0: DRM: DCB conn 01: 00000140 [ 7.720903] nouveau 0000:01:00.0: DRM: DCB conn 02: 00002261 [ 7.721066] nouveau 0000:01:00.0: DRM: DCB conn 07: 00000513 [ 7.738669] nouveau 0000:01:00.0: DRM: MM: using M2MF for buffer copies [ 7.784149] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 7.813006] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1 [ 7.828162] nouveau 0000:01:00.0: DRM: allocated 1440x900 fb: 0x50000, bo (____ptrval____) [ 7.870502] fbcon: nouveaufb (fb0) is primary device [ 7.885068] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 7.898694] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1 [ 9.902963] nouveau 0000:01:00.0: DRM: core notifier timeout [ 11.903058] nouveau 0000:01:00.0: DRM: base-0: timeout [ 11.908408] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 11.947124] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 11.958855] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1 [ 11.961609] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device [ 11.972641] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0 [ 11.978613] #0: (____ptrval____) (drm_connector_list_iter){.+.+}, at: nouveau_backlight_init+0x63/0x450 [nouveau] [ 22.205362] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 32.445359] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 42.685355] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 52.925595] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 63.165373] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 73.405363] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 83.645378] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 93.890397] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 104.125363] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 107.020185] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 107.032965] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1 [ 107.074838] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 107.086752] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1 [ 110.113354] nouveau 0000:01:00.0: DRM: core notifier timeout [ 110.634595] ------------[ cut here ]------------ [ 110.634608] nouveau 0000:01:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000010c412000] [size=4096 bytes] [ 110.634630] WARNING: CPU: 1 PID: 1163 at kernel/dma/debug.c:1230 check_sync+0x136/0x670 [ 110.634634] Modules linked in: ip_set nfnetlink ebtable_nat ebtable_broute ccm bridge stp llc ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables bnep sunrpc arc4 snd_hda_codec_realtek snd_hda_codec_generic ath9k snd_hda_intel ath9k_common snd_hda_codec ath9k_hw snd_hda_core uvcvideo btusb snd_hwdep btrtl snd_seq snd_seq_device btbcm btintel snd_pcm mac80211 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 ath videobuf2_common cfg80211 videodev media bluetooth snd_timer snd coretemp ecdh_generic joydev r592 soundcore asus_laptop memstick sparse_keymap rfkill input_polldev pcc_cpufreq acpi_cpufreq dm_crypt [ 110.634775] nouveau ata_generic pata_acpi firewire_ohci firewire_core mxm_wmi wmi i2c_algo_bit drm_kms_helper sdhci_pci cqhci sdhci ttm sis190 serio_raw mmc_core mii crc_itu_t drm sata_sis pata_sis video [ 110.634823] CPU: 1 PID: 1163 Comm: Xorg Not tainted 4.19.0-0.rc0.git5.1.fc30.x86_64 #1 [ 110.634827] Hardware name: ASUSTeK Computer Inc. X71SL /X71SL , BIOS 206 11/05/2008 [ 110.634832] RIP: 0010:check_sync+0x136/0x670 [ 110.634837] Code: 48 85 ed 75 04 48 8b 68 10 48 8b 3c 24 e8 e2 38 56 00 48 89 c6 4d 89 e8 4c 89 f9 48 89 ea 48 c7 c7 a8 18 30 b1 e8 ee 77 f6 ff <0f> 0b 8b 05 9a 75 85 01 85 c0 0f 84 81 04 00 00 48 83 c4 28 4c 89 [ 110.634841] RSP: 0018:ffffb980412c7a10 EFLAGS: 00010082 [ 110.634847] RAX: 0000000000000000 RBX: ffffffffb2f33410 RCX: 0000000000000006 [ 110.634851] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff9e12fbbd6ba0 [ 110.634855] RBP: ffff9e12f9f82ed0 R08: 0000000000000000 R09: 0000000000000001 [ 110.634859] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000286 [ 110.634863] R13: 0000000000001000 R14: 0000000000010000 R15: 000000010c412000 [ 110.634868] FS: 00007fe0441aeac0(0000) GS:ffff9e12fba00000(0000) knlGS:0000000000000000 [ 110.634873] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 110.634877] CR2: 00007fe03c0c8d90 CR3: 0000000114a6e000 CR4: 00000000000006e0 [ 110.634881] Call Trace: [ 110.634897] debug_dma_sync_single_for_device+0x7b/0x90 [ 110.634915] ? ttm_bo_mem_compat+0x23/0x60 [ttm] [ 110.634925] ? kfree+0x188/0x320 [ 110.634932] ? krealloc+0x25/0xa0 [ 110.635040] nouveau_bo_sync_for_device+0x6a/0xb0 [nouveau] [ 110.635098] nouveau_bo_validate+0x71/0x90 [nouveau] [ 110.635154] nouveau_gem_ioctl_pushbuf+0x8a5/0x1ad0 [nouveau] [ 110.635222] ? nouveau_gem_ioctl_new+0xe0/0xe0 [nouveau] [ 110.635240] ? drm_ioctl_kernel+0xa5/0xf0 [drm] [ 110.635240] ? nouveau_gem_ioctl_new+0xe0/0xe0 [nouveau] [ 110.635240] drm_ioctl_kernel+0xa5/0xf0 [drm] [ 110.635240] drm_ioctl+0x1fc/0x390 [drm] [ 110.635240] ? nouveau_gem_ioctl_new+0xe0/0xe0 [nouveau] [ 110.635240] nouveau_drm_ioctl+0x65/0xc0 [nouveau] [ 110.635240] do_vfs_ioctl+0xa5/0x6e0 [ 110.635240] ksys_ioctl+0x60/0x90 [ 110.635240] __x64_sys_ioctl+0x16/0x20 [ 110.635240] do_syscall_64+0x60/0x1f0 [ 110.635240] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 110.635240] RIP: 0033:0x7fe041422ec7 [ 110.635240] Code: 00 00 90 48 8b 05 d9 7f 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 7f 2c 00 f7 d8 64 89 01 48 [ 110.635240] RSP: 002b:00007ffcd424fc68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 110.635240] RAX: ffffffffffffffda RBX: 0000000000d5ae98 RCX: 00007fe041422ec7 [ 110.635240] RDX: 00007ffcd424fcd0 RSI: 00000000c0406481 RDI: 000000000000000e [ 110.635240] RBP: 00007ffcd424fcd0 R08: 0000000000000000 R09: 0000000000d59f20 [ 110.635240] R10: 0000000000d6be98 R11: 0000000000000246 R12: 00000000c0406481 [ 110.635240] R13: 000000000000000e R14: 0000000000d5a070 R15: 0000000000d59f20 [ 110.635240] irq event stamp: 0 [ 110.635240] hardirqs last enabled at (0): [<0000000000000000>] (null) [ 110.635240] hardirqs last disabled at (0): [<ffffffffb00bb817>] copy_process.part.28+0x747/0x1e70 [ 110.635240] softirqs last enabled at (0): [<ffffffffb00bb817>] copy_process.part.28+0x747/0x1e70 [ 110.635240] softirqs last disabled at (0): [<0000000000000000>] (null) [ 110.635240] ---[ end trace a1450e59d31d3810 ]--- [ 114.365372] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 117.080247] nouveau 0000:01:00.0: DRM: core notifier timeout [ 119.080664] nouveau 0000:01:00.0: DRM: base-0: timeout [ 122.562843] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 122.574617] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1 [ 124.605473] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 134.845626] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 145.085449] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 155.325447] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 165.565443] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 175.805469] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 186.045466] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 196.285425] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 201.777469] nouveau 0000:01:00.0: DRM: base-0: timeout [ 204.052686] nouveau 0000:01:00.0: DRM: base-0: timeout [ 206.087485] nouveau 0000:01:00.0: DRM: base-0: timeout [ 206.525448] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 216.765543] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 227.005455] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 237.245471] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 247.485448] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 257.725448] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 267.965455] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 270.806584] perf: interrupt took too long (2512 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 [ 271.508397] nouveau 0000:01:00.0: DRM: base-0: timeout [ 278.205455] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 288.445453] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 298.685448] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 308.925453] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 319.165443] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 329.405454] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1 [ 331.508453] nouveau 0000:01:00.0: DRM: base-0: timeout (this keeps repeating) I'm seeing this problem on Fedora 28. Kernel 4.18.18-200 works fine. Kernels 4.19.2-200 and 4.19.5-200 fail. The GUI doesn't quite lock up, but it's horribly intermittent & jerky. I'm using the kernel nouveau driver on a GeForce GTX 1070 (GP104). System was updated with `dnf upgrade' for both 4.19.2-200 and 4.19.5-200, but for both I had to drop back to 4.18.18-200 to avoid lockup. I've attached a log from journalctl. All goes ok through the boot (up to 17:02:20) but after playing with firefox (63.0.3) for a few minutes, I get near lock-up and start seeing error messages from 17:07:14. The system remains completely responsive when accessed via ssh. Created attachment 1511170 [details] Journalctl log showing DRM: base-0 timeout error (re. comment #3) Journalctl log promised in comment #3 (Hafer). My first bad commit: [fdba46ffb4c203b6e6794163493fd310f98bb4be] x86/apic: Get rid of multi CPU affinity (in kernel 4.15.0-git2) My second bad commit: [a31e58e129f73ab5b04016330b13ed51fde7a961] x86/apic: Switch all APICs to Fixed delivery mode (in kernel-4.15.0-0.rc6.git1.1) [1] commit message say that fixes fdba46ffb4c2 ("x86/apic: Get rid of multi CPU affinity") [1] https://bugs.freedesktop.org/attachment.cgi?id=141327 The problem persists as of kernel 4.19.6-200.fc28.x86_64. Still have to drop back to 4.18.18-200 to avoid an unusable GUI. The problem persists as of kernel 4.19.10-200.fc28.x86_64. Still have to drop back to 4.18.18-200 to avoid an unusable GUI. I'm also affected by this bug on Fedora 29. I hopped on the kernel mainline and started poking around and noticed that I see this bug on 4.19 but not on 4.20. So I bisected to find the commit in the 4.20 series that fixes the bug. The fix appears to be from Ben Skeggs (the assignee of this bug, go Ben!): commit 970a5ee41c72df46e3b0f307528c7d8ef7734a2e Author: Ben Skeggs <bskeggs> Date: Wed Dec 12 16:51:17 2018 +1000 drm/nouveau/kms/nv50-: also flush fb writes when rewinding push buffer Should hopefully fix a regression some people have been seeing since EVO push buffers were moved to VRAM by default on Pascal GPUs. Fixes: d00ddd9da ("drm/nouveau/kms/nv50-: allocate push buffers in vidmem on pascal") Signed-off-by: Ben Skeggs <bskeggs> Cc: <stable.org> # 4.19+ I can cherry pick just this commit on top of 4.19 and I get a stable system. Looks like this patch just needs to be pulled in to the Fedora kernel. I am also reporting frozen nouveau drivers, it still does something but the screens get unusable. Linux dw093.wdm.local 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux [root@dw093 ~]# cat /etc/centos-release CentOS Linux release 7.6.1810 (Core) [root@dw093 ~]# grep "DRM: core notifier timeout" /var/log/messages* /var/log/messages:Jan 23 09:05:13 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 09:22:51 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 09:34:03 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 09:34:13 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:04:34 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:05:57 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:06:01 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:06:03 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:06:08 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:12:07 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:12:09 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:12:39 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:14:47 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:14:49 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout /var/log/messages:Jan 23 13:14:51 dw093 kernel: nouveau 0000:01:00.0: DRM: core notifier timeout [root@dw093 ~]# modinfo nouveau filename: /lib/modules/3.10.0-957.1.3.el7.x86_64/kernel/drivers/gpu/drm/nouveau/nouveau.ko.xz firmware: nvidia/gp100/gr/sw_method_init.bin firmware: nvidia/gp100/gr/sw_bundle_init.bin firmware: nvidia/gp100/gr/sw_nonctx.bin firmware: nvidia/gp100/gr/sw_ctx.bin firmware: nvidia/gp100/gr/gpccs_sig.bin firmware: nvidia/gp100/gr/gpccs_data.bin firmware: nvidia/gp100/gr/gpccs_inst.bin firmware: nvidia/gp100/gr/gpccs_bl.bin firmware: nvidia/gp100/gr/fecs_sig.bin firmware: nvidia/gp100/gr/fecs_data.bin firmware: nvidia/gp100/gr/fecs_inst.bin firmware: nvidia/gp100/gr/fecs_bl.bin firmware: nvidia/gp100/acr/ucode_unload.bin firmware: nvidia/gp100/acr/ucode_load.bin firmware: nvidia/gp100/acr/bl.bin firmware: nvidia/gm206/gr/sw_method_init.bin firmware: nvidia/gm206/gr/sw_bundle_init.bin firmware: nvidia/gm206/gr/sw_nonctx.bin firmware: nvidia/gm206/gr/sw_ctx.bin firmware: nvidia/gm206/gr/gpccs_sig.bin firmware: nvidia/gm206/gr/gpccs_data.bin firmware: nvidia/gm206/gr/gpccs_inst.bin firmware: nvidia/gm206/gr/gpccs_bl.bin firmware: nvidia/gm206/gr/fecs_sig.bin firmware: nvidia/gm206/gr/fecs_data.bin firmware: nvidia/gm206/gr/fecs_inst.bin firmware: nvidia/gm206/gr/fecs_bl.bin firmware: nvidia/gm206/acr/ucode_unload.bin firmware: nvidia/gm206/acr/ucode_load.bin firmware: nvidia/gm206/acr/bl.bin firmware: nvidia/gm204/gr/sw_method_init.bin firmware: nvidia/gm204/gr/sw_bundle_init.bin firmware: nvidia/gm204/gr/sw_nonctx.bin firmware: nvidia/gm204/gr/sw_ctx.bin firmware: nvidia/gm204/gr/gpccs_sig.bin firmware: nvidia/gm204/gr/gpccs_data.bin firmware: nvidia/gm204/gr/gpccs_inst.bin firmware: nvidia/gm204/gr/gpccs_bl.bin firmware: nvidia/gm204/gr/fecs_sig.bin firmware: nvidia/gm204/gr/fecs_data.bin firmware: nvidia/gm204/gr/fecs_inst.bin firmware: nvidia/gm204/gr/fecs_bl.bin firmware: nvidia/gm204/acr/ucode_unload.bin firmware: nvidia/gm204/acr/ucode_load.bin firmware: nvidia/gm204/acr/bl.bin firmware: nvidia/gm200/gr/sw_method_init.bin firmware: nvidia/gm200/gr/sw_bundle_init.bin firmware: nvidia/gm200/gr/sw_nonctx.bin firmware: nvidia/gm200/gr/sw_ctx.bin firmware: nvidia/gm200/gr/gpccs_sig.bin firmware: nvidia/gm200/gr/gpccs_data.bin firmware: nvidia/gm200/gr/gpccs_inst.bin firmware: nvidia/gm200/gr/gpccs_bl.bin firmware: nvidia/gm200/gr/fecs_sig.bin firmware: nvidia/gm200/gr/fecs_data.bin firmware: nvidia/gm200/gr/fecs_inst.bin firmware: nvidia/gm200/gr/fecs_bl.bin firmware: nvidia/gm200/acr/ucode_unload.bin firmware: nvidia/gm200/acr/ucode_load.bin firmware: nvidia/gm200/acr/bl.bin firmware: nvidia/gm20b/pmu/sig.bin firmware: nvidia/gm20b/pmu/image.bin firmware: nvidia/gm20b/pmu/desc.bin firmware: nvidia/gm20b/gr/sw_method_init.bin firmware: nvidia/gm20b/gr/sw_bundle_init.bin firmware: nvidia/gm20b/gr/sw_nonctx.bin firmware: nvidia/gm20b/gr/sw_ctx.bin firmware: nvidia/gm20b/gr/gpccs_data.bin firmware: nvidia/gm20b/gr/gpccs_inst.bin firmware: nvidia/gm20b/gr/fecs_sig.bin firmware: nvidia/gm20b/gr/fecs_data.bin firmware: nvidia/gm20b/gr/fecs_inst.bin firmware: nvidia/gm20b/gr/fecs_bl.bin firmware: nvidia/gm20b/acr/ucode_load.bin firmware: nvidia/gm20b/acr/bl.bin firmware: nvidia/gp107/sec2/sig.bin firmware: nvidia/gp107/sec2/image.bin firmware: nvidia/gp107/sec2/desc.bin firmware: nvidia/gp107/nvdec/scrubber.bin firmware: nvidia/gp107/gr/sw_method_init.bin firmware: nvidia/gp107/gr/sw_bundle_init.bin firmware: nvidia/gp107/gr/sw_nonctx.bin firmware: nvidia/gp107/gr/sw_ctx.bin firmware: nvidia/gp107/gr/gpccs_sig.bin firmware: nvidia/gp107/gr/gpccs_data.bin firmware: nvidia/gp107/gr/gpccs_inst.bin firmware: nvidia/gp107/gr/gpccs_bl.bin firmware: nvidia/gp107/gr/fecs_sig.bin firmware: nvidia/gp107/gr/fecs_data.bin firmware: nvidia/gp107/gr/fecs_inst.bin firmware: nvidia/gp107/gr/fecs_bl.bin firmware: nvidia/gp107/acr/ucode_unload.bin firmware: nvidia/gp107/acr/ucode_load.bin firmware: nvidia/gp107/acr/unload_bl.bin firmware: nvidia/gp107/acr/bl.bin firmware: nvidia/gp106/sec2/sig.bin firmware: nvidia/gp106/sec2/image.bin firmware: nvidia/gp106/sec2/desc.bin firmware: nvidia/gp106/nvdec/scrubber.bin firmware: nvidia/gp106/gr/sw_method_init.bin firmware: nvidia/gp106/gr/sw_bundle_init.bin firmware: nvidia/gp106/gr/sw_nonctx.bin firmware: nvidia/gp106/gr/sw_ctx.bin firmware: nvidia/gp106/gr/gpccs_sig.bin firmware: nvidia/gp106/gr/gpccs_data.bin firmware: nvidia/gp106/gr/gpccs_inst.bin firmware: nvidia/gp106/gr/gpccs_bl.bin firmware: nvidia/gp106/gr/fecs_sig.bin firmware: nvidia/gp106/gr/fecs_data.bin firmware: nvidia/gp106/gr/fecs_inst.bin firmware: nvidia/gp106/gr/fecs_bl.bin firmware: nvidia/gp106/acr/ucode_unload.bin firmware: nvidia/gp106/acr/ucode_load.bin firmware: nvidia/gp106/acr/unload_bl.bin firmware: nvidia/gp106/acr/bl.bin firmware: nvidia/gp104/sec2/sig.bin firmware: nvidia/gp104/sec2/image.bin firmware: nvidia/gp104/sec2/desc.bin firmware: nvidia/gp104/nvdec/scrubber.bin firmware: nvidia/gp104/gr/sw_method_init.bin firmware: nvidia/gp104/gr/sw_bundle_init.bin firmware: nvidia/gp104/gr/sw_nonctx.bin firmware: nvidia/gp104/gr/sw_ctx.bin firmware: nvidia/gp104/gr/gpccs_sig.bin firmware: nvidia/gp104/gr/gpccs_data.bin firmware: nvidia/gp104/gr/gpccs_inst.bin firmware: nvidia/gp104/gr/gpccs_bl.bin firmware: nvidia/gp104/gr/fecs_sig.bin firmware: nvidia/gp104/gr/fecs_data.bin firmware: nvidia/gp104/gr/fecs_inst.bin firmware: nvidia/gp104/gr/fecs_bl.bin firmware: nvidia/gp104/acr/ucode_unload.bin firmware: nvidia/gp104/acr/ucode_load.bin firmware: nvidia/gp104/acr/unload_bl.bin firmware: nvidia/gp104/acr/bl.bin firmware: nvidia/gp102/sec2/sig.bin firmware: nvidia/gp102/sec2/image.bin firmware: nvidia/gp102/sec2/desc.bin firmware: nvidia/gp102/nvdec/scrubber.bin firmware: nvidia/gp102/gr/sw_method_init.bin firmware: nvidia/gp102/gr/sw_bundle_init.bin firmware: nvidia/gp102/gr/sw_nonctx.bin firmware: nvidia/gp102/gr/sw_ctx.bin firmware: nvidia/gp102/gr/gpccs_sig.bin firmware: nvidia/gp102/gr/gpccs_data.bin firmware: nvidia/gp102/gr/gpccs_inst.bin firmware: nvidia/gp102/gr/gpccs_bl.bin firmware: nvidia/gp102/gr/fecs_sig.bin firmware: nvidia/gp102/gr/fecs_data.bin firmware: nvidia/gp102/gr/fecs_inst.bin firmware: nvidia/gp102/gr/fecs_bl.bin firmware: nvidia/gp102/acr/ucode_unload.bin firmware: nvidia/gp102/acr/ucode_load.bin firmware: nvidia/gp102/acr/unload_bl.bin firmware: nvidia/gp102/acr/bl.bin firmware: nvidia/gv100/sec2/sig.bin firmware: nvidia/gv100/sec2/image.bin firmware: nvidia/gv100/sec2/desc.bin firmware: nvidia/gv100/nvdec/scrubber.bin firmware: nvidia/gv100/gr/sw_method_init.bin firmware: nvidia/gv100/gr/sw_bundle_init.bin firmware: nvidia/gv100/gr/sw_nonctx.bin firmware: nvidia/gv100/gr/sw_ctx.bin firmware: nvidia/gv100/gr/gpccs_sig.bin firmware: nvidia/gv100/gr/gpccs_data.bin firmware: nvidia/gv100/gr/gpccs_inst.bin firmware: nvidia/gv100/gr/gpccs_bl.bin firmware: nvidia/gv100/gr/fecs_sig.bin firmware: nvidia/gv100/gr/fecs_data.bin firmware: nvidia/gv100/gr/fecs_inst.bin firmware: nvidia/gv100/gr/fecs_bl.bin firmware: nvidia/gv100/acr/ucode_unload.bin firmware: nvidia/gv100/acr/ucode_load.bin firmware: nvidia/gv100/acr/unload_bl.bin firmware: nvidia/gv100/acr/bl.bin firmware: nvidia/gp108/sec2/sig.bin firmware: nvidia/gp108/sec2/image.bin firmware: nvidia/gp108/sec2/desc.bin firmware: nvidia/gp108/nvdec/scrubber.bin firmware: nvidia/gp108/gr/sw_method_init.bin firmware: nvidia/gp108/gr/sw_bundle_init.bin firmware: nvidia/gp108/gr/sw_nonctx.bin firmware: nvidia/gp108/gr/sw_ctx.bin firmware: nvidia/gp108/gr/gpccs_sig.bin firmware: nvidia/gp108/gr/gpccs_data.bin firmware: nvidia/gp108/gr/gpccs_inst.bin firmware: nvidia/gp108/gr/gpccs_bl.bin firmware: nvidia/gp108/gr/fecs_sig.bin firmware: nvidia/gp108/gr/fecs_data.bin firmware: nvidia/gp108/gr/fecs_inst.bin firmware: nvidia/gp108/gr/fecs_bl.bin firmware: nvidia/gp108/acr/ucode_unload.bin firmware: nvidia/gp108/acr/ucode_load.bin firmware: nvidia/gp108/acr/unload_bl.bin firmware: nvidia/gp108/acr/bl.bin firmware: nvidia/gp10b/pmu/sig.bin firmware: nvidia/gp10b/pmu/image.bin firmware: nvidia/gp10b/pmu/desc.bin firmware: nvidia/gp10b/gr/sw_method_init.bin firmware: nvidia/gp10b/gr/sw_bundle_init.bin firmware: nvidia/gp10b/gr/sw_nonctx.bin firmware: nvidia/gp10b/gr/sw_ctx.bin firmware: nvidia/gp10b/gr/gpccs_sig.bin firmware: nvidia/gp10b/gr/gpccs_data.bin firmware: nvidia/gp10b/gr/gpccs_inst.bin firmware: nvidia/gp10b/gr/gpccs_bl.bin firmware: nvidia/gp10b/gr/fecs_sig.bin firmware: nvidia/gp10b/gr/fecs_data.bin firmware: nvidia/gp10b/gr/fecs_inst.bin firmware: nvidia/gp10b/gr/fecs_bl.bin firmware: nvidia/gp10b/acr/ucode_load.bin firmware: nvidia/gp10b/acr/bl.bin license: GPL and additional rights description: nVidia Riva/TNT/GeForce/Quadro/Tesla/Tegra K1+ author: Nouveau Project retpoline: Y rhelversion: 7.6 srcversion: 464415DA74D2AF7BF0C5E06 alias: pci:v000012D2d*sv*sd*bc03sc*i* alias: pci:v000010DEd*sv*sd*bc03sc*i* depends: drm,drm_kms_helper,ttm,mxm-wmi,wmi,video,i2c-algo-bit intree: Y vermagic: 3.10.0-957.1.3.el7.x86_64 SMP mod_unload modversions signer: CentOS Linux kernel signing key sig_key: E7:CE:F3:61:3A:9B:8B:D0:12:FA:E7:49:82:72:15:9B:B1:87:9C:65 sig_hashalgo: sha256 parm: vram_pushbuf:Create DMA push buffers in VRAM (int) parm: tv_norm:Default TV norm. Supported: PAL, PAL-M, PAL-N, PAL-Nc, NTSC-M, NTSC-J, hd480i, hd480p, hd576i, hd576p, hd720p, hd1080i. Default: PAL *NOTE* Ignored for cards with external TV encoders. (charp) parm: nofbaccel:Disable fbcon acceleration (int) parm: fbcon_bpp:fbcon bits-per-pixel (default: auto) (int) parm: mst:Enable DisplayPort multi-stream (default: enabled) (int) parm: tv_disable:Disable TV-out detection (int) parm: ignorelid:Ignore ACPI lid status (int) parm: duallink:Allow dual-link TMDS (default: enabled) (int) parm: hdmimhz:Force a maximum HDMI pixel clock (in MHz) (int) parm: config:option string to pass to driver core (charp) parm: debug:debug string to pass to driver core (charp) parm: noaccel:disable kernel/abi16 acceleration (int) parm: modeset:enable driver (default: auto, 0 = disabled, 1 = enabled, 2 = headless) (int) parm: atomic:Expose atomic ioctl (default: disabled) (int) parm: runpm:disable (0), force enable (1), optimus only default (-1) (int) I'm happy to say 4.20.3-200.fc29.x86_64, which is available as a regular "dnf update" on Fedora 29, appears to have the bug fixed. I've been running it for a few days with none of the issues I used to have with the 4.19 series (frequent hangups as other had reported before). Before trying the 4.20 kernel above, the only one that worked for me was 4.18.16-300.fc29 as Lou had reported before. For completeness, my affected computer has the GP107GL [Quadro P400] Nvidia card. I believe I'm seeing the same issue on Fedora 29 with 5.0.5-200.fc29.x86_64. This is with hybrid graphics when nouveau is trying to initialize and times out on each CPU which locks the system. $ lspci -d 10de:1cba -vnn 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GLM [Quadro P2000 Mobile] [10de:1cba] (rev a1) (prog-if 00 [VGA controller]) Subsystem: Lenovo Device [17aa:2266] Flags: bus master, fast devsel, latency 0, IRQ 131 Memory at a3000000 (32-bit, non-prefetchable) [size=16M] Memory at 60000000 (64-bit, prefetchable) [size=256M] Memory at 70000000 (64-bit, prefetchable) [size=32M] I/O ports at 3000 [size=128] Expansion ROM at a4080000 [disabled] [size=512K] Capabilities: <access denied> Kernel driver in use: nouveau Kernel modules: nouveau $ journalctl --no-hostname -k -b -1 | grep nouveau Apr 04 10:17:43 kernel: nouveau: detected PR support, will not use DSM Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: enabling device (0006 -> 0007) Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: NVIDIA GP107 (137000a1) Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: bios: version 86.07.63.00.35 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: fb: 4096 MiB GDDR5 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: VRAM: 4096 MiB Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: GART: 536870912 MiB Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: BIT table 'A' not found Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: BIT table 'L' not found Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: TMDS table version 2.0 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: DCB version 4.1 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: DCB outp 00: 02800f76 04600020 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: DCB outp 01: 02011f62 00020010 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: DCB outp 02: 01022f46 04600010 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: DCB outp 03: 01033f56 04600020 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: DCB conn 00: 00020047 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: DCB conn 01: 00010161 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: DCB conn 02: 00001246 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: DCB conn 03: 00002346 Apr 04 10:17:43 kernel: nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies Apr 04 10:17:43 kernel: [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0 Apr 04 17:17:55 kernel: nouveau 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none Apr 04 17:17:58 kernel: nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 409800 [ TIMEOUT ] Apr 04 17:18:00 kernel: nouveau 0000:01:00.0: timeout Apr 04 17:18:00 kernel: WARNING: CPU: 4 PID: 1594 at drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c:1524 gf100_gr_init_ctxctl_ext+0x323/0x7d0 [nouveau] Apr 04 17:18:00 kernel: videobuf2_memops btintel snd_hda_core videobuf2_v4l2 mdev iwlwifi vfio_iommu_type1 snd_hwdep videobuf2_common bluetooth vfio snd_seq joydev videodev snd_seq_device kvm media snd_pcm cfg80211 wmi_bmof intel_wmi_thunderbolt ecdh_generic idma64 thinkpad_acpi ucsi_acpi mei_me snd_timer processor_thermal_device thunderbolt intel_lpss_pci i2c_i801 typec_ucsi mei ledtrig_audio irqbypass intel_pch_thermal intel_lpss intel_soc_dts_iosf snd typec soundcore rfkill int3403_thermal int340x_thermal_zone pcc_cpufreq acpi_pad int3400_thermal acpi_thermal_rel dm_crypt nouveau crct10dif_pclmul mxm_wmi i2c_algo_bit crc32_pclmul drm_kms_helper ttm crc32c_intel nvme drm e1000e ghash_clmulni_intel nvme_core serio_raw wmi video uas usb_storage Apr 04 17:18:00 kernel: RIP: 0010:gf100_gr_init_ctxctl_ext+0x323/0x7d0 [nouveau] Apr 04 17:18:00 kernel: gf100_gr_init_ctxctl+0x2e/0x2b0 [nouveau] Apr 04 17:18:00 kernel: ? gf100_gr_init+0x53c/0x580 [nouveau] Apr 04 17:18:00 kernel: nvkm_engine_init+0xaa/0x1e0 [nouveau] Apr 04 17:18:00 kernel: nvkm_subdev_init+0xb2/0x200 [nouveau] Apr 04 17:18:00 kernel: nvkm_engine_ref.part.0+0x43/0x60 [nouveau] Apr 04 17:18:00 kernel: nvkm_ioctl_new+0x125/0x220 [nouveau] Apr 04 17:18:00 kernel: ? nvkm_fifo_chan_child_del+0x90/0x90 [nouveau] Apr 04 17:18:00 kernel: ? gf100_gr_dtor+0xd0/0xd0 [nouveau] Apr 04 17:18:00 kernel: nvkm_ioctl+0xd8/0x170 [nouveau] Apr 04 17:18:00 kernel: usif_ioctl+0x6a3/0x700 [nouveau] Apr 04 17:18:00 kernel: nouveau_drm_ioctl+0xac/0xc0 [nouveau] Apr 04 17:18:00 kernel: nouveau 0000:01:00.0: gr: init failed, -16 Apr 04 17:18:02 kernel: nouveau 0000:01:00.0: timeout Apr 04 17:18:02 kernel: WARNING: CPU: 10 PID: 1594 at drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c:207 gf100_vmm_flush_+0x17b/0x190 [nouveau] Apr 04 17:18:02 kernel: videobuf2_memops btintel snd_hda_core videobuf2_v4l2 mdev iwlwifi vfio_iommu_type1 snd_hwdep videobuf2_common bluetooth vfio snd_seq joydev videodev snd_seq_device kvm media snd_pcm cfg80211 wmi_bmof intel_wmi_thunderbolt ecdh_generic idma64 thinkpad_acpi ucsi_acpi mei_me snd_timer processor_thermal_device thunderbolt intel_lpss_pci i2c_i801 typec_ucsi mei ledtrig_audio irqbypass intel_pch_thermal intel_lpss intel_soc_dts_iosf snd typec soundcore rfkill int3403_thermal int340x_thermal_zone pcc_cpufreq acpi_pad int3400_thermal acpi_thermal_rel dm_crypt nouveau crct10dif_pclmul mxm_wmi i2c_algo_bit crc32_pclmul drm_kms_helper ttm crc32c_intel nvme drm e1000e ghash_clmulni_intel nvme_core serio_raw wmi video uas usb_storage Apr 04 17:18:02 kernel: RIP: 0010:gf100_vmm_flush_+0x17b/0x190 [nouveau] Apr 04 17:18:02 kernel: nvkm_vmm_iter.constprop.9+0x352/0x810 [nouveau] Apr 04 17:18:02 kernel: ? nvkm_vmm_free_insert+0x80/0x80 [nouveau] Apr 04 17:18:02 kernel: ? gf100_vmm_aper+0x20/0x20 [nouveau] Apr 04 17:18:02 kernel: nvkm_vmm_ptes_unmap_put+0x2a/0x40 [nouveau] Apr 04 17:18:02 kernel: ? gf100_vmm_aper+0x20/0x20 [nouveau] Apr 04 17:18:02 kernel: nvkm_vmm_put_locked+0xf5/0x210 [nouveau] Apr 04 17:18:02 kernel: nvkm_uvmm_mthd+0x37b/0x830 [nouveau] Apr 04 17:18:02 kernel: nvkm_ioctl+0xd8/0x170 [nouveau] Apr 04 17:18:02 kernel: nvif_object_mthd+0x108/0x130 [nouveau] Apr 04 17:18:02 kernel: nvif_vmm_put+0x5c/0x80 [nouveau] Apr 04 17:18:02 kernel: nouveau_vma_del+0x70/0xc0 [nouveau] Apr 04 17:18:02 kernel: nouveau_gem_object_close+0x1d4/0x200 [nouveau] Apr 04 17:18:02 kernel: nouveau_drm_ioctl+0x65/0xc0 [nouveau] Apr 04 17:18:04 kernel: nouveau 0000:01:00.0: timeout Apr 04 17:18:04 kernel: WARNING: CPU: 4 PID: 1594 at drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c:207 gf100_vmm_flush_+0x17b/0x190 [nouveau] Apr 04 17:18:04 kernel: videobuf2_memops btintel snd_hda_core videobuf2_v4l2 mdev iwlwifi vfio_iommu_type1 snd_hwdep videobuf2_common bluetooth vfio snd_seq joydev videodev snd_seq_device kvm media snd_pcm cfg80211 wmi_bmof intel_wmi_thunderbolt ecdh_generic idma64 thinkpad_acpi ucsi_acpi mei_me snd_timer processor_thermal_device thunderbolt intel_lpss_pci i2c_i801 typec_ucsi mei ledtrig_audio irqbypass intel_pch_thermal intel_lpss intel_soc_dts_iosf snd typec soundcore rfkill int3403_thermal int340x_thermal_zone pcc_cpufreq acpi_pad int3400_thermal acpi_thermal_rel dm_crypt nouveau crct10dif_pclmul mxm_wmi i2c_algo_bit crc32_pclmul drm_kms_helper ttm crc32c_intel nvme drm e1000e ghash_clmulni_intel nvme_core serio_raw wmi video uas usb_storage Apr 04 17:18:04 kernel: RIP: 0010:gf100_vmm_flush_+0x17b/0x190 [nouveau] Apr 04 17:18:04 kernel: nvkm_vmm_unref_pdes+0xeb/0x1f0 [nouveau] Apr 04 17:18:04 kernel: nvkm_vmm_unref_ptes+0x1bc/0x1f0 [nouveau] Apr 04 17:18:04 kernel: ? nv50_instobj_release+0x74/0xc0 [nouveau] Apr 04 17:18:04 kernel: nvkm_vmm_iter.constprop.9+0x26d/0x810 [nouveau] Apr 04 17:18:04 kernel: ? nvkm_vmm_free_insert+0x80/0x80 [nouveau] Apr 04 17:18:04 kernel: ? gf100_vmm_aper+0x20/0x20 [nouveau] Apr 04 17:18:04 kernel: nvkm_vmm_ptes_unmap_put+0x2a/0x40 [nouveau] Apr 04 17:18:04 kernel: ? gf100_vmm_aper+0x20/0x20 [nouveau] Apr 04 17:18:04 kernel: nvkm_vmm_put_locked+0xf5/0x210 [nouveau] Apr 04 17:18:04 kernel: nvkm_uvmm_mthd+0x37b/0x830 [nouveau] Apr 04 17:18:04 kernel: nvkm_ioctl+0xd8/0x170 [nouveau] Apr 04 17:18:04 kernel: nvif_object_mthd+0x108/0x130 [nouveau] Apr 04 17:18:04 kernel: nvif_vmm_put+0x5c/0x80 [nouveau] Apr 04 17:18:04 kernel: nouveau_vma_del+0x70/0xc0 [nouveau] Apr 04 17:18:04 kernel: nouveau_gem_object_close+0x1d4/0x200 [nouveau] Apr 04 17:18:04 kernel: nouveau_drm_ioctl+0x65/0xc0 [nouveau] ... ... ... Apr 04 17:19:09 kernel: nouveau 0000:01:00.0: timeout Apr 04 17:19:09 kernel: WARNING: CPU: 0 PID: 336 at drivers/gpu/drm/nouveau/nvkm/engine/disp/sornv50.c:43 nv50_sor_power_wait+0x99/0xb0 [nouveau] Apr 04 17:19:09 kernel: videobuf2_memops btintel snd_hda_core videobuf2_v4l2 mdev iwlwifi vfio_iommu_type1 snd_hwdep videobuf2_common bluetooth vfio snd_seq joydev videodev snd_seq_device kvm media snd_pcm cfg80211 wmi_bmof intel_wmi_thunderbolt ecdh_generic idma64 thinkpad_acpi ucsi_acpi mei_me snd_timer processor_thermal_device thunderbolt intel_lpss_pci i2c_i801 typec_ucsi mei ledtrig_audio irqbypass intel_pch_thermal intel_lpss intel_soc_dts_iosf snd typec soundcore rfkill int3403_thermal int340x_thermal_zone pcc_cpufreq acpi_pad int3400_thermal acpi_thermal_rel dm_crypt nouveau crct10dif_pclmul mxm_wmi i2c_algo_bit crc32_pclmul drm_kms_helper ttm crc32c_intel nvme drm e1000e ghash_clmulni_intel nvme_core serio_raw wmi video uas usb_storage Apr 04 17:19:09 kernel: RIP: 0010:nv50_sor_power_wait+0x99/0xb0 [nouveau] Apr 04 17:19:09 kernel: nv50_sor_power+0xa6/0x130 [nouveau] Apr 04 17:19:09 kernel: nvkm_disp_init+0xb6/0xd0 [nouveau] Apr 04 17:19:09 kernel: nvkm_engine_init+0xaa/0x1e0 [nouveau] Apr 04 17:19:09 kernel: nvkm_subdev_init+0xb2/0x200 [nouveau] Apr 04 17:19:09 kernel: nvkm_device_fini+0xb7/0x1c0 [nouveau] Apr 04 17:19:09 kernel: nvkm_udevice_fini+0x4c/0x60 [nouveau] Apr 04 17:19:09 kernel: nvkm_object_fini+0xbc/0x150 [nouveau] Apr 04 17:19:09 kernel: nvkm_object_fini+0x73/0x150 [nouveau] Apr 04 17:19:09 kernel: nouveau_do_suspend+0xfd/0x2c0 [nouveau] Apr 04 17:19:09 kernel: nouveau_pmops_runtime_suspend+0x42/0xa0 [nouveau] Apr 04 17:19:11 kernel: nouveau 0000:01:00.0: timeout Apr 04 17:19:11 kernel: WARNING: CPU: 0 PID: 336 at drivers/gpu/drm/nouveau/nvkm/engine/disp/sornv50.c:63 nv50_sor_power+0x127/0x130 [nouveau] Apr 04 17:19:11 kernel: videobuf2_memops btintel snd_hda_core videobuf2_v4l2 mdev iwlwifi vfio_iommu_type1 snd_hwdep videobuf2_common bluetooth vfio snd_seq joydev videodev snd_seq_device kvm media snd_pcm cfg80211 wmi_bmof intel_wmi_thunderbolt ecdh_generic idma64 thinkpad_acpi ucsi_acpi mei_me snd_timer processor_thermal_device thunderbolt intel_lpss_pci i2c_i801 typec_ucsi mei ledtrig_audio irqbypass intel_pch_thermal intel_lpss intel_soc_dts_iosf snd typec soundcore rfkill int3403_thermal int340x_thermal_zone pcc_cpufreq acpi_pad int3400_thermal acpi_thermal_rel dm_crypt nouveau crct10dif_pclmul mxm_wmi i2c_algo_bit crc32_pclmul drm_kms_helper ttm crc32c_intel nvme drm e1000e ghash_clmulni_intel nvme_core serio_raw wmi video uas usb_storage Apr 04 17:19:11 kernel: RIP: 0010:nv50_sor_power+0x127/0x130 [nouveau] Apr 04 17:19:11 kernel: nvkm_disp_init+0xb6/0xd0 [nouveau] Apr 04 17:19:11 kernel: nvkm_engine_init+0xaa/0x1e0 [nouveau] Apr 04 17:19:11 kernel: nvkm_subdev_init+0xb2/0x200 [nouveau] Apr 04 17:19:11 kernel: nvkm_device_fini+0xb7/0x1c0 [nouveau] Apr 04 17:19:11 kernel: nvkm_udevice_fini+0x4c/0x60 [nouveau] Apr 04 17:19:11 kernel: nvkm_object_fini+0xbc/0x150 [nouveau] Apr 04 17:19:11 kernel: nvkm_object_fini+0x73/0x150 [nouveau] Apr 04 17:19:11 kernel: nouveau_do_suspend+0xfd/0x2c0 [nouveau] Apr 04 17:19:11 kernel: nouveau_pmops_runtime_suspend+0x42/0xa0 [nouveau] Apr 04 17:19:11 kernel: nouveau: DRM-master:00000000:00000080: suspend failed with -110 Apr 04 17:19:13 kernel: nouveau 0000:01:00.0: timeout Apr 04 17:19:13 kernel: WARNING: CPU: 0 PID: 336 at drivers/gpu/drm/nouveau/nvkm/engine/disp/piocgf119.c:63 gf119_disp_pioc_init+0xdc/0x130 [nouveau] Apr 04 17:19:13 kernel: videobuf2_memops btintel snd_hda_core videobuf2_v4l2 mdev iwlwifi vfio_iommu_type1 snd_hwdep videobuf2_common bluetooth vfio snd_seq joydev videodev snd_seq_device kvm media snd_pcm cfg80211 wmi_bmof intel_wmi_thunderbolt ecdh_generic idma64 thinkpad_acpi ucsi_acpi mei_me snd_timer processor_thermal_device thunderbolt intel_lpss_pci i2c_i801 typec_ucsi mei ledtrig_audio irqbypass intel_pch_thermal intel_lpss intel_soc_dts_iosf snd typec soundcore rfkill int3403_thermal int340x_thermal_zone pcc_cpufreq acpi_pad int3400_thermal acpi_thermal_rel dm_crypt nouveau crct10dif_pclmul mxm_wmi i2c_algo_bit crc32_pclmul drm_kms_helper ttm crc32c_intel nvme drm e1000e ghash_clmulni_intel nvme_core serio_raw wmi video uas usb_storage Apr 04 17:19:13 kernel: RIP: 0010:gf119_disp_pioc_init+0xdc/0x130 [nouveau] Apr 04 17:19:13 kernel: nvkm_object_init+0x3e/0x100 [nouveau] Apr 04 17:19:13 kernel: nvkm_object_init+0x71/0x100 [nouveau] Apr 04 17:19:13 kernel: nvkm_object_init+0x71/0x100 [nouveau] Apr 04 17:19:13 kernel: nvkm_object_init+0x71/0x100 [nouveau] Apr 04 17:19:13 kernel: nvkm_object_fini+0x137/0x150 [nouveau] Apr 04 17:19:13 kernel: nouveau_do_suspend+0xfd/0x2c0 [nouveau] Apr 04 17:19:13 kernel: nouveau_pmops_runtime_suspend+0x42/0xa0 [nouveau] Apr 04 17:19:13 kernel: nouveau 0000:01:00.0: disp: ch 20 init: bad00100 Apr 04 17:19:13 kernel: nouveau: DRM:00000000:0000917a: init failed with -16 Apr 04 17:19:13 kernel: nouveau: DRM:00000000:00009870: init failed with -16 Apr 04 17:19:13 kernel: nouveau: DRM:00000000:00000080: init failed with -16 Apr 04 17:19:13 kernel: nouveau: DRM-master:00000000:00000000: init failed with -16 Apr 04 17:19:13 kernel: RIP: 0010:evo_wait+0x5a/0x130 [nouveau] Apr 04 17:19:13 kernel: core507d_init+0x1d/0x70 [nouveau] Apr 04 17:19:13 kernel: nv50_display_init+0x34/0xf0 [nouveau] Apr 04 17:19:13 kernel: nouveau_display_init+0x36/0xe0 [nouveau] Apr 04 17:19:13 kernel: nouveau_display_resume+0x39/0x250 [nouveau] Apr 04 17:19:13 kernel: nouveau_do_suspend+0x156/0x2c0 [nouveau] Apr 04 17:19:13 kernel: nouveau_pmops_runtime_suspend+0x42/0xa0 [nouveau] Apr 04 17:19:13 kernel: videobuf2_memops btintel snd_hda_core videobuf2_v4l2 mdev iwlwifi vfio_iommu_type1 snd_hwdep videobuf2_common bluetooth vfio snd_seq joydev videodev snd_seq_device kvm media snd_pcm cfg80211 wmi_bmof intel_wmi_thunderbolt ecdh_generic idma64 thinkpad_acpi ucsi_acpi mei_me snd_timer processor_thermal_device thunderbolt intel_lpss_pci i2c_i801 typec_ucsi mei ledtrig_audio irqbypass intel_pch_thermal intel_lpss intel_soc_dts_iosf snd typec soundcore rfkill int3403_thermal int340x_thermal_zone pcc_cpufreq acpi_pad int3400_thermal acpi_thermal_rel dm_crypt nouveau crct10dif_pclmul mxm_wmi i2c_algo_bit crc32_pclmul drm_kms_helper ttm crc32c_intel nvme drm e1000e ghash_clmulni_intel nvme_core serio_raw wmi video uas usb_storage Apr 04 17:19:13 kernel: RIP: 0010:evo_wait+0x5a/0x130 [nouveau] This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle. Changing version to '31'. This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle. Changing version to 31. Seeing this error coming up on one of my systems with the 5.3.8 kernel on F31. Screen freezes up and I need to reboot the system. After rebooting, the system is function for some seemingly random period of time before it starts acting up again. ~> lspci -d 10de:05e2 -vnn 03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GT200 [GeForce GTX 260] [10de:05e2] (rev a1) (prog-if 00 [VGA controller]) Subsystem: eVga.com. Corp. Device [3842:1255] Flags: bus master, fast devsel, latency 0, IRQ 44 Memory at f8000000 (32-bit, non-prefetchable) [size=16M] Memory at e0000000 (64-bit, prefetchable) [size=256M] Memory at f6000000 (64-bit, non-prefetchable) [size=32M] I/O ports at af00 [size=128] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: nouveau Kernel modules: nouveau From journalctl --no-hostname -k -b 0 |grep nouveau output, the problem seems to start with Nov 03 12:28:20 kernel: nouveau 0000:03:00.0: disp: ERROR 5 [INVALID_STATE] 0b [] chid 0 mthd 0080 data 00000000 and then followed by a stream of these errors Nov 03 12:40:49 kernel: nouveau: evo channel stalled Nov 03 12:40:51 kernel: nouveau 0000:03:00.0: DRM: base-1: timeout From the most recent occurrence on my system with kernel 5.3.13-300 Nov 30 18:41:58 kernel: nouveau 0000:03:00.0: disp: ERROR 1 [PUSHBUFFER_ERR] 01 [] chid 0 mthd 0000 data 00000000 Nov 30 18:42:00 kernel: nouveau 0000:03:00.0: DRM: core notifier timeout Nov 30 18:42:02 kernel: nouveau 0000:03:00.0: DRM: base-1: timeout Nov 30 18:42:04 kernel: nouveau 0000:03:00.0: DRM: base-1: timeout Nov 30 18:42:06 kernel: nouveau: evo channel stalled Nov 30 18:42:08 kernel: nouveau 0000:03:00.0: DRM: base-1: timeout Nov 30 18:42:10 kernel: nouveau 0000:03:00.0: DRM: base-1: timeout Nov 30 18:42:12 kernel: nouveau 0000:03:00.0: DRM: base-1: timeout Nov 30 18:42:14 kernel: nouveau 0000:03:00.0: DRM: core notifier timeout Nov 30 18:42:14 kernel: nouveau 0000:03:00.0: DRM: base-1: timeout Nov 30 18:42:16 kernel: nouveau 0000:03:00.0: DRM: base-0: timeout Nov 30 18:42:18 kernel: nouveau 0000:03:00.0: DRM: base-0: timeout Nov 30 18:42:18 kernel: nouveau 0000:03:00.0: DRM: base-1: timeout Nov 30 18:42:20 kernel: nouveau 0000:03:00.0: DRM: base-0: timeout I see the same issue cat /etc/redhat-release Fedora release 31 (Thirty One) $ uname -a Linux yferszt-fc 5.3.15-300.fc31.x86_64 #1 SMP Thu Dec 5 15:04:01 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux lspci | grep -e VGA 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06) 01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M1000M] (rev a2) ===== Dec 12 12:07:51 yferszt-fc kernel: nouveau: evo channel stalled Dec 12 12:08:02 yferszt-fc kernel: nouveau 0000:01:00.0: DRM: core notifier timeout Dec 12 12:08:12 yferszt-fc kernel: nouveau 0000:01:00.0: DRM: base-0: timeout Dec 12 12:08:15 yferszt-fc kernel: nouveau 0000:01:00.0: DRM: core notifier timeout ===== GUI freezes and ssh is still working. rebooting the system fixes the issue till it happens randomly again. Still reproducible on F31: $ uname -r 5.5.15-200.fc31.x86_64 $ dmesg|grep nouveau [ 6.105904] nouveau 0000:01:00.0: NVIDIA G98 (298480a2) [ 6.196288] nouveau 0000:01:00.0: bios: version 62.98.3c.00.44 [ 6.234673] nouveau 0000:01:00.0: bios: M0203T not found [ 6.234677] nouveau 0000:01:00.0: bios: M0203E not matched! [ 6.234680] nouveau 0000:01:00.0: fb: 512 MiB DDR2 [ 6.347632] nouveau 0000:01:00.0: DRM: VRAM: 512 MiB [ 6.347634] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB [ 6.347640] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 [ 6.347643] nouveau 0000:01:00.0: DRM: DCB version 4.0 [ 6.347646] nouveau 0000:01:00.0: DRM: DCB outp 00: 01011323 00010034 [ 6.347649] nouveau 0000:01:00.0: DRM: DCB outp 01: 02000300 00000028 [ 6.347651] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022312 00020030 [ 6.347653] nouveau 0000:01:00.0: DRM: DCB conn 00: 00000000 [ 6.347655] nouveau 0000:01:00.0: DRM: DCB conn 01: 00000140 [ 6.347657] nouveau 0000:01:00.0: DRM: DCB conn 02: 00002261 [ 6.347659] nouveau 0000:01:00.0: DRM: DCB conn 07: 00000513 [ 6.351313] nouveau 0000:01:00.0: DRM: MM: using M2MF for buffer copies [ 6.440483] nouveau 0000:01:00.0: DRM: allocated 1440x900 fb: 0x50000, bo (____ptrval____) [ 6.440646] fbcon: nouveaudrmfb (fb0) is primary device [ 8.442191] nouveau 0000:01:00.0: DRM: core notifier timeout [ 10.442203] nouveau 0000:01:00.0: DRM: base-0: timeout [ 10.446983] nouveau 0000:01:00.0: fb0: nouveaudrmfb frame buffer device [ 10.454934] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0 [ 12.627942] nouveau 0000:01:00.0: DRM: core notifier timeout [ 14.628130] nouveau 0000:01:00.0: DRM: base-0: timeout [ 2126.158743] nouveau 0000:01:00.0: DRM: core notifier timeout [ 2129.529458] nouveau 0000:01:00.0: DRM: core notifier timeout [ 2131.550202] nouveau 0000:01:00.0: DRM: base-0: timeout [ 2153.738236] nouveau 0000:01:00.0: DRM: base-0: timeout [ 2155.738333] nouveau 0000:01:00.0: DRM: core notifier timeout [ 2161.063900] nouveau 0000:01:00.0: DRM: base-0: timeout [ 2163.080106] nouveau 0000:01:00.0: DRM: base-0: timeout [ 2165.182431] nouveau 0000:01:00.0: DRM: base-0: timeout [ 2167.203106] nouveau 0000:01:00.0: DRM: base-0: timeout This is the only GPU in this machine, so I don't have the luxury of using a built-in Intel GPU waiting for a fix. Happening on F31 with Linux ir-pc 5.3.7-301.fc31.x86_64 #1 SMP Mon Oct 21 19:18:58 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Apr 15 02:01:43 ir-pc kernel: nouveau 0000:1d:00.0: DRM: base-1: timeout Apr 15 02:01:45 ir-pc kernel: nouveau 0000:1d:00.0: DRM: base-1: timeout Apr 15 02:01:47 ir-pc kernel: nouveau 0000:1d:00.0: DRM: base-1: timeout Apr 15 02:01:49 ir-pc kernel: nouveau 0000:1d:00.0: DRM: base-1: timeout Apr 15 02:01:51 ir-pc kernel: nouveau 0000:1d:00.0: DRM: base-1: timeout Apr 15 02:01:53 ir-pc kernel: nouveau 0000:1d:00.0: DRM: base-1: timeout Apr 15 02:01:55 ir-pc kernel: nouveau 0000:1d:00.0: DRM: base-1: timeout Same here! $ cat /etc/system-release Fedora release 31 (Thirty One) $ uname -a Linux 5.5.15-200.fc31.x86_64 #1 SMP Thu Apr 2 19:16:17 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $ lspci -d 10de:1b80 -vnn 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1) (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. Device [1043:8592] Flags: bus master, fast devsel, latency 0, IRQ 136 Memory at de000000 (32-bit, non-prefetchable) [size=16M] Memory at c0000000 (64-bit, prefetchable) [size=256M] Memory at d0000000 (64-bit, prefetchable) [size=32M] I/O ports at e000 [size=128] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Legacy Endpoint, MSI 00 Capabilities: [100] Virtual Channel Capabilities: [250] Latency Tolerance Reporting Capabilities: [128] Power Budgeting <?> Capabilities: [420] Advanced Error Reporting Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900] Secondary PCI Express Kernel driver in use: nouveau Kernel modules: nouveau $ dmseg -w | grep nouveau [ 2.115632] fb0: switching to nouveaufb from EFI VGA [ 2.115708] nouveau 0000:01:00.0: NVIDIA GP104 (134000a1) [ 2.221635] nouveau 0000:01:00.0: bios: version ESC[32m86.04.17.00ESC[m.1c [ 2.222157] nouveau 0000:01:00.0: fb: 8192 MiB GDDR5X [ 2.229078] nouveau 0000:01:00.0: DRM: VRAM: 8192 MiB [ 2.229079] nouveau 0000:01:00.0: DRM: GART: 536870912 MiB [ 2.229080] nouveau 0000:01:00.0: DRM: BIT table 'A' not found [ 2.229081] nouveau 0000:01:00.0: DRM: BIT table 'L' not found [ 2.229082] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 [ 2.229083] nouveau 0000:01:00.0: DRM: DCB version 4.1 [ 2.229084] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f42 00020030 [ 2.229085] nouveau 0000:01:00.0: DRM: DCB outp 01: 04811f96 04600020 [ 2.229086] nouveau 0000:01:00.0: DRM: DCB outp 02: 04011f92 00020020 [ 2.229087] nouveau 0000:01:00.0: DRM: DCB outp 03: 04822f86 04600010 [ 2.229088] nouveau 0000:01:00.0: DRM: DCB outp 04: 04022f82 00020010 [ 2.229089] nouveau 0000:01:00.0: DRM: DCB outp 06: 02033f62 00020010 [ 2.229089] nouveau 0000:01:00.0: DRM: DCB outp 08: 02044f72 00020020 [ 2.229090] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001031 [ 2.229091] nouveau 0000:01:00.0: DRM: DCB conn 01: 02000146 [ 2.229092] nouveau 0000:01:00.0: DRM: DCB conn 02: 01000246 [ 2.229093] nouveau 0000:01:00.0: DRM: DCB conn 03: 00010361 [ 2.229093] nouveau 0000:01:00.0: DRM: DCB conn 04: 00020461 [ 2.229390] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies [ 2.857595] nouveau 0000:01:00.0: DRM: allocated 1920x1080 fb: 0x200000, bo 00000000188c5ccc [ 2.892257] fbcon: nouveaudrmfb (fb0) is primary device [ 2.892259] nouveau 0000:01:00.0: fb0: nouveaudrmfb frame buffer device [ 2.944672] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0 NVRM: nouveau, rivafb, nvidiafb or rivatv NVRM: nouveau, rivafb, nvidiafb or rivatv NVRM: nouveau, rivafb, nvidiafb or rivatv NVRM: nouveau, rivafb, nvidiafb or rivatv NVRM: nouveau, rivafb, nvidiafb or rivatv NVRM: nouveau, rivafb, nvidiafb or rivatv [33245.082049] nouveau 0000:01:00.0: disp: chid 0 mthd 0080 data 00000002 00005080 00000015 [33245.082050] nouveau 0000:01:00.0: disp: Core: [33245.082054] nouveau 0000:01:00.0: disp: 0080: 00000000 -> 00000002 [33245.082057] nouveau 0000:01:00.0: disp: 0084: 00000000 -> 80000000 [33245.082060] nouveau 0000:01:00.0: disp: 0088: f0000000 [33245.082061] nouveau 0000:01:00.0: disp: Core - DAC 0: [33245.082065] nouveau 0000:01:00.0: disp: 0180: 00000000 [33245.082068] nouveau 0000:01:00.0: disp: 0184: 00000000 [33245.082073] nouveau 0000:01:00.0: disp: 0188: 00000000 [33245.082075] nouveau 0000:01:00.0: disp: 0190: 00000000 [33245.082076] nouveau 0000:01:00.0: disp: Core - DAC 1: [33245.082095] nouveau 0000:01:00.0: disp: 01a0: 00000000 [33245.082099] nouveau 0000:01:00.0: disp: 01a4: 00000000 [33245.082105] nouveau 0000:01:00.0: disp: 01a8: 00000000 [33245.082110] nouveau 0000:01:00.0: disp: 01b0: 00000000 {...} [33267.065439] nouveau 0000:01:00.0: disp: 0e54: 00000000 [33267.065442] nouveau 0000:01:00.0: disp: 0e58: 00000000 [33267.065446] nouveau 0000:01:00.0: disp: 0e5c: 00000001 [33269.548289] nouveau 0000:01:00.0: DRM: base-0: timeout [33271.562467] nouveau: evo channel stalled [33273.562548] nouveau 0000:01:00.0: DRM: base-0: timeout [33275.565515] nouveau 0000:01:00.0: DRM: base-0: timeout [33277.568617] nouveau 0000:01:00.0: DRM: base-0: timeout [33279.570770] nouveau 0000:01:00.0: DRM: base-0: timeout [33281.575781] nouveau 0000:01:00.0: DRM: base-0: timeout [33283.595087] nouveau 0000:01:00.0: DRM: base-0: timeout [33285.598140] nouveau 0000:01:00.0: DRM: base-0: timeout [33287.601285] nouveau 0000:01:00.0: DRM: base-0: timeout [33289.604393] nouveau 0000:01:00.0: DRM: base-0: timeout [33291.644816] nouveau 0000:01:00.0: DRM: base-0: timeout {end} I get this issues on Fedora 32 beta kernel 5.6.5-300.fc32.x86_64, I'm using a Nvidia GeForce 1070. I don't get this issue straight away on startup, i get it at random throughout the day. Usually 3 or 4 times a day, and even once while writing this :) using: 01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1) In my journalctl logs i get: Apr 23 10:57:24 pc kernel: nouveau 0000:01:00.0: DRM: base-0: timeout Apr 23 10:57:22 pc kernel: nouveau 0000:01:00.0: DRM: base-0: timeout Apr 23 10:57:20 pc kernel: nouveau 0000:01:00.0: DRM: base-0: timeout Apr 23 10:57:18 pc kernel: nouveau 0000:01:00.0: DRM: base-0: timeout Apr 23 10:57:16 pc kernel: nouveau 0000:01:00.0: DRM: base-0: timeout Apr 23 10:57:14 pc kernel: nouveau 0000:01:00.0: DRM: base-0: timeout Apr 23 10:57:12 pc kernel: nouveau 0000:01:00.0: DRM: core notifier timeout A reboot seems to be the only way to fix it. Also hit this problem of frozen screen with Nvidia GP107GL [Quadro P400]. The system in the background was working just fine, as attested with blind switching to next VT (where I was also logged in) and killing sound producing program by name. NOTE: would be nice to have a command handy to try to recover from this problem, e.g. when one can still enter the commands as was my case per above, or when ssh connection is an option -- if at all possible, of course Luckily, it was the only manifestation of the problem in about 3 months IIRC. Jul 27 17:40:43 sway[2504]: 2020-07-27 17:40:43 - [sway/commands.c:255] Handling command 'workspace 9' Jul 27 17:40:43 sway[2504]: 2020-07-27 17:40:43 - [sway/commands.c:255] Handling command 'workspace 10' Jul 27 17:40:44 sway[2504]: 2020-07-27 17:40:44 - [sway/commands.c:255] Handling command 'workspace 9' Jul 27 17:40:44 sway[2504]: 2020-07-27 17:40:44 - [sway/commands.c:255] Handling command 'workspace 8' Jul 27 17:40:45 sway[2504]: 2020-07-27 17:40:45 - [sway/commands.c:255] Handling command 'workspace 7' Jul 27 17:40:45 sway[2504]: 2020-07-27 17:40:45 - [sway/commands.c:255] Handling command 'workspace 6' Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a00 data 0000a004 10003a00 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a04 data 0000cf00 10003a04 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a08 data 00040084 10003a08 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a0c data 00000010 10003a0c 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a10 data 000400c0 10003a10 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a14 data fb0000fe 10003a14 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a18 data 00140400 10003a18 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a1c data 002e4000 10003a1c 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a20 data 00000000 10003a20 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a24 data 05a00a00 10003a24 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a28 data 0000a004 10003a28 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a2c data 0000cf00 10003a2c 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a30 data 00040080 10003a30 00000000 Jul 27 17:40:45 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a34 data 00000000 10003a34 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: DRM: base-3: timeout Jul 27 17:40:47 sway[2504]: 2020-07-27 17:40:47 - [sway/commands.c:255] Handling command 'workspace 6' Jul 27 17:40:47 sway[2504]: 2020-07-27 17:40:47 - [sway/commands.c:255] Handling command 'workspace 4' Jul 27 17:40:47 sway[2504]: 2020-07-27 17:40:47 - [sway/commands.c:255] Handling command 'workspace 3' Jul 27 17:40:47 sway[2504]: 2020-07-27 17:40:47 - [sway/commands.c:255] Handling command 'workspace 4' Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a38 data 000800a0 10003a38 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a3c data 00000130 10003a3c 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a40 data f0000000 10003a40 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a44 data 00040084 10003a44 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a48 data 00000010 10003a48 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a4c data 000400c0 10003a4c 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a50 data fb0000fe 10003a50 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a54 data 00140400 10003a54 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a58 data 003c8000 10003a58 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a5c data 00000000 10003a5c 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a60 data 05a00a00 10003a60 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a64 data 0000a004 10003a64 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a68 data 0000cf00 10003a68 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a6c data 00040080 10003a6c 00000000 Jul 27 17:40:47 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a70 data 00000000 10003a70 00000000 Jul 27 17:40:49 kernel: nouveau 0000:65:00.0: DRM: base-3: timeout Jul 27 17:40:49 sway[2504]: 2020-07-27 17:40:49 - [sway/commands.c:255] Handling command 'workspace 4' Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a74 data 000800a0 10003a74 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a78 data 00000120 10003a78 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a7c data f0000000 10003a7c 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a80 data 00040084 10003a80 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a84 data 00000010 10003a84 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a88 data 000400c0 10003a88 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a8c data fb0000fe 10003a8c 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a90 data 00140400 10003a90 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a94 data 002e4000 10003a94 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a98 data 00000000 10003a98 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0a9c data 05a00a00 10003a9c 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0aa0 data 0000a004 10003aa0 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0aa4 data 0000cf00 10003aa4 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0aa8 data 00040080 10003aa8 00000000 Jul 27 17:40:50 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0aac data 00000000 10003aac 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: DRM: base-3: timeout Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ab0 data 000800a0 10003ab0 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ab4 data 00000130 10003ab4 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ab8 data f0000000 10003ab8 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0abc data 00040084 10003abc 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ac0 data 00000010 10003ac0 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ac4 data 000400c0 10003ac4 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ac8 data fb0000fe 10003ac8 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0acc data 00140400 10003acc 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ad0 data 003c8000 10003ad0 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ad4 data 00000000 10003ad4 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ad8 data 05a00a00 10003ad8 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0adc data 0000a004 10003adc 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ae0 data 0000cf00 10003ae0 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ae4 data 00040080 10003ae4 00000000 Jul 27 17:40:52 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ae8 data 00000001 10003ae8 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: DRM: base-3: timeout Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0aec data 000800a0 10003aec 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0af0 data 00000120 10003af0 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0af4 data f0000000 10003af4 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0af8 data 00040084 10003af8 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0afc data 00000010 10003afc 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b00 data 000400c0 10003b00 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b04 data fb0000fe 10003b04 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b08 data 00140400 10003b08 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b0c data 002e4000 10003b0c 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b10 data 00000000 10003b10 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b14 data 05a00a00 10003b14 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b18 data 0000a004 10003b18 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b1c data 0000cf00 10003b1c 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b20 data 00040080 10003b20 00000000 Jul 27 17:40:54 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b24 data 00000000 10003b24 00000000 Jul 27 17:40:54 sway[2504]: 2020-07-27 17:40:54 - [sway/commands.c:255] Handling command 'workspace 2' Jul 27 17:40:56 kernel: nouveau 0000:65:00.0: DRM: base-3: timeout Jul 27 17:40:56 sway[2504]: 2020-07-27 17:40:56 - [sway/commands.c:255] Handling command 'workspace 2' Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: DRM: core notifier timeout Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b28 data 000800a0 10003b28 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b2c data 00000130 10003b2c 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b30 data f0000000 10003b30 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b34 data 00040084 10003b34 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b38 data 00000010 10003b38 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b3c data 000400c0 10003b3c 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b40 data fb0000fe 10003b40 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b44 data 00140400 10003b44 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b48 data 003c8000 10003b48 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b4c data 00000000 10003b4c 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b50 data 05a00a00 10003b50 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b54 data 0000a004 10003b54 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b58 data 0000cf00 10003b58 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b5c data 00040080 10003b5c 00000000 Jul 27 17:40:58 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b60 data 00000000 10003b60 00000000 Jul 27 17:41:00 kernel: nouveau 0000:65:00.0: DRM: base-3: timeout Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: DRM: core notifier timeout Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b64 data 000800a0 10003b64 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b68 data 00000120 10003b68 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b6c data f0000000 10003b6c 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b70 data 00040084 10003b70 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b74 data 00000010 10003b74 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b78 data 000400c0 10003b78 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b7c data fb0000fe 10003b7c 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b80 data 00140400 10003b80 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b84 data 002e4000 10003b84 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b88 data 00000000 10003b88 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b8c data 05a00a00 10003b8c 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b90 data 0000a004 10003b90 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b94 data 0000cf00 10003b94 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b98 data 00040080 10003b98 00000000 Jul 27 17:41:02 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0b9c data 00000000 10003b9c 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: DRM: base-3: timeout Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ba0 data 000800a0 10003ba0 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ba4 data 00000130 10003ba4 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0ba8 data f0000000 10003ba8 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bac data 00040084 10003bac 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bb0 data 00000010 10003bb0 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bb4 data 000400c0 10003bb4 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bb8 data fb0000fe 10003bb8 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bbc data 00140400 10003bbc 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bc0 data 003c8000 10003bc0 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bc4 data 00000000 10003bc4 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bc8 data 05a00a00 10003bc8 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bcc data 0000a004 10003bcc 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bd0 data 0000cf00 10003bd0 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bd4 data 00040080 10003bd4 00000000 Jul 27 17:41:04 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bd8 data 00000000 10003bd8 00000000 Jul 27 17:41:06 kernel: nouveau 0000:65:00.0: DRM: base-3: timeout Jul 27 17:41:08 kernel: nouveau 0000:65:00.0: DRM: core notifier timeout Jul 27 17:41:08 kernel: nouveau 0000:65:00.0: disp: chid 4 mthd 0bdc data 000800a0 10003bdc 00000000 [...] As you can see, the problem might have been provoked with rather heavy workspace (virtual desktop) switching under Sway WM. # uname -r 5.7.0-0.rc7.20200529gitb0c3ba31be3e.1.fc33.x86_64 (bumped hence to Rawhide) # lspci -d 10de:1cb3 -vnn 65:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P400] [10de:1cb3] (rev a1) (prog-if 00 [VGA controller]) Subsystem: NVIDIA Corporation Device [10de:11be] Flags: bus master, fast devsel, latency 0, IRQ 49, NUMA node 0 Memory at df000000 (32-bit, non-prefetchable) [size=16M] Memory at c0000000 (64-bit, prefetchable) [size=256M] Memory at d0000000 (64-bit, prefetchable) [size=32M] I/O ports at b000 [size=128] Expansion ROM at e0000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Legacy Endpoint, MSI 00 Capabilities: [100] Virtual Channel Capabilities: [250] Latency Tolerance Reporting Capabilities: [128] Power Budgeting <?> Capabilities: [420] Advanced Error Reporting Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900] Secondary PCI Express Kernel driver in use: nouveau Kernel modules: nouveau Note that I suspect the original component was wrong, this is rather a problem with the Nouveau kernel driver as such, since I use minimum of Xorg (the user space part of the driver stack would rather be mesa-dri-drivers in my case if any, I think). But please correct me if I am wrong. I ended up using kernel-longterm-4.14 [1] , since I can't boot the computer with any newer kernel release ... [1] https://copr.fedorainfracloud.org/coprs/kwizart/kernel-longterm-4.14/ Thanks, Sergio. I have no problem to boot up and all was working very reasonably until recently but it seems now that I am running into the problem rather frequently -- just observed another such problem again. I use sway WM (sway-1.4-7.fc33.x86_64 + wlroots-0.10.1-2.fc33.x86_64) , and I start to suspect the problem is related with some operations involving: - some sort of video playback (vlc, zoom) - heavy use of manual dragging of freefloating "surfaces" (overflow-windowed Zoom app, control bar with play/pause etc. of VLC) - perhaps but not sure, Firefox involvement (one of the conspiracy theories being that newer Firefox does more of chunked redrawing or something like that) Anyway, I was now able to get that far as to see observe also this message that I didn't see originally, and it was connected to switching from TTY1 with running sway to TTY2 with plain VT and back, whereby sway/wlroots attempted to regain the direct screen access: Jul 28 18:51:06 sway[2476]: 2020-07-28 18:51:06 - [backend/drm/backend.c:124] DRM fd paused Jul 28 18:51:27 systemd[1]: getty: Succeeded. [...] Jul 28 18:51:42 sway[2476]: 2020-07-28 18:51:42 - [backend/drm/backend.c:91] DRM fd resumed Jul 28 18:51:42 sway[2476]: 2020-07-28 18:51:42 - [backend/drm/drm.c:1272] Scanning DRM connectors Jul 28 18:51:43 sway[2476]: 2020-07-28 18:51:43 - [backend/drm/drm.c:693] Modesetting 'DP-1' with '2560x1440@59951 mHz' Jul 28 18:51:45 kernel: nouveau 0000:65:00.0: DRM: base-0: timeout Jul 28 18:51:48 kernel: nouveau: evo channel stalled Also, I was able to blindly kill sway and attempt to run it anew, with another surprising observation: Jul 28 18:55:42 sway[44306]: 2020-07-28 18:55:42 - [sway/server.c:207] Running compositor on wayland display 'wayland-0' [...nothing interesting...] Jul 28 18:58:34 kernel: INFO: task kworker/u24:4:37180 blocked for more than 122 seconds. Jul 28 18:58:34 kernel: Tainted: G W O --------- --- 5.7.0-0.rc7.20200529gitb0c3ba31be3e.1.fc33.x86_64 #1 Jul 28 18:58:34 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 28 18:58:34 kernel: kworker/u24:4 D12240 37180 2 0x80004000 Jul 28 18:58:34 kernel: Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau] Jul 28 18:58:34 kernel: Call Trace: Jul 28 18:58:34 kernel: __schedule+0x33e/0xa30 Jul 28 18:58:34 kernel: ? sched_clock+0x5/0x10 Jul 28 18:58:34 kernel: schedule+0x5f/0xd0 Jul 28 18:58:34 kernel: schedule_timeout+0xe4/0x120 Jul 28 18:58:34 kernel: ? mark_held_locks+0x2d/0x80 Jul 28 18:58:34 kernel: ? _raw_spin_unlock_irqrestore+0x46/0x60 Jul 28 18:58:34 kernel: ? lockdep_hardirqs_on+0x11e/0x1b0 Jul 28 18:58:34 kernel: dma_fence_default_wait+0x176/0x210 Jul 28 18:58:34 kernel: ? dma_fence_free+0x20/0x20 Jul 28 18:58:34 kernel: dma_fence_wait_timeout+0x1b2/0x250 Jul 28 18:58:34 kernel: drm_atomic_helper_wait_for_fences+0x7f/0xf0 [drm_kms_helper] Jul 28 18:58:34 kernel: nv50_disp_atomic_commit_tail+0x79/0x760 [nouveau] Jul 28 18:58:34 kernel: ? sched_clock+0x5/0x10 Jul 28 18:58:34 kernel: process_one_work+0x269/0x5c0 Jul 28 18:58:34 kernel: worker_thread+0x55/0x3d0 Jul 28 18:58:34 kernel: ? process_one_work+0x5c0/0x5c0 Jul 28 18:58:34 kernel: kthread+0x131/0x150 Jul 28 18:58:34 kernel: ? __kthread_bind_mask+0x60/0x60 Jul 28 18:58:34 kernel: ret_from_fork+0x3a/0x50 Jul 28 18:58:34 kernel: Showing all locks held in the system: Jul 28 18:58:34 kernel: 1 lock held by khungtaskd/73: Jul 28 18:58:34 kernel: #0: ffffffffb8a98760 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x15/0x16f Jul 28 18:58:34 kernel: 1 lock held by fuse mainloop/3463: Jul 28 18:58:34 kernel: #0: ffff8e582a085c70 (&pipe->mutex/1){+.+.}-{3:3}, at: do_splice+0x5cb/0x790 Jul 28 18:58:34 kernel: 1 lock held by fuse mainloop/3464: Jul 28 18:58:34 kernel: #0: ffff8e5829b5ca70 (&pipe->mutex/1){+.+.}-{3:3}, at: do_splice+0x5cb/0x790 Jul 28 18:58:34 kernel: 2 locks held by kworker/u24:4/37180: Jul 28 18:58:34 kernel: #0: ffff8e58f8411948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1d4/0x5c0 Jul 28 18:58:34 kernel: #1: ffffa310feacfe70 ((work_completion)(&state->commit_work)){+.+.}-{0:0}, at: process_one_work+0x1d4/0x5c0 Jul 28 18:58:34 kernel: 2 locks held by sway/44306: Jul 28 18:58:34 kernel: #0: ffffa310f5e07d08 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_mode_gamma_set_ioctl+0x8f/0x1f0 [drm] Jul 28 18:58:34 kernel: #1: ffff8e58e934f8c8 (crtc_ww_class_mutex){+.+.}-{3:3}, at: modeset_lock+0xd7/0x1c0 [drm] Jul 28 18:58:34 kernel: Jul 28 18:58:34 kernel: ============================================= If this stacktrace could help to move forward with this. or, please, suggest, what would be good diagnostics steps that one can attempt to run blindly or with a prepared script. (In reply to Sergio Basto from comment #22) > > I ended up using kernel-longterm-4.14 [1] , since I can't boot the computer > with any newer kernel release ... > > [1] https://copr.fedorainfracloud.org/coprs/kwizart/kernel-longterm-4.14/ Sergio, I'm also ended up using the kwizart's kernel-longterm, but in my case the 4.19 which lasts almost one year more than 4.14 to become EOL. Good results so far. FWIW, still seeing this on Fedora 33, 5.10.13-200.fc33.x86_64 $ sudo lspci -s 01:00.0 -v 01:00.0 VGA compatible controller: NVIDIA Corporation GF110 [GeForce GTX 560 Ti OEM] (rev a1) (prog-if 00 [VGA controller]) Subsystem: PC Partner Limited / Sapphire Technology Device 5207 Flags: bus master, fast devsel, latency 0, IRQ 39 Memory at fa000000 (32-bit, non-prefetchable) [size=16M] Memory at d0000000 (64-bit, prefetchable) [size=128M] Memory at d8000000 (64-bit, prefetchable) [size=32M] I/O ports at e000 [size=128] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Kernel driver in use: nouveau Kernel modules: nouveau |