Bug 1514831

Summary: NVIDIA GTS 450 randomly freezes
Product: [Fedora] Fedora Reporter: Dustin Spicuzza <dustin>
Component: xorg-x11-drv-nouveauAssignee: Ben Skeggs <bskeggs>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 35CC: airlied, ajax, bskeggs, daveb, david.cussans, eggert, jan.public, jglisse, kenorb, me, mkrupcale, pablo.iranzo, rob
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-13 15:12:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
journalctl of system hang none

Description Dustin Spicuzza 2017-11-18 19:23:21 UTC
Description of problem: X completely freezes randomly while moving mouse

Version-Release number of selected component (if applicable):

How reproducible: Seems to be random

Actual results: Freezes

Expected results: No freeze

Additional info:

[    0.000000] Linux version 4.13.12-300.fc27.x86_64 (mockbuild.fedoraproject.org) (gcc version 7.
2.1 20170915 (Red Hat 7.2.1-2) (GCC)) #1 SMP Wed Nov 8 16:38:01 UTC 2017

[ 3571.488551] nouveau 0000:01:00.0: gr: TRAP ch 14 [003f543000 Xwayland[3020]]
[ 3571.488560] nouveau 0000:01:00.0: gr: GPC0/TPC0/TEX: 80000049
[ 3571.488564] nouveau 0000:01:00.0: gr: GPC0/TPC1/TEX: 80000049
[ 3571.488568] nouveau 0000:01:00.0: gr: GPC0/TPC2/TEX: 80000049
[ 3571.488572] nouveau 0000:01:00.0: gr: GPC0/TPC3/TEX: 80000049
[ 3571.488581] nouveau 0000:01:00.0: fifo: read fault at 0004bbc000 engine 00 [PGRAPH] client 0a [GPC0/] reason 02 [PAGE_NOT_PRESENT] on channel 14 [003f543000 Xwayland[3020]]
[ 3571.488583] nouveau 0000:01:00.0: fifo: gr engine fault on channel 14, recovering...
[ 3571.488779] nouveau 0000:01:00.0: Xwayland[3020]: channel 14 killed!

lspci:

01:00.0 VGA compatible controller: NVIDIA Corporation GF106 [GeForce GTS 450] (rev a1)

packages:

xorg-x11-drv-nouveau-1.0.15-3.fc27

Comment 1 Matthew Krupcale 2017-12-17 04:20:17 UTC
I am also observing these system freezes. Particularly, I was attempting to exit the full-screen view of an HTML5 video player in Firefox when I could no longer control the system, and the video froze, but audio continued to play. I did not experience these freezes under Fedora 26.

Additional info:

$ journalctl
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: gr: TRAP ch 14 [003f390000 Xwayland[2003]]
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: gr: GPC0/TPC0/TEX: 80000049
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: gr: GPC0/TPC1/TEX: 80000049
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: gr: GPC0/TPC2/TEX: 80000049
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: gr: GPC1/TPC0/TEX: 80000049
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: gr: GPC1/TPC1/TEX: 80000049
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: gr: GPC1/TPC2/TEX: 80000049
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: gr: GPC1/TPC3/TEX: 80000049
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: fifo: read fault at 0004096000 engine 00 [PGRAPH] client 04 [GPC1/] reason 02 [PAGE_NOT_PRESENT] on channel 14 [003f390000 Xwayland[2003]]
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: fifo: gr engine fault on channel 14, recovering...
Dec 16 22:51:06 kernel: nouveau 0000:01:00.0: Xwayland[2003]: channel 14 killed!
Dec 16 22:51:24 org.gnome.Shell.desktop[1976]: nouveau: kernel rejected pushbuf: Device or resource busy
Dec 16 22:51:24 org.gnome.Shell.desktop[1976]: nouveau: ch14: krec 0 pushes 1 bufs 4 relocs 0
Dec 16 22:51:24 org.gnome.Shell.desktop[1976]: nouveau: ch14: buf 00000000 00000002 00000004 00000004 00000000
Dec 16 22:51:24 org.gnome.Shell.desktop[1976]: nouveau: ch14: buf 00000001 00000006 00000004 00000000 00000004
Dec 16 22:51:24 org.gnome.Shell.desktop[1976]: nouveau: ch14: buf 00000002 0000001c 00000002 00000000 00000002
Dec 16 22:51:24 org.gnome.Shell.desktop[1976]: nouveau: ch14: buf 00000003 00000027 00000004 00000004 00000000
Dec 16 22:51:24 org.gnome.Shell.desktop[1976]: nouveau: ch14: psh 00000000 0000058b08 0000058b98
Dec 16 22:51:24 org.gnome.Shell.desktop[1976]: nouveau:         0x20056080
...
Dec 16 22:51:24 org.gnome.Shell.desktop[1976]: nouveau:         0x00000000
Dec 16 22:51:24 kernel: nouveau 0000:01:00.0: Xwayland[2003]: nv50cal_space: -16
...

$ lspci
01:00.0 VGA compatible controller: NVIDIA Corporation GF114 [GeForce GTX 560] (rev a1)

$ rpm -qa xorg-x11-drv-nouveau
xorg-x11-drv-nouveau-1.0.15-3.fc27.x86_64

$ uname -r
4.14.5-300.fc27.x86_64

Comment 2 David Cussans 2017-12-18 11:54:43 UTC
I am also experiencing quite frequent display freezes. This seems to happen when I am manipulating a window ( e.g. resizing ).

When the display freezes I can still log in remotely ( using ssh )

from dmesg:
[ 4590.883720] nouveau 0000:01:00.0: gr: TRAP ch 14 [003f574000 Xwayland[1925]]
[ 4590.883732] nouveau 0000:01:00.0: gr: GPC0/TPC0/TEX: 80000049
[ 4590.883737] nouveau 0000:01:00.0: gr: GPC0/TPC1/TEX: 80000049
[ 4590.883747] nouveau 0000:01:00.0: fifo: read fault at 0005022000 engine 00 [PGRAPH] client 04 [GPC0/] reason 02 [PAGE_NOT_PRESENT] on channel 14 [003f574000 Xwayland[1925]]
[ 4590.883749] nouveau 0000:01:00.0: fifo: gr engine fault on channel 14, recovering...
[ 4590.883965] nouveau 0000:01:00.0: Xwayland[1925]: channel 14 killed!


lspci:
01:00.0 VGA compatible controller: NVIDIA Corporation GF108GL [Quadro 600] (rev a1)

$ rpm -qa xorg-x11-drv-nouveau
xorg-x11-drv-nouveau-1.0.15-3.fc27.x86_64

$ uname -r
4.14.5-300.fc27.x86_64

I didn't experience these freezes when running FC26

Comment 3 rob 2017-12-31 00:15:05 UTC
I experience this resizing a window. It seems that I can resize a gnome terminal but as soon as I try to resize gvim it freezes.

[ 7249.769172] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7249.797734] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7249.833225] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7250.030695] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7250.051195] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7250.086170] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7250.120499] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7250.167768] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7250.216739] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7250.409368] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7250.435035] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7250.469725] gnome-terminal-[2578]: Allocating size to GtkScrollbar 0x5573647e62f0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
[ 7252.271859] gnome-shell[1882]: The property brightness doesn't seem to be a normal object property of [0x559e7b895440 StWidget] or a registered special property
[ 7252.272225] gnome-shell[1882]: The property vignette_sharpness doesn't seem to be a normal object property of [0x559e7b895440 StWidget] or a registered special property
[ 7252.272785] gnome-shell[1882]: The property brightness doesn't seem to be a normal object property of [0x559e7b9e9c40 StWidget] or a registered special property
[ 7252.273382] gnome-shell[1882]: The property vignette_sharpness doesn't seem to be a normal object property of [0x559e7b9e9c40 StWidget] or a registered special property
[ 7259.234168] kernel: nouveau 0000:03:00.0: gr: TRAP ch 14 [007f290000 Xwayland[1937]]
[ 7259.234179] kernel: nouveau 0000:03:00.0: gr: GPC0/TPC0/TEX: 80000049
[ 7259.234186] kernel: nouveau 0000:03:00.0: gr: GPC0/TPC1/TEX: 80000049
[ 7259.234193] kernel: nouveau 0000:03:00.0: gr: GPC0/TPC2/TEX: 80000049
[ 7259.234202] kernel: nouveau 0000:03:00.0: gr: GPC1/TPC0/TEX: 80000049
[ 7259.234209] kernel: nouveau 0000:03:00.0: gr: GPC1/TPC1/TEX: 80000049
[ 7259.234216] kernel: nouveau 0000:03:00.0: gr: GPC1/TPC2/TEX: 80000049
[ 7259.234223] kernel: nouveau 0000:03:00.0: gr: GPC1/TPC3/TEX: 80000049
[ 7259.234237] kernel: nouveau 0000:03:00.0: fifo: read fault at 000416c000 engine 00 [PGRAPH] client 0a [GPC1/] reason 02 [PAGE_NOT_PRESENT] on channel 14 [007f290000 Xwayland[1937]]
[ 7259.234239] kernel: nouveau 0000:03:00.0: fifo: gr engine fault on channel 14, recovering...
[ 7259.235014] kernel: nouveau 0000:03:00.0: Xwayland[1937]: channel 14 killed!

lspci:
03:00.0 VGA compatible controller: NVIDIA Corporation GF114 [GeForce GTX 560] (rev a1)

$ rpm -qa xorg-x11-drv-nouveau
xorg-x11-drv-nouveau-1.0.15-3.fc27.x86_64

$ uname -r
4.14.8-300.fc27.x86_64

Don't recall the problem in FC26.

Comment 4 Jan Vlug 2017-12-31 11:32:30 UTC
I experienced freezes as well (see bug #1529854, and the corresponding bug at freedesktop.org: https://bugs.freedesktop.org/show_bug.cgi?id=104421

Until now I could not reproduce the freeze. But now I installed gvim (vim-X11.x86_64), started it, and the system froze immediately on resizing gvim.

I have other hardware though:

Graphics:  Card: NVIDIA GP107 [GeForce GTX 1050 Ti]
           Display Server: X.org 1.19.5 drivers: modesetting,fbdev,vesa
           tty size: 80x24 Advanced Data: N/A for root

Comment 5 Paul Eggert 2018-01-26 01:52:04 UTC
*** Bug 1535751 has been marked as a duplicate of this bug. ***

Comment 6 Paul Eggert 2018-01-26 01:56:39 UTC
I have been seeing similar freezes for a month or two, as described in Bug 1535751. I am using an NVIDIA GF106 [GeForce GTS 450] (rev a1), the same as the original reporter of this bug.

Comment 7 Dustin Spicuzza 2018-02-16 20:49:29 UTC
I installed GVIM and tried to resize it. Unfortunately, it did not cause a freeze.

I'm still experiencing the freeze once a day or so. It's pretty annoying. Resizing things definitely tends to cause it... my interactions with Chrome/Firefox/Atom are generally what cause it to hang.

I started looking through the source code for the nouveau kernel driver, and it seems like this sort of error is expected, as each type of driver seems to have fifo error recovery. However, I'm guessing this sort of error isn't tested very much. :)

I'm contemplating adding additional printks in there to see if the recovery routine actually finishes or not. Unfortunately, since this is so difficult to reliably reproduce, it might take awhile to get some results from that.

Comment 8 Jan Vlug 2018-02-16 21:49:38 UTC
For me, resizing gvim also does not freeze the system any more, while it previously 100% reproducible froze my system. Moreover, it has been several days, maybe even longer since I had the last system freeze.

I think I installed once or twice a kernel update since the last freeze.

Comment 9 Nikolai Gotzmann 2018-02-25 20:49:13 UTC
I get the same bug with some applications.
In my logs I have started "Boxes" and then my screen freeze.

System details:
* i5 4690
* GTX 970
* kernel: 4.15.4-300.fc27.x86_64
* f27

Feb 25 20:57:23 nghome dbus-daemon[1792]: [session uid=1000 pid=1792] Activating service name='org.gnome.Boxes' requested by ':1.17' (uid=1000 pid=1858 comm="/usr/bin/gnome-shell " label="unconfined
_u:unconfined_r:unconfined_t:s0-s0:c0.c1023")
Feb 25 20:57:23 nghome dbus-daemon[1792]: [session uid=1000 pid=1792] Successfully activated service 'org.gnome.Boxes'
Feb 25 20:57:23 nghome gnome-shell[1858]: Object Clutter.Clone (0x55d584b08de0), has been already finalized. Impossible to get any property from it.
Feb 25 20:57:23 nghome gnome-shell[1858]: Object Clutter.Clone (0x55d584b08de0), has been already finalized. Impossible to set any property to it.
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: == Stack trace for context 0x55d581608000 ==
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #0 0x7ffc0fe240b0 b   resource:///org/gnome/shell/ui/tweener.js:73 (0x7fe3bc4ddef0 @ 9)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #1 0x7ffc0fe24150 b   resource:///org/gnome/shell/ui/tweener.js:105 (0x7fe3bc4df230 @ 36)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #2 0x7ffc0fe241f0 b   resource:///org/gnome/shell/ui/tweener.js:92 (0x7fe3bc4df098 @ 52)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #3 0x7ffc0fe25160 b   resource:///org/gnome/gjs/modules/tweener/tweener.js:203 (0x7fe3bc4e9cd0 @ 54)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #4 0x7ffc0fe252b0 b   resource:///org/gnome/gjs/modules/tweener/tweener.js:332 (0x7fe3bc4e9d58 @ 1626)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #5 0x7ffc0fe25360 b   resource:///org/gnome/gjs/modules/tweener/tweener.js:345 (0x7fe3bc4e9de0 @ 100)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #6 0x7ffc0fe253f0 b   resource:///org/gnome/gjs/modules/tweener/tweener.js:360 (0x7fe3bc4e9e68 @ 10)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #7 0x7ffc0fe25470 I   resource:///org/gnome/gjs/modules/signals.js:126 (0x7fe3bc4e2b38 @ 386)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #8 0x7ffc0fe25520 b   resource:///org/gnome/shell/ui/tweener.js:208 (0x7fe3bc4df808 @ 159)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #9 0x7ffc0fe25580 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7fe3bc4c2bc0 @ 71)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #10 0x7ffc0fe25580 I   resource:///org/gnome/shell/ui/tweener.js:183 (0x7fe3bc4df780 @ 20)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #11 0x7ffc0fe25610 I   self-hosted:917 (0x7fe3bc4ee5e8 @ 394)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: == Stack trace for context 0x55d581608000 ==
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #0 0x7ffc0fe240b0 b   resource:///org/gnome/shell/ui/tweener.js:80 (0x7fe3bc4ddef0 @ 82)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #1 0x7ffc0fe24150 b   resource:///org/gnome/shell/ui/tweener.js:105 (0x7fe3bc4df230 @ 36)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #2 0x7ffc0fe241f0 b   resource:///org/gnome/shell/ui/tweener.js:92 (0x7fe3bc4df098 @ 52)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #3 0x7ffc0fe25160 b   resource:///org/gnome/gjs/modules/tweener/tweener.js:203 (0x7fe3bc4e9cd0 @ 54)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #4 0x7ffc0fe252b0 b   resource:///org/gnome/gjs/modules/tweener/tweener.js:332 (0x7fe3bc4e9d58 @ 1626)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #5 0x7ffc0fe25360 b   resource:///org/gnome/gjs/modules/tweener/tweener.js:345 (0x7fe3bc4e9de0 @ 100)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #6 0x7ffc0fe253f0 b   resource:///org/gnome/gjs/modules/tweener/tweener.js:360 (0x7fe3bc4e9e68 @ 10)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #7 0x7ffc0fe25470 I   resource:///org/gnome/gjs/modules/signals.js:126 (0x7fe3bc4e2b38 @ 386)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #8 0x7ffc0fe25520 b   resource:///org/gnome/shell/ui/tweener.js:208 (0x7fe3bc4df808 @ 159)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #9 0x7ffc0fe25580 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7fe3bc4c2bc0 @ 71)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #10 0x7ffc0fe25580 I   resource:///org/gnome/shell/ui/tweener.js:183 (0x7fe3bc4df780 @ 20)
Feb 25 20:57:23 nghome org.gnome.Shell.desktop[1858]: #11 0x7ffc0fe25610 I   self-hosted:917 (0x7fe3bc4ee5e8 @ 394)
Feb 25 20:57:23 nghome dbus-daemon[1072]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.106' (uid=1000 pid=4504 comm="/usr/bin/gnome-boxes --gapplication-service " label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023")
Feb 25 20:57:23 nghome systemd[1]: Starting Hostname Service...
Feb 25 20:57:24 nghome dbus-daemon[1072]: [system] Successfully activated service 'org.freedesktop.hostname1'
Feb 25 20:57:24 nghome systemd[1]: Started Hostname Service.
Feb 25 20:57:24 nghome audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Feb 25 20:57:24 nghome kernel: nf_conntrack: default automatic helper assignment has been turned off for security reasons and CT-based  firewall rule not found. Use the iptables CT target to attach helpers instead.
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: FECS 00010000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 409000 - done 00003224
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 409000 - stat 00000004 00000000 00050009 00000010
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 409000 - stat 00080420 00000000 00000000 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 502000 - done 00007324
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 502000 - stat 00000000 00000000 00040009 00000010
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 502000 - stat 00080426 00000000 00000000 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 50a000 - done 00007324
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 50a000 - stat 00000000 00000000 00040009 00000010
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 50a000 - stat 00080436 00000000 00000000 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 512000 - done 00007324
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 512000 - stat 00000000 00000000 00040009 00000010
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 512000 - stat 00080446 00000000 00000000 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 51a000 - done 00007324
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 51a000 - stat 00000000 00000000 00040009 00000010
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 51a000 - stat 00080456 00000000 00000000 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: fifo: write fault at 0000557000 engine 00 [GR] client 04 [FE] reason 00 [PDE] on channel 14 [103c9b9000 Xwayland[1925]]
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: fifo: channel 14: killed
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: Xwayland[1925]: channel 14 killed!
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: fifo: read fault at 44ff444000 engine 1f [] client 07 [HOST_CPU] reason 0d [REGION_VIOLATION] on channel -1 [0000000000 unknown]
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80004000 [GPPTR SIGNATURE] ch 13 [103cd7f000 systemd-logind[1156]] subc 0 mthd 1b00 data 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 00004000 [GPPTR] ch 13 [103cd7f000 systemd-logind[1156]] subc 0 mthd 1b04 data 00224000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: FECS 00010000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 409000 - done 00007380
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 409000 - stat 00000001 00000000 00090009 00000201
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 409000 - stat 00080420 00000000 00000000 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 502000 - done 00000300
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 502000 - stat 00000000 00000000 00000001 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 502000 - stat 00080426 00000000 00000000 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 50a000 - done 00000300
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 50a000 - stat 00000000 00000000 00000001 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 50a000 - stat 00080436 00000000 00000000 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 512000 - done 00000300
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 512000 - stat 00000000 00000000 00000001 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 512000 - stat 00080446 00000000 00000000 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 51a000 - done 00000300
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 51a000 - stat 00000000 00000000 00000001 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: gr: 51a000 - stat 00080456 00000000 00000000 00000000
Feb 25 20:57:26 nghome kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 00004000 [GPPTR] ch 13 [103cd7f000 systemd-logind[1156]] subc 0 mthd 1b08 data 0000408f
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: FECS 00080000
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 409000 - done 00007b80
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 409000 - stat 00000001 00000000 00090009 00000201
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 409000 - stat 00080420 00000000 00000000 00000000
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 502000 - done 00000300
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 502000 - stat 00000000 00000000 00000001 00000000
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 502000 - stat 00080426 00000000 00000000 00000000
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 50a000 - done 00000300
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 50a000 - stat 00000000 00000000 00000001 00000000
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 50a000 - stat 00080436 00000000 00000000 00000000
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 51a000 - done 00000300
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 51a000 - stat 00000000 00000000 00000001 00000000
Feb 25 20:57:29 nghome kernel: nouveau 0000:01:00.0: gr: 51a000 - stat 00080456 00000000 00000000 00000000
Feb 25 20:57:30 nghome kernel: nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
Feb 25 20:57:30 nghome kernel: nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
Feb 25 20:57:30 nghome kernel: nouveau 0000:01:00.0: fifo: channel 13: killed
Feb 25 20:57:30 nghome kernel: nouveau 0000:01:00.0: fifo: engine 7: scheduled for recovery
Feb 25 20:57:30 nghome kernel: nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery
Feb 25 20:57:30 nghome kernel: nouveau 0000:01:00.0: systemd-logind[1156]: channel 13 killed!
Feb 25 20:57:47 nghome dbus-daemon[1072]: [system] Failed to activate service 'org.bluez': timed out (service_start_timeout=25000ms)
Feb 25 20:57:54 nghome audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
-- Reboot --

Comment 10 Dave B 2018-04-17 16:50:36 UTC
I also had what I believe to be this bug. 

I am new to Linux so I do not yet know where to read the logs for the frequent system hangs or how to interpret them.  Therefore, I made an educated guess that the problem was due to the graphics card and installed the driver directly from Nvidia by following the step by step instructions in the article below.

https://www.if-not-true-then-false.com/2015/fedora-nvidia-guide

Prior to installing the new driver, the system would crash at least once an hour 
causing me to restart my work.  It has not crashed since and I have used the system at least 5 hours since then.

System:
  Graphics card: GTX 970
  Monitors: 2
  Environments: Gnome3 Many times.  Tried XLDE to eliminate Gnome3 as the cause.  Had the same issue.

Causes:
  Making a window take the full screen, 
  Pressing play on a video within Chrome
  
Hope this additional case helps.

Comment 11 Dave B 2018-05-15 11:30:51 UTC
Created attachment 1436771 [details]
journalctl of system hang

Since I originally added to this bug, the NVIDIA GTX drivers I manually installed on Fedora 27, which solved the issues stopped the system from booting when new updates to the kernel were automatically installed.  I was able to keep booting using an old version of the kernel.  This however stopped working about a week after version 28 was released.  I was effectively locked out of the system and could not boot with any version of the kernel or the recovery.

I did a clean install of Fedora 28 two days ago and have used the system less than 3 hours since doing so.  

I had two windows docked next to each other and when dragging the border between them, the system froze again.  One window was google chrome.  The other was the terminal.  I have since learned how to use journalctl and have included the period of time covering this freeze here.

Comment 12 Ben Cotton 2018-11-27 15:42:22 UTC
This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 13 Paul Eggert 2018-11-27 19:38:57 UTC
Although I often observed the problem in Fedora 27 (see Bug 1535751) I have not observed it on the same hardware recently. I am currently running Fedora 29, so perhaps the bug was fixed in Fedora 29 or Fedora 28.

Comment 14 Ben Cotton 2018-11-30 17:34:02 UTC
Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 16 kenorb 2019-01-06 13:59:49 UTC
I've a similar problem on Ubuntu (SCHED_ERROR 0a [CTXSW_TIMEOUT]), detailed (including stack trace) provided at https://bugs.freedesktop.org/show_bug.cgi?id=100567#c18

Comment 17 Pablo Iranzo Gómez 2022-02-17 10:40:32 UTC
Similar fault on another nvidia card on F35:

01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M1000M] (rev a2) (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device 2230
        Flags: bus master, fast devsel, latency 0, IRQ 143
        Memory at b2000000 (32-bit, non-prefetchable) [size=16M]
        Memory at a0000000 (64-bit, prefetchable) [size=256M]
        Memory at b0000000 (64-bit, prefetchable) [size=32M]
        I/O ports at 4000 [size=128]
        Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [258] L1 PM Substates
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] Secondary PCI Express
        Kernel driver in use: nouveau
        Kernel modules: nouveau




[ 9981.133381] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
[ 9981.133417] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
[ 9981.133432] nouveau 0000:01:00.0: fifo: channel 6: killed
[ 9981.133444] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery
[ 9981.133822] nouveau 0000:01:00.0: Xwayland[4916]: channel 6 killed!
[10017.641411] nouveau 0000:01:00.0: Xwayland[4916]: failed to idle channel 8 [Xwayland[4916]]
[10032.641406] nouveau 0000:01:00.0: Xwayland[4916]: failed to idle channel 8 [Xwayland[4916]]
[10032.641457] nouveau 0000:01:00.0: fifo: fault 00 [READ] at 0000000000013000 engine 07 [HOST0] client 07 [HUB/HOST_CPU] reason 02 [PTE] on channel 8 [00feee3000 Xwayland[4916]]
[10032.641468] nouveau 0000:01:00.0: fifo: channel 8: killed
[10032.641470] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
[10032.641508] ------------[ cut here ]------------
[10032.641509] WARNING: CPU: 5 PID: 57936 at drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c:284 gk104_fifo_engine_id+0x33/0x50 [nouveau]
[10032.641577] Modules linked in: snd_seq_dummy snd_hrtimer uinput rfcomm xt_mark xt_CHECKSUM xt_conntrack xt_MASQUERADE xt_comment ipt_REJECT nf_nat_tftp tun nft_objref nf_conntrack_tftp nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib snd_usb_audio nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject snd_usbmidi_lib snd_rawmidi nft_ct nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security bridge iptable_nat nf_nat nf_conntrack stp llc nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bnep vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) btusb btrtl btbcm btintel bluetooth qrtr uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common ecdh_generic rmi_smbus rmi_core sunrpc intel_rapl_msr intel_rapl_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp ee1004 iTCO_wdt intel_pmc_bxt
[10032.641613]  iTCO_vendor_support kvm_intel mei_wdt mei_pxp mei_hdcp iwlmvm vfat fat kvm squashfs mac80211 irqbypass snd_ctl_led loop snd_hda_codec_realtek snd_hda_codec_hdmi libarc4 snd_hda_codec_generic rapl snd_hda_intel intel_cstate snd_intel_dspcfg snd_intel_sdw_acpi intel_uncore snd_hda_codec iwlwifi snd_hda_core snd_hwdep think_lmi joydev intel_wmi_thunderbolt wmi_bmof firmware_attributes_class cfg80211 snd_seq snd_seq_device i2c_i801 i2c_smbus thinkpad_acpi snd_pcm mei_me ledtrig_audio platform_profile snd_timer rfkill mei intel_pch_thermal snd soundcore v4l2loopback(OE) videodev mc zram ip_tables xfs dm_crypt hid_logitech_hidpp hid_logitech_dj nouveau rtsx_pci_sdmmc mmc_core crct10dif_pclmul drm_ttm_helper crc32_pclmul crc32c_intel ttm i2c_algo_bit mxm_wmi e1000e drm_kms_helper ghash_clmulni_intel nvme cec serio_raw nvme_core rtsx_pci drm wmi video ipmi_devintf ipmi_msghandler fuse
[10032.641651] CPU: 5 PID: 57936 Comm: tracker-miner-f Tainted: G           OE     5.16.8-200.fc35.x86_64 #1
[10032.641654] RIP: 0010:gk104_fifo_engine_id+0x33/0x50 [nouveau]
[10032.641709] Code: 74 30 8b 97 98 04 00 00 48 85 f6 74 1d 85 d2 7e 19 48 81 c7 98 03 00 00 31 c0 48 39 37 74 18 83 c0 01 48 83 c7 10 39 d0 7c f0 <0f> 0b b8 ff ff ff ff c3 b8 0f 00 00 00 c3 66 66 2e 0f 1f 84 00 00
[10032.641711] RSP: 0000:ffff9fbb8f6ffcd8 EFLAGS: 00010046
[10032.641712] RAX: 0000000000000006 RBX: ffff92e1c1112030 RCX: 0000000000000006
[10032.641714] RDX: 0000000000000006 RSI: ffff92e1c1112010 RDI: ffff92e1c1112400
[10032.641715] RBP: 0000000000000007 R08: 0000000000000000 R09: 0000000000000000
[10032.641715] R10: 0000000000000006 R11: 0000000000000000 R12: 0000000000000046
[10032.641716] R13: ffff92e1c1112010 R14: ffff92e1c1112008 R15: ffff92e1c11122b8
[10032.641718] FS:  00007f1096aa3ac0(0000) GS:ffff92f123f40000(0000) knlGS:0000000000000000
[10032.641719] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10032.641720] CR2: 000055c6ba6f10a8 CR3: 00000004b076c003 CR4: 00000000003706e0
[10032.641721] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[10032.641722] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[10032.641723] Call Trace:
[10032.641725]  <TASK>
[10032.641726]  gk104_fifo_fault+0x10a/0x230 [nouveau]
[10032.641793]  gm107_fifo_intr_fault+0xd5/0xf0 [nouveau]
[10032.641856]  gk104_fifo_intr+0x294/0x4d0 [nouveau]
[10032.641914]  nvkm_mc_intr+0x129/0x170 [nouveau]
[10032.641975]  nvkm_pci_intr+0x3d/0x80 [nouveau]
[10032.642034]  __handle_irq_event_percpu+0x3a/0x190
[10032.642039]  handle_irq_event+0x45/0x90
[10032.642041]  handle_edge_irq+0x9f/0x240
[10032.642043]  __common_interrupt+0x69/0x100
[10032.642047]  common_interrupt+0x5c/0xd0
[10032.642067]  ? asm_common_interrupt+0x8/0x40
[10032.642069]  asm_common_interrupt+0x1e/0x40
[10032.642071] RIP: 0033:0x7f1098fe6f9f
[10032.642073] Code: 44 00 00 0f be 30 0f b6 40 01 c1 e6 08 09 f0 be 04 00 00 00 48 98 66 89 72 08 48 89 02 89 c8 c3 0f 1f 00 0f be 30 0f b7 40 01 <66> c1 c0 08 c1 e6 10 0f b7 c0 eb d7 0f 1f 44 00 00 0f b6 70 01 0f
[10032.642074] RSP: 002b:00007ffc6d2f9718 EFLAGS: 00000203
[10032.642076] RAX: 000000000000086b RBX: 000055c687a74ba8 RCX: 0000000000000003
[10032.642077] RDX: 00007f1064df5518 RSI: 0000000000000002 RDI: 00007f10990b7ce0
[10032.642078] RBP: 00007f1064df5518 R08: 0000000000000000 R09: 0000000000000001
[10032.642079] R10: 00000000000003ea R11: 0000000000000000 R12: 000055c687ca7e95
[10032.642080] R13: 00007f1064df46d0 R14: 00007f1064001488 R15: 0000000000000003
[10032.642082]  </TASK>
[10032.642083] ---[ end trace b711737f28758890 ]---
[10032.642096] nouveau 0000:01:00.0: Xwayland[4916]: channel 8 killed!

Comment 19 Ben Cotton 2022-11-29 16:45:28 UTC
This message is a reminder that Fedora Linux 35 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '35'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 35 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 20 Ben Cotton 2022-12-13 15:12:21 UTC
Fedora Linux 35 entered end-of-life (EOL) status on 2022-12-13.

Fedora Linux 35 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.