Bug 1245875 - nouveau PGRAPH TLB flush idle timeout fail (F22)
Summary: nouveau PGRAPH TLB flush idle timeout fail (F22)
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: 22
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-23 03:30 UTC by Matt Domsch
Modified: 2015-11-12 02:35 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-12 02:35:27 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
kernel log messages (244.77 KB, application/octet-stream)
2015-07-23 03:30 UTC, Matt Domsch
no flags Details

Description Matt Domsch 2015-07-23 03:30:02 UTC
Created attachment 1055143 [details]
kernel log messages

Description of problem:
System locks up (console unresponsive) with the following errors being reported by the kernel repeatedly:

[  441.989044] nouveau E[     PGR][0000:01:00.0] vm flush timeout
[  443.995127] nouveau E[     PGR][0000:01:00.0] PGRAPH TLB flush idle timeout fail
[  444.002432] nouveau E[     PGR][0000:01:00.0] PGRAPH_STATUS  : 0x00be0001 BUSY ENG2D RMASK TPC_RAST TPC_PROP TPC_TEX TPC_MP
[  444.013495] nouveau E[     PGR][0000:01:00.0] PGRAPH_VSTATUS0: 0x00000000
[  444.020206] nouveau E[     PGR][0000:01:00.0] PGRAPH_VSTATUS1: 0x0000106d TPC_TEX TPC_MP
[  444.028224] nouveau E[     PGR][0000:01:00.0] PGRAPH_VSTATUS2: 0x00148000 ENG2D
[  446.035627] nouveau E[     PGR][0000:01:00.0] vm flush timeout
[  448.041675] nouveau E[     PGR][0000:01:00.0] PGRAPH TLB flush idle timeout fail
[  448.048978] nouveau E[     PGR][0000:01:00.0] PGRAPH_STATUS  : 0x00be0001 BUSY ENG2D RMASK TPC_RAST TPC_PROP TPC_TEX TPC_MP
[  448.060041] nouveau E[     PGR][0000:01:00.0] PGRAPH_VSTATUS0: 0x00000000
[  448.066749] nouveau E[     PGR][0000:01:00.0] PGRAPH_VSTATUS1: 0x0000106d TPC_TEX TPC_MP
[  448.074763] nouveau E[     PGR][0000:01:00.0] PGRAPH_VSTATUS2: 0x00148000 ENG2D

Version-Release number of selected component (if applicable):
Fedora release 22 (Twenty Two)
Kernel 4.0.4-303.fc22.x86_64 on an x86_64 (ttyS0)
xorg-x11-drv-nouveau-1.0.11-2.fc22.x86_64

How reproducible:
trivial with this hardware and Fedora 22.  I had stopped using Gnome about Fedora 18 due to similar lockups; KDE was fine. With Fedora 22, I had to stop using KDE and instead use LXDE.  After doing a dnf upgrade yesterday to newest Fedora 22 packages, this failure happens within a few minutes of booting even with LXDE.

Steps to Reproduce:
1. Boot Fedora 22 with this graphics hardware and nouveau driver / xorg, LXDE desktop
2. Start a web browser
3. crash imminent

Actual results:
system lockup

Expected results:
no lockup

Additional info:
Kernel messages related to nouveau at bootup:

[    3.729680] nouveau  [  DEVICE][0000:01:00.0] BOOT0  : 0x0a3000a2
[    3.738835] nouveau  [  DEVICE][0000:01:00.0] Chipset: GT215 (NVA3)
[    3.748288] nouveau  [  DEVICE][0000:01:00.0] Family : NV50
[    3.880804] nouveau  [   VBIOS][0000:01:00.0] using image from PRAMIN
[    3.887635] nouveau  [   VBIOS][0000:01:00.0] BIT signature found
[    3.893972] nouveau  [   VBIOS][0000:01:00.0] version 70.15.32.00.00
[    3.920937] nouveau  [     PMC][0000:01:00.0] MSI interrupts enabled
[    3.920962] nouveau  [     PFB][0000:01:00.0] RAM type: DDR3
[    3.920963] nouveau  [     PFB][0000:01:00.0] RAM size: 1024 MiB
[    3.920964] nouveau  [     PFB][0000:01:00.0]    ZCOMP: 2048 tags
[    3.923543] nouveau  [    VOLT][0000:01:00.0] GPU voltage: 900000uv
[    3.951128] nouveau  [  PTHERM][0000:01:00.0] FAN control: PWM
[    3.951138] nouveau  [  PTHERM][0000:01:00.0] fan management: automatic
[    3.951156] nouveau  [  PTHERM][0000:01:00.0] internal sensor: yes
[    3.971188] nouveau  [     CLK][0000:01:00.0] 03: core 135 MHz shader 270 MHz memory 135 MHz
[    3.971191] nouveau  [     CLK][0000:01:00.0] 07: core 405 MHz shader 810 MHz memory 324 MHz
[    3.971193] nouveau  [     CLK][0000:01:00.0] 0f: core 550 MHz shader 1340 MHz memory 790 MHz
[    3.971214] nouveau  [     CLK][0000:01:00.0] --: core 405 MHz shader 810 MHz memory 324 MHz
[    3.971398] nouveau  [     DRM] VRAM: 1024 MiB
[    3.971398] nouveau  [     DRM] GART: 1048576 MiB
[    3.971401] nouveau  [     DRM] TMDS table version 2.0
[    3.971402] nouveau  [     DRM] DCB version 4.0
[    3.971403] nouveau  [     DRM] DCB outp 00: 01000302 00020030
[    3.971404] nouveau  [     DRM] DCB outp 01: 02000300 00000000
[    3.971405] nouveau  [     DRM] DCB outp 02: 040113b6 0f220010
[    3.971405] nouveau  [     DRM] DCB outp 03: 04011372 00020010
[    3.971407] nouveau  [     DRM] DCB conn 00: 00001030
[    3.971407] nouveau  [     DRM] DCB conn 01: 00202146
[    4.043294] nouveau  [     DRM] MM: using COPY for buffer copies
[    4.146305] nouveau  [     DRM] allocated 1920x1200 fb: 0x70000, bo ffff880409716c00
[    4.161841] fbcon: nouveaufb (fb0) is primary device
[    4.333332] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[    4.333333] nouveau 0000:01:00.0: registered panic notifier
[    4.350656] [drm] Initialized nouveau 1.2.1 20120801 for 0000:01:00.0 on minor 0

Comment 1 Matt Domsch 2015-07-23 03:36:19 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=754882 asks for installed package versions of other apps.  I include those here:

libdrm-2.4.61-3.fc22.x86_64
libdrm-2.4.61-3.fc22.i686
mesa-dri-drivers-10.6.1-1.20150629.fc22.x86_64
mesa-dri-drivers-10.6.1-1.20150629.fc22.i686

Comment 2 Matt Domsch 2015-07-24 01:22:23 UTC
Ben had tried to fix this in this upstream commit:
commit 464d636bd0a7a905209816d1dee0838ccb79e57a
Author: Ben Skeggs <bskeggs>
Date:   Mon May 13 20:55:46 2013 +1000

    drm/nv50/vm: remove explicit vm knowledge from engines
    
    This reverses the lock ordering between VM and gr/nv84:nvc0.
    
    Signed-off-by: Ben Skeggs <bskeggs>


With reference to 
/* unfortunate hw bug workaround... */

Not sure what else to do here but perhaps buy a different video card that doesn't crash?

Comment 3 Matt Domsch 2015-07-25 03:58:45 UTC
Running google-chrome with --disable-gpu seems to help reduce crashes.  It hasn't crashed in several hours of using the system today at least.

Comment 4 Eugene Mah 2015-08-04 17:36:18 UTC
Have been running into this issue on my system as well. never happens when I'm working at the computer. I leave the computer for a while, come back to a frozen screensaver and unresponsive to mouse or keyboard.

I see
Aug  4 11:36:12 tungsten kernel: nouveau E[  PGRAPH][0000:09:00.0] PGRAPH TLB flush idle timeout fail
Aug  4 11:36:12 tungsten kernel: nouveau E[  PGRAPH][0000:09:00.0] PGRAPH_STATUS  : 0x01982703 BUSY DISPATCH CTXPROG VFETCH CCACHE_PREGEOM RATTR_APLANE TPC_RAST TPC_PROP TPC_MP ROP
Aug  4 11:36:12 tungsten kernel: nouveau E[  PGRAPH][0000:09:00.0] PGRAPH_VSTATUS0: 0x0000000d CCACHE
Aug  4 11:36:12 tungsten kernel: nouveau E[  PGRAPH][0000:09:00.0] PGRAPH_VSTATUS1: 0x0000102d TPC_MP
Aug  4 11:36:12 tungsten kernel: nouveau E[  PGRAPH][0000:09:00.0] PGRAPH_VSTATUS2: 0x00200028 ROP

repeated about every two seconds in /var/log/messages

From dmesg
[    2.791703] nouveau  [  DEVICE][0000:0a:00.0] BOOT0  : 0x298c00a2
[    2.791709] nouveau  [  DEVICE][0000:0a:00.0] Chipset: G98 (NV98)
[    2.791712] nouveau  [  DEVICE][0000:0a:00.0] Family : NV50
[    2.908645] nouveau  [   VBIOS][0000:0a:00.0] using image from PROM
[    2.908802] nouveau  [   VBIOS][0000:0a:00.0] BIT signature found
[    2.908806] nouveau  [   VBIOS][0000:0a:00.0] version 62.98.6f.00.07
[    2.909268] nouveau  [ DEVINIT][0000:0a:00.0] adaptor not initialised
[    2.909276] nouveau  [   VBIOS][0000:0a:00.0] running init tables
[    2.962965] nouveau  [     PMC][0000:0a:00.0] MSI interrupts enabled
[    2.963028] nouveau  [     PFB][0000:0a:00.0] RAM type: GDDR3
[    2.963031] nouveau  [     PFB][0000:0a:00.0] RAM size: 256 MiB
[    2.963034] nouveau  [     PFB][0000:0a:00.0]    ZCOMP: 960 tags
[    3.605032] nouveau  [  PTHERM][0000:0a:00.0] FAN control: none / external
[    3.605054] nouveau  [  PTHERM][0000:0a:00.0] fan management: automatic
[    3.605061] nouveau  [  PTHERM][0000:0a:00.0] internal sensor: yes
[    3.625085] nouveau  [     CLK][0000:0a:00.0] 03: core 169 MHz shader 358 MHz memory 100 MHz
[    3.625091] nouveau  [     CLK][0000:0a:00.0] 0f: core 550 MHz shader 1400 MHz memory 700 MHz
[    3.625160] nouveau  [     CLK][0000:0a:00.0] --: core 550 MHz shader 1400 MHz memory 702 MHz
[    3.625371] nouveau  [     DRM] VRAM: 256 MiB
[    3.625374] nouveau  [     DRM] GART: 1048576 MiB
[    3.625380] nouveau  [     DRM] TMDS table version 2.0
[    3.625383] nouveau  [     DRM] DCB version 4.0
[    3.625387] nouveau  [     DRM] DCB outp 00: 02000386 0f220010
[    3.625391] nouveau  [     DRM] DCB outp 01: 02000302 00020010
[    3.625394] nouveau  [     DRM] DCB outp 02: 040113a6 0f220010
[    3.625397] nouveau  [     DRM] DCB outp 03: 04011312 00020010
[    3.625400] nouveau  [     DRM] DCB conn 00: 00005046
[    3.625404] nouveau  [     DRM] DCB conn 01: 00006146
[    3.680555] nouveau  [     DRM] MM: using M2MF for buffer copies
[    3.680577] [drm] Initialized nouveau 1.2.2 20120801 for 0000:0a:00.0 on minor 1

Currently running Fedora 22
Kernel: 4.1.3-200.fc22.x86_64
xorg-x11-drv-nouveau 1.0.11-2.fc22
libdrm.i686 2.4.61-3.fc22
libdrm.x86_64 2.4.61-3.fc22
mesa-dri-drivers.x86_64 10.6.3-1.20150729.fc22
xorg-x11-server-Xorg.x86_64 1.17.2-2.fc22

Comment 5 Matt Domsch 2015-08-19 17:07:16 UTC
I gave up and replaced my video card with an ATI Radeon HD 5450. Bye bye nouveau.


Note You need to log in before you can comment on or make changes to this bug.