Bug 1023849

Summary: Nouveau segfaults on NVE7
Product: [Fedora] Fedora Reporter: D. Hugh Redelmeier <hugh>
Component: xorg-x11-drv-nouveauAssignee: Ben Skeggs <bskeggs>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: airlied, ajax, bskeggs, otakon, richartjkuak
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-17 17:50:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output
none
Xorg.0,log.old: crash happens at end
none
Xorg.0.log: recovery from crash fails none

Description D. Hugh Redelmeier 2013-10-28 05:30:27 UTC
Description of problem:
Several times today, X crashed and left an unreadably curdled screen.
I have to ssh in from another machine to shut down cleanly.  Even then, shutdown isn't as quick as normal.

Version-Release number of selected component (if applicable):
xorg-x11-drv-nouveau-1.0.9-1.fc19.x86_64


How reproducible:
Has happend a couple of times, without trying.  The one I'm reporting crashed 984 seconds after the server started.

Steps to Reproduce:
1.Run X on my machine.
2. MAYBE switch to another screen with KVM.  I don't see evidence that this is the problem.
3. wait

Actual results:
nouveau segfaults, screen is curdled, console is unresponsive.

Expected results:
no segfaults

Additional info:

- the monitor I use is connected via dual link DVI (2560x1600 resolution)

- the video card is an MSI GeForce GTX 650 with 1G of GDDR5 RAM

- the system is an HP Envy 700-19 (i7, 12G RAM).

- I'm running Firefox when X it crashes.  No Flash installed.  Not working the machine hard at all (I'm not actually using it when it crashes).

- I updated my F19 today to see if that would cure the crash.  But I got another crash, the one I'm reporting.

Comment 1 D. Hugh Redelmeier 2013-10-28 05:41:35 UTC
Created attachment 816694 [details]
dmesg output

You can see that at 942 seconds in, nouveau gets into trouble.

The USB messages at 43 seconds in probably reflect my switching the KVM away from this machine.  So I don't expect that the KVM was a trigger.

The USB messages at 31933 seconds in are probably my switching the KVM back to this machine.  Again, not a trigger.

The key messages start:
[  942.353232] nouveau ![   PFIFO][0000:01:00.0] unhandled status 0x01000000
[  985.405692] nouveau E[     DRM] GPU lockup - switching to software fbcon
[ 1000.469072] nouveau E[Xorg[946]] failed to idle channel 0xcccc0001 [Xorg[946]]
[ 1015.475382] nouveau E[Xorg[946]] failed to idle channel 0xcccc0001 [Xorg[946]]
[ 1017.479238] nouveau E[   PFIFO][0000:01:00.0] playlist 0 update timeout

Comment 2 D. Hugh Redelmeier 2013-10-28 05:44:24 UTC
Created attachment 816695 [details]
Xorg.0,log.old: crash happens at end

the villain seems to be a segfault here:
[   984.971] (EE) 2: /lib64/libc.so.6 (__memcpy_ssse3_back+0x241c) [0x3373949d0c]
[   984.972] (EE) 3: /usr/lib64/xorg/modules/libexa.so (exaMoveOutPixmap+0xed6) [0x7f13b9c0f406]

Comment 3 D. Hugh Redelmeier 2013-10-28 05:48:26 UTC
Created attachment 816698 [details]
Xorg.0.log: recovery from crash fails

After the crash, X tries to restart.
It seems to stall partway in, with no subsequent messages.

The last logged message is at 1233 seconds in (dmesg goes to 31976 seconds).

The KVM seems to be behaving: the EDID info looks good.

Comment 4 D. Hugh Redelmeier 2013-10-28 06:50:47 UTC
I just had another crash.  Looks the same.  I never switched the KVM away from this computer.

I booted, logged in, and started firefox (asked it to recover the windows from the previous crash).  And went away.

557 seconds in, dmesg showed the same messages as reported above.  But it does take place ofver a period of time:

[  557.630564] nouveau W[   PFIFO][0000:01:00.0] unknown status 0x00000100
[  682.134221] nouveau ![   PFIFO][0000:01:00.0] unhandled status 0x01000000
[  769.489579] nouveau E[     DRM] GPU lockup - switching to software fbcon
[  784.553566] nouveau E[Xorg[942]] failed to idle channel 0xcccc0001 [Xorg[942]]
[  799.559785] nouveau E[Xorg[942]] failed to idle channel 0xcccc0001 [Xorg[942]]
[  801.563707] nouveau E[   PFIFO][0000:01:00.0] playlist 0 update timeout
[  816.569861] nouveau E[Xorg[942]] failed to idle channel 0xcccc0000 [Xorg[942]]
[  831.576061] nouveau E[Xorg[942]] failed to idle channel 0xcccc0000 [Xorg[942]]
[  833.577040] nouveau E[   PFIFO][0000:01:00.0] channel 2 [Xorg[942]] kick timeout
[  835.577953] nouveau E[   PFIFO][0000:01:00.0] channel 2 [Xorg[942]] kick timeout
[  837.581498] nouveau E[   PFIFO][0000:01:00.0] playlist 0 update timeout
[  852.657785] nouveau E[gnome-shell[1561]] failed to idle channel 0xcccc0000 [gnome-shell[1561]]
[  867.664238] nouveau E[gnome-shell[1561]] failed to idle channel 0xcccc0000 [gnome-shell[1561]]
[  869.665315] nouveau E[   PFIFO][0000:01:00.0] channel 4 [gnome-shell[1561]] kick timeout
[  871.669106] nouveau E[   PFIFO][0000:01:00.0] playlist 0 update timeout
[  873.807564] nouveau E[   PFIFO][0000:01:00.0] playlist 0 update timeout
[  875.821509] nouveau E[   PFIFO][0000:01:00.0] playlist 0 update timeout
[  893.024405] nouveau E[   PFIFO][0000:01:00.0] playlist 0 update timeout
[  953.278637] nouveau E[gnome-session-c[2216]] failed to idle channel 0xcccc0000 [gnome-session-c[2216]]
[  968.284175] nouveau E[gnome-session-c[2216]] failed to idle channel 0xcccc0000 [gnome-session-c[2216]]
[  983.289953] nouveau E[gnome-session-c[2216]] failed to idle channel 0xcccc0000 [gnome-session-c[2216]]
[  998.295947] nouveau E[gnome-session-c[2216]] failed to idle channel 0xcccc0000 [gnome-session-c[2216]]
[ 1013.302118] nouveau E[gnome-session-c[2216]] failed to idle channel 0xcccc0000 [gnome-session-c[2216]]
[ 1015.303052] nouveau E[   PFIFO][0000:01:00.0] channel 4 [gnome-session-c[2216]] kick timeout
[ 1017.306746] nouveau E[   PFIFO][0000:01:00.0] playlist 0 update timeout
[ 1019.598416] nouveau E[   PFIFO][0000:01:00.0] playlist 0 update timeout

Comment 5 D. Hugh Redelmeier 2013-11-10 21:07:57 UTC
again

Comment 6 D. Hugh Redelmeier 2013-11-16 19:47:14 UTC
I downgraded to see if things would get better.
  kernel-3.10.11-200.fc19.x86_64.rpm
  kernel-devel-3.10.11-200.fc19.x86_64.rpm
  kernel-modules-extra-3.10.11-200.fc19.x86_64.rpm

While I was away from the keyboard, the screensaver presumably blanked the screen but now the display won't come back.

Unlike in the cases reported above, the desktop didn't restart and thus the programs I left running (firefox, xterm) are still running.

dmesg shows:
 [ 1697.326394] nouveau E[   PFIFO][0000:01:00.0] PFIFO: write fault at 0x0000e50000 [PAGE_NOT_PRESENT] from (unknown enum 0x00000000)/GPC0/(unknown enum 0x0000000f) on channel 0x003facf000 [unknown]

When I poke around trying to switch the VT, I get things like:
 [ 9001.669387] nouveau E[     DRM] GPU lockup - switching to software fbcon
and
 [ 9013.733748] nouveau E[   PDISP][0000:01:00.0][0xc000917c][ffff88030a5e0800] channel stalled

I have no idea if this is the same bug or a different one.  But downgrading the kernel doesn't leave me in a clearly better situation.

Comment 7 D. Hugh Redelmeier 2013-11-17 02:12:37 UTC
Rebooted with current kernel (kernel-3.11.7-200.fc19.x86_64).  Still using the nouveau driver.

Hangs while I'm not using the computer.  I imagine this lonely dmesg line is related:
[ 5536.460775] nouveau E[   PFIFO][0000:01:00.0] PFIFO: write fault at 0x00010a0000 [PAGE_NOT_PRESENT] from (unknown enum 0x00000000)/GPC0/(unknown enum 0x0000000f) on channel 0x003fa6f000 [unknown]

Comment 8 Maximilian Rehkopf 2013-12-13 21:52:22 UTC
I'm getting the exact same behaviour as described above (up to comment #6, including the same nouveau kernel module messages), with similar ways of reproducing (Restore Tabs in Firefox), although it doesn't occur every time.

 - graphics card is a Zotac GeForce GT 640 with 2GB of DDR3 RAM

 - DVI dual-link configuration at 2560x1440 screen resolution

 - running F19 with kernel-3.11.9-200.fc19.x86_64

The issue occurs quite rarely but in bursts. It seems that once it occurs, it will happen quote soon again after every reboot, until the machine is power cycled.

Comment 9 Richard 2013-12-17 13:33:07 UTC
I have the same issue. Sometimes screen get in color bars before crashing. And sometimes it only get greyed and errors begin to be displayed. No possibility to restart in a normal manner.

 - Graphics card is an Nvidia GeForce GTX 660 with 2GB of GDDR5
 - Two monitors, I have disable the DVI one(1280x1024) and let only the HDMI connected(1920x1080) as in another threat seems that people have issues with GNOME 3 and dual displays configuration. Doesn't works so I have try vice versa with the same result.
- running F19 with the same kernel that Maximilian: kernel-3.11.9-200.fc19.x86_64.

Comment 10 Fedora End Of Life 2015-01-09 20:23:07 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Fedora End Of Life 2015-02-17 17:50:01 UTC
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.