Bug 922435 - nouveau and kernel 3.8 breaks suspend/resume
Summary: nouveau and kernel 3.8 breaks suspend/resume
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: 18
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-16 21:11 UTC by Noel Duffy
Modified: 2014-02-05 20:02 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-02-05 20:02:27 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Suspend log (8.24 KB, text/x-log)
2013-03-16 21:11 UTC, Noel Duffy
no flags Details
/var/log/messages from failed suspend (322.20 KB, text/plain)
2013-03-16 21:15 UTC, Noel Duffy
no flags Details

Description Noel Duffy 2013-03-16 21:11:32 UTC
Created attachment 711183 [details]
Suspend log

Description of problem:

Since kernel 3.8 my laptop will no longer suspend or resume. The logs show an error when nouveau is told to suspend the graphics card. Worse still, the laptop screen blanks and will not switch back on until the laptop is rebooted. Suspend and resume work fine in kernel 3.7.9-205

Version-Release number of selected component (if applicable):

xorg-x11-drv-nouveau-1.0.6-1.fc18.x86_64
kernel-3.8.2-206.fc18.x86_64
kernel-3.8.1-201.fc18.x86_64

How reproducible: always.


Steps to Reproduce:
1. Boot kernel 3.8.1 or kernel 3.8.2. 
2. Suspend by closing lid or by running pm-suspend in a terminal.
3. Screen blanks, but laptop is not asleep. I can ssh to it.
  
Actual results: Laptop does not suspend


Expected results: Laptop should suspend.


Additional info:

pm-suspend.log shows this:

Running hook /usr/lib64/pm-utils/sleep.d/99video hibernate hibernate:
/usr/lib64/pm-utils/sleep.d/99video hibernate hibernate: success.

Sun Mar 17 01:09:58 NZDT 2013: performing hibernate
/usr/lib64/pm-utils/pm-functions: line 323: echo: write error: Device or resource busy
Sun Mar 17 01:10:11 NZDT 2013: Awake.
Sun Mar 17 01:10:11 NZDT 2013: Running hooks for thaw

/var/log/messages show this:

Mar 17 01:53:38 pariah kernel: [51772.393814] nouveau E[   PDISP][0000:02:00.0][0xc000887d][ffff8801152
f2180] channel stalled
Mar 17 01:53:38 pariah kernel: [51774.649415] pci_pm_suspend(): nouveau_pmops_suspend+0x0/0x80 [nouveau
] returns -16
Mar 17 01:53:38 pariah kernel: [51774.649421] dpm_run_callback(): pci_pm_suspend+0x0/0x140 returns -16
Mar 17 01:53:38 pariah kernel: [51774.649429] PM: Device 0000:02:00.0 failed to suspend async: error -1
6
Mar 17 01:53:38 pariah kernel: [51774.649477] PM: Some devices failed to suspend


Device 02:00 is:

02:00.0 VGA compatible controller: NVIDIA Corporation C79 [GeForce G102M] (rev b1) (prog-if 00 [VGA controller])
	Subsystem: ASUSTeK Computer Inc. Device 19b4
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 23
	Region 0: Memory at fb000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Region 3: Memory at f8000000 (64-bit, prefetchable) [size=32M]
	Region 5: I/O ports at dc00 [size=128]
	Expansion ROM at fafe0000 [disabled] [size=128K]
	Capabilities: [60] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Kernel driver in use: nouveau

Comment 1 Noel Duffy 2013-03-16 21:15:17 UTC
Created attachment 711184 [details]
/var/log/messages from failed suspend

/var/log/messages from failed suspend

Comment 2 Noel Duffy 2013-03-17 00:49:44 UTC
I just realised that the /var/log/messages snippet I showed does not match the hibernate event from pm-suspend.log I showed. The /var/log/messages snippet should have been:

Mar 17 01:10:10 pariah kernel: [49163.710976] nouveau E[   PDISP][0000:02:00.0][0xc000887c][ffff8801152
f2380] fini timeout, 0x8e071008
Mar 17 01:10:10 pariah kernel: [49163.710978] nouveau E[   PDISP][0000:02:00.0][0xc000887c][ffff8801152
f2380] failed suspend, -16
Mar 17 01:10:10 pariah kernel: [49163.710981] nouveau E[     DRM] 0xd1500000:0xd15c7c00 suspend failed 
with -16
Mar 17 01:10:10 pariah kernel: [49163.711030] nouveau E[     DRM] 0xdddddddd:0xd1500000 suspend failed 
with -16
Mar 17 01:10:10 pariah kernel: [49163.711236] nouveau E[     DRM] 0xffffffff:0xdddddddd suspend failed 
with -16
Mar 17 01:10:10 pariah kernel: [49163.720807] nouveau E[     DRM] 0xffffffff:0xffffffff suspend failed 
with -16
Mar 17 01:10:10 pariah kernel: [49163.721064] nouveau  [     DRM] resuming display...
Mar 17 01:10:10 pariah kernel: [49168.236803] pci_pm_freeze(): nouveau_pmops_freeze+0x0/0x20 [nouveau] 
returns -16
Mar 17 01:10:10 pariah kernel: [49168.236807] dpm_run_callback(): pci_pm_freeze+0x0/0xb0 returns -16
Mar 17 01:10:10 pariah kernel: [49168.236812] PM: Device 0000:02:00.0 failed to freeze async: error -16

Comment 3 Marek Zukal 2013-03-21 10:15:05 UTC
Same problems on GeForce 8400M GS, Dell Vostro 1310

Comment 4 Salvatore Filippone 2013-04-04 09:22:33 UTC
Same problem here, On an Asus with G210M,

Comment 5 Noel Duffy 2013-04-13 12:21:52 UTC
Changing this to be against the kernel rather than nouveau. Although it is nouveau which reports the error, the version of the nouveau driver has not changed since kernel 3.7.9. I think that the bug lies in the kernel itself.

See also: http://lists.fedoraproject.org/pipermail/devel/2013-March/180260.html

Comment 7 Marek Zukal 2013-05-16 11:59:34 UTC
Seems to be working fine again in kernel 3.9.2 with my 8400M GS.

Comment 8 Salvatore Filippone 2013-05-16 14:57:27 UTC
(In reply to comment #7)
> Seems to be working fine again in kernel 3.9.2 with my 8400M GS.

Seems to be fine for me as well on a G210M

Comment 9 Salvatore Filippone 2013-05-16 15:56:34 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > Seems to be working fine again in kernel 3.9.2 with my 8400M GS.
> 
> Seems to be fine for me as well on a G210M

Hmm. I spoke too early. If I have a laptop with an external screen, suspend, and then resume, after resume the screen control tool (xfce) does *not* show both internal and external screen. Which is obviously not very nice.

Comment 10 Salvatore Filippone 2013-05-16 18:01:55 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > Seems to be working fine again in kernel 3.9.2 with my 8400M GS.
> > 
> > Seems to be fine for me as well on a G210M
> 
> Hmm. I spoke too early. If I have a laptop with an external screen, suspend,
> and then resume, after resume the screen control tool (xfce) does *not* show
> both internal and external screen. Which is obviously not very nice.

Even if I suspend from a configuration with only the laptop screen, after the suspend/resume the screen config tool aborts and does not work any longer. So not everything is right...

Comment 11 Noel Duffy 2013-05-20 10:21:32 UTC
Suspend and resume are working again on my G102M with kernel 3.9.2-200. I have not tried with an external monitor, but normal lid close and lid open appear to be working normally.

Comment 12 Noel Duffy 2013-05-21 10:06:34 UTC
While suspend/resume are now working normally, I noticed that after resuming, the nouveau driver becomes highly unstable. It crashed Xorg, leaving this in the logs:

May 21 21:41:24 pariah kernel: [157789.744322] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TPDMA_2D - TP 0 - Unknown fault at address 0042c89000
May 21 21:41:24 pariah kernel: [157789.744329] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TPDMA_2D - TP 0 - e0c: 00000000, e18: 00000000, e1c: 0000029e, e20: 00000011, e24: 0c030000
May 21 21:41:24 pariah kernel: [157789.744332] nouveau E[  PGRAPH][0000:02:00.0]  TRAP
May 21 21:41:24 pariah kernel: [157789.744338] nouveau E[  PGRAPH][0000:02:00.0] ch 2 [0x001fb45000 Xorg[17946]] subc 2 class 0x502d mthd 0x08dc data 0x0000005f
May 21 21:41:24 pariah kernel: [157789.744346] nouveau E[     PFB][0000:02:00.0] trapped write at 0x0042c89000 on channel 0x0001fb45 [Xorg[17946]] PGRAPH/PROP/DST2D reason: PAGE_NOT_PRESENT
May 21 21:41:24 pariah kernel: [157789.744360] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TPDMA_2D - TP 0 - Unknown fault at address 0042cb5600
May 21 21:41:24 pariah kernel: [157789.744363] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TPDMA_2D - TP 0 - e0c: 00000000, e18: 00000000, e1c: 005802a8, e20: 00000011, e24: 0c030000
May 21 21:41:24 pariah kernel: [157789.744365] nouveau E[  PGRAPH][0000:02:00.0]  TRAP
May 21 21:41:24 pariah kernel: [157789.744369] nouveau E[  PGRAPH][0000:02:00.0] ch 2 [0x001fb45000 Xorg[17946]] subc 2 class 0x502d mthd 0x08dc data 0x0000005f
May 21 21:41:24 pariah kernel: [157789.744377] nouveau E[     PFB][0000:02:00.0] trapped write at 0x0042ce0000 on channel 0x0001fb45 [Xorg[17946]] PGRAPH/PROP/DST2D reason: PAGE_NOT_PRESENT
May 21 21:41:24 pariah kernel: [157789.744390] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TPDMA_2D - TP 0 - Unknown fault at address 0042d0b000
May 21 21:41:24 pariah kernel: [157789.744394] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TPDMA_2D - TP 0 - e0c: 00000000, e18: 00000000, e1c: 00c002a8, e20: 00000011, e24: 0c030000
May 21 21:41:24 pariah kernel: [157789.744396] nouveau E[  PGRAPH][0000:02:00.0]  TRAP
May 21 21:41:24 pariah kernel: [157789.744400] nouveau E[  PGRAPH][0000:02:00.0] ch 2 [0x001fb45000 Xorg[17946]] subc 2 class 0x502d mthd 0x08dc data 0x0000005f
May 21 21:41:24 pariah kernel: [157789.744406] nouveau E[     PFB][0000:02:00.0] trapped write at 0x0042d0b900 on channel 0x0001fb45 [Xorg[17946]] PGRAPH/PROP/DST2D reason: PAGE_NOT_PRESENT
May 21 21:41:24 pariah kernel: [157789.744420] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TPDMA_2D - TP 0 - Unknown fault at address 0042d35900
May 21 21:41:24 pariah kernel: [157789.744423] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TPDMA_2D - TP 0 - e0c: 00000000, e18: 00000000, e1c: 0124029e, e20: 00000011, e24: 0c030000
May 21 21:41:24 pariah kernel: [157789.744425] nouveau E[  PGRAPH][0000:02:00.0]  TRAP
May 21 21:41:24 pariah kernel: [157789.744429] nouveau E[  PGRAPH][0000:02:00.0] ch 2 [0x001fb45000 Xorg[17946]] subc 2 class 0x502d mthd 0x08dc data 0x0000005f
May 21 21:41:24 pariah kernel: [157789.744435] nouveau E[     PFB][0000:02:00.0] trapped write at 0x0042d61000 on channel 0x0001fb45 [Xorg[17946]] PGRAPH/PROP/DST2D reason: PAGE_NOT_PRESENT
May 21 21:41:24 pariah kernel: [157789.744448] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TPDMA_2D - TP 0 - Unknown fault at address 0042d8b000
May 21 21:41:24 pariah kernel: [157789.744452] nouveau E[  PGRAPH][0000:02:00.0] TRAP_TPDMA_2D - TP 0 - e0c: 00000000, e18: 00000000, e1c: 0180029e, e20: 00000011, e24: 0c030000

This was happening regularly before suspend/resume broke things. During the weeks when I could not suspend or resume, this bug did not surface once. Now that I am suspending and resuming again, it's back.

Comment 13 Joel Uckelman 2013-06-03 11:18:54 UTC
I'm seeing something similar to Comment 12 after resuming. I get these as fast as they can be printed until I shutdown using CTRL-ALT-DELETE:

Jun  3 13:00:42 scylla kernel: [ 2378.308846] nouveau E[  PGRAPH][0000:01:00.0] 
SHADER 0xa004021e
Jun  3 13:00:42 scylla kernel: [ 2378.308909] nouveau E[  PGRAPH][0000:01:00.0] 
TRAP ch 1 [0x003fe10000 Xorg[1027]]
Jun  3 13:00:42 scylla kernel: [ 2378.308917] nouveau E[  PGRAPH][0000:01:00.0] 
SHADER 0xa004021e
Jun  3 13:00:42 scylla kernel: [ 2378.308983] nouveau E[  PGRAPH][0000:01:00.0] 
TRAP ch 1 [0x003fe10000 Xorg[1027]]

Comment 14 salvatore.filippone@uniroma2.it 2013-06-12 07:49:40 UTC
with kernel 3.9.4-200 the system resumes, but the xfce4 display settings application does not work, it shows no display controls at all. I do not see anything in eitehr the X or the kernel log.

Comment 15 ahxyz 2013-08-03 21:44:49 UTC
Hi, I'm running Ubuntu 13.04, EFI boot (not BIOS/CMS) on Macbook Pro 6.2, kernel Linux mbp 3.8.0-27-generic #40-Ubuntu SMP Tue Jul 9 00:17:05 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux.

Graphic HW:

00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 18)
01:00.0 VGA compatible controller: NVIDIA Corporation GT216M [GeForce GT 330M] (rev a2)

On my mashine suspend doesn't work at all on lid close. 
Suspending by using the button from menu is working however.
Resuming is a problem, the screen doesn't come up properly. 
I can switch to console by ctrl+f2, after certain timeout I see on console many lines of this:

[  346.905150] nouveau E[  PGRAPH][0000:01:00.0] ch -1 [0x000fb3a000] subc 2 class 0x0000 mthd 0x0860 data 0xff4c4c4c
[  346.905161] nouveau E[  PGRAPH][0000:01:00.0]  ILLEGAL_MTHD ILLEGAL_CLASS
[  346.905163] nouveau E[  PGRAPH][0000:01:00.0] ch -1 [0x000fb3a000] subc 2 class 0x0000 mthd 0x0860 data 0xff4c4c4c
[  346.905174] nouveau E[  PGRAPH][0000:01:00.0]  ILLEGAL_MTHD ILLEGAL_CLASS

Then Xorg is restarted, and I see login screen but when I try to login the mashine hangs totally then, even ctrl + f2 isn't working any more.

Currently I don't have the latest kernel installed, but I had tried stable: 3.10.4 previously, it looks like the problem still persists there too.

Comment 16 Fedora End Of Life 2013-12-21 12:12:52 UTC
This message is a reminder that Fedora 18 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 18. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '18'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 18's end of life.

Thank you for reporting this issue and we are sorry that we may not be 
able to fix it before Fedora 18 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior to Fedora 18's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 17 Fedora End Of Life 2014-02-05 20:02:27 UTC
Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.