Bug 601002

Summary: nouveau screen stays black after resume (parsing INIT_ZM_I2C_BYTE fails)
Product: [Fedora] Fedora Reporter: Mike Snitzer <msnitzer>
Component: kernelAssignee: Ben Skeggs <bskeggs>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: airlied, anton, dougsland, gansalmon, itamar, jglisse, jonathan, kernel-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.35-0.54.rc5.git7.fc14 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 569074 Environment:
Last Closed: 2010-07-22 04:01:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
drm.debug=14 messages log
none
vbios
none
gzip'd dmesg output
none
gzip'd post.iolog
none
vbios.rom from 2.6.35-0.36.rc4.git5.fc14.x86_64
none
dmesg from 2.6.35-0.36.rc4.git5.fc14.x86_64 w/ drm.debug=14 none

Description Mike Snitzer 2010-06-07 00:23:59 UTC
+++ This bug was initially created as a clone of Bug #569074 +++

Description of problem:
Resuming after suspend fails on this Ideapad Y550 which has an NV50 (NVIDIA GeForce GT 130M).

Version-Release number of selected component (if applicable):
2.6.34-20.fc14.x86_64

How reproducible:
always

Steps to Reproduce:
1. suspend
2. resume
3. screen stays black (INIT_ZM_I2C_BYTE fails)

--- Additional comment from bskeggs on 2010-06-06 19:03:02 EDT ---

(In reply to comment #6)
> (In reply to comment #5)
> 
> Bad news is the display still stays blank after resume.  Here are the details
> from the log:
> 
>  kernel: [drm] nouveau 0000:01:00.0: 0xD62F: Failed parsing init table opcode:
> INIT_ZM_I2C_BYTE -6
This is likely the reason why.  I'd say there's more setup for your card in that init table that we're skipping because INIT_ZM_I2C_BYTE fails.  Can you file a new bug report against rawhide to track this issue please.

It'd be great if you could include your dmesg output after a suspend/resume with "drm.debug=14 log_buf_len=1M" appended to your boot options, as well as a vbios image (I may want vbtracetool traces later too, but we'll discuss that when you open a new bug).

Thanks!

Comment 1 Mike Snitzer 2010-06-07 01:09:06 UTC
Created attachment 421689 [details]
drm.debug=14 messages log

Doesn't seem like drm.debug=14 increased the verbosity of the drm debugging.  Not sure what I'm missing.

Comment 2 Ben Skeggs 2010-06-07 01:17:59 UTC
Yeah, you need to use the "dmesg" command.  The debug level messages don't make it to /var/log/messages.

Comment 3 Mike Snitzer 2010-06-07 01:38:00 UTC
Created attachment 421692 [details]
vbios

vbios collected with: sudo ./vbtracetool -w 2>/tmp/vbios.rom

Comment 4 Mike Snitzer 2010-06-07 01:54:26 UTC
Created attachment 421695 [details]
gzip'd dmesg output

dmesg with drm.debug=14

This dmesg also shows that a NULL pointer (comparable to bz#569074) still exists (even with rawhide's 2.6.34 kernel's use of 2.6.35-rc1's drm and ttm):

BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: [<ffffffffa007e701>] nouveau_ttm_io_mem_reserve+0x8d/0xb5 [nouveau]

Comment 5 Ben Skeggs 2010-06-08 01:11:18 UTC
Thanks for that :)

Can I also get "./vbtracetool -lp 2>post.iolog" from you please?

Comment 6 Mike Snitzer 2010-06-16 23:18:38 UTC
Created attachment 424609 [details]
gzip'd post.iolog

Collected with: ./vbtracetool -lp 2>post.iolog

But it blanked the screen as soon as I issued the command.  I then rebooted and this file is what remained.

Comment 7 Ben Skeggs 2010-07-14 03:46:39 UTC
This is hopefully fixed in http://koji.fedoraproject.org/koji/buildinfo?buildID=183345

Can you confirm?

Comment 8 Mike Snitzer 2010-07-14 23:49:19 UTC
(In reply to comment #7)
> This is hopefully fixed in
> http://koji.fedoraproject.org/koji/buildinfo?buildID=183345
> 
> Can you confirm?    

Unfortunately, the screen still stays black on resume:

nouveau 0000:01:00.0: power state changed by ACPI to D0
nouveau 0000:01:00.0: power state changed by ACPI to D0
nouveau 0000:01:00.0: power state changed by ACPI to D0
nouveau 0000:01:00.0: power state changed by ACPI to D0
nouveau 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
nouveau 0000:01:00.0: setting latency timer to 64
[drm] nouveau 0000:01:00.0: POSTing device...
[drm] nouveau 0000:01:00.0: Parsing VBIOS init table 0 at offset 0xD359
[drm] nouveau 0000:01:00.0: 0xD62F: i2c wr fail: -6
[drm] nouveau 0000:01:00.0: Parsing VBIOS init table 1 at offset 0xD8A4
[drm] nouveau 0000:01:00.0: Parsing VBIOS init table 2 at offset 0xE3D3
[drm] nouveau 0000:01:00.0: Parsing VBIOS init table 3 at offset 0xE408
[drm] nouveau 0000:01:00.0: Parsing VBIOS init table 4 at offset 0xE59C
[drm] nouveau 0000:01:00.0: Parsing VBIOS init table at offset 0xE601
[drm] nouveau 0000:01:00.0: Couldn't find matching output script table
[drm] nouveau 0000:01:00.0: 0xC078: parsing output script 0
[drm] nouveau 0000:01:00.0: Reinitialising engines...
[drm] nouveau 0000:01:00.0: Restoring GPU objects...
[drm] nouveau 0000:01:00.0: Restoring mode...
[drm] nouveau 0000:01:00.0: Couldn't find matching output script table
[drm] nouveau 0000:01:00.0: Couldn't find matching output script table

Comment 9 Ben Skeggs 2010-07-15 00:04:04 UTC
Hmm, maybe a new issue has snuck in too.

[drm] nouveau 0000:01:00.0: Couldn't find matching output script table
[drm] nouveau 0000:01:00.0: Couldn't find matching output script table 

These weren't in previous logs.

Could I see a drm.debug=14 log again please :)

Comment 10 Ben Skeggs 2010-07-15 03:47:26 UTC
Also, with nouveau loaded can you mount debugfs (mount -t debugfs debugfs /sys/kernel/debug) and attach /sys/kernel/debug/dri/vbios.rom to this bug.

Comment 11 Mike Snitzer 2010-07-16 05:44:43 UTC
Created attachment 432301 [details]
vbios.rom from 2.6.35-0.36.rc4.git5.fc14.x86_64

# cat /sys/kernel/debug/dri/0/name 
nouveau 0000:01:00.0 pci:0000:01:00.0

# cp /sys/kernel/debug/dri/0/vbios.rom vbios.rom.`uname -r`

Comment 12 Mike Snitzer 2010-07-16 05:55:26 UTC
Created attachment 432306 [details]
dmesg from 2.6.35-0.36.rc4.git5.fc14.x86_64 w/ drm.debug=14

suspend and resume was performed w/ drm.debug=14

Comment 13 Mike Snitzer 2010-07-16 05:59:24 UTC
Ben, Just FYI.. I also tried the RHEL6 kernel from rhbz#596679 and after resume with that kernel I got a rapidly flickering white screen (rather than the usual all black screen).

Comment 14 Ben Skeggs 2010-07-16 06:36:56 UTC
Mike, thank you for all of that!  A sneaky regression did indeed sneak in for LVDS on some chipsets.

kernel-2.6.34.1-15.fc13 is building in koji at the moment, and will hopefully fix the problem.

http://koji.fedoraproject.org/koji/taskinfo?taskID=2323487

Comment 15 Mike Snitzer 2010-07-16 22:37:47 UTC
(In reply to comment #14)
> Mike, thank you for all of that!  A sneaky regression did indeed sneak in for
> LVDS on some chipsets.
> 
> kernel-2.6.34.1-15.fc13 is building in koji at the moment, and will hopefully
> fix the problem.
> 
> http://koji.fedoraproject.org/koji/taskinfo?taskID=2323487    

Works great, thanks Ben!

Comment 16 Chuck Ebbert 2010-07-22 02:39:17 UTC
Is this fixed in rawhide 2.6.35-rc kernels?

Comment 17 Ben Skeggs 2010-07-22 04:01:34 UTC
It's fixed in kernel-2.6.35-0.54.rc5.git7.fc14