Bug 1351205 - Blank screen with kernel 4.6.3
Summary: Blank screen with kernel 4.6.3
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: 24
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1364276
TreeView+ depends on / blocked
 
Reported: 2016-06-29 13:06 UTC by Sammy
Modified: 2016-08-05 02:13 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1364276 (view as bug list)
Environment:
Last Closed: 2016-07-19 22:21:14 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journalctl with 3.5.7 (274.92 KB, text/x-vhdl)
2016-06-30 23:03 UTC, Sammy
no flags Details
journalctl with non-working 4.6.3 (202.68 KB, text/x-vhdl)
2016-06-30 23:04 UTC, Sammy
no flags Details
No patch 1, patch2 not reverted (202.10 KB, text/plain)
2016-07-01 13:07 UTC, Sammy
no flags Details
Patch 1 present, patch2 reverted (208.02 KB, text/plain)
2016-07-01 13:08 UTC, Sammy
no flags Details
journalctl -b -k -p err (7.74 KB, text/plain)
2016-07-21 00:45 UTC, Philip Kovacs
no flags Details

Description Sammy 2016-06-29 13:06:57 UTC
One of my computers with a NVIDIA GK107 (0e73d0a2) graphics card fails to
bring up the display during boot and after. After grub screen the monitor
goes into sleep mode and stays that way. Xorg is still functioning and log file
with the two kernels is identical.

Switching to kernel 4.5.7 the boot proceeds normally.

In the system log the nouveau identification is correct for both kernels except
4.6.3 kernel has multiple lines of:

nouveau 0000:03:00.0: disp: outp 06:0006:0f48: link training failed

messages. The system is fc24 with all updates and testing-updates. I have 3 other
computers with different nvidia cards but same otherwise that are working fine.
Searching around I came acress:

https://bugs.freedesktop.org/show_bug.cgi?id=91954

which was supposedly resolved earlier. It looks like something has changed in
4.6.x.
Thanks

Comment 1 Sammy 2016-06-30 15:22:31 UTC
I have rebuilt the kernel-4.6.3 with two changes and now it works. The changes
were:

1. Remove patch "0003-drm-nouveau-disp-sor-gf119-both-links-use-the-same-t.patch"

2. Reverted an older patch (see below).

However, later I realized that the reverted patch in item (2) was already applied
in 4.5.x, which was working (4.5.7) in our case. So, the culprit seems to be
the patch in item (1).

This particular gpu NVIDIA GK107 (chipset NVE7) has a connector attached that
fans out 4 hdmi connectors. I am not an expert in this but if you want me to
send more info please tell me how to get it.

Thanks.

Reverted patch was:

=====================================================================
From 95664e66fad964c3dd7945d6edfb1d0931844664 Mon Sep 17 00:00:00 2001
From: Ben Skeggs <bskeggs>
Date: Thu, 18 Feb 2016 08:14:19 +1000
Subject: drm/nouveau/disp/dp: ensure sink is powered up before attempting link
 training

This can happen under some annoying circumstances, and is a quick fix
until more substantial changes can be made.

Fixed eDP mode changes on (at least) the Lenovo P50.

Signed-off-by: Ben Skeggs <bskeggs>
Cc: stable.org
---
 drivers/gpu/drm/nouveau/nvkm/engine/disp/dport.c | 10 ++++++++++
 drivers/gpu/drm/nouveau/nvkm/engine/disp/dport.h |  6 ++++++
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dport.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dport.c
index 74e2f7c..9688970 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dport.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dport.c
@@ -328,6 +328,7 @@ nvkm_dp_train(struct work_struct *w)
                .outp = outp,
        }, *dp = &_dp;
        u32 datarate = 0;
+       u8  pwr;
        int ret;
 
        if (!outp->base.info.location && disp->func->sor.magic)
@@ -355,6 +356,15 @@ nvkm_dp_train(struct work_struct *w)
        /* disable link interrupt handling during link training */
        nvkm_notify_put(&outp->irq);
 
+       /* ensure sink is not in a low-power state */
+       if (!nvkm_rdaux(outp->aux, DPCD_SC00, &pwr, 1)) {
+               if ((pwr & DPCD_SC00_SET_POWER) != DPCD_SC00_SET_POWER_D0) {
+                       pwr &= ~DPCD_SC00_SET_POWER;
+                       pwr |=  DPCD_SC00_SET_POWER_D0;
+                       nvkm_wraux(outp->aux, DPCD_SC00, &pwr, 1);
+               }
+       }
+
        /* enable down-spreading and execute pre-train script from vbios */
        dp_link_train_init(dp, outp->dpcd[3] & 0x01);
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dport.h b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dport.h
index 9596290..6e10c5e 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dport.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dport.h
@@ -71,5 +71,11 @@
 #define DPCD_LS0C_LANE1_POST_CURSOR2                                       0x0c
 #define DPCD_LS0C_LANE0_POST_CURSOR2                                       0x03
 
+/* DPCD Sink Control */
+#define DPCD_SC00                                                       0x00600
+#define DPCD_SC00_SET_POWER                                                0x03
+#define DPCD_SC00_SET_POWER_D0                                             0x01
+#define DPCD_SC00_SET_POWER_D3                                             0x03
+
 void nvkm_dp_train(struct work_struct *);
 #endif
-- 
cgit v0.12

Comment 2 Ben Skeggs 2016-06-30 22:45:26 UTC
Can you post your kernel log for me please?

Comment 3 Sammy 2016-06-30 23:03:10 UTC
Created attachment 1174752 [details]
journalctl with 3.5.7

Comment 4 Sammy 2016-06-30 23:04:06 UTC
Created attachment 1174753 [details]
journalctl with non-working 4.6.3

Comment 5 Ben Skeggs 2016-06-30 23:12:46 UTC
Ok, from an initial look, the "0003-drm-nouveau-disp-sor-gf119-both-links-use-the-same-t.patch" patch shouldn't have caused this issue.  Can you confirm it works with just the second patch reverted?

Comment 6 Sammy 2016-06-30 23:21:19 UTC
I will have to do this tomorrow at the office. I built a kernel with just the
opposite, just removed the gd119 patch but did not revert the second one.
Will test both tomorrow.

Comment 7 Sammy 2016-07-01 13:05:56 UTC
I have done the two tests. The culprit still seems to be the first patch.
Here is the summary:

1. Patch 1 removed, patch2 reverted - WORKING
2. Patch 1 removed, patch2 not reverted - WORKING
3. Patch 1 applied, patch2 reverted - NOT WORKING

I am attaching journalctl for 2 and 3.

Comment 8 Sammy 2016-07-01 13:07:18 UTC
Created attachment 1174919 [details]
No patch 1, patch2 not reverted

Comment 9 Sammy 2016-07-01 13:08:04 UTC
Created attachment 1174921 [details]
Patch 1 present, patch2 reverted

Comment 10 Bill 2016-07-05 19:20:35 UTC
Also experiencing on a Dell Precision T5500 with NVIDA GTX460. at this time I just plan to reinstall and install NVIDA driver before I do the update. It was a fresh instal to begin with. I'll post again to say how it went.

Comment 11 Ben Skeggs 2016-07-05 21:01:38 UTC
Doh.  I think I know what's happening there now.  I've done this fix for it:

https://github.com/skeggsb/linux/commit/217215041b9285af2193a755b56a8f3ed408bfe2

It's been sent for inclusion, and will make its way into a stable kernel update.

Please let me know if it doesn't help!

Comment 12 Sammy 2016-07-05 21:04:11 UTC
Thanks. Will try when it is out.

Comment 13 Josh Boyer 2016-07-06 13:18:17 UTC
(In reply to Ben Skeggs from comment #11)
> Doh.  I think I know what's happening there now.  I've done this fix for it:
> 
> https://github.com/skeggsb/linux/commit/
> 217215041b9285af2193a755b56a8f3ed408bfe2
> 
> It's been sent for inclusion, and will make its way into a stable kernel
> update.
> 
> Please let me know if it doesn't help!

Do you want us to pull this into the Fedora kernel now?  It will likely be a while before Greg cuts another stable release, and this commit isn't in Linus' tree yet either.

Comment 14 Ben Skeggs 2016-07-06 22:05:25 UTC
(In reply to Josh Boyer from comment #13)
> (In reply to Ben Skeggs from comment #11)
> > Doh.  I think I know what's happening there now.  I've done this fix for it:
> > 
> > https://github.com/skeggsb/linux/commit/
> > 217215041b9285af2193a755b56a8f3ed408bfe2
> > 
> > It's been sent for inclusion, and will make its way into a stable kernel
> > update.
> > 
> > Please let me know if it doesn't help!
> 
> Do you want us to pull this into the Fedora kernel now?  It will likely be a
> while before Greg cuts another stable release, and this commit isn't in
> Linus' tree yet either.

That would be great if it's going to be a while before the next stable.  I've sent the patch to airlied, he doesn't appear to have picked it up yet though.

Comment 15 Josh Boyer 2016-07-07 12:29:32 UTC
(In reply to Ben Skeggs from comment #14)
> (In reply to Josh Boyer from comment #13)
> > (In reply to Ben Skeggs from comment #11)
> > > Doh.  I think I know what's happening there now.  I've done this fix for it:
> > > 
> > > https://github.com/skeggsb/linux/commit/
> > > 217215041b9285af2193a755b56a8f3ed408bfe2
> > > 
> > > It's been sent for inclusion, and will make its way into a stable kernel
> > > update.
> > > 
> > > Please let me know if it doesn't help!
> > 
> > Do you want us to pull this into the Fedora kernel now?  It will likely be a
> > while before Greg cuts another stable release, and this commit isn't in
> > Linus' tree yet either.
> 
> That would be great if it's going to be a while before the next stable. 
> I've sent the patch to airlied, he doesn't appear to have picked it up yet
> though.

Should be in the next build of rawhide-f23.

Comment 16 Sammy 2016-07-12 16:26:48 UTC
Tested 4.6.4-301.fc24 from koji and the problem is solved. Thanks.

Comment 17 Fedora Update System 2016-07-12 18:31:34 UTC
kernel-4.6.4-301.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-9a16b2e14e

Comment 18 Fedora Update System 2016-07-12 18:34:03 UTC
kernel-4.6.4-201.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-784d5526d8

Comment 19 Fedora Update System 2016-07-14 01:24:28 UTC
kernel-4.6.4-201.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-784d5526d8

Comment 20 Fedora Update System 2016-07-14 01:55:15 UTC
kernel-4.6.4-301.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-9a16b2e14e

Comment 21 Fedora Update System 2016-07-19 22:19:56 UTC
kernel-4.6.4-201.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

Comment 22 Fedora Update System 2016-07-20 00:22:08 UTC
kernel-4.6.4-301.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 23 Philip Kovacs 2016-07-21 00:42:38 UTC
Still broken here on fed 23.

uname -a
Linux porthos 4.6.4-201.fc23.x86_64 #1 SMP Tue Jul 12 11:43:59 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

journalctl -b -k -p err >bootlog.txt

is attached.

Comment 24 Philip Kovacs 2016-07-21 00:45:05 UTC
Created attachment 1182303 [details]
journalctl -b -k -p err

Comment 25 Jared Sprague 2016-08-05 02:13:59 UTC
This is also still happening for me in kernel 4.6.4-201 and GeForce GTX 970.  I cloned this bug since this one was closed here : https://bugzilla.redhat.com/show_bug.cgi?id=1364276

Sorry if that was wrong, my first time submitting a fedora bug.


Note You need to log in before you can comment on or make changes to this bug.