Bug 1463368 - [GM206] DRM: EVO timeout
[GM206] DRM: EVO timeout
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: xorg-x11-drv-nouveau (Show other bugs)
7.4
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Ben Skeggs
Desktop QE
Mark Flitter
:
: 1260740 1373523 1373527 (view as bug list)
Depends On:
Blocks: 1260740
  Show dependency treegraph
 
Reported: 2017-06-20 12:16 EDT by Tomas Pelka
Modified: 2017-11-14 07:35 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Issues with Nouveau when using multiple displays on nVidia GM20x hardware Some display combinations connected to nVidia GM20x series devices are subject to many errors, observable in `dmesg` and including the message: `DRM: EVO timeout`. This problem is resolved upstream and a backport is under investigation.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-02 12:52:46 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg (705.34 KB, text/plain)
2017-06-20 12:16 EDT, Tomas Pelka
no flags Details
dmesg 2 (68.33 KB, text/plain)
2017-08-02 11:15 EDT, Tomas Hudziec
no flags Details

  None (edit)
Description Tomas Pelka 2017-06-20 12:16:43 EDT
Created attachment 1289740 [details]
dmesg

Description of problem:
KMS seems to be broken on GM206, see attached dmesg 

Version-Release number of selected component (if applicable):
kernel-3.10.0-680.el7.x86_64
xorg-x11-server-Xorg-1.19.3-7.el7.x86_64
xorg-x11-drv-nouveau-1.0.13-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. boot machine with GM206
2.
3.

Actual results:
kernel stuck on lot of messages, see dmesg

Expected results:
machine should enter gdm and further

Additional info:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM206 [GeForce GTX 960] [10de:1401] (rev a1)
Comment 1 Tomas Pelka 2017-06-20 12:26:02 EDT
Seems that the machine at least boot when all outputs except DVI are disconnected.

Dmesh output looks the same, but at least it boots into gdm.
Comment 2 Ben Skeggs 2017-06-20 20:36:48 EDT
I *believe* (hard to say for sure with the trimmed log), that this is an issue that effects certain combinations of displays on GM20x and higher GPUs.  NVIDIA basically split something that was a single hardware concept into two different ones, and added a crossbar between them.

Nouveau in 7.4 doesn't deal with the routing and keeps the two identity-mapped, which can lead to overlapping use of the same hardware blocks with some display combinations.

I recently reworked the display part of Nouveau to handle this, which is a *very* invasive change, and only made it for kernel 4.12.  I suspect, at this late stage, we may have to document this issue as known and delay until 7.5?
Comment 3 Tomas Pelka 2017-06-21 03:15:50 EDT
Sounds fair Ben, thanks.

It might be related to bz1463497.
Comment 8 Tomas Pelka 2017-07-27 08:26:14 EDT
Tomas would you please check this particular card on the machine with updated bios/firmware.

Thanks
-Tom
Comment 9 Tomas Pelka 2017-07-27 08:26:49 EDT
*** Bug 1373527 has been marked as a duplicate of this bug. ***
Comment 10 Tomas Pelka 2017-07-27 08:26:51 EDT
*** Bug 1373523 has been marked as a duplicate of this bug. ***
Comment 11 Tomas Pelka 2017-07-27 08:27:13 EDT
*** Bug 1260740 has been marked as a duplicate of this bug. ***
Comment 12 Tomas Hudziec 2017-08-02 11:15 EDT
Created attachment 1308295 [details]
dmesg 2

Machine enters gdm and it is possible to login.
Tested with following outputs used:
- only DVI
- 1xDP
- 2xDP
Dmesg look all right.

kernel-3.10.0-693.el7.x86_64
xorg-x11-server-Xorg-1.19.3-11.el7.x86_64
xorg-x11-drv-nouveau-1.0.13-3.el7.x86_64
Comment 13 Tomas Pelka 2017-08-02 12:52:46 EDT
Seems this is consequence of bios update that Tomas did on our old test machine, closing.

Note You need to log in before you can comment on or make changes to this bug.