Bug 751589

Summary: (badEDID) KMS:: Fails to boot
Product: [Fedora] Fedora Reporter: Gregory Maxwell <gmaxwell>
Component: xorg-x11-drv-nouveauAssignee: Ben Skeggs <bskeggs>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: airlied, ajax, bskeggs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-13 12:43:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
syslog from a failing bootup
none
Syslogs from a working bootup with garbage on the display. none

Description Gregory Maxwell 2011-11-06 00:48:12 UTC
I have a couple of panels here that give bad EDID information connected to a desktop with an nvidia gtx-275 card. In the past this has resulted in some whiny log messages and some minor annoyances which is why I know about it, but no serious problems for the free software drivers.

With F16 (RC5) the installer and the installed system fail without nomodeset.

The bootup appears to stop in the kernel with the last displayed message. "[drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, rema" (it's cut off)

The message is displayed previously during the boot several times, those times with a block of hex containing the edid information (and they also complete the message "inder is 161".

I  can't scroll the console up, but numlock works. Ctrl-alt-delete causes the machine to reboot after a 15 second delay or so.

https://bugzilla.redhat.com/show_bug.cgi?id=533632  looks related, but it was filed back at F12, and more recent comments seem mostly to be complains about warnings rather than an inoperable system, feel free to dupe me on to that as appropriate.

Comment 1 Gregory Maxwell 2011-12-15 07:23:34 UTC
Still failing with fresh updates on kernel 3.1.5-2.fc16.x86_64 / libdrm-2.4.27-2.fc16.x86_64 / mesa-dri-drivers-7.11.2-1.fc16.x86_64.

Please tell me how I can help get this fixed?  It's really frustrating to have a system which is completely unusable with Fedora 16 so many weeks after the release, especially due to such an apparently silly issue as it not liking the EDID values provided by the displays.

I'm guessing that if I install the binary-only nvidia drivers that it will work, but I don't want them— and once I do install them I'll become useless in getting this fixed in the free software.

Comment 2 Ben Skeggs 2011-12-16 02:56:13 UTC
Can you provide your full dmesg output from trying to boot please.

Comment 3 Gregory Maxwell 2011-12-16 03:24:20 UTC
Created attachment 547582 [details]
syslog from a failing bootup

So, it looks like the _complete_ logs aren't quite making it to disk.  I believe that edid block is displayed three times identically, with the final time being cutoff near the end of it.

If you need more of it, I can get a serial console setup but that will take me a day or so since I'll need to find some way of getting a serial port into the system. So hopefully this much is useful to you.

Comment 4 Ben Skeggs 2011-12-16 04:11:36 UTC
I've pulled a recent patch that was submitted to dri-devel into a Fedora kernel build, which I believe should fix this issue for you.

It hasn't built yet, but keep an eye on koji[1] for it to finish and you'll be able to download and install the build.


[1] http://koji.fedoraproject.org/koji/taskinfo?taskID=3588203

Comment 5 Gregory Maxwell 2011-12-16 07:57:14 UTC
Okay! it built. And works, KINDA!

Comes up, and I get a bunch of edid errors, then I get the fedora bubble. Then the second head starts with edid errors in text but it quickly blanks (I have a second identical panel attached).  Then on the first head, and only the first head, I get these blue rapidly horizontal lines which are denser towards the right side of the display, they look like some kind of crazy xmms visualization. They only appear on the first head.

X starts, and the crazy blue likes are still visible but mostly only in black areas (e.g. at the black bar at the top of gnome shell)

I'm taking a wild guess that the garbage is coming from probing the edid values over and over again or something like that?

I'll attach new logs. Big improvement at least.

Comment 6 Gregory Maxwell 2011-12-16 07:58:09 UTC
Created attachment 547631 [details]
Syslogs from a working bootup with garbage on the display.

Comment 7 Ben Skeggs 2011-12-16 09:03:24 UTC
The lines have nothing at all to do with EDID fetching.  That's something completely different.  Honestly, to me it sounds like you could have bad cables, bad monitors, or a bad GPU.  

That said, there is one bug in the nouveau i2c code that I discovered very recently when I looked into it in a lot of detail.  The full fixes changed are upstream, but the particular fix that I'm thinking of is available in this (again, hasn't finished building) test kernel build.  I haven't tested the backport at all, but it shouldn't be any worse than it is now unless I messed something up.  I doubt it'll help, but it's worth a try I guess.

http://koji.fedoraproject.org/koji/taskinfo?taskID=3588365

The extra strict header checking issue has been fixed, so I think we should track the extra problems in a new bug report and let this one be closed when the fixed kernel goes out.

Comment 8 Gregory Maxwell 2011-12-16 17:08:09 UTC
Same behavior, sadly. Booting into F14 with the nvidia binary drivers doesn't have the garbage. I'll do some more testing and open another one if I can find something to work from. Thank you very much for your help.


As an aside, when I initially booted into the new kernel it inexplicably swapped two of my md devices— something which I've never seen before. As a result of this my ephemerally encrypted swap (swap in man crypttab(5)) was initialized on the partition containing /home.   Do to some crazy fortuitous luck of data structure non-aligment and the fact that I noticed before it actually swapped anything I managed to escape without any data loss.

Comment 9 Gregory Maxwell 2011-12-16 17:13:38 UTC
(ah, you can ignore my comment about /dev/mdXX devices swapping, apparently two of my devices didn't have entries in mdadm.conf :( )

Comment 10 Fedora End Of Life 2013-01-16 12:27:39 UTC
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 11 Fedora End Of Life 2013-02-13 12:43:46 UTC
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.