Bug 649949
Summary: | Kernel cannot read EDID (possibly radeon specific) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Paul Flinders <paul> | ||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 14 | CC: | dougsland, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mailings, mcepl, michal, mikel | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2010-11-26 18:42:25 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Paul Flinders
2010-11-04 21:08:31 UTC
Upon further investigation I think that the problem is that the monitor supplies extended EDID data when connected through the HDMI interface, however the switch doesn't pass this back. The kernel did, in fact, read the base block OK but this isn't very clear from the error messages. There is (IMO) a kernel bug which is that if the base EDID data is valid but an extension block is invalid the whole EDID data is discarded - it would seem more sensible to at least use the validated base EDID. I'd like to propose the following patch. With it the only problem is that I get a lot of noise in /var/log/messages as the EDID data is continually probed but otherwise things work just fine. --- vanilla-2.6.36/drivers/gpu/drm/drm_edid.c 2010-10-20 21:30:22.000000000 +0100 +++ linux-2.6.36.x86_64/drivers/gpu/drm/drm_edid.c 2010-11-08 20:55:44.947043854 +0000 @@ -256,6 +256,7 @@ { int i, j = 0; u8 *block, *new; + int have_base_edid = 0; if ((block = kmalloc(EDID_LENGTH, GFP_KERNEL)) == NULL) return NULL; @@ -270,6 +271,8 @@ if (i == 4) goto carp; + have_base_edid = 1; + /* if there's no extensions, we're done */ if (block[0x7e] == 0) return block; @@ -297,6 +300,23 @@ dev_warn(connector->dev->dev, "%s: EDID block %d invalid.\n", drm_get_connector_name(connector), j); + /* Invalid extension block but base EDID block was OK. Some switches + might not be able to pass extension block data. Therefore use what + we do have */ + if (have_base_edid) { + u8 csum = 0; + + dev_warn(connector->dev->dev, "%s: Discarding invalid extension blocks.\n", + drm_get_connector_name(connector)); + block[0x7e] = 0; + + /* Fix checksum */ + for (i = 0; i < EDID_LENGTH-1; i++) + csum += block[i]; + block[0x7f] = -csum; + return block; + } + out: kfree(block); return NULL; Oh PS - I think there's another bug :-) If there is more than one extension block won't drm_do_probe_ddc_edid fail because the start address in the EDID EPROM goes in an unsigned 8-bit value? And finally... I'd like to propose that the hex dump of the EDID data at the end of drm_edid_block_valid is output as a debug message, not an error message as it contributes massively to the noise in the kernel messages. I updated a working machine from Fedora 12 to 14 and that totally screwed it over. This machine is remote so I can only watch processes and logs, and not video directly, but there are apparently two possible outcomes. Either X server accidentally starts and then every 10 seconds the following block shows up in /var/log/messages making this file hardly usable very quickly: Nov 21 17:35:56 epsilon kernel: [ 114.645928] [drm:drm_edid_block_valid] *ERROR * Raw EDID: Nov 21 17:35:56 epsilon kernel: [ 114.645989] Nov 21 17:35:56 epsilon kernel: [ 114.696552] [drm:drm_edid_block_valid] *ERROR * Raw EDID: Nov 21 17:35:56 epsilon kernel: [ 114.696604] Nov 21 17:35:56 epsilon kernel: [ 114.747104] [drm:drm_edid_block_valid] *ERROR * Raw EDID: Nov 21 17:35:56 epsilon kernel: [ 114.747183] Nov 21 17:35:56 epsilon kernel: [ 114.798712] [drm:drm_edid_block_valid] *ERROR * Raw EDID: Nov 21 17:35:56 epsilon kernel: [ 114.798733] Nov 21 17:35:56 epsilon kernel: [ 114.798736] radeon 0000:01:05.0: HDMI Type A- 1: EDID block 0 invalid. Nov 21 17:35:56 epsilon kernel: [ 114.798740] [drm:radeon_dvi_detect] *ERROR* H DMI Type A-1: probed a monitor but no|invalid EDID This is not even rate limited. Bummer! Moreover after 'telinit 3' spamming of /var/log/messages continues unabated only getting X back looks like impossible (see further down). The other possibility is that X does not start at all. That seems to a be a "normal" situation. A few of message blocks like the one above show up in /var/log/messages in a very quick succession and that is it. No X or gdm running. That is with kernel-2.6.35.6-48.fc14.x86_64. I tried to install and boot on the same machine also kernel-2.6.34.7-61.fc13.x86_64. With that one I got the same messages but so far with it X did not start even once. Looking at Xorg.0.log file from a "lucky" F14 start I see the following lines (II) RADEON(0): EDID for output S-video (II) RADEON(0): EDID for output HDMI-0 (II) RADEON(0): EDID for output DVI-0 (II) RADEON(0): EDID for output VGA-0 An old Xorg.0.log from F12 installation does not seem to be substantially different from the new one but only (II) RADEON(0): EDID for output VGA-0 from these four lines appears in it. This is a sharp regression in a comparison with F12. Created attachment 461906 [details]
Xorg.0.log file from when X just started, once, following an upgrade
This is with "ATI Technologies Inc RS690 [Radeon X1200 Series]" built-in on mobo and behind a bridge; like that:
-[0000:00]-+-00.0 ATI Technologies Inc RS690 Host Bridge
+-01.0-[01]----05.0 ATI Technologies Inc RS690 [Radeon X1200 Series]
+-07.0-[02]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller
Created attachment 461938 [details]
dmesg for 2.6.35.6-48.fc14.x86_64 with EDID errors
I should have look sooner. The current status of dmesg is that the whole buffer is completely flooded with the stuff like that:
[drm:drm_edid_block_valid] *ERROR* Raw EDID:
<3>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
<3>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
<3>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
<3>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
<3>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
<3>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
<3>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
<3>00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Attached is what boot sequence stored in /var/log/
My log is full of 'probed a monitor but no|invalid EDID'. This also causes stuttering in the resposiveness of my system, like mini 'hang-ups'. I have a dual-head setup of Viewsonic VP2030b screens, with a radeon HD4650. I can setup my resolution ok though, no problem there. Nov 25 07:35:06 stinkcentre kernel: [59629.140543] radeon 0000:01:00.0: DVI-I-1: EDID block 0 invalid. Nov 25 07:35:06 stinkcentre kernel: [59629.140545] [drm:radeon_dvi_detect] *ERROR* DVI-I-1: probed a monitor but no|invalid EDID Nov 25 07:35:17 stinkcentre kernel: [59639.689898] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 254 Nov 25 07:35:17 stinkcentre kernel: [59639.689901] [drm:drm_edid_block_valid] *ERROR* Raw EDID: Nov 25 07:35:17 stinkcentre kernel: [59639.689917] Nov 25 07:35:17 stinkcentre kernel: [59639.927698] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 254 Nov 25 07:35:17 stinkcentre kernel: [59639.927700] [drm:drm_edid_block_valid] *ERROR* Raw EDID: Nov 25 07:35:17 stinkcentre kernel: [59639.927715] Nov 25 07:35:17 stinkcentre kernel: [59640.165452] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 254 Nov 25 07:35:17 stinkcentre kernel: [59640.165455] [drm:drm_edid_block_valid] *ERROR* Raw EDID: Nov 25 07:35:17 stinkcentre kernel: [59640.165470] Nov 25 07:35:17 stinkcentre kernel: [59640.403268] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 254 Nov 25 07:35:17 stinkcentre kernel: [59640.403271] [drm:drm_edid_block_valid] *ERROR* Raw EDID: Nov 25 07:35:17 stinkcentre kernel: [59640.403286] xorg-x11-drv-ati.x86_64 6.13.1-0.3.20100705git37b348059.fc14 kernel.x86_64 2.6.35.6-48.fc14 01:00.0 VGA compatible controller: ATI Technologies Inc RV730 PRO [Radeon HD 4650] (prog-if 00 [VGA controller]) Subsystem: PC Partner Limited Device 1391 Flags: bus master, fast devsel, latency 0, IRQ 45 Memory at c0000000 (64-bit, prefetchable) [size=256M] Memory at d0000000 (64-bit, non-prefetchable) [size=64K] I/O ports at 2000 [size=256] [virtual] Expansion ROM at d0020000 [disabled] [size=128K] Capabilities: [50] Power Management version 3 Capabilities: [58] Express Legacy Endpoint, MSI 00 Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Kernel driver in use: radeon Kernel modules: radeon I should note that 1 DVI connection uses a regular DVI cable, and the other uses the HDMI connector (and cable). Maybe the HDMI EDID interaction is not working out. That was not it, the DVI was acting up. When I fired up gnome-display-settings the system went all haywire and very unresponsive. Could start up X normally after that, even after reboot. I now disconnected the DVI and replaced it with a VGA cable and that seems to work ok. Very much not the preferred way :-) see also #533632 (In reply to comment #9) > > I now disconnected the DVI and replaced it with a VGA cable and that seems to > work ok. Very much not the preferred way :-) Try instead to put in /etc/X11/xorg.conf.d/ a file, say 01-radeon.conf, with the following in it: Section "Device" Identifier "Videocard0" Driver "radeon" Option "IgnoreEDID" "on" EndSection That does not stop a relentless spamming of dmesg and /var/log/messages by KMS, and I would be very interested how one could kill this without recompiling a kernel, but at least video seems to work in my case. thanks, but the kernel is constantly trying to read the EDID and causing significant impact on system responsiveness. not going that way... (In reply to comment #11) > thanks, but the kernel is constantly trying to read the EDID and causing > significant impact on system responsiveness. That is one of reasons why I would like to know how to turn off this turkey. Configuring rsyslog to log on /dev/null has clear disadvantages. *** This bug has been marked as a duplicate of bug 611149 *** This is NOT a duplicate of bug 611149, please reopen. There are two bugs being discussed here 1) Failure because the kernel requires the whole EDID to be valid and cannot use a valid base EDID if the extensions are corrupt. This is a problem because some KVM switches don't pass the extended EDID blocks, just the base 128 byte block. I provided a patch for this. 2) Spammage of kernel messages with many copies of the error messages for invalid EDIDs This has been discussed on the kernel mailing list recently I think and some solutions proposed. Neither relates to bug 611149 which seems to be a problem with a specific monitor. See also bug bug 668196 (related in particular to comment #4, #5 and #6). |