Bug 446620 - intel 82G35 graphics controller on asus p5e-vm produces black screen on firstboot
Summary: intel 82G35 graphics controller on asus p5e-vm produces black screen on first...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-i810
Version: 9
Hardware: All
OS: Linux
low
urgent
Target Milestone: ---
Assignee: Adam Jackson
QA Contact:
URL:
Whiteboard:
: 453366 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-05-15 12:16 UTC by rico sec
Modified: 2018-04-11 09:03 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-07-14 17:41:06 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
logfiles (14.69 KB, application/x-zip-compressed)
2008-05-15 12:16 UTC, rico sec
no flags Details
dmesg.txt from the attachment 305482 (31.05 KB, text/plain)
2008-05-16 21:37 UTC, Matěj Cepl
no flags Details
lspci.txt from the attachment 305482 (2.04 KB, text/plain)
2008-05-16 21:38 UTC, Matěj Cepl
no flags Details
Xorg.0.log from the attachment 305482 (20.60 KB, text/plain)
2008-05-16 21:38 UTC, Matěj Cepl
no flags Details
Xorg.0.log from 2.3.2-2 hard crash (24.12 KB, text/plain)
2008-07-02 18:32 UTC, Holger Lubitz
no flags Details

Description rico sec 2008-05-15 12:16:02 UTC
Description of problem:
X crashes on FC9 firstboot of Intel 82G35 Graphics Controller (X23500) on Asus 
P5E-VM HDMI Motherboard

Version-Release number of selected component (if applicable):
xorg-x11-server-Xorg-1.4.99.901-29.20080415
kernel-2.6.25.3.18.fc9

How reproducible:
Boot FC9 with Intel 82G35 Graphics Controller (X23500) on Asus P5E-VM HDMI 
Motherboard. Only thing unusual with this system it is running software RAID.

Steps to Reproduce:
1. Install Fedora
2. X will Crash on firstboot

  
Actual results:


Expected results:


Additional info:
Attachments:

dmesg.txt
lspci.txt
Xorg.0.log

Comment 1 rico sec 2008-05-15 12:16:02 UTC
Created attachment 305482 [details]
logfiles

Comment 2 rico sec 2008-05-15 12:19:30 UTC
X will crash is a minor inaccuracy. 'System will freeze' is more accurate.

Comment 3 Matěj Cepl 2008-05-16 21:37:57 UTC
Created attachment 305757 [details]
dmesg.txt from the attachment 305482 [details]

Comment 4 Matěj Cepl 2008-05-16 21:38:04 UTC
Created attachment 305758 [details]
lspci.txt from the attachment 305482 [details]

Comment 5 Matěj Cepl 2008-05-16 21:38:11 UTC
Created attachment 305759 [details]
Xorg.0.log from the attachment 305482 [details]

Comment 6 rico sec 2008-05-17 06:26:06 UTC
Further information to report:

I can get X partially working in runlevel 3 by typing startx. On logout, the 
display is set incorrectly and does not display a console.

In runlevel 5 the system boots n until most services are loaded and then 
displays malformed screen artefacts just prior to the system crash. This is 
right before gdm comes up.


Comment 7 Dave Airlie 2008-05-19 03:02:50 UTC
can you try test booting with mem=2048M

I'm wondering is the 4GB causing problems.

Comment 8 Holger Lubitz 2008-05-28 15:19:43 UTC
I am having the same problem. You're right - it is somehow connected to memory
>4 GB and the memory remapping the board does. For me, it is usually not a
freeze, though. Just a garbled screen / blank screen / no signal. Ctrl-Alt-F1
followed by Ctrl-Alt-Del still lets me reboot the system. But I have seen
freezes when trying to use HDMI.

Things that help to get at least working VGA output: 

1. Downgrade to 2 GB. 
2. Disable Memory Remapping in BIOS (for a 4 GB configuration - obviously
doesn't help with 8 GB as there is still memory > 4 GB). 
3. And surprisingly enough: Removing the rhgb kernel boot option. Leaves the
video memory uncached (and makes xv and gl slow), but at least it works.


Comment 9 rico sec 2008-06-07 10:09:00 UTC
Booting with mem=2048M did not help or affect anything. 

mem=2048 caused a kernel halt.
Bug: Int 6: CR2 000000

mem=2048M no change in faulty behaviour.

However disabling memory remapping in BIOS did fix things. ty Holger.

Comment 10 Holger Lubitz 2008-06-10 01:37:49 UTC
I wouldn't say disabling memory remapping fixes things. It works around the real
bug if and only if you're on a 4G system. Unfortunately my attempts to contact
airlied and ajax re this bug on #fedora-devel have been unsuccessful so far. If
there's anything I can do to help you resolve this bug, please tell me.

Comment 11 Holger Lubitz 2008-07-02 18:32:04 UTC
Created attachment 310839 [details]
Xorg.0.log from 2.3.2-2 hard crash

2.3.2-2 from koji still crashes, but more verbosely. The system tries to start
X a couple times (checkered background, X cursor are visible for a moment or
two), until that crashes. Xorg.0.log from the crash attached.

Fortunately, booting without rhgb still works.

Comment 12 Holger Lubitz 2008-07-04 02:12:56 UTC
I think this maybe a mtrr issue. The bug goes away with less than 4G or with 4G
and memory remapping disabled. In the latter case, /proc/mtrr looks like this on
my machine:

reg00: base=0x00000000 (   0MB), size=2048MB: write-back, count=1
reg01: base=0x80000000 (2048MB), size=1024MB: write-back, count=1
reg02: base=0xc0000000 (3072MB), size= 256MB: write-back, count=1
reg03: base=0xcf800000 (3320MB), size=   8MB: uncachable, count=1
reg04: base=0xd0000000 (3328MB), size= 256MB: write-combining, count=1

(even with rhgb, everything works just fine)

If I enable memory remapping for the same configuration, /proc/mtrr changes to this:

reg00: base=0xd0000000 (3328MB), size= 256MB: uncachable, count=1
reg01: base=0xe0000000 (3584MB), size= 512MB: uncachable, count=1
reg02: base=0x00000000 (   0MB), size=4096MB: write-back, count=1
reg03: base=0x100000000 (4096MB), size= 512MB: write-back, count=1
reg04: base=0x120000000 (4608MB), size= 256MB: write-back, count=1
reg05: base=0xcf800000 (3320MB), size=   8MB: uncachable, count=1

X tries to set reg00 to write-combining, but fails. Now one could argue that
this is an unusual order (and so far only Asus boards seem to set it up that
way), but nevertheless I think this is legal. The exceptions should still take
precedence before the larger write-back mapping underlying.

Now I do not know what exactly goes wrong when rhgb is used.  And I have been
unsuccessful in adapting the workaround of manually changing mtrr afterwards -
the kernel always tells me what I'm trying to do is not allowed.

So this may ultimately be a kernel problem, and there is in attempt to fix mtrr
from there: http://lkml.org/lkml/2008/5/1/47

Intel already indicated that they do not feel to responsible for unusual BIOS
mtrr settings and said the vendor should fix this:
https://bugs.freedesktop.org/show_bug.cgi?id=15360

However, I think what Asus does (unlike their example) is actually correct and
legal if maybe unusual, and should be made to work. Also, it seems this not only
affects Intel graphics. Similar reports come from some ATI users.

Bug #453366 seems to be very similar to this one.

Comment 13 Holger Lubitz 2008-07-04 03:30:51 UTC
Found a workaround - adding this to /etc/rc5.d/S99local seems to help:

echo "disable=2" > /proc/mtrr
echo "disable=0" > /proc/mtrr
echo "base=0x00000000 size=0x80000000 type=write-back" > /proc/mtrr
echo "base=0x80000000 size=0x40000000 type=write-back" > /proc/mtrr
echo "base=0xc0000000 size=0x10000000 type=write-back" > /proc/mtrr
echo "base=0xd0000000 size=0x10000000 type=write-combining" > /proc/mtrr

But this mapping uses all mtrr's (at least i think there are only 8?) and
wouldn't help for the cases where there's either an additional 1M uncacheable
below the 8M (taking up an additional mtrr) or where the system has 8G (no space
left for those mappings either).

I did not find a way to tell the kernel to create a large write-back mapping and
then a smaller write-combining within the already mapped area.

Comment 14 Christian Lupien 2008-07-12 22:49:52 UTC
Similar problems here.
I have Dell optiplex 755 system with core2 Duo with 4G of ram. I run it with
32bit Fedora 9.

The card from lspci is
Intel Corporation 82Q35 Express Integrated Graphics Controller (rev 02)

From X I get (4G)
(--) PCI:*(0@0:2:0) Intel Corporation 82Q35 Express Integrated Graphics
Controller rev 2, Mem @ 0xfea00000/0, 0xd0000000/0, 0xfeb00000/0, I/O @ 0x0000ec90/0
(--) PCI: (0@0:2:1) Intel Corporation 82Q35 Express Integrated Graphics
Controller rev 2, Mem @ 0xfea80000/0
(II) System resource ranges:
        [0] -1  0       0xffffffff - 0xffffffff (0x1) MX[B]
        [1] -1  0       0x000f0000 - 0x000fffff (0x10000) MX[B]
        [2] -1  0       0x000c0000 - 0x000effff (0x30000) MX[B]
        [3] -1  0       0x00000000 - 0x0009ffff (0xa0000) MX[B]
        [4] -1  0       0x0000ffff - 0x0000ffff (0x1) IX[B]
        [5] -1  0       0x00000000 - 0x00000000 (0x1) IX[B]

My observations is that X can only be started once. So if rhgb starts I get no
X. If I switch to runlevel 3, then I can start 1 X session, using either startx
or just simply X. After ending the first one I get garbled X on the second start
and if I kept running that way it eventually locks up.

The fix of disbaling memory remapping in BIOS I could not try (did not find it
in my BIOS settings).

But physically removing 2G of memory worked (the mem=2048M kernel option did not
fix it). I could then start more than one X session in a row (I tried 3).

I also have a mtrr message when X starts:
 mtrr: type mismatch for d0000000,10000000 old: write-back new: write-combining
I got the same message whether it was 2G or 4G
My 2G mtrr were (before or during X are the same)
reg00: base=0x00000000 (   0MB), size=65536MB: write-back, count=1
reg01: base=0x7e600000 (2022MB), size=   2MB: uncachable, count=1
reg02: base=0x7e800000 (2024MB), size=   8MB: uncachable, count=1
reg03: base=0x7f000000 (2032MB), size=  16MB: uncachable, count=1
reg04: base=0x7e500000 (2021MB), size=   1MB: uncachable, count=1
reg05: base=0x80000000 (2048MB), size=2048MB: uncachable, count=1
4G
reg00: base=0x00000000 (   0MB), size=65536MB: write-back, count=1
reg01: base=0xcf600000 (3318MB), size=   2MB: uncachable, count=1
reg02: base=0xcf800000 (3320MB), size=   8MB: uncachable, count=1
reg03: base=0xcf500000 (3317MB), size=   1MB: uncachable, count=1
reg04: base=0xd0000000 (3328MB), size= 256MB: uncachable, count=1
reg05: base=0xe0000000 (3584MB), size= 512MB: uncachable, count=1

Finally the update xorg-x11-drv-i810-2.2.1-24.fc9.i386 made things worse.
Nothing changes initially but after X runs for a while, eventually the machine
locks up completely (keyboard, network all non-functionnal) and I can only do a
hard reboot. I have had 4 crashes so far and most happened overnight when the
computer was not doing anything. I could not find anything in the logs, they
simply stop. So I have downgraded to xorg-x11-drv-i810-2.3.2-2.fc9.i386

Here are other bugs that seem related that I could find (some already mentionned)
#446591 #450247 #453366 #455127

Comment 15 Christian Lupien 2008-07-12 23:26:13 UTC
I forgot to add that the console is running the frame buffer (vga=0x318 kernel
parameter). I think it made my console more stable (coming out of X), but I
could confuse this with my laptop (855GM) which also has some trouble switching
VT (freezes sometimes and no console without the frame buffer).

Comment 16 Chris Sorisio 2008-08-01 21:41:01 UTC
Could this be related to the Foxconn/AMI BIOS issue?

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/251338

Comment 17 Matěj Cepl 2008-09-02 12:19:51 UTC
*** Bug 453366 has been marked as a duplicate of this bug. ***

Comment 18 D. Hugh Redelmeier 2008-09-20 01:00:31 UTC
I have just written a userland C program to reorganize MTRRs.  This may help your situation.  It is new and may well have bugs (including the fact that reorganizing MTRRs is not guaranteed to work).

Have a look at ftp://ftp.cs.utoronto.ca/pub/hugh/mtrr-uncover-2008sept19.tgz
I will update this so look for ones with later dates.

Here's the output of the program for Holger's /proc/mtrr.  Indentation shows nesting and MTRRs are sorted.  The first number on a line is kind of the register number -- 50 and up are newly allocated ones

Initial MTRR configuration:
 2 0x000000000-0x0ffffffff write-back
    5 0x0cf800000-0x0cfffffff uncachable
    0 0x0d0000000-0x0dfffffff uncachable
    1 0x0e0000000-0x0ffffffff uncachable
 3 0x100000000-0x11fffffff write-back
 4 0x120000000-0x12fffffff write-back

Final MTRR configuration:
 2' 0x000000000-0x07fffffff write-back
50' 0x080000000-0x0bfffffff write-back
51' 0x0c0000000-0x0cfffffff write-back
    5 0x0cf800000-0x0cfffffff uncachable
 3 0x100000000-0x11fffffff write-back
 4 0x120000000-0x12fffffff write-back

Commands for /proc/mtrr to make these changes:
disable=0
disable=1
disable=2
base=0x000000000 size=0x080000000 type=write-back
base=0x080000000 size=0x040000000 type=write-back
base=0x0c0000000 size=0x010000000 type=write-back

Comment 19 Bug Zapper 2009-06-10 00:51:21 UTC
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 20 Bug Zapper 2009-07-14 17:41:06 UTC
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.