Bug 524419

Summary: KMS:RV740:HD4770 screen corruption under video memory pressure
Product: [Fedora] Fedora Reporter: Sean Middleditch <sean>
Component: xorg-x11-drv-atiAssignee: Jérôme Glisse <jglisse>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 13CC: j, jglisse, mcepl, xgl-maint
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: card_R700/M
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-05-10 10:29:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
my Xorg log from this session (no corruption yet)
none
dmesg after booting with drm.debug=15
none
screenshot of corruption none

Description Sean Middleditch 2009-09-19 23:18:23 UTC
Created attachment 361793 [details]
my Xorg log from this session (no corruption yet)

I am seeing corruption issues (and sometimes lockups) on Rawhide x86_64 with the ATI driver on an R600 part (I think) only with the new KMS support.  Firefox is the biggest instigator, though OpenOffice seemed to cause it once too (Firefox was open at the same time, it might have still contributed).  It only happens in Firefox with a large number of tabs open.  If I keep my tab count low, the problem (so far) doesn't happen.

Specifically what I see happen is fonts start to go wacky (one or two glyphs on a page will be garbage, the rest will be normal).  More and more of them get funked up as I scroll or load new pages.  The glyphs outside of the Firefox window (e.g., gnome panel, etc.) will occasionally get wonky too.  Firefox will then start drawing images funny, for example with half the image being clear and the other half garbage.  Eventually all images in Firefox turn into garbage.  Sometimes the whole page view will go to garbage, too (for alls tabs).  Eventually various UI elements in Firefox, pure GTK apps, Metacity window borders, and so on all go weird, e.g. the borders around buttons in the GNOME Shutdown Dialog will have noise around them or be weird colors, the Metacity buttons turn into garbage, and so on.  If I don't reboot soon after this starts happening the whole system will eventually lock (I have not tested if it's just X or the whole system as I haven't had another machine handy to SSH in with, and I can't switch to another VC since X has to intercept the ctrl-alt-FX key for that to work).

The only other clue is that the computer fans get louder and louder when this starts to happen, almost like it's starting to overheat.  This is just having a lot of tabs open though, no videos or heavy graphics animation or anything -- 3D/DRI isn't even supported on this card yet in Fedora (with or without KMS).  I'm not able to easily tell if it's the video card fan, CPU fan, or case fans that are doing -- most of the fans besides the GPU are generally silent in this computer though (literally inaudible at the distance the computer is at) so I'm inclined to believe it's the graphics card fan.

The same machine in Windows 7 does not exhibit any of this behavior, even when playing graphics-heavy games, so I'm fairly sure it's a Linux driver issue and not busted hardware.  Possibly overheating hardware if Linux isn't handling the power for this card right, or a TTM issue allowing graphics memory to get corrupted when enough of it starts being used.  Or just a bug in the DRI/X driver I guess.

Playing long videos does not cause the issue to happen, notably, even when using X video acceleration.

kernel-2.6.31-33.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.4.20090908git651fe5a47.fc12.x86_64
xorg-x11-server-Xorg-1.6.99.901-3.fc12.x86_64

I have no X config file (oddly, I noticed something about "Using hsync ranges from config file" in my Xorg.0.log, which makes no sense to me).

The hardware is an HD4770:
(--) RADEON(0): Chipset: "ATI Radeon HD 4770" (ChipID = 0x94b3)

Xorg.0.log attached -- this is from my current session, which has not yet exhibited any traces of screen corruption.

Comment 1 Paul Bolle 2009-09-30 23:25:54 UTC
0) Similar issues - especially the fonts getting gradually more corrupted over time - here (for quite some time now, actually) while running Rawhide on a "Radeon RV250 [Mobility FireGL 9000]" on a i686 system. No hard info when this was first occurring. Could easily be a few weeks or more ago.

1) Currently forced (by bug #526433) to run with the "nomodeset" kernel option the issues _seems_ to be resolved. Can the reporter confirm whether that kernel command line parameter "solves" the issues for him too?

Comment 2 Sean Middleditch 2009-10-03 00:46:28 UTC
I will test the nomodeset bit shortly.

With mode setting still enabled, I'm noticing far less corruption now than earlier, with recent package updates.  It still happens here and there, but usually only one or two images on a Firefox page will get corrupt, and they will often "fix" themselves after a while.

Comment 3 Sean Middleditch 2009-10-03 00:52:54 UTC
Ew.  The desktop is slow as hell normally, but with nomodeset it's practically unusable.  I can measure the time it takes for most graphics updates to fully render in whole seconds with nomodeset=1.  And to think, I was going to try that anyway to see if it would fix the performance issues I had with mode setting enabled.  :/  It takes a bit of a while of real use to trigger the bug, but there's no way I can use a desktop with that kind of performance problems... sorry.

Comment 4 Matěj Cepl 2009-10-16 00:32:36 UTC
Most likely this bug has duplicates in bug 528496, bug 529081

Comment 5 Jérôme Glisse 2009-10-28 20:06:15 UTC
Please retest with updating to :
xorg-x11-drv-ati-6.13.0-0.10.20091006git457646d73.fc12.x86_64
xorg-x11-server-Xorg-1.7.0-1.fc12.x86_64

And kernel from :
http://koji.fedoraproject.org/koji/buildinfo?buildID=138707

Report back if it helps.

Comment 6 Paul Bolle 2009-10-28 22:21:34 UTC
(In reply to comment #5)
> Please retest with updating to :
> xorg-x11-drv-ati-6.13.0-0.10.20091006git457646d73.fc12.x86_64
> xorg-x11-server-Xorg-1.7.0-1.fc12.x86_64
> 
> And kernel from :
> http://koji.fedoraproject.org/koji/buildinfo?buildID=138707
> 
> Report back if it helps. 

The font corruption is clearly still there (on a "Radeon RV250 [Mobility FireGL 9000]" on a i686 system). Makes the system hard to use within (ten or so) minutes.

Comment 7 Dave Airlie 2009-11-04 05:24:32 UTC
can you try with latest -112 kernel? This looks like two bugs that are probably not related though.

Comment 8 Paul Bolle 2009-11-04 08:47:05 UTC
(In reply to comment #7)
> can you try with latest -112 kernel? This looks like two bugs that are probably
> not related though.  

Is the current (at least on my system) update from xorg-x11-server-Xorg-1.7.0-1.fc12 to xorg-x11-server-Xorg-1.7.0-5.fc12.i686 relevant (ie, should I try with one version or with both versions)?

Comment 9 Paul Bolle 2009-11-04 13:36:22 UTC
(In reply to comment #7)
> can you try with latest -112 kernel? This looks like two bugs that are probably
> not related though.  

0) I tried your -112 kernel with both xorg-x11-server-Xorg-1.7.0-1.fc12 to xorg-x11-server-Xorg-1.7.0-5.fc12.i686. The Xorg version seemed not relevant (I saw no difference in behaviour).

1) Things definitely look better now.

2) Font corruption is very sporadic now: eg, I just noticed that (all or most) 'k's turned into a bold 'P' (while before characters used to become just unreadable, very smudgy, rather quickly). Moreover, the current font issues seem to disappear when fonts get "redrawn" (eg, if one moves the pointer over a corrupted text somewhere). So, minor glitches.

3) Firefox seems to have more issues (some pages tend to choose either foreground or a background colour, hard to say which, instead of both; I had the tabs disappear once). These issues also disappear if things get "redrawn". Might be a Firefox issue, but I can't remember seeing that before.

Comment 10 Matěj Cepl 2009-11-05 17:17:01 UTC
Since this bugzilla report was filed, there have been several major updates in various components of the Xorg system, which may have resolved this issue. Users who have experienced this problem are encouraged to upgrade their system to the latest version of their packages (at least F12Beta, but even better if the very latest versions).

Please, if you experience this problem on the up-to-date system, let us now in the comment for this bug, or whether the upgraded system works for you.

If you won't be able to reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you.

[This is a bulk message for all open Fedora Rawhide Xorg-related bugs. I'm adding myself to the CC list for each bug, so I'll see any comments you make after this and do my best to make sure every issue gets proper attention.]

Comment 11 Paul Bolle 2009-11-05 17:40:25 UTC
(In reply to comment #10)
> If you won't be able to reply in one month, I will have to close this bug as
> INSUFFICIENT_DATA.

Needed info already provided in comment #9.

Comment 12 Sean Middleditch 2009-11-08 21:07:59 UTC
Still happens on latest kernel/Mesa in rawhide.

Comment 13 Jérôme Glisse 2009-11-12 13:00:29 UTC
Paul your issue is more likely a duplicate of :
https://bugzilla.redhat.com/show_bug.cgi?id=529081


Sean please boot kms with drm.debug=15 and attach full dmesg. Thanks.

Comment 14 Bug Zapper 2009-11-16 12:39:24 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 15 Sean Middleditch 2009-11-16 20:56:31 UTC
Created attachment 369782 [details]
dmesg after booting with drm.debug=15

As requested.  I noted that the log says that drm.debug=15 is an unknown option -- was that supposed to be something else?

Comment 16 Sean Middleditch 2009-12-10 20:36:44 UTC
Just as an update, this still happens with Rawhide kernels too:

kernel-2.6.32-1.fc13.x86_64
mesa-libGL-7.6-0.15.fc12.x86_64
mesa-dri-drivers-experimental-7.6-0.15.fc12.x86_64
xorg-x11-server-Xorg-1.7.1-9.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.18.20091201git88a50a30d.fc13.x86_64

Comment 17 Sean Middleditch 2010-01-04 01:55:00 UTC
Still happens, latest packages in Rawhide.

kernel-2.6.32.2-15.fc13.x86_64
mesa-libGL-7.8-0.6.fc13.x86_64
xorg-x11-server-Xorg-1.7.1-9.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.18.20091221git4b05c47ac.fc13.x86_64

If there is anything else I can provide to help track this down, please let me know.

Comment 18 Sean Middleditch 2010-01-04 09:45:52 UTC
Created attachment 381511 [details]
screenshot of corruption

In case it helps.  The stipled/tiled pattern is pretty common whenever the corruption starts, as well as the glyph corruption and (sometimes) toolbar icon corruption.  You can see the glyph corruption in the clock applet in the upper right, if that matters at all.  Maybe seeing the pattern the corruption shows up in will help narrow down the culprit.

Comment 19 Sean Middleditch 2010-02-12 23:37:35 UTC
Still happening on latest Rawhide, albeit far, far less frequently.  Something got "better" perhaps (or some other variable changed here), but it's definitely still broke.

kernel-2.6.33-0.27.rc6.git1.fc13.x86_64
mesa-libGL-7.8-0.16.fc13.x86_64
xorg-x11-drv-ati-6.13.0-0.22.20100208git4f9d1714a.fc13.x86_64

(haven't tested a newer kernel than that one since none of the newer ones will even boot for me... something in early boot up or possibly the initrd before DRM even gets loaded; not related to this bug, of course)

Comment 20 Sean Middleditch 2010-03-03 20:02:09 UTC
Latest Fedora 13 updates has made this MUCH rarer (I can reliably use the desktop machine now), but it has still happened a few times in the last week.  I haven't noticed any changes in my usage pattern, but maybe there's something I'm missing.  The effects when it kicks in are the same: corrupted fonts first, usually followed by corrupted icons, and eventually corrupted desktop and app windows (sometimes the desktop first goes wonky first, but usually not).  The corruption appears less severe now too, though.  For example, the last time it started happening, the right-half of all the 'm' letters in my app windows disappeared, but that was it.  (I rebooted not long after so more corruption may well have been on the way.)

Just to be sure, I ran a memtest check on the machine again to convince myself its not bad memory or the like.  Additionally, Windows 7 on the same machine can run with very heavy load (e.g. high-end commercial games: Borderlands, Bioshock, etc.) without glitches, reaffirming to a good degree that it's not a hardware issue.

kernel-2.6.33-1.fc13.x86_64
mesa-libGL-7.8-0.18.fc13.x86_64
xorg-x11-drv-ati-6.13.0-0.23.20100219gite68d3a389.fc13.x86_64

Comment 21 Bug Zapper 2010-03-15 12:51:03 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 22 Jérôme Glisse 2010-04-13 09:38:26 UTC
Does it works any better with recent packages ?

Comment 23 Sean Middleditch 2010-04-14 08:05:54 UTC
Yes, yes it does.  The only glitch I have seen in the last few weeks is (I believe) completely unrelated to this.  Don't know if you're the hero who took care of this or not, but either way, thank you very much!

(I never got a response to this the last time I asked on another bug: is it okay for me to close a bug I reported if I know it's fixed, or should I let you do that?)

Comment 24 Jérôme Glisse 2010-05-10 10:29:00 UTC
Yes it's ok to close the bug by yourself