Bug 557805

Summary: GPU lockup with latest libdrm update
Product: [Fedora] Fedora Reporter: Milan Kerslager <milan.kerslager>
Component: xorg-x11-drv-atiAssignee: Jérôme Glisse <jglisse>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 12CC: ajax, clancy.kieran+redhat, eherget, evins, ikke, jglisse, mcepl, pcfe, redhat, suren, tcwan, xgl-maint
Target Milestone: ---Keywords: Reopened, Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-11-05 19:02:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 494832    
Attachments:
Description Flags
Log from X server (first lockup).
none
dmesg log after the second restart
none
Xorg.log after update to: libdrm-2.4.17-1.fc12, xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.x86_64
none
my hw lspci -v
none
some crashes from syslog
none
my xorg, kernel and mesa-* versions
none
Kernel OOPS when libdrm was only updated to libdrm-2.4.17-1.fc12
none
Another kernel OOPS
none
My Xorg.log (without backtrace), just to show my config
none
some latest crash logs none

Description Milan Kerslager 2010-01-22 16:20:11 UTC
Created attachment 386182 [details]
Log from X server (first lockup).

Latest update causes GPU lockup just after login into the GUI with my Radeon HD 2400 XT in Acer Extensa 5620. This happened with these packages:

libdrm-2.4.17-1.fc12.x86_64
mesa-dri-drivers-7.7-2.fc12.x86_64
mesa-libGL-7.6-0.13.fc12.x86_64
mesa-libGLU-7.6-0.13.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.x86_64

I had to downgrade to previous packages:

libdrm-2.4.15-8.fc12.x86_64.rpm
mesa-dri-drivers-7.6-0.13.fc12.x86_64.rpm
mesa-libGL-7.6-0.13.fc12.x86_64.rpm
mesa-libGLU-7.6-0.13.fc12.x86_64.rpm
xorg-x11-drv-ati-6.13.0-0.11.20091119git437113124.fc12.x86_64.rpm

Log from dmesg and X server are attached.

Comment 1 Milan Kerslager 2010-01-22 16:26:45 UTC
Created attachment 386184 [details]
dmesg log after the second restart

Comment 2 Milan Kerslager 2010-01-22 17:04:35 UTC
PCI ID from my "Radeon HD 2400 XT in Acer Extensa 5620" is 0300: 1002:94c8

Comment 3 Milan Kerslager 2010-01-23 14:18:42 UTC
I tryed selective update:

libdrm-2.4.17-1.fc12.x86_64
libdrm-2.4.17-1.fc12.i686
xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.x86_64

After this the system seems to be stable and running GUI without troubles.

Comment 4 Milan Kerslager 2010-01-23 14:21:19 UTC
Created attachment 386331 [details]
Xorg.log after update to: libdrm-2.4.17-1.fc12,  xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.x86_64

Comment 5 Milan Kerslager 2010-01-23 14:56:37 UTC
After the update to the mesa-7.7-2, the system seems to be stable too. 

mesa-dri-drivers-7.7-2.fc12.x86_64
mesa-libGL-7.7-2.fc12.x86_64
mesa-libGLU-7.7-2.fc12.x86_64

This seems that kernel update to kernel-2.6.31.12-174.2.3.fc12.x86_64 fixed the problem for me and the bug could be closed.

There are tight dependencies between driver of the X server, DRM library, Mesa and kernel which are not bundled into the RPM as dependencies I think.

Comment 6 Ilkka Tengvall 2010-01-25 15:42:32 UTC
Please re-open the bug!

I'm having exact the same problem with similar HW. And I can't get rid of it... unless disabling the DRI totally.

First I had the mesa-dri-drivers-experimental in place, but even after removing it I run into this. Also disabled compiz (desktop effects), but still it crashed. This started to happen right after last weeks updates.

I need to run now, but I'm happy to take whatever logs you want about the issue, and also do debug with guidance. I want my machine back to stable :)

Comment 7 Ilkka Tengvall 2010-01-25 15:43:05 UTC
Created attachment 386654 [details]
my hw lspci -v

Comment 8 Ilkka Tengvall 2010-01-25 15:43:33 UTC
Created attachment 386656 [details]
some crashes from syslog

Comment 9 Ilkka Tengvall 2010-01-25 15:45:36 UTC
Created attachment 386657 [details]
my xorg, kernel and mesa-* versions

Comment 10 Milan Kerslager 2010-01-25 16:16:31 UTC
I experienced lockups again. Reopening.

Comment 11 Milan Kerslager 2010-01-26 15:55:37 UTC
Created attachment 386863 [details]
Kernel OOPS when libdrm was only updated to libdrm-2.4.17-1.fc12

OOPS happened when libdrm was updated and these older stable componets was present in the system (ie. these was not updated):

mesa-dri-drivers-7.6-0.13.fc12.x86_64
mesa-dri-drivers-7.6-0.13.fc12.i686
mesa-libGL-7.6-0.13.fc12.x86_64
mesa-libGL-7.6-0.13.fc12.i686
mesa-libGLU-7.6-0.13.fc12.x86_64
mesa-libGLU-7.6-0.13.fc12.i686
xorg-x11-drv-ati-6.13.0-0.11.20091119git437113124.fc12.x86_64

Comment 12 Milan Kerslager 2010-01-26 16:00:02 UTC
Created attachment 386864 [details]
Another kernel OOPS

I'm not sure what componets was present in the system when this OOPS happened.

Comment 13 Milan Kerslager 2010-01-26 16:02:43 UTC
Created attachment 386865 [details]
My Xorg.log (without backtrace), just to show my config

Comment 14 Milan Kerslager 2010-01-26 17:59:36 UTC
I'm not sure if this issue is xorg-x11-drv-ati or libdrm related.

Comment 15 Milan Kerslager 2010-01-27 09:29:23 UTC
I have collection of non-broken packages (versions just before latest update,
ie more recent than from pure F12) at http://ftp.pslib.cz/pub/local/fedora/12/

I tryed Rawhide packages, but when X starts, the system completly freezes (SSH is not possible) without any traces in the logs.

yum --enablerepo=rawhide update libdrm mesa-dri-drivers mesa-libGL mesa-libGLU xorg-x11-drv-ati

Current kernel from Radwihde does not boot for mee too (KMS does not work, so native 25x80, and the system hangs after two initial lines with OK). So I had to downgrade to the packages from URL above.

Comment 16 Nick Lamb 2010-02-02 10:26:23 UTC
Similar or same problem here (X crashes or the machine locks up altogether suggestive of GPU lockup) since the latest libdrm / mesa / ati driver etc. lockstep upgrade

in a laptop with a

Mobility Radeon HD 3400 Series (1002:95c4)

trying the package versions suggested by Milan above, but not using his packages because some are missing the Fedora GnuPG signature, for those I used identically versioned packages from the install DVD.

Can any of the relevant developers reproduce this? Do they have any hints for what we should try in terms of debugging it? Anything?

Comment 17 Milan Kerslager 2010-02-02 13:41:15 UTC
Changing component to libdrm. Not pretty sure, co please check.

Comment 18 Jérôme Glisse 2010-02-02 14:36:03 UTC
We prefer to have gpu issue assigned to ddx. A upgrade should fix the issue you are experiencing, somehow a dependency was wrong or broken at one point during API change which leaded to segfault. So please keep updating your system at one point you should endup with a working set of package. In the meantime adding radeon.modeset=0 should work around the issue.

Comment 19 Milan Kerslager 2010-02-02 18:46:48 UTC
It seems to help. I may try development packages if somebody want to. Thank you for your workaround.

Comment 20 Matěj Cepl 2010-02-03 17:51:43 UTC
*** Bug 546264 has been marked as a duplicate of this bug. ***

Comment 21 Ilkka Tengvall 2010-02-08 07:23:16 UTC
Created attachment 389468 [details]
some latest crash logs

I tried it with the latest updated xorg drivers, and experienced immediately several crashes the same way.

$ rpm -qa '*mesa*'
mesa-libGL-7.7-3.fc12.x86_64
mesa-libGL-devel-7.7-3.fc12.x86_64
mesa-libGL-7.7-3.fc12.i686
mesa-libGLU-7.7-3.fc12.x86_64
mesa-libGLU-devel-7.7-3.fc12.x86_64
mesa-dri-drivers-7.7-3.fc12.i686
mesa-libGLU-7.7-3.fc12.i686
mesa-dri-drivers-7.7-3.fc12.x86_64
mesa-dri-drivers-experimental-7.7-3.fc12.x86_64
$ rpm -qa '*xorg*ati*'
xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.x86_64

@Jerome: Do you mean radeon.modeset=0 would help X time problems or only on boot-up time problems? I didn't try that yet since I thought it only affects boot time KMS crashes. Any hint on which version the thing is fixed, so I know when to try the next time?

The logs here are from during the time I enabled dri again by re-installing mesa-dri-drivers-experimental, and removing my xorg.conf that denies dri. The last two logs are somewhere along the way trying to remove dri again, but the system crashed in the meanwhile.

Here is my xorg.conf:

$ cat /etc/X11/xorg.conf
Section  "Module"
	Disable "dri"
	Disable "dri2"
EndSection

Section  "ServerFlags"
	Option "DRI" "false"
	Option "DRI2" "false"
EndSection

Comment 22 Nick Lamb 2010-02-15 10:12:37 UTC
Still crashes here with all the updates.

Jerome?

Comment 23 Nick Lamb 2010-02-16 18:31:36 UTC
I pushed the "report" button on the abrt dialog I got after the most recent crash, but it occurs to me that I don't know exactly what it does with those reports...

radeon.modeset=0 does seem to workaround this issue for me, at least so far I have zero crashes with that kernel parameter. However my understanding is that KMS is now the default and UMS code will slowly die out, which means being stuck without KMS is bad news and wants fixing.

Comment 24 Jérôme Glisse 2010-02-22 13:10:29 UTC
Somethings is wrong with your packages, somehow one is not properly updated, as i am 100% sure this is related to ABI breakage we did in libdrm. Dunno how you can solve this beside forcing reinstall of all the lastest libdrm,mesa,xorg-x11-drv-ati,xorg-x11-server-Xorg

Comment 25 Nick Lamb 2010-02-22 15:20:43 UTC
Here's a list of packages which are installed and verified (including checksums) by RPM. The "mesa" package doesn't exist, so I've listed Mesa related packages that seemed relevant.

libdrm-2.4.17-1.fc12.x86_64
libdrm-2.4.17-1.fc12.i686
mesa-libGL-7.7-3.fc12.x86_64
mesa-libGL-7.7-3.fc12.i686
mesa-dri-drivers-7.7-3.fc12.x86_64
mesa-dri-drivers-7.7-3.fc12.i686
mesa-dri-drivers-experimental-7.7-3.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.x86_64
xorg-x11-server-Xorg-1.7.4-6.fc12.x86_64

Without a list from you of what counts as "properly updated" I have no way to refute your claim that this isn't "properly updated". All I know is that Fedora's supplied package management software claims there are currently no updates for this set of packages, and it is now three weeks since you claimed a "broken package" was responsible.

So, what packages _should_ I be running, in order for me to either have working KMS or be able to have bug reports treated as something other than user error?

Comment 26 Nick Lamb 2010-02-22 16:02:20 UTC
Just tried (with those exact packages listed above) again, to make absolutely, no question, 100% sure

Specifically the symptom on this particular occasion was a freeze while using Firefox, then the screen briefly goes blank, then an I-beam cursor appears on the black background and nothing further happens. This is after maybe an hour's use or less.

Comment 27 Nick Lamb 2010-03-13 12:07:08 UTC
Still no response from the maintainer.

Jerome, what I need from you is really, really simple. What should work?

Name a set of packages against which you would actually accept a bug report.

Or, if you can't, then we've got to the root of the problem - the ATI driver is now unmaintainable, and should be removed from Fedora until someone develops one that is maintainable, even perhaps at the cost of a great deal of features.

Comment 28 Nick Lamb 2010-03-17 09:46:22 UTC
Same symptoms persist to this day, current packages on the affected system are:

libdrm-2.4.17-1.fc12.x86_64
libdrm-2.4.17-1.fc12.i686
mesa-libGL-7.7-4.fc12.x86_64
mesa-libGL-7.7-4.fc12.i686
mesa-dri-drivers-7.7-4.fc12.x86_64
mesa-dri-drivers-7.7-4.fc12.i686
mesa-dri-drivers-experimental-7.7-4.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.21.20100219gite68d3a389.fc12.x86_64
xorg-x11-server-Xorg-1.7.5.901-4.fc12.x86_64

as we can see, libdrm is the only thing that hasn't been changed since the last time I made such a list. Checking with Koji I see that indeed this package has not changed (though there is a new candidate from a month ago that wasn't promoted to updates).

So, where's the "not properly updated" package Jerome?

Let's go back to assuming there is a problem (is that so unlikely, in this huge morass of poorly documented code?) and try to debug it, hmm?

Comment 29 Nick Lamb 2010-06-15 09:04:37 UTC
Bug present to this day.

Still no response from the assignee, assume he is non-contactable.

QA please re-assign to someone who can actually respond meaningfully to bug reports.

Comment 30 Jérôme Glisse 2010-06-15 09:24:47 UTC
You may suffer from usual GPU lockup, there is very little we can do about it until we are able to reproduce it localy. I am sorry you are experiencing such things but there is no easy way for us to fix them, maybe try F13 it might be better (sometimes we fix lockup without even knowing that we fix one, simply by changing something).

Comment 31 Ilkka Tengvall 2010-06-15 09:48:49 UTC
Hi,

I had the same error on F12 as I reported earlier in this track. Now after upgrading to F13 and having Radeon driver taking care of display, the bug doesn't appear any more. I'm happily running gnome-shell with 3D enabled \o/

It only crashes (kernel) while coming back from suspend to memory. But that's worth a separate bug report, after I get console connected to read the crash info.

Comment 32 Bug Zapper 2010-11-04 00:13:08 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping