Bug 588845 - UMS:RV515:Radeon X1300 Xorg crashes - GPF in radeon_read_ring_rptr
Summary: UMS:RV515:Radeon X1300 Xorg crashes - GPF in radeon_read_ring_rptr
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-ati
Version: 13
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Jérôme Glisse
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-04 17:38 UTC by Tomáš Trnka
Modified: 2011-06-27 16:05 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2011-06-27 16:05:27 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
xorg.conf (342 bytes, text/plain)
2010-05-04 17:43 UTC, Tomáš Trnka
no flags Details
Xorg log (79.95 KB, text/plain)
2010-05-04 17:45 UTC, Tomáš Trnka
no flags Details
Xorg log with bad kernel when monitor goes off (39.43 KB, text/plain)
2010-05-06 14:49 UTC, adriano
no flags Details
Xorg log with working kernel (86.30 KB, text/plain)
2010-05-06 14:49 UTC, adriano
no flags Details

Description Tomáš Trnka 2010-05-04 17:38:15 UTC
Description of problem:
After updating Xorg to 1.8.0-8 (-12, too), X crashes soon after login to KDE with compositing enabled (it looks like it happens when a popup/tooltip/combobox dropdown is being closed). 1.8.0-6 works okay (downgrading makes the bug disappear).

Version-Release number of selected component (if applicable):
xorg-x11-server-Xorg-1.8.0-8.fc13.x86_64
mesa-libGL-7.8.1-2.fc13.x86_64
kernel-2.6.33.2-57.fc13.x86_64
kernel-2.6.33.3-72.fc13.x86_64

How reproducible:
As soon as a popup is triggered, X goes down in flames...

Steps to Reproduce:
1.Position your mouse over a taskbar button to make a window thumbnail appear
OR
2.Type anything into Konqueror's address bar (history dropdown appears)
  
Actual results:
general protection fault: 0000 [#1] SMP 
last sysfs file: /sys/devices/virtual/dmi/id/bios_vendor
CPU 0 
Pid: 1625, comm: X Not tainted 2.6.33.2-57.fc13.x86_64 #1 AM2NF3-VSTA/To Be Filled By O.E.M.
RIP: 0010:[<ffffffffa009a12b>]  [<ffffffffa009a12b>] radeon_read_ring_rptr+0x1f/0x34 [radeon]
RSP: 0018:ffff88007b419a90  EFLAGS: 00013202
RAX: ffff88007981fb28 RBX: ffff88007a3418d8 RCX: ffffc90000d80000
RDX: 0000000000000028 RSI: 6b6b6b6b6b6b6b6b RDI: ffff88007a3418d8
RBP: ffff88007b419a90 R08: ffff88007a3b20b0 R09: 0000000000003246
R10: ffff88007a3b20c8 R11: 0000000000000000 R12: 0000000000000010
R13: ffff88007a3b2090 R14: ffff88007a341a58 R15: ffff88007a3b2280
FS:  00007fc321ae3840(0000) GS:ffff880004a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003cef4cc400 CR3: 000000004e215000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process X (pid: 1625, threadinfo ffff88007b418000, task ffff8800798c4920)
Stack:
ffff88007b419aa0 ffffffffa009a156 ffff88007b419ab8 ffffffffa009a881
 ffff88007a3418d8 ffff88007b419ad8 ffffffffa009b857 ffff88007a3418d8
 ffff88007a3b2090 ffff88007b419af8 ffffffffa009d387 ffff88007a341a58
Call Trace:
[<ffffffffa009a156>] radeon_get_ring_head+0x16/0x41 [radeon]
[<ffffffffa009a881>] radeon_commit_ring+0x4d/0x9c [radeon]
[<ffffffffa009b857>] radeon_do_cp_idle+0x145/0x152 [radeon]
[<ffffffffa009d387>] radeon_do_release+0x9a/0x1ad [radeon]
[<ffffffffa00a4308>] radeon_driver_lastclose+0x52/0x5b [radeon]
[<ffffffffa002df38>] drm_lastclose+0x4f/0x2a0 [drm]
[<ffffffffa002ed83>] drm_release+0x5f1/0x63e [drm]
[<ffffffff811211fe>] __fput+0x12a/0x1df
[<ffffffff811212cd>] fput+0x1a/0x1c
[<ffffffff8111e0c3>] filp_close+0x68/0x72
[<ffffffff81053581>] put_files_struct+0x6a/0xcb
[<ffffffff8105362d>] exit_files+0x4b/0x54
[<ffffffff81054f0c>] do_exit+0x269/0x7a5
[<ffffffff810554cc>] do_group_exit+0x84/0xb0
[<ffffffff8106439a>] get_signal_to_deliver+0x3b0/0x3cf
[<ffffffff81009015>] do_signal+0x72/0x6bc
[<ffffffff81477a8c>] ? printk+0x41/0x45
[<ffffffff81030e55>] ? bad_area+0x47/0x4e
[<ffffffff81009687>] do_notify_resume+0x28/0x86
[<ffffffff8147b09b>] retint_signal+0x4d/0x92
Code: 89 c7 e8 54 87 f9 ff c9 c3 90 90 90 55 48 89 e5 0f 1f 44 00 00 f6 87 0e 04 00 00 08 48 8b 87 10 01 00 00 74 0a 89 f6 48 03 70 18 <8b> 06 eb 0f c1 ee 02 89 f6 48 c1 e6 02 48 03 70 18 8b 06 c9 c3 
RIP  [<ffffffffa009a12b>] radeon_read_ring_rptr+0x1f/0x34 [radeon]
RSP <ffff88007b419a90>

Expected results:
No GPF, X still alive

Comment 1 Tomáš Trnka 2010-05-04 17:43:24 UTC
Created attachment 411348 [details]
xorg.conf

Comment 2 Tomáš Trnka 2010-05-04 17:45:36 UTC
Created attachment 411351 [details]
Xorg log

This is from the working 1.8.0-6, but the -8 (crashing) log wasn't any different

Comment 3 Tomáš Trnka 2010-05-04 18:35:07 UTC
The corresponding Xorg backtrace:

[New Thread 13530]
Core was generated by `/usr/bin/X -nr -nolisten tcp :0 vt8 -auth /var/run/kdm/A:0-o2m3lb'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000445caa in privateExists (privates=0x290, 
    key=<value optimized out>) at privates.c:79
79          return *key && *privates &&

Thread 1 (Thread 13530):
#0  0x0000000000445caa in privateExists (privates=0x290, 
    key=<value optimized out>) at privates.c:79
#1  dixLookupPrivate (privates=0x290, key=<value optimized out>)
    at privates.c:162
#2  0x00007f07100c099b in __glXDRIdrawableDestroy (drawable=0x7c692c0)
    at glxdri.c:233
#3  0x00007f07100b7e6f in DrawableGone (glxPriv=0x7c692c0, 
    xid=<value optimized out>) at glxext.c:174
#4  0x0000000000449570 in FreeResource (id=25168934, skipDeleteFuncType=0)
    at resource.c:560
#5  0x00007f07100b4f19 in DoDestroyDrawable (cl=<value optimized out>, 
    glxdrawable=25168934, type=<value optimized out>) at glxcmds.c:1274
#6  0x00007f07100b7c80 in __glXDispatch (client=0x1eeea50) at glxext.c:601
#7  0x000000000042c32c in Dispatch () at dispatch.c:439
#8  0x00000000004219ca in main (argc=<value optimized out>, 
    argv=0x7fffc0f2a218, envp=<value optimized out>) at main.c:286

Judging from this kernel line (that I apparently failed to include in the Actual results section above), the culprit is privates = 0x290:

X[1625]: segfault at 290 ip 0000000000445caa sp 00007fff93fc1400 error 4 in Xorg[400000+1c2000]

Comment 4 adriano 2010-05-06 14:47:44 UTC
Previously, I've reported bug in 586453 https://bugzilla.redhat.com/show_bug.cgi?id=586453, I think this one looks like my problem:

with last kernel and all other upgrades (since 18
April), graphic is unavailable. Can boot only with nomodeset in boot params but
can't use graphic display with nomodeset. With previous kernel
(kernel-2.6.33.1-24.fc13.x86_64) I don't have problems.    
Monitor goes off just after grub menu (I think when udev start?)
Today, I've updated system (May 6) but nothing has changed. Bug is always present.

Comment 5 adriano 2010-05-06 14:49:05 UTC
Created attachment 412077 [details]
Xorg log with bad kernel when monitor goes off

Comment 6 adriano 2010-05-06 14:49:45 UTC
Created attachment 412078 [details]
Xorg log with working kernel

Comment 7 adriano 2010-05-06 14:50:30 UTC
00:02.0 VGA compatible controller: Intel Corporation 82Q963/Q965 Integrated
Graphics Controller (rev 02)
00:02.1 Display controller: Intel Corporation 82Q963/Q965 Integrated Graphics
Controller (rev 02)

Comment 8 Tomáš Trnka 2010-05-06 15:07:13 UTC
(In reply to comment #4)
> Previously, I've reported bug in 586453
> https://bugzilla.redhat.com/show_bug.cgi?id=586453, I think this one looks like
> my problem:

What makes you think this bug is related? Almost everything here is different...This is a Radeon UMS working, but crashing after some time; you have a non-working Intel KMS...This was caused/exposed by a xorg upgrade, yours by a kernel upgrade...
IMHO your problem is completely unrelated to this one and you should file a separate bugreport...

Comment 9 adriano 2010-05-06 17:24:13 UTC
Changed again to 587625!
Sorry

Comment 10 Tomáš Trnka 2010-05-17 19:54:03 UTC
I've investigated this a bit more and here are the results:

1.8.0-6 works flawlessly, 1.8.0-7 crashes (any newer version, too). This is 100% reproducible (tried upgrading/downgrading Xorg several times to make sure this is not caused by anything else)

I'm highly unsure this is a radeon bug since the Xorg backtrace (suggests it happens in Xorg's dix/privates.c (OTOH the kernel backtrace shows radeon_ring_read_rptr - with exactly the same IP - how is that possible?).

I've upgraded xorg-x11-drv-ati to the latest build (6.13.0-2), no difference.

Comparing Xorg logs between a working 1.8.0-6 session and the crashed ones (-7, -12) did not show any differences (well, except for the package version and AGP ring/texturemap/buffers addresses).

Comment 11 Vedran Miletić 2010-05-24 19:56:23 UTC
Improving summary.

---

Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

[This triage is part of collective effort done by students of University of
Rijeka Department of Informatics.]

Comment 12 Hugo Mildenberger 2010-06-18 12:25:28 UTC
I encountered a very similar problem using an ATI Radeon IGP 9100 (R200) and xorg server version 1.8.1-901 on Gentoo. The very same actions triggered the crash while the kernel was unaffected. xorg-server-1.8.0 is the last working release. Using xorg-1.8.1-901, the stack was

 #9  <signal handler called>
 #10 0x517d8210 in DrawableGone (glxPriv=0x12b06288, xid=20971921)
     at glxext.c:133
 #11 0x1221cfaf in FreeResource (id=20971921, skipDeleteFuncType=0)
     at resource.c:560
 #12 0x517d5112 in DoDestroyDrawable (cl=<value optimized out>, 
     glxdrawable=20971921, type=1) at glxcmds.c:1275
 #13 0x517d80ab in __glXDispatch (client=0x126a10b0) at glxext.c:601
 #14 0x12200ced in Dispatch () at dispatch.c:439
 #15 0x121f64b5 in main (argc=10, argv=0x5ea4adc4, envp= Cannot access 
     memory at address 0x3e
 
This appears to have a similar fingerprint. In my environment, DrawableGone failed due to an invalid value for glxPriv->pDraw:

 133	    if (glxPriv->drawId != glxPriv->pDraw->id) {
 134		if (xid == glxPriv->drawId)
 135		    FreeResourceByType(glxPriv->pDraw->id, __glXDrawableRes,
                    TRUE);

 print glxPriv->drawId
 $1 = 20971921
 print glxPriv->pDraw
 $2 = (DrawablePtr) 0x3cbaf008
 print glxPriv->pDraw->id
 Cannot access memory at address 0x3cbaf00c


Here is also some info on the resource id and type gained from the FreeResource frame:
  550	#ifdef XSERVER_DTRACE
  551			XSERVER_RESOURCE_FREE(res->id, res->type,
  552				      res->value, TypeNameString(res->type));
  553	#endif		    
  554			*prev = res->next;

  print *res
  $1 = {next = 0x1280d760, id = 20971921, type = 54, value = 0x12b06288}


Bisecting xorg-server between 1.8.0 and 1.8.1-901 finally revealed that

  0460a76b9ae25fe26f683f0cbff1e4157287cf56 is the first bad commit
  commit 0460a76b9ae25fe26f683f0cbff1e4157287cf56
  Author: Kristian Høgsberg <krh>
  Date:   Fri Apr 16 05:55:33 2010 -0400

  glx: Let the resource system destroy pixmaps

  GLX pbuffers are implemented using a pixmap allocated by the server.
  With the change to DRI2 to track DRI2 drawables as resources, we need to make
  sure that every drawable we create a DRI2 drawable for has an XID.  By
  using the XID of the pbuffer, the resource system will automatically
  reclaim the hidden pixmap and the DRI2 drawable when the pbuffer is
  destroyed or the client exits.

  Signed-off-by: Kristian Høgsberg <krh>
  Signed-off-by: Keith Packard <keithp>
  (cherry picked from commit 22da7aa9d743deee198aaf6df5d370a446db9763)

  :040000 040000 47f59391028a3c792c3ea22a0eb65a65c9f414c4
   ac43336bcc8ee3545f1b673affc5d2121f9054c7 M      glx

The problem disappeared after reverting that particular commit.

Comment 13 Hugo Mildenberger 2010-06-21 08:01:28 UTC
https://bugs.freedesktop.org/show_bug.cgi?id=28181 is related.

Comment 14 Bug Zapper 2011-06-02 14:30:05 UTC
This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 15 Bug Zapper 2011-06-27 16:05:27 UTC
Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.