Bug 726857 - [REDWOOD] Radeon GPU lockup
Summary: [REDWOOD] Radeon GPU lockup
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-ati
Version: 15
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Jérôme Glisse
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: [cat:lockup]
: 704905 730872 733047 733916 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-07-30 00:21 UTC by Rik van Riel
Modified: 2012-08-07 16:31 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-07 16:31:15 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Xorg.0.log of the session that crashed (67.57 KB, text/plain)
2011-07-30 00:21 UTC, Rik van Riel
no flags Details

Description Rik van Riel 2011-07-30 00:21:46 UTC
Created attachment 515949 [details]
Xorg.0.log of the session that crashed

Description of problem:

Xorg often gets stuck in an infinite loop after upgrading my kernel from 3.0-rc3 to 3.0.0-2. I run a Radeon 5570 card with three monitors attached.

Dmesg:

[19076.942946] radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec
[19076.942957] ------------[ cut here ]------------
[19076.942988] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 radeon_fence_wait+0x296/0x33d [radeon]()
[19076.942993] Hardware name: Precision WorkStation T3500  
[19076.942998] GPU lockup (waiting for 0x00784F77 last fence id 0x00784F6C)
[19076.943000] Modules linked in: bnep bluetooth rfkill fuse tun ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle netconsole configfs nfsd lockd nfs_acl auth_rpcgss sunrpc bridge stp llc max6650 coretemp snd_hda_codec_hdmi snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_usb_audio snd_ens1371 gameport snd_ac97_codec snd_hwdep ac97_bus snd_usbmidi_lib virtio_net snd_seq snd_pcm snd_rawmidi snd_timer ppdev iTCO_wdt snd_seq_device pl2303 snd kvm_intel iTCO_vendor_support i7core_edac dell_wmi tg3 serio_raw sparse_keymap i2c_i801 edac_core snd_page_alloc parport_pc parport kvm dcdbas soundcore microcode wmi radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
[19076.943040] Pid: 24386, comm: Xorg Tainted: G        W   3.0.0-2.fc16.x86_64 #1
[19076.943042] Call Trace:
[19076.943051]  [<ffffffff81054c9e>] warn_slowpath_common+0x83/0x9b
[19076.943054]  [<ffffffff81054d59>] warn_slowpath_fmt+0x46/0x48
[19076.943074]  [<ffffffffa00d2341>] ? evergreen_gpu_is_lockup+0xba/0xc2 [radeon]
[19076.943090]  [<ffffffffa0098dad>] radeon_fence_wait+0x296/0x33d [radeon]
[19076.943100]  [<ffffffff8107040e>] ? remove_wait_queue+0x3a/0x3a
[19076.943121]  [<ffffffffa00993a1>] radeon_sync_obj_wait+0x11/0x13 [radeon]
[19076.943141]  [<ffffffffa0060642>] ttm_bo_wait+0xbf/0x17a [ttm]
[19076.943150]  [<ffffffffa006106a>] ? ttm_bo_list_ref_sub+0x29/0x2b [ttm]
[19076.943169]  [<ffffffffa00a9c4f>] radeon_bo_wait+0x7b/0x9f [radeon]
[19076.943189]  [<ffffffffa00aa1b1>] radeon_gem_wait_idle_ioctl+0x3d/0x70 [radeon]
[19076.943198]  [<ffffffffa00157ff>] drm_ioctl+0x29e/0x37b [drm]
[19076.943217]  [<ffffffffa00aa174>] ? radeon_gem_busy_ioctl+0x86/0x86 [radeon]
[19076.943224]  [<ffffffff811dadb0>] ? inode_has_perm+0x32/0x34
[19076.943228]  [<ffffffff811dae59>] ? file_has_perm+0xa7/0xc9
[19076.943233]  [<ffffffff81134162>] do_vfs_ioctl+0x460/0x4a1
[19076.943238]  [<ffffffff810a0ec9>] ? audit_syscall_exit+0x12d/0x148
[19076.943242]  [<ffffffff811341f9>] sys_ioctl+0x56/0x79
[19076.943249]  [<ffffffff814a4bb5>] ? int_check_syscall_exit_work+0x34/0x3d
[19076.943253]  [<ffffffff814a4902>] system_call_fastpath+0x16/0x1b
[19076.943256] ---[ end trace 8c7dd67d01e6d0d4 ]---
[19076.944358] radeon 0000:02:00.0: GPU softreset 
[19076.944367] radeon 0000:02:00.0:   GRBM_STATUS=0xA0003828
[19076.944376] radeon 0000:02:00.0:   GRBM_STATUS_SE0=0x00000007
[19076.944383] radeon 0000:02:00.0:   GRBM_STATUS_SE1=0x00000007
[19076.944392] radeon 0000:02:00.0:   SRBM_STATUS=0x200000C0
[19076.944415] radeon 0000:02:00.0:   GRBM_SOFT_RESET=0x00007F6B
[19076.944526] radeon 0000:02:00.0:   GRBM_STATUS=0x00003828
[19076.944530] radeon 0000:02:00.0:   GRBM_STATUS_SE0=0x00000007
[19076.944534] radeon 0000:02:00.0:   GRBM_STATUS_SE1=0x00000007
[19076.944538] radeon 0000:02:00.0:   SRBM_STATUS=0x200000C0
[19076.945547] radeon 0000:02:00.0: GPU reset succeed
[19077.000498] radeon 0000:02:00.0: WB enabled
[19077.017088] [drm] ring test succeeded in 0 usecs
[19077.017101] [drm] ib test succeeded in 1 usecs
[19077.017463] [drm] force priority to high

Xorg.0.log is attached

Version-Release number of selected component (if applicable):

kernel-3.0.0-2.fc16.x86_64
xorg-x11-drv-ati-6.14.1-2.20110525gitfe5c42f51.fc15.x86_64
mesa-libGL-7.11-0.16.20110709.0.fc15.x86_64
mesa-dri-drivers-7.11-0.16.20110709.0.fc15.x86_64

Steps to Reproduce:
1. run the above combination of software with gnome3 on a radeon 5770 with 3 monitors attached
2. watch X get stuck after a while (not sure what triggers it)
3. wait minutes without X becoming responsive again
4. kill X over ssh
5. file this BZ
  
Expected results:

Xorg continues to work.  It seems to recover from some GPU stalls just fine (like the one above?), while getting totally stuck at other times.

Additional info:

I just run a fairly standard gnome3 setup, with 3 monitors.  Not sure what triggers the crash...

Comment 1 Jeff Raber 2011-08-09 03:41:06 UTC
See: https://bugs.freedesktop.org/show_bug.cgi?id=39572

Same issue?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 2 Dave Jones 2011-08-30 20:57:28 UTC
*** Bug 734201 has been marked as a duplicate of this bug. ***

Comment 3 Dave Jones 2011-08-30 20:57:34 UTC
*** Bug 733916 has been marked as a duplicate of this bug. ***

Comment 4 Dave Jones 2011-08-30 20:57:41 UTC
*** Bug 733047 has been marked as a duplicate of this bug. ***

Comment 5 Dave Jones 2011-08-30 20:57:47 UTC
*** Bug 730872 has been marked as a duplicate of this bug. ***

Comment 6 Dave Jones 2011-08-30 20:59:22 UTC
*** Bug 704905 has been marked as a duplicate of this bug. ***

Comment 7 Jérôme Glisse 2011-08-31 17:23:04 UTC
Please don't mark bugs as duplicate, each lockup is more likely different from the other.

It's the sad true about gpu lockup, you can't say they are duplicate unless fixing one also fix the others.

Comment 8 Jérôme Glisse 2011-08-31 17:24:43 UTC
Oh and the kernel message is pretty much uninformative, it will be the same for pretty much any lockup.

Comment 9 Fedora End Of Life 2012-08-07 16:31:18 UTC
This message is a notice that Fedora 15 is now at end of life. Fedora
has stopped maintaining and issuing updates for Fedora 15. It is
Fedora's policy to close all bug reports from releases that are no
longer maintained. At this time, all open bugs with a Fedora 'version'
of '15' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that
we were unable to fix it before Fedora 15 reached end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora, you are encouraged to click on
"Clone This Bug" (top right of this page) and open it against that
version of Fedora.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

The process we are following is described here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping


Note You need to log in before you can comment on or make changes to this bug.