Bug 553279

Summary: KMS:RV630:HD2600 lockup worse with newer server.
Product: [Fedora] Fedora Reporter: Leif Gruenwoldt <leifer>
Component: xorg-x11-drv-atiAssignee: Jérôme Glisse <jglisse>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 12CC: airlied, campbecg, cbuissar, jglisse, manfred, mcepl, milan.kerslager, mjw, xgl-maint
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: card_R600/mM
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-06-09 15:43:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
xorg.0.log
none
xorg.0.log at time of xorg crash
none
/var/log/messages showing radeon stack trace
none
crashed again today. similar but slightly different /var/log/messages crash info
none
Avoid oops in case of GPU lockup none

Description Leif Gruenwoldt 2010-01-07 15:10:45 UTC
Description of problem:

My desktop randomly locks up. When I shell in I see that xorg is using 100% cpu.


Version-Release number of selected component (if applicable):

xorg-x11-server-Xorg-1.7.1-7.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.11.20091119git437113124.fc12.x86_64
mesa-dri-drivers-experimental-7.6-0.13.fc12.x86_64

How reproducible:

Not very. Randomly crashes possibly once a week. This last time was midst a Software Update. Not sure 
  

Actual results:

from /var/log/Xorg.0.log

[mi] EQ overflowing. The server is probably stuck in an infinite loop.

Backtrace:
0: /usr/bin/Xorg (xorg_backtrace+0x28) [0x49e8d8]
1: /usr/bin/Xorg (mieqEnqueue+0x1f4) [0x49e2a4]
2: /usr/bin/Xorg (xf86PostMotionEventP+0xce) [0x478f0e]
3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f478b515000+0x50bf) [0x7f478b51a0bf]
4: /usr/bin/Xorg (0x400000+0x6be17) [0x46be17]
5: /usr/bin/Xorg (0x400000+0x116b13) [0x516b13]
6: /lib64/libpthread.so.0 (0x323ba00000+0xefa0) [0x323ba0efa0]
7: /lib64/libc.so.6 (ioctl+0x7) [0x323b2d61f7]
8: /usr/lib64/libdrm.so.2 (drmIoctl+0x23) [0x3f704033b3]
9: /usr/lib64/libdrm.so.2 (drmCommandWriteRead+0x1c) [0x3f704035fc]
10: /usr/lib64/libdrm_radeon.so.1 (0x7f478bb54000+0xff9) [0x7f478bb54ff9]
11: /usr/lib64/libdrm_radeon.so.1 (0x7f478bb54000+0x1045) [0x7f478bb55045]
12: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0x7f478bd6c000+0xbcc86) [0x7f478be28c86]
13: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0x7f478bd6c000+0xbccf3) [0x7f478be28cf3]
14: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0x7f478bd6c000+0xb733b) [0x7f478be2333b]
15: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0x7f478bd6c000+0xba614) [0x7f478be26614]
16: /usr/lib64/xorg/modules/libexa.so (0x7f478b71f000+0x9f10) [0x7f478b728f10]
17: /usr/lib64/xorg/modules/libexa.so (0x7f478b71f000+0xa01d) [0x7f478b72901d]
18: /usr/bin/Xorg (miCopyRegion+0x28d) [0x54575d]
19: /usr/bin/Xorg (miDoCopy+0x44a) [0x545c6a]
20: /usr/lib64/xorg/modules/libexa.so (0x7f478b71f000+0x8423) [0x7f478b727423]
21: /usr/bin/Xorg (0x400000+0xd3f18) [0x4d3f18]
22: /usr/bin/Xorg (0x400000+0xad985) [0x4ad985]
23: /usr/bin/Xorg (0x400000+0xaed35) [0x4aed35]
24: /usr/bin/Xorg (0x400000+0x2c69c) [0x42c69c]
25: /usr/bin/Xorg (0x400000+0x21cfa) [0x421cfa]
26: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x323b21eb1d]
27: /usr/bin/Xorg (0x400000+0x218a9) [0x4218a9]

Additional Info:

Using mesa-dri-drivers-experimental and compiz on.

Comment 1 Leif Gruenwoldt 2010-01-07 15:31:38 UTC
Created attachment 382261 [details]
xorg.0.log

xorg log with backtrace

Comment 2 Leif Gruenwoldt 2010-01-13 22:49:23 UTC
It might be coincidental but since upgrading today to xorg 1.74 I've experienced this issue much more freqently. Xorg has hung three times today.

Comment 3 Dave Airlie 2010-01-14 08:29:28 UTC
does the new mesa in updates-testing help at all (also need new libdrm and newer -ati driver, plymouth).

Comment 4 Leif Gruenwoldt 2010-01-14 16:09:09 UTC
(In reply to comment #3)
> does the new mesa in updates-testing help at all (also need new libdrm and
> newer -ati driver, plymouth).    

I tried this 

$ sudo yum update libdrm xorg-x11-drv-ati plymouth --enablerepo=updates-testing

and it says

Error: Missing Dependency: libdrm >= 2.4.17 is needed by package plymouth-0.8.0-0.2009.29.09.19.1.fc12.x86_64 (updates-testing)

In koji I only see 2.4.17 built for f13.

Comment 5 Leif Gruenwoldt 2010-01-20 20:35:34 UTC
Did todays yum updates and amongst other things got:

libdrm-2.4.17-1.fc12.x86_64
plymouth-0.8.0-0.2009.29.09.19.1.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.x86_64
mesa-dri-drivers-7.7-2.fc12.x86_64

Xorg is crashing (no longer hanging with 100% cpu). 

Here's what I found in /var/log/messages



Jan 20 15:21:32 localhost kernel: [drm:radeon_ib_get] *ERROR* radeon: IB(6:0x0000000010161000:0)
Jan 20 15:21:32 localhost kernel: [drm:radeon_ib_get] *ERROR* radeon: GPU lockup detected, fail to get a IB
Jan 20 15:21:32 localhost kernel: [drm:r600_vb_ib_get] *ERROR* failed to get IB for vertex buffer
Jan 20 15:21:32 localhost kernel: ------------[ cut here ]------------
Jan 20 15:21:32 localhost kernel: WARNING: at drivers/gpu/drm/radeon/r600_blit_kms.c:550 r600_blit_prepare_copy+0x36/0x3e4 [radeon]() (Tainted: P          )
Jan 20 15:21:32 localhost kernel: Hardware name: HP Compaq 8510p 
Jan 20 15:21:32 localhost kernel: Modules linked in: fuse ipv6 cpufreq_ondemand acpi_cpufreq freq_table dm_multipath kvm_intel kvm uinput snd_hda_codec_atihdmi snd_hda_codec_analog sdhci_pci snd_hda_intel sdhci firewire_ohci ppdev parport_pc firewire_core hp_accel lib80211_crypt_tkip snd_hda_codec snd_hwdep parport mmc_core snd_seq snd_seq_device snd_pcm serio_raw tpm_infineon lis3lv02d joydev ricoh_mmc crc_itu_t pata_pcmcia input_polldev snd_timer iTCO_wdt snd soundcore wl(P) iTCO_vendor_support snd_page_alloc lib80211 e1000e wmi yenta_socket rsrc_nonstatic video output radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: microcode]
Jan 20 15:21:32 localhost kernel: Pid: 1374, comm: Xorg Tainted: P           2.6.31.9-174.fc12.x86_64 #1
Jan 20 15:21:32 localhost kernel: Call Trace:
Jan 20 15:21:32 localhost kernel: [<ffffffff81051710>] warn_slowpath_common+0x84/0x9c
Jan 20 15:21:32 localhost kernel: [<ffffffff8105173c>] warn_slowpath_null+0x14/0x16
Jan 20 15:21:32 localhost kernel: [<ffffffffa00a12b2>] r600_blit_prepare_copy+0x36/0x3e4 [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa009821c>] r600_copy_blit+0x2b/0x57 [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa00795f9>] radeon_move_blit+0x101/0x13f [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa007986f>] radeon_bo_move+0x238/0x25e [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa004c248>] ttm_bo_handle_move_mem+0x1cc/0x2bb [ttm]
Jan 20 15:21:32 localhost kernel: [<ffffffffa004dab6>] ttm_bo_move_buffer+0xb0/0xef [ttm]
Jan 20 15:21:32 localhost kernel: [<ffffffffa004db37>] ttm_buffer_object_validate+0x42/0xbf [ttm]
Jan 20 15:21:32 localhost kernel: [<ffffffffa007a22c>] radeon_object_list_validate+0xaf/0x152 [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa0086937>] radeon_cs_parser_relocs+0x19c/0x1fa [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa0086cb6>] ? radeon_cs_ioctl+0x0/0x19a [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa0086d79>] radeon_cs_ioctl+0xc3/0x19a [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa001521f>] drm_ioctl+0x237/0x2f4 [drm]
Jan 20 15:21:32 localhost kernel: [<ffffffff81108cc5>] vfs_ioctl+0x6f/0x87
Jan 20 15:21:32 localhost kernel: [<ffffffff8104907d>] ? finish_task_switch+0xc3/0xe6
Jan 20 15:21:32 localhost kernel: [<ffffffff811091d4>] do_vfs_ioctl+0x47b/0x4c1
Jan 20 15:21:32 localhost kernel: [<ffffffff81109270>] sys_ioctl+0x56/0x79
Jan 20 15:21:32 localhost kernel: [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b
Jan 20 15:21:32 localhost kernel: ---[ end trace 6ddccecdcb125329 ]---
Jan 20 15:21:32 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
Jan 20 15:21:32 localhost kernel: IP: [<ffffffffa00a19c4>] r600_kms_blit_copy+0x68/0x518 [radeon]
Jan 20 15:21:32 localhost kernel: PGD 138c13067 PUD 138c12067 PMD 0 
Jan 20 15:21:32 localhost kernel: Oops: 0000 [#1] SMP 
Jan 20 15:21:32 localhost kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda2/stat
Jan 20 15:21:32 localhost kernel: CPU 0 
Jan 20 15:21:32 localhost kernel: Modules linked in: fuse ipv6 cpufreq_ondemand acpi_cpufreq freq_table dm_multipath kvm_intel kvm uinput snd_hda_codec_atihdmi snd_hda_codec_analog sdhci_pci snd_hda_intel sdhci firewire_ohci ppdev parport_pc firewire_core hp_accel lib80211_crypt_tkip snd_hda_codec snd_hwdep parport mmc_core snd_seq snd_seq_device snd_pcm serio_raw tpm_infineon lis3lv02d joydev ricoh_mmc crc_itu_t pata_pcmcia input_polldev snd_timer iTCO_wdt snd soundcore wl(P) iTCO_vendor_support snd_page_alloc lib80211 e1000e wmi yenta_socket rsrc_nonstatic video output radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: microcode]
Jan 20 15:21:32 localhost kernel: Pid: 1374, comm: Xorg Tainted: P        W  2.6.31.9-174.fc12.x86_64 #1 HP Compaq 8510p 
Jan 20 15:21:32 localhost kernel: RIP: 0010:[<ffffffffa00a19c4>]  [<ffffffffa00a19c4>] r600_kms_blit_copy+0x68/0x518 [radeon]
Jan 20 15:21:32 localhost kernel: RSP: 0018:ffff8801374e3948  EFLAGS: 00010216
Jan 20 15:21:32 localhost kernel: RAX: 0000000000000000 RBX: ffff880138734000 RCX: ffffffffa00b66e7
Jan 20 15:21:32 localhost kernel: RDX: ffffffffa00aa930 RSI: ffffffffa00b66b1 RDI: 0000000000000001
Jan 20 15:21:32 localhost kernel: RBP: ffff8801374e39b8 R08: 0000000010252000 R09: 00000000d7429000
Jan 20 15:21:32 localhost kernel: R10: 0000000000000028 R11: 0000000000000010 R12: 00000000000000c0
Jan 20 15:21:32 localhost kernel: R13: 0000000000001000 R14: 00000000d7429000 R15: 0000000010252000
Jan 20 15:21:32 localhost kernel: FS:  00007f82cd7397c0(0000) GS:ffff880028022000(0000) knlGS:0000000000000000
Jan 20 15:21:32 localhost kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 20 15:21:32 localhost kernel: CR2: 0000000000000028 CR3: 00000001391df000 CR4: 00000000000026f0
Jan 20 15:21:32 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 20 15:21:32 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jan 20 15:21:32 localhost kernel: Process Xorg (pid: 1374, threadinfo ffff8801374e2000, task ffff880138c88000)
Jan 20 15:21:32 localhost kernel: Stack:
Jan 20 15:21:32 localhost kernel: ffff880100001000 ffff880000000030 00007f8200000054 0000000000000004
Jan 20 15:21:32 localhost kernel: <0> ffff880100000028 0000000000000028 ffff880100000010 0000021000000020
Jan 20 15:21:32 localhost kernel: <0> ffff88013550ca80 ffff880138734000 0000000010252000 ffff880105e56a00
Jan 20 15:21:32 localhost kernel: Call Trace:
Jan 20 15:21:32 localhost kernel: [<ffffffffa009822e>] r600_copy_blit+0x3d/0x57 [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa00795f9>] radeon_move_blit+0x101/0x13f [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa007986f>] radeon_bo_move+0x238/0x25e [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa004c248>] ttm_bo_handle_move_mem+0x1cc/0x2bb [ttm]
Jan 20 15:21:32 localhost kernel: [<ffffffffa004dab6>] ttm_bo_move_buffer+0xb0/0xef [ttm]
Jan 20 15:21:32 localhost kernel: [<ffffffffa004db37>] ttm_buffer_object_validate+0x42/0xbf [ttm]
Jan 20 15:21:32 localhost kernel: [<ffffffffa007a22c>] radeon_object_list_validate+0xaf/0x152 [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa0086937>] radeon_cs_parser_relocs+0x19c/0x1fa [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa0086cb6>] ? radeon_cs_ioctl+0x0/0x19a [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa0086d79>] radeon_cs_ioctl+0xc3/0x19a [radeon]
Jan 20 15:21:32 localhost kernel: [<ffffffffa001521f>] drm_ioctl+0x237/0x2f4 [drm]
Jan 20 15:21:32 localhost kernel: [<ffffffff81108cc5>] vfs_ioctl+0x6f/0x87
Jan 20 15:21:32 localhost kernel: [<ffffffff8104907d>] ? finish_task_switch+0xc3/0xe6
Jan 20 15:21:32 localhost kernel: [<ffffffff811091d4>] do_vfs_ioctl+0x47b/0x4c1
Jan 20 15:21:32 localhost kernel: [<ffffffff81109270>] sys_ioctl+0x56/0x79
Jan 20 15:21:32 localhost kernel: [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b
Jan 20 15:21:32 localhost kernel: Code: a0 48 c7 c2 30 a9 0a a0 48 c7 c6 b1 66 0b a0 bf 01 00 00 00 e8 eb 71 f7 ff 44 8b a3 c0 0e 00 00 48 8b 83 c8 0e 00 00 49 c1 e4 02 <4c> 03 60 28 41 f6 c5 03 0f 85 22 02 00 00 4c 89 f0 4c 09 f8 a8 
Jan 20 15:21:32 localhost kernel: RIP  [<ffffffffa00a19c4>] r600_kms_blit_copy+0x68/0x518 [radeon]
Jan 20 15:21:32 localhost kernel: RSP <ffff8801374e3948>
Jan 20 15:21:32 localhost kernel: CR2: 0000000000000028
Jan 20 15:21:32 localhost kernel: ---[ end trace 6ddccecdcb12532a ]---
Jan 20 15:21:32 localhost kernel: [drm:drm_release] *ERROR* Device busy: 1

Comment 6 Leif Gruenwoldt 2010-01-20 20:36:49 UTC
Created attachment 385783 [details]
xorg.0.log at time of xorg crash

Comment 7 Leif Gruenwoldt 2010-01-20 20:37:48 UTC
Created attachment 385784 [details]
/var/log/messages showing radeon stack trace

Comment 8 Leif Gruenwoldt 2010-01-21 20:21:54 UTC
Created attachment 385997 [details]
crashed again today. similar but slightly different /var/log/messages crash info

Comment 9 Manfred Spraul 2010-01-22 07:41:41 UTC
I experience the same crash.
I've tried two kernels:

   kernel-2.6.31.9-174.fc12.x86_64
   kernel-2.6.31.12-174.2.3.fc12.x86_64

mesa-dri-drivers-experimental is not installed.
icewm as session manager.

# grep -B 2 -A 2 "IB.*vertex" /var/log/messages

Jan 21 22:50:05 cores kernel: [drm:radeon_ib_get] *ERROR* radeon: IB(2:0x0000000010121000:0)
Jan 21 22:50:05 cores kernel: [drm:radeon_ib_get] *ERROR* radeon: GPU lockup detected, fail to get a IB
Jan 21 22:50:05 cores kernel: [drm:r600_vb_ib_get] *ERROR* failed to get IB for vertex buffer
Jan 21 22:50:05 cores kernel: ------------[ cut here ]------------
Jan 21 22:50:05 cores kernel: WARNING: at drivers/gpu/drm/radeon/r600_blit_kms.c:550 r600_blit_prepare_copy+0x36/0x3e4 [radeon]() (Not tainted)
--
Jan 22 07:40:25 cores kernel: [drm:radeon_ib_get] *ERROR* radeon: IB(1:0x0000000010111000:0)
Jan 22 07:40:25 cores kernel: [drm:radeon_ib_get] *ERROR* radeon: GPU lockup detected, fail to get a IB
Jan 22 07:40:25 cores kernel: [drm:r600_vb_ib_get] *ERROR* failed to get IB for vertex buffer
Jan 22 07:40:25 cores kernel: ------------[ cut here ]------------
Jan 22 07:40:25 cores kernel: WARNING: at drivers/gpu/drm/radeon/r600_blit_kms.c:550 r600_blit_prepare_copy+0x36/0x3e4 [radeon]() (Not tainted)
--
Jan 22 08:10:41 cores kernel: [drm:radeon_ib_get] *ERROR* radeon: IB(0:0x0000000010101000:0)
Jan 22 08:10:41 cores kernel: [drm:radeon_ib_get] *ERROR* radeon: GPU lockup detected, fail to get a IB
Jan 22 08:10:41 cores kernel: [drm:r600_vb_ib_get] *ERROR* failed to get IB for vertex buffer
Jan 22 08:10:41 cores kernel: ------------[ cut here ]------------
Jan 22 08:10:41 cores kernel: WARNING: at drivers/gpu/drm/radeon/r600_blit_kms.c:550 r600_blit_prepare_copy+0x36/0x3e4 [radeon]() (Not tainted)

Comment 10 Manfred Spraul 2010-01-22 08:49:30 UTC
Some updates:
- most reliable method to provoke a crash appears to be scrolling in Firefox.

What does not help:
- booting with kernel-2.6.31-9 does NOT help
- switching to AccelMethod "XAA" does NOT help
- running with "NoAccel" in xorg.conf is not possible, X crashes (null pointer,  exaGetPixmapDriverPrivate in the backtrace)
- destalling mesa-dri-drivers-experimental does NOT help.

What seems to help:
- booting with "nomodeset" appears to avoid the crash.

My next testing steps are:
- Downgrading xorg-x11-drv-ati-6.13.0-0.20.20091221
- Downgrading kernel-firmware

Here is the list of packages that were upgraded:
 kernel-2.6.31.12-174.2.3.fc12.x86_64

 kernel-firmware-2.6.31.12-174.2.3.fc12.noarch
 kernel-headers-2.6.31.12-174.2.3.fc12.x86_64

 libdrm-2.4.17-1.fc12.i686
 libdrm-2.4.17-1.fc12.x86_64
 libdrm-devel-2.4.17-1.fc12.x86_64

 mesa-dri-drivers-7.7-2.fc12.i686
 mesa-dri-drivers-7.7-2.fc12.x86_64
 mesa-dri-drivers-experimental-7.7-2.fc12.x86_64

 mesa-libGL-7.7-2.fc12.i686
 mesa-libGL-7.7-2.fc12.x86_64
 mesa-libGL-devel-7.7-2.fc12.x86_64
 mesa-libGLU-7.7-2.fc12.i686
 mesa-libGLU-7.7-2.fc12.x86_64
 mesa-libGLU-devel-7.7-2.fc12.x86_64

 xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.x86_64
 xorg-x11-proto-devel-7.4-35.fc12.noarch

Comment 11 Manfred Spraul 2010-01-22 12:34:40 UTC
No Progress:

What does not help:
- booting with kernel-2.6.31-9 does NOT help
- switching to AccelMethod "XAA" does NOT help
- running with "NoAccel" in xorg.conf is not possible, X crashes (null pointer,
 exaGetPixmapDriverPrivate in the backtrace)
- destalling mesa-dri-drivers-experimental does NOT help.
- Downgrading kernel-firmware does NOT help - firmware files are identical
- Downgrading xorg-x11-drv-ati-6.13.0-0.20.20091221 does NOT help - X crashes

What seems to help:
- booting with "nomodeset" appears to avoid the crash.

I'm stuck, perhaps it was unstable before the last yum upgrade and I didn't notice it.
Is there anything I could try to locate the problem?

Comment 12 Jérôme Glisse 2010-01-22 14:26:23 UTC
Created attachment 386159 [details]
Avoid oops in case of GPU lockup

If you have the courage to build your own kernel this patch apply on top of 
http://git.kernel.org/?p=linux/kernel/git/airlied/drm-2.6.git;a=shortlog;h=refs/heads/drm-radeon-testing

and should fix the oops you are seeing, it might or might not fix the GPU lockup you are experiencing.

Comment 13 Mark Wielaard 2010-01-23 14:20:41 UTC
I am seeing similar issues with:
0f:00.0 VGA compatible controller: ATI Technologies Inc RV620 [ATI FireGL V3700]
kernel-2.6.31.12-174.2.3.fc12.x86_64
xorg-x11-server-Xorg-1.7.4-1.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.x86_64

Random lockups of desktop a few times a day.
Latest one (machine still reachable through ssh) had the following in dmesg:

$ dmesg | egrep -1 \(radeon\|drm\)
udev: starting version 145
[drm] Initialized drm 1.1.0 20060810
[drm] radeon defaulting to kernel modesetting.
[drm] radeon kernel modesetting enabled.
  alloc irq_desc for 24 on node 0
  alloc kstat_irqs on node 0
radeon 0000:0f:00.0: PCI INT A -> GSI 24 (level, low) -> IRQ 24
radeon 0000:0f:00.0: setting latency timer to 64
[drm] radeon: Initializing kernel modesetting.
[drm] register mmio base: 0xE0000000
[drm] register mmio size: 65536
ATOM BIOS: 113
[drm] Clocks initialized !
[drm] Detected VRAM RAM=256M, BAR=256M
[drm] RAM width 64bits DDR
[TTM] Zone  kernel: Available graphics memory: 3088144 kiB.
[TTM] Zone   dma32: Available graphics memory: 2097152 kiB.
[drm] radeon: 256M of VRAM memory ready
[drm] radeon: 512M of GTT memory ready.
[drm] Loading RV620 CP Microcode
platform radeon_cp.0: firmware: requesting radeon/RV620_pfp.bin
platform radeon_cp.0: firmware: requesting radeon/RV620_me.bin
[drm] GART: num cpu pages 131072, num gpu pages 131072
[drm] ring test succeeded in 1 usecs
[drm] radeon: ib pool ready.
[drm] ib test succeeded in 0 usecs
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   DVI-I
[drm]   DDC: 0x7e60 0x7e60 0x7e64 0x7e64 0x7e68 0x7e68 0x7e6c 0x7e6c
[drm]   Encoders:
[drm]     DFP1: INTERNAL_UNIPHY
[drm]     CRT2: INTERNAL_KLDSCP_DAC2
[drm] Connector 1:
[drm]   DVI-I
[drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
[drm]   Encoders:
[drm]     CRT1: INTERNAL_KLDSCP_DAC1
[drm]     DFP2: INTERNAL_KLDSCP_LVTMA
usb 6-1: New USB device found, idVendor=03f0, idProduct=2c24
--
generic-usb 0003:03F0:2C24.0001: input,hidraw0: USB HID v1.10 Mouse [HP HP USB Laser Mouse] on usb-0000:00:1d.0-1/input0
[drm] fb mappable at 0xD0141000
[drm] vram apper at 0xD0000000
[drm] size 7680000
[drm] fb depth is 24
[drm]    pitch is 6400
executing set pll
executing set crtc timing
[drm] TMDS-9: set mode 1600x1200 1c
Console: switching to colour frame buffer device 200x75
usb 2-3.4: new full speed USB device using ehci_hcd and address 4
fb0: radeondrmfb frame buffer device
registered panic notifier
[drm] Initialized radeon 2.0.0 20080528 for 0000:0f:00.0 on minor 0
dracut: Starting plymouth daemon
--
SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
[drm:radeon_ib_get] *ERROR* radeon: IB(13:0x00000000101D1000:0)
[drm:radeon_ib_get] *ERROR* radeon: GPU lockup detected, fail to get a IB
[drm:r600_vb_ib_get] *ERROR* failed to get IB for vertex buffer
------------[ cut here ]------------
WARNING: at drivers/gpu/drm/radeon/r600_blit_kms.c:550 r600_blit_prepare_copy+0x36/0x3e4 [radeon]() (Not tainted)
Hardware name: HP Z400 Workstation
Modules linked in: fuse ipt_MASQUERADE iptable_nat nf_nat nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc cpufreq_ondemand acpi_cpufreq freq_table bridge stp llc xt_physdev ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath kvm_intel kvm uinput snd_hda_codec_realtek snd_usb_audio snd_hda_intel snd_hda_codec snd_seq snd_usb_lib snd_pcm snd_rawmidi snd_seq_device snd_hwdep firewire_ohci firewire_core crc_itu_t snd_timer snd wmi soundcore snd_page_alloc iTCO_wdt iTCO_vendor_support serio_raw tg3 raid1 raid456 raid6_pq async_xor async_memcpy async_tx xor usb_storage radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: microcode]
Pid: 2406, comm: Xorg Not tainted 2.6.31.12-174.2.3.fc12.x86_64 #1
Call Trace:
 [<ffffffff81051710>] warn_slowpath_common+0x84/0x9c
 [<ffffffff8105173c>] warn_slowpath_null+0x14/0x16
 [<ffffffffa00a12aa>] r600_blit_prepare_copy+0x36/0x3e4 [radeon]
 [<ffffffffa0098214>] r600_copy_blit+0x2b/0x57 [radeon]
 [<ffffffffa00795f9>] radeon_move_blit+0x101/0x13f [radeon]
 [<ffffffffa007986f>] radeon_bo_move+0x238/0x25e [radeon]
 [<ffffffffa004c248>] ttm_bo_handle_move_mem+0x1cc/0x2bb [ttm]
 [<ffffffffa004dab6>] ttm_bo_move_buffer+0xb0/0xef [ttm]
 [<ffffffffa004db37>] ttm_buffer_object_validate+0x42/0xbf [ttm]
 [<ffffffffa007a22c>] radeon_object_list_validate+0xaf/0x152 [radeon]
 [<ffffffffa008692f>] radeon_cs_parser_relocs+0x19c/0x1fa [radeon]
 [<ffffffffa0086cae>] ? radeon_cs_ioctl+0x0/0x19a [radeon]
 [<ffffffffa0086d71>] radeon_cs_ioctl+0xc3/0x19a [radeon]
 [<ffffffffa001521f>] drm_ioctl+0x237/0x2f4 [drm]
 [<ffffffff81108dd1>] vfs_ioctl+0x6f/0x87
 [<ffffffff811092e0>] do_vfs_ioctl+0x47b/0x4c1
 [<ffffffff8110937c>] sys_ioctl+0x56/0x79
 [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b
---[ end trace 056b85a75e90e936 ]---
BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: [<ffffffffa00a19bc>] r600_kms_blit_copy+0x68/0x518 [radeon]
PGD 1931a1067 PUD 193135067 PMD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.5/0000:01:00.0/irq
CPU 0 
Modules linked in: fuse ipt_MASQUERADE iptable_nat nf_nat nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc cpufreq_ondemand acpi_cpufreq freq_table bridge stp llc xt_physdev ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath kvm_intel kvm uinput snd_hda_codec_realtek snd_usb_audio snd_hda_intel snd_hda_codec snd_seq snd_usb_lib snd_pcm snd_rawmidi snd_seq_device snd_hwdep firewire_ohci firewire_core crc_itu_t snd_timer snd wmi soundcore snd_page_alloc iTCO_wdt iTCO_vendor_support serio_raw tg3 raid1 raid456 raid6_pq async_xor async_memcpy async_tx xor usb_storage radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: microcode]
Pid: 2406, comm: Xorg Tainted: G        W  2.6.31.12-174.2.3.fc12.x86_64 #1 HP Z400 Workstation
RIP: 0010:[<ffffffffa00a19bc>]  [<ffffffffa00a19bc>] r600_kms_blit_copy+0x68/0x518 [radeon]
RSP: 0018:ffff8801a7d09948  EFLAGS: 00010216
RAX: 0000000000000000 RBX: ffff8801a5b08000 RCX: ffffffffa00b66d7
RDX: ffffffffa00aa920 RSI: ffffffffa00b66a1 RDI: 0000000000000001
RBP: ffff8801a7d099b8 R08: 00000000102a5000 R09: 00000000db5ce000
R10: 0000000000000028 R11: 0000000000000010 R12: 00000000000000c0
R13: 0000000000001000 R14: 00000000db5ce000 R15: 00000000102a5000
FS:  00007f2a352e77c0(0000) GS:ffff880028034000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000028 CR3: 000000019348d000 CR4: 00000000000026f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process Xorg (pid: 2406, threadinfo ffff8801a7d08000, task ffff880193695e00)
Stack:
 ffff880100001000 ffff880000000030 00007f2a00000054 0000000000000004
<0> ffff880100000030 0000000000000028 ffff880100000010 0000021000000020
<0> ffff8801a5a3f380 ffff8801a5b08000 00000000102a5000 ffff88013991d800
Call Trace:
 [<ffffffffa0098226>] r600_copy_blit+0x3d/0x57 [radeon]
 [<ffffffffa00795f9>] radeon_move_blit+0x101/0x13f [radeon]
 [<ffffffffa007986f>] radeon_bo_move+0x238/0x25e [radeon]
 [<ffffffffa004c248>] ttm_bo_handle_move_mem+0x1cc/0x2bb [ttm]
 [<ffffffffa004dab6>] ttm_bo_move_buffer+0xb0/0xef [ttm]
 [<ffffffffa004db37>] ttm_buffer_object_validate+0x42/0xbf [ttm]
 [<ffffffffa007a22c>] radeon_object_list_validate+0xaf/0x152 [radeon]
 [<ffffffffa008692f>] radeon_cs_parser_relocs+0x19c/0x1fa [radeon]
 [<ffffffffa0086cae>] ? radeon_cs_ioctl+0x0/0x19a [radeon]
 [<ffffffffa0086d71>] radeon_cs_ioctl+0xc3/0x19a [radeon]
 [<ffffffffa001521f>] drm_ioctl+0x237/0x2f4 [drm]
 [<ffffffff81108dd1>] vfs_ioctl+0x6f/0x87
 [<ffffffff811092e0>] do_vfs_ioctl+0x47b/0x4c1
 [<ffffffff8110937c>] sys_ioctl+0x56/0x79
 [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b
Code: a0 48 c7 c2 20 a9 0a a0 48 c7 c6 a1 66 0b a0 bf 01 00 00 00 e8 f3 71 f7 ff 44 8b a3 c0 0e 00 00 48 8b 83 c8 0e 00 00 49 c1 e4 02 <4c> 03 60 28 41 f6 c5 03 0f 85 22 02 00 00 4c 89 f0 4c 09 f8 a8 
RIP  [<ffffffffa00a19bc>] r600_kms_blit_copy+0x68/0x518 [radeon]
 RSP <ffff8801a7d09948>
CR2: 0000000000000028
---[ end trace 056b85a75e90e937 ]---
[drm:drm_release] *ERROR* Device busy: 1
executing set pll
executing set crtc timing
[drm] TMDS-9: set mode 1600x1200 1c

Comment 14 Chris Campbell 2010-01-23 22:05:19 UTC

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 15 Milan Kerslager 2010-02-02 18:42:53 UTC
Seems to be similar to bug #557805.

Comment 16 Leif Gruenwoldt 2010-03-08 15:57:35 UTC
(In reply to comment #12)
> Created an attachment (id=386159) [details]
> Avoid oops in case of GPU lockup
> 
> If you have the courage to build your own kernel this patch apply on top of 
> http://git.kernel.org/?p=linux/kernel/git/airlied/drm-2.6.git;a=shortlog;h=refs/heads/drm-radeon-testing
> 
> and should fix the oops you are seeing, it might or might not fix the GPU
> lockup you are experiencing.    

I haven't had a chance to try building this but I have found a workaround for the time being. I turn off my laptop screen and work off my external monitor and the lockup hasn't happened in weeks. 

I noticed in todays F12 yum updates a new kernel 2.6.32, maybe this contains your patch?

Comment 17 Milan Kerslager 2010-03-09 07:06:29 UTC
I tryed latest updates and then mesa updates from testing repository, but it did not helped:

kernel-2.6.32.9-67.fc12.x86_64
libdrm-2.4.17-1.fc12.i686
libdrm-2.4.17-1.fc12.x86_64
mesa-dri-drivers-7.7-4.fc12.i686
mesa-dri-drivers-7.7-4.fc12.x86_64
mesa-libGLU-7.7-3.fc12.i686
mesa-libGLU-7.7-3.fc12.x86_64
mesa-libGL-7.7-4.fc12.i686
mesa-libGL-7.7-4.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.21.20100219gite68d3a389.fc12.x86_64

Comment 18 Milan Kerslager 2010-03-09 08:19:53 UTC
I tryed libdrm-2.4.18-1.fc14 from rawhide, but it did not helped (ATI Radeon HD 2400 XT).

Comment 19 Leif Gruenwoldt 2010-03-09 18:58:22 UTC
FYI, i just peformed a yum update on F12 and have new problems. Seems unrelated to this bug so I have filed a new one, Bug #571874

Comment 20 Leif Gruenwoldt 2010-06-09 14:20:03 UTC
(In reply to comment #16)
> I have found a workaround for
> the time being. I turn off my laptop screen and work off my external monitor

FYI,  I do not experience this issue running F13 with my external monitor ON.

Comment 21 Leif Gruenwoldt 2010-06-09 14:22:40 UTC
(In reply to comment #20)
> (In reply to comment #16)
> > I have found a workaround for
> > the time being. I turn off my laptop screen and work off my external monitor
> 
> FYI,  I do not experience this issue running F13 with my external monitor ON.    

Sorry I meant to say with both my laptop and external monitor ON, everything is OK on F13.

Comment 22 Matěj Cepl 2010-06-09 15:43:43 UTC
Thank you for letting us know.