Bug 562607 - KMS:RV410:X700 hard system freeze using 3D
Summary: KMS:RV410:X700 hard system freeze using 3D
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-ati
Version: 13
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
Assignee: Jérôme Glisse
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-02-07 17:00 UTC by Tom Horsley
Modified: 2018-04-11 14:13 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-06-27 14:54:25 UTC
Type: ---


Attachments (Terms of Use)
xorg log from running system (31.58 KB, text/plain)
2010-02-07 17:01 UTC, Tom Horsley
no flags Details
dmesg from running system (46.06 KB, text/plain)
2010-02-07 17:02 UTC, Tom Horsley
no flags Details
/var/log/messages from around the crash (293.26 KB, text/plain)
2010-02-08 16:47 UTC, Tom Horsley
no flags Details

Description Tom Horsley 2010-02-07 17:00:55 UTC
Description of problem:

Surveying the state of my ATI cards, I tried running neverputt (a 3D game)
on my ATI Technologies Inc RV410 [Radeon X700 Pro (PCIE)] system.

As soon as it drew the initial window of the game, the system was totally
frozen. No cursor movement, no disk activity, no nuthin.

Version-Release number of selected component (if applicable):
SDL-1.2.13-10.fc12.x86_64
mesa-dri-drivers-7.7-3.fc12.x86_64
mesa-libGL-7.7-3.fc12.x86_64
mesa-libGLU-7.7-3.fc12.x86_64
neverball-1.4.0-17.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.x86_64
xorg-x11-server-Xorg-1.7.4-1.fc12.x86_64
xorg-x11-server-common-1.7.4-1.fc12.x86_64
kernel-2.6.31.12-174.2.3.fc12.x86_64


How reproducible:
I only tried it once, but it sure looked like the sort of thing that
would happen every time.

Steps to Reproduce:
1.run neverputt
2.system frozen
3.
  
Actual results:
frozen system

Expected results:
play stoopid game

Additional info:
I'm using all the installed defaults, so there is no xorg.conf, and KMS
is active by default.

After rebooting the system I didn't find any information in any logs.
The crash was too hard and fast to get anything into a log file.

Smolt profile of system:
http://www.smolts.org/client/show/pub_0cd07516-0dde-4c15-a78f-eae00975ebe7

lspci -v video card info:
01:00.0 VGA compatible controller: ATI Technologies Inc RV410 [Radeon X700 Pro (PCIE)] (prog-if 00 [VGA controller])
	Subsystem: PC Partner Limited Device 0620
	Flags: bus master, fast devsel, latency 0, IRQ 28
	Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Memory at feaf0000 (64-bit, non-prefetchable) [size=64K]
	I/O ports at b000 [size=256]
	Expansion ROM at feac0000 [disabled] [size=128K]
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Express Endpoint, MSI 00
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: radeon
	Kernel modules: radeon

01:00.1 Display controller: ATI Technologies Inc RV410 [Radeon X700 Pro (PCIE)] (Secondary)
	Subsystem: PC Partner Limited Device 0621
	Flags: bus master, fast devsel, latency 0
	Memory at feae0000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Express Endpoint, MSI 00

Comment 1 Tom Horsley 2010-02-07 17:01:50 UTC
Created attachment 389393 [details]
xorg log from running system

Comment 2 Tom Horsley 2010-02-07 17:02:23 UTC
Created attachment 389394 [details]
dmesg from running system

Comment 3 Tom Horsley 2010-02-07 17:14:11 UTC
After I submitted this, I tried running neverputt again, and it does
indeed crash the system every time (or at least twice in a row).

Comment 4 Tom Horsley 2010-02-07 17:16:49 UTC
But glxgears runs OK, no system crash, so not all 3d apps crash:

8173 frames in 5.0 seconds = 1634.523 FPS
8204 frames in 5.0 seconds = 1640.720 FPS
8238 frames in 5.0 seconds = 1647.524 FPS
8194 frames in 5.0 seconds = 1638.727 FPS

Comment 5 Matěj Cepl 2010-02-08 14:59:17 UTC
Tom, could you reboot (when the crash happens) to runlevel 3 and collect /var/log/Xorg.0.log from there? Are you able to switch to tty2 (Ctrl-Alt-F2)? Then you could get output of dmesg there as well. Unfortunately, both logs you have attached are from subsequent run of the system, and they look just fine.

Thank you for filing the report

Comment 6 Tom Horsley 2010-02-08 15:22:51 UTC
The xorg log following the crash is identical to the one from a normal
run, nothing gets added to it at crash time (I presume because it
crashes so completely that it doesn't get a chance even if it wanted to).
No tty switch is possible, no mouse movement is possible, no ssh in
from another system is possible. The system is just utterly and completely
dead. Hardware reset button is the only thing it will respond to :-).

I suppose there is actually a chance that it is something like pulseaudio
that is responsible for the crash rather than the ATI driver, but I don't
know how to tell. Anyone know a good collection of 3D apps available
in the fedora repos I could try to see which ones crash and which ones don't?

Comment 7 Matěj Cepl 2010-02-08 16:13:05 UTC
Could you give us /var/log/messages from the crashed system as well, please?

Otherwise I will just pass the bug to developers, but my hopes are not much high.

Comment 8 Tom Horsley 2010-02-08 16:47:45 UTC
Created attachment 389572 [details]
/var/log/messages from around the crash

Boot after first crash was around Feb 7 11:33. I certainly don't see anything
useful in the log from before that, but maybe someone else can :-).

Comment 9 Tom Horsley 2010-02-08 23:48:48 UTC
I tried some more experiments to see what works and what crashes, and found
that I can login as a gnome user, enable compiz, turn on rubber windows and
cube desktops, and it all seems to work. I can even run glxgears and drag
the glxgears window around in a rubbery fashion.

I then tried installing the rss-glx-xscreensavers and brought up the
xscreensaver config dialog. As soon as I clicked on "Lattice" to see
if it would show up in the little preview window in the config dialog,
my system froze up again, so I think that confirms it isn't pulseaudio.

Comment 10 Tom Horsley 2010-02-08 23:54:26 UTC
More data: mplayer -vo gl plays videos OK, as does mplayer -vo gl2, so no
crash using opengl video drivers in mplayer.

Comment 11 Matěj Cepl 2010-02-09 12:36:21 UTC
(In reply to comment #8)
> Created an attachment (id=389572) [details]
> /var/log/messages from around the crash
> 
> Boot after first crash was around Feb 7 11:33. I certainly don't see anything
> useful in the log from before that, but maybe someone else can :-).    

It would be much more useful if you just attach complete /var/log/messages please.

And for testing GL, you can try examples from mesa-demo packages.

Thank you

Comment 12 Tom Horsley 2010-02-09 12:47:28 UTC
That is a complete /var/log/messages file, I was just making sure I got the
one that included the time of the crash rather than one the log rotation
cron jobs had maybe renamed and compressed (they hadn't - it was still
the active /var/log/messages).

Comment 13 Tom Horsley 2010-02-09 21:53:10 UTC
I installed the demos and just started running all the executable files
in lexical order: arbfplight and arbfslight segfaulted, arbocclude
seemed to run OK, then bounce crashed the system as soon as the initial
screen got drawn. After that, I lost interest in running any more
demos :-).

Comment 14 Tom Horsley 2010-03-06 19:30:12 UTC
Just got latest updates which included:

kernel-2.6.32.9-67.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.21.20100219gite68d3a389.fc12.x86_64

This combo works much, much better than before. I can run neverputt
and the game plays smoothly, and the bounce mesa demo doesn't crash the
system, but it still isn't free of crashes:

I was skipping around, trying various mesa demo programs with interesting
sounding names, when I ran the "gltestperf" demo. It got all the way through
the tests with the first triangle pattern it draws, and started on the 2nd,
when my system froze up the same way it used to with just neverputt running.
Unfortunately, it didn't leave any walkback in the xorg log or anything,
just a sudden total system freeze.

So it looks like attempting to beat the system to death will still crash it,
but much more normal usage now works fine.

Comment 15 Tom Horsley 2010-03-16 01:17:17 UTC
I just installed fedora 13 alpha on a spare partition and tried mesa-demos
again, and again gltestperf locks up the system. This time I had the
gnome-terminal where I could see it, and the last lines it printed
before locking up were:

Benchmark: 2
ZSmooth Triangles
Current size: 480

at that point my system turned into an inert block displaying that frozen
screen content.

This was with:

mesa-demos-7.8-0.18.fc13.x86_64
xorg-x11-drv-ati-6.13.0-0.23.20100219gite68d3a389.fc13.x86_64
kernel-2.6.33-1.fc13.x86_64

kernel mode setting is on by default, no xorg.conf file is being used.

No little breadcrumbs were left in any log files that I could see, the
system froze up to sudden to have a chance to write anything even if
it wanted to.

Comment 16 Tom Horsley 2010-03-17 22:08:22 UTC
Just tried again with new updates:

kernel-2.6.33-1.fc13.x86_64
xorg-x11-drv-ati-6.13.0-0.24.20100316git819b40153.fc13.x86_64

I actually went through all the mesa-demo programs one by one
(saving gltestperf to the last), and they all seemed to work,
or complain about some GL property that was missing (or in
one instance a file it couldn't find). No crashes on any of
the other demos, but when I ran gltestperf, it crashed the
same way as previous comment.

New for this time though, almost all the tests printed
an infinite number of errors to stderr saying something
like:

freeglut(progname) Unknown X event type: 96

Comment 17 Tom Horsley 2010-04-06 22:45:48 UTC
Just an update with lastest kernel and ati driver in f13:

kernel-2.6.33.1-24.fc13.x86_64 (from koij)
xorg-x11-drv-ati-6.13.0-1.fc13.x86_64

Same hard system freeze running gltestperf when it gets to benchmark 2.

Comment 18 Tom Horsley 2010-06-01 00:42:56 UTC
And after official release of fedora 13 and getting all updates,
same hard system crash at gltestperf benchmark 2.

kernel-2.6.33.4-95.fc13.x86_64
xorg-x11-drv-ati-6.13.0-1.fc13.x86_64

Also tested on the hardware described in bug 541387 (RV610) and I see
the same benchmark 2 system crash.

Less stressful 3D seems to work fine on both these systems. The other mesa
demos, neverputt, desktop effects, etc. all seem to work on both these
systems.

Comment 19 Tom Horsley 2010-08-31 21:08:51 UTC
Out of curiosity, I just tried gltestperf on fedora 14 alpha (with updates)
on the same hardware.

The system no longer freezes up solid, instead, shortly after benchmark 1
starts, the screen just goes black, but the system is apparently talking because
the combination of Ctrl-Alt-F2 and Ctrl-Alt-Del seems to have made it
shut down cleanly. I see the leftover /var/log/messages file has a near
infinite amount of this stuff in it at the end:

Aug 31 16:53:42 zooty kernel: radeon 0000:01:00.0: GPU lockup CP stall for more than 1000msec
Aug 31 16:53:42 zooty kernel: ------------[ cut here ]------------
Aug 31 16:53:42 zooty kernel: WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:235 radeon_fence_wait+0x22e/0x2cd [radeon]()
Aug 31 16:53:42 zooty kernel: Hardware name: TP43D2-A7
Aug 31 16:53:42 zooty kernel: GPU lockup (waiting for 0x000018B2 last fence id 0x000018B1)
Aug 31 16:53:42 zooty kernel: Modules linked in: ebtable_nat ebtables sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bridge stp llc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables kvm_intel kvm uinput usblp snd_hda_codec_realtek snd_ca0106 snd_ac97_codec snd_usb_audio ac97_bus snd_hda_intel snd_hda_codec snd_seq snd_pcm snd_hwdep snd_usbmidi_lib snd_rawmidi snd_seq_device snd_timer snd iTCO_wdt shpchp e1000e uvcvideo lirc_imon(C) iTCO_vendor_support lirc_dev videodev v4l1_compat v4l2_compat_ioctl32 snd_page_alloc joydev i2c_i801 ppdev soundcore parport_pc parport serio_raw microcode ipv6 ata_generic pata_acpi usb_storage pata_jmicron radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Aug 31 16:53:42 zooty kernel: Pid: 1988, comm: gltestperf Tainted: G         C  2.6.35.4-12.fc14.x86_64 #1
Aug 31 16:53:42 zooty kernel: Call Trace:
Aug 31 16:53:42 zooty kernel: [<ffffffff810510ea>] warn_slowpath_common+0x85/0x9d
Aug 31 16:53:42 zooty kernel: [<ffffffff810511a5>] warn_slowpath_fmt+0x46/0x48
Aug 31 16:53:42 zooty kernel: [<ffffffffa009bb9c>] radeon_fence_wait+0x22e/0x2cd [radeon]
Aug 31 16:53:42 zooty kernel: [<ffffffff8106b472>] ? autoremove_wake_function+0x0/0x39
Aug 31 16:53:42 zooty kernel: [<ffffffffa009c3b5>] radeon_sync_obj_wait+0x11/0x13 [radeon]
Aug 31 16:53:42 zooty kernel: [<ffffffffa00618a9>] ttm_bo_wait+0xab/0x16b [ttm]
Aug 31 16:53:42 zooty kernel: [<ffffffffa00aac79>] radeon_bo_wait+0xb9/0xde [radeon]
Aug 31 16:53:42 zooty kernel: [<ffffffffa00ab23a>] radeon_gem_wait_idle_ioctl+0x40/0x77 [radeon]
Aug 31 16:53:42 zooty kernel: [<ffffffff810fa472>] ? might_fault+0x5c/0xac
Aug 31 16:53:42 zooty kernel: [<ffffffffa00193b0>] drm_ioctl+0x291/0x392 [drm]
Aug 31 16:53:42 zooty kernel: [<ffffffff8107e450>] ? lock_release+0x19a/0x1a6
Aug 31 16:53:42 zooty kernel: [<ffffffffa00ab1fa>] ? radeon_gem_wait_idle_ioctl+0x0/0x77 [radeon]
Aug 31 16:53:42 zooty kernel: [<ffffffff81100278>] ? remove_vma+0x7f/0x87
Aug 31 16:53:42 zooty kernel: [<ffffffff8111975a>] ? check_object+0x179/0x1b4
Aug 31 16:53:42 zooty kernel: [<ffffffff81100278>] ? remove_vma+0x7f/0x87
Aug 31 16:53:42 zooty kernel: [<ffffffff81136f7c>] vfs_ioctl+0x36/0xa7
Aug 31 16:53:42 zooty kernel: [<ffffffff811378f5>] do_vfs_ioctl+0x47c/0x4af
Aug 31 16:53:42 zooty kernel: [<ffffffff81100278>] ? remove_vma+0x7f/0x87
Aug 31 16:53:42 zooty kernel: [<ffffffff8112a4d2>] ? fcheck_files+0x7b/0xe0
Aug 31 16:53:42 zooty kernel: [<ffffffff8113797e>] sys_ioctl+0x56/0x7c
Aug 31 16:53:42 zooty kernel: [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b
Aug 31 16:53:42 zooty kernel: ---[ end trace 2b04186ae4625054 ]---
Aug 31 16:53:42 zooty kernel: Failed to wait GUI idle while programming pipes. Bad things might happen.
Aug 31 16:53:42 zooty kernel: radeon 0000:01:00.0: (r300_asic_reset:415) RBBM_STATUS=0x84010140
Aug 31 16:53:42 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(4).
Aug 31 16:53:42 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: radeon 0000:01:00.0: (r300_asic_reset:434) RBBM_STATUS=0x84010140
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(5).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: radeon 0000:01:00.0: (r300_asic_reset:446) RBBM_STATUS=0x84000140
Aug 31 16:53:43 zooty kernel: radeon 0000:01:00.0: failed to reset GPU
Aug 31 16:53:43 zooty kernel: radeon 0000:01:00.0: GPU reset failed
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(6).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(7).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(8).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(9).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(10).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(11).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(12).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(13).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(14).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Aug 31 16:53:43 zooty kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(15).
Aug 31 16:53:43 zooty kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
... those last errors go on pretty much forever

Various possibly relevant rpms on the system:

kernel-2.6.35.4-12.fc14.x86_64
libdrm-2.4.21-3.fc14.x86_64
mesa-libGL-7.9-0.6.fc14.x86_64
mesa-libGLU-devel-7.9-0.6.fc14.x86_64
mesa-demos-7.9-0.6.fc14.x86_64
mesa-dri-drivers-7.9-0.6.fc14.x86_64
mesa-libGL-devel-7.9-0.6.fc14.x86_64
mesa-libGLU-7.9-0.6.fc14.x86_64
xorg-x11-drv-ati-6.13.1-0.3.20100705git37b348059.fc14.x86_64
xorg-x11-server-common-1.9.0-4.fc14.x86_64
xorg-x11-server-utils-7.4-19.fc14.x86_64
xorg-x11-server-Xorg-1.9.0-4.fc14.x86_64

Comment 20 Bug Zapper 2010-11-03 22:48:59 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 21 Tom Horsley 2010-11-03 23:56:43 UTC
Still freezes on f13, and gives comment #19 results as shown on f14.

Comment 22 Bug Zapper 2011-06-02 16:39:35 UTC
This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 23 Bug Zapper 2011-06-27 14:54:25 UTC
Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.