Bug 472505 - X server instability (nomodeset, RS690)
X server instability (nomodeset, RS690)
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-ati (Show other bugs)
11
All Linux
medium Severity medium
: ---
: ---
Assigned To: Dave Airlie
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-11-21 04:39 EST by Jan Martinek
Modified: 2009-09-06 02:01 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-09-06 02:01:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Xorg.0.log (69.65 KB, text/plain)
2008-11-21 05:11 EST, Michal Schmidt
no flags Details
a better Xorg.0.log (includes a backtrace) (74.94 KB, text/plain)
2008-11-21 05:34 EST, Michal Schmidt
no flags Details
dmesg output taken during the hang (36.56 KB, text/plain)
2008-11-21 05:35 EST, Michal Schmidt
no flags Details

  None (edit)
Description Jan Martinek 2008-11-21 04:39:37 EST
Description of problem:
From time to time the X server freezes and becomes unusable. Only the pointer jerkily responses to mouse moves. It does not respond to keyboard, even Alt+Ctrl+BackSpace does not work. Every time this happens, I must reboot (I use Alt+SysRq+B). It is possible to log on the computer remotely and top shows that Xorg eats 117% CPU (I have dual core).

Probability of the freeze is higher when I am using browser or gimp or audacious (I get here 100% reproducibility).

I have nomodeset kernel parameter so I don't use kernel mode setting. After removing the parameter the system is stable.

Version-Release number of selected component (if applicable):
kernel-2.6.27.5-117.fc10.x86_64 (with nomodeset)
xorg-x11-server-common-1.5.3-5.fc10.x86_64
xorg-x11-drv-ati-6.9.0-54.fc10.x86_64

GPU:
ATI Technologies Inc RS690 [Radeon X1200 Series]

Steps to Reproduce:
1. add "nomodeset" kernel parameter
2. After booting, run audacious
3. Try to display playlist - and be ready to reset the machine.
  
Actual results:
The X server hangs.
Comment 1 Michal Schmidt 2008-11-21 05:08:45 EST
I can reproduce it.

In step 3 I'd add: If it does not freeze immediately, grab the playlist's corner and wiggle the mouse to make it resize and redraw repeatedly. It will hang within 2 seconds.

My hw is ATI Technologies Inc RS690M [Radeon X1200 Series]

I can ssh in, even though it reacts slowly. top shows X eating 100% CPU. strace shows an infinite loop of:

ioctl(11, 0xc0406429, 0x7fff0e99c3d0)   = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe)                       = -1 EBUSY (Device or resource busy)
ioctl(11, 0xc0406429, 0x7fff0e99c3d0)   = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe)                       = -1 EBUSY (Device or resource busy)
ioctl(11, 0xc0406429, 0x7fff0e99c3d0)   = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe)                       = -1 EBUSY (Device or resource busy)
...

fd 11 is /dev/dri/card0
Comment 2 Michal Schmidt 2008-11-21 05:11:47 EST
Created attachment 324287 [details]
Xorg.0.log

Here's my Xorg.0.log. I'm not going to add xorg.conf because I don't use any.
Comment 3 Michal Schmidt 2008-11-21 05:34:05 EST
Created attachment 324288 [details]
a better Xorg.0.log (includes a backtrace)

Still happens with kernel-2.6.27.5-120.fc10.x86_64 and xorg-x11-drv-ati-6.9.0-55.fc10.x86_64 from Koji.
Comment 4 Michal Schmidt 2008-11-21 05:35:39 EST
Created attachment 324289 [details]
dmesg output taken during the hang
Comment 5 Michal Schmidt 2008-11-21 05:43:48 EST
A possibly interesting observation: I can issue 'init 6' via ssh and the machine reboots eventually. It takes about 3 minutes of shutting down before it issues the reboot itself. Afterwards my BIOS seems to be confused by the state of the graphics card - it displays the boot logo for a few minutes, then clears the screen, but never continues. I have to do a full reset.
Comment 6 Michal Schmidt 2008-11-21 09:05:31 EST
Trying to find a workaround I added a minimal xorg.conf with:
Option "DRI" "off"
in the Device section for radeon. However, this affects performance very badly (gtkperf is 5 times slower, text rendering being hit the most).
Comment 7 Dave Airlie 2008-11-23 23:02:34 EST
I've just kicked off a new kernel build in koji.

kernel-2.6.27.5-123.fc10

it'll appear here when finished.

http://kojipkgs.fedoraproject.org/packages/kernel/2.6.27.5/123.fc10/

Can you install it and see if it helps?
Comment 8 Jan Martinek 2008-11-24 04:35:59 EST
(In reply to comment #7)
> I've just kicked off a new kernel build in koji.
> 
> kernel-2.6.27.5-123.fc10
> 
> it'll appear here when finished.
> 
> http://kojipkgs.fedoraproject.org/packages/kernel/2.6.27.5/123.fc10/
> 
> Can you install it and see if it helps?

Hello, I installed the new kernel-2.6.27.5-123.fc10 but nothing changed. The hangup is still reproducible using audacious's playlist when the nomodeset kernel parameter is set.
Comment 9 Dave Airlie 2008-11-25 01:45:02 EST
If you could try

xorg-x11-drv-ati-6.9.0-57 on top of that kernel and see does it help any that
would be great.

I'm still doing more investigation on my rs690 to try and narrow down what is happening.
Comment 10 Michal Schmidt 2008-11-25 04:25:24 EST
Still the same result with xorg-x11-drv-ati-6.9.0-57.
Comment 11 Jan Martinek 2008-11-25 07:28:07 EST
I confirm this. With xorg-x11-drv-ati-6.9.0-57 and kernel-2.6.27.5-123.fc10 the behaviour is still the same.
Comment 12 Dave Airlie 2008-11-25 21:13:03 EST
Okay give -58 a whirl. Please report if its unuseable slow or anything like that.

I've gotten a workaround while I await AMD to see if they can narrow down what is happening.
Comment 13 Bug Zapper 2008-11-26 00:43:04 EST
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 14 Michal Schmidt 2008-11-26 02:45:54 EST
This hang is still trivially reproducible with xorg-x11-drv-ati-6.9.0-58.
Comment 15 Michal Schmidt 2008-11-26 03:22:52 EST
For me it always hangs once the height of audacious's playlist window is more than 350px. Width has no influence.
Comment 16 Fedora Update System 2008-11-27 18:54:53 EST
xorg-x11-drv-ati-6.9.0-59.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/xorg-x11-drv-ati-6.9.0-59.fc10
Comment 17 Jan Martinek 2008-11-27 20:35:11 EST
I tried xorg-x11-drv-ati-6.9.0-59.fc10, but the bug is still present. There is certainly some noticeable change. Audacious's playlist does not hang the X server immediately but after an attempt to resize the window to make it tall (as Michal wrote).
Comment 18 Dave Airlie 2008-11-27 20:54:18 EST
Do you have a long playlist in there or what, I've been sitting resizing my audacious like crazy here to no avail.

can I get a newer Xorg log file from a system running with modesetting?
Comment 19 Dave Airlie 2008-11-27 21:02:45 EST
ah it only happens with nomodeset I should read the rest of the bug.

I'll see what I can fix in this, but really modesetting is the future, and fixing the past is less of a priority for me.
Comment 20 Oren Nachman 2008-12-02 23:01:53 EST
Unfortunately some of us (usually older) ATI owners seem to be stuck in the middle during the transition. I can't boot without "nomodeset" because then my dual monitor setup doesn't work - I can't boot *with* because then the machine locks up.

Catch 22 :(
Comment 21 Jan Martinek 2008-12-07 05:49:03 EST
I am now using

kernel-2.6.28-0.113.rc7.git5.fc11.x86_64
xorg-x11-drv-ati-6.9.0-61.fc10.x86_64
xorg-x11-server-common-1.5.3-5.fc10.x86_64

and this bug seems to be fixed. I cannot force the X server to hangup anymore using audacious or other means. It has been running for several days without freezing.
So, thanks everyone :-)
Comment 22 Mads Kiilerich 2008-12-11 20:21:09 EST
Michal Schmidt, can you confirm that it also works for you?
Comment 23 Michal Schmidt 2008-12-12 07:18:24 EST
The X server does not hang anymore with kernel-2.6.28-0.113.rc7.git5.fc11.x86_64.
But notice that this kernel produces a flood of debug messages in dmesg, like:

[drm:drm_ioctl] pid=2491, cmd=0xc0406429, nr=0x29, dev 0xe200, auth=1
[drm:radeon_freelist_get] done_age = 126826
[drm:drm_ioctl] pid=2491, cmd=0xc010644d, nr=0x4d, dev 0xe200, auth=1
[drm:radeon_cp_indirect] idx=11 s=0 e=8 d=1
[drm:radeon_cp_dispatch_indirect] buf=11 s=0x0 e=0x8
[drm:drm_ioctl] pid=2491, cmd=0xc0406429, nr=0x29, dev 0xe200, auth=1
[drm:radeon_freelist_get] done_age = 126827
[drm:drm_ioctl] pid=2491, cmd=0xc010644d, nr=0x4d, dev 0xe200, auth=1
[drm:radeon_cp_indirect] idx=12 s=0 e=208 d=1
[drm:radeon_cp_dispatch_indirect] buf=12 s=0x0 e=0xd0
[drm:drm_ioctl] pid=2491, cmd=0xc0406429, nr=0x29, dev 0xe200, auth=1
[drm:radeon_freelist_get] done_age = 126828
[drm:drm_ioctl] pid=2491, cmd=0xc010644d, nr=0x4d, dev 0xe200, auth=1
[drm:radeon_cp_indirect] idx=13 s=0 e=1376 d=1
[drm:radeon_cp_dispatch_indirect] buf=13 s=0x0 e=0x560
[drm:drm_ioctl] pid=2491, cmd=0xc0406429, nr=0x29, dev 0xe200, auth=1
[drm:radeon_freelist_get] done_age = 126828
[drm:drm_ioctl] pid=2491, cmd=0xc010644d, nr=0x4d, dev 0xe200, auth=1
[drm:radeon_cp_indirect] idx=14 s=0 e=136 d=1
[drm:radeon_cp_dispatch_indirect] buf=14 s=0x0 e=0x88

It is possible that printing these messages changes the timing enough to prevent the hang.

The hang is back with kernel -114 (where I'm seeing no such debug messages) and later builds. The bug is not fixed. Additionally I'm getting corrupted pixmaps with these later kernels and nomodeset.

Anyway, I started using modesetting as soon as the major bugs with it on RS690 were fixed, so this bug is not too important for me at the moment.
Comment 24 Jan Martinek 2008-12-14 17:32:22 EST
Unfortunately that is right. The kernel 2.6.28-0.113.rc7.git5.fc11.x86_64 works fine (and I am using it now) but since -114 the bugs are back again.
I just don't have the messages in dmesg as listed above:
$ dmesg | grep drm
[drm] Initialized drm 1.1.0 20060810
[drm] Initialized radeon 1.29.0 20080528 on minor 0
[drm] Setting GART location based on new memory map
[drm] Loading RS690/RS740 Microcode
[drm] Num pipes: 1
[drm] writeback test succeeded in 1 usecs
Comment 25 Yanko Kaneti 2009-01-09 08:33:15 EST
Alos happened to me yesterday with
kernel-2.6.28-3.fc11.i686 and xorg-x11-drv-ati-6.10.0-1.fc11.i386
Comment 26 Michal Schmidt 2009-01-30 10:45:48 EST
I got this X hang again today with
xorg-x11-drv-ati-6.10.0-1.fc10.x86_64
kernel-2.6.29-0.6.rc3.fc10.x86_64

This is the current Koji build of the kernel for F10. I should have expected the hangs would be back, since it has CONFIG_DRM_RADEON_KMS disabled (temporarily, according to the changelog).
Comment 27 Jan Martinek 2009-02-01 04:49:04 EST
I have same results with kernel-2.6.29-0.66.rc3.fc11.x86_64. And, here is another way how to force the hang:

1) open any PDF in evince
2) zoom in as far as horizontal scrollbar appears
3) move the horizontal scrollbar using a mouse
Comment 28 Orion Poplawski 2009-03-30 17:16:59 EDT
Very similar with Radeon XPRESS 200M 5955 (PCIE).  Cursor moves but that is it.  strace tight loop:

--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe)                       = 49014496

cpu usage is only a couple percent though.

kernel-2.6.29-16.fc11.x86_64
xorg-x11-drv-ati-6.12.0-2.fc11.x86_64
Comment 29 Michal Schmidt 2009-03-31 09:43:21 EDT
I upgraded my laptop to current Rawhide and tested both reproducers (audacious playlist resizing, evince horizontal scrolling) with nomodeset. I am not able to reproduce the hang with nomodeset anymore.
xorg-x11-drv-ati-6.12.0-2.fc11.x86_64
kernel-2.6.29-21.fc11.x86_64

[OTOH, I am now getting frequent hangs _with_ KMS. So the situation has reversed (with F10 it was stable enough only with KMS while nomodeset was hanging). I'll report it as a new bug.]
Comment 30 Hin-Tak Leung 2009-03-31 21:47:29 EDT
I can quite readily get the X server to stop responding (whereas ssh from outside still works) by runing xpdf... so I have switched to acroread for my pdf needs...
Comment 31 Matěj Cepl 2009-04-01 04:12:56 EDT
Lovely!

Backtrace:
0: /usr/bin/Xorg(xorg_backtrace+0x26) [0x4e7a26]
1: /usr/bin/Xorg(mieqEnqueue+0x291) [0x4c8591]
2: /usr/bin/Xorg(xf86PostMotionEventP+0xc4) [0x491494]
3: /usr/bin/Xorg(xf86PostMotionEvent+0xa9) [0x491669]
4: /usr/lib64/xorg/modules/input//synaptics_drv.so [0x183b832]
5: /usr/lib64/xorg/modules/input//synaptics_drv.so [0x183dde2]
6: /usr/bin/Xorg [0x47a765]
7: /usr/bin/Xorg [0x46b307]
8: /lib64/libc.so.6 [0x3554432f60]
9: /usr/bin/Xorg [0x4eb4c0]
10: /lib64/libc.so.6 [0x3554432f60]
11: /lib64/libc.so.6(ioctl+0x7) [0x35544ddff7]
12: /usr/lib64/libdrm.so.2(drmDMA+0x7d) [0x356d603c0d]
13: /usr/lib64/xorg/modules/drivers//radeon_drv.so(RADEONCPGetBuffer+0x159) [0x7075e69]
14: /usr/lib64/xorg/modules/drivers//radeon_drv.so [0x70cbcb3]
15: /usr/lib64/xorg/modules//libexa.so [0x4c735f9]
16: /usr/lib64/xorg/modules//libexa.so [0x4c73c7f]
17: /usr/lib64/xorg/modules//libexa.so(exaDoMigration+0x68f) [0x4c7446f]
18: /usr/lib64/xorg/modules//libexa.so [0x4c75892]
19: /usr/lib64/xorg/modules//libexa.so(exaComposite+0x645) [0x4c762d5]
20: /usr/bin/Xorg [0x5291b8]
21: /usr/bin/Xorg [0x5183fa]
22: /usr/bin/Xorg(Dispatch+0x364) [0x4468d4]
23: /usr/bin/Xorg(main+0x45d) [0x42cd1d]
24: /lib64/libc.so.6(__libc_start_main+0xe6) [0x355441e546]
25: /usr/bin/Xorg [0x42c0f9]
Comment 32 Michal Schmidt 2009-04-16 16:58:34 EDT
Scratch comment #29. I can reproduce the bug again in Rawhide by resizing Audacious's playlist height to 350px. Sometimes I get a spontaneous hang with the same symptoms while scrolling in Firefox. I could not reproduce it with evince nor xpdf.

This is with:
kernel-2.6.29.1-85.fc11.x86_64
xorg-x11-drv-ati-6.12.2-4.fc11.x86_64

So right now I have unstable X without KMS because of this bug and I have unstable X with KMS because of bug 493068.
Comment 33 Hin-Tak Leung 2009-05-23 07:28:42 EDT
I haven't had a hang since I started doing my own patch:

https://bugzilla.redhat.com/show_bug.cgi?id=497427#c20

FWIW, effective TX_CLAMP patches were only in between -6 and -11 - then David Airlie went for something different in -12 onwards.

I am on a modified version of xorg-x11-drv-ati-6.12.2-15 ATM.
Comment 34 Michal Schmidt 2009-09-04 11:02:51 EDT
Still reproducible in F11 with:

kernel-2.6.30.5-43.fc11.x86_64
xorg-x11-drv-ati-6.12.2-18.fc11.x86_64
xorg-x11-server-Xorg-1.6.3-4.fc11.x86_64

Good news: I can't reproduce this bug in Rawhide:

kernel-2.6.31-0.199.rc8.git2.fc12.x86_64
xorg-x11-drv-ati-6.13.0-0.2.20090821gitb1b77a4d6.fc12.x86_64
xorg-x11-server-Xorg-1.6.99-45.20090903.fc12.x86_64

(Bad news: in Rawhide I have serious problems when using KMS - severe general slowness, and starting compiz kills the box. Different BZs.)
Comment 35 Vedran Miletić 2009-09-06 02:01:50 EDT
Closing per comment #34.

Note You need to log in before you can comment on or make changes to this bug.