Bug 517625 - Xorg stops responding and consumes 93%+ CPU time (radeon driver pcie_aspm issue)
Xorg stops responding and consumes 93%+ CPU time (radeon driver pcie_aspm issue)
Status: CLOSED CANTFIX
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-ati (Show other bugs)
rawhide
All Linux
low Severity high
: ---
: ---
Assigned To: Dave Airlie
Fedora Extras Quality Assurance
card_R600
:
: 524368 (view as bug list)
Depends On:
Blocks: eq-overflow F12Blocker/F12FinalBlocker
  Show dependency treegraph
 
Reported: 2009-08-15 06:06 EDT by Quentin Armitage
Modified: 2009-11-01 21:17 EST (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-11-01 17:18:21 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Xorg.0.log from when problem occured. (32.83 KB, text/plain)
2009-08-15 06:06 EDT, Quentin Armitage
no flags Details
xorg log from similar problem. (17.03 KB, text/plain)
2009-09-30 11:13 EDT, Matthew Miller
no flags Details
.config I use as requested by airlied (98.12 KB, text/plain)
2009-10-08 17:07 EDT, Kevin DeKorte
no flags Details
gdb bt -f output (51.50 KB, text/plain)
2009-10-12 09:38 EDT, Quentin Armitage
no flags Details
strace -p output (958 bytes, text/plain)
2009-10-12 09:39 EDT, Quentin Armitage
no flags Details
.config file used to build kernel 2.6.32-rc4 (98.22 KB, text/plain)
2009-10-12 10:51 EDT, Kevin DeKorte
no flags Details
diff .config /boot/config-2.6.32-0.24.rc4.git0.fc13.x86_64 (6.08 KB, text/plain)
2009-10-12 10:52 EDT, Kevin DeKorte
no flags Details

  None (edit)
Description Quentin Armitage 2009-08-15 06:06:58 EDT
Created attachment 357537 [details]
Xorg.0.log from when problem occured. 

Description of problem:
Xorg stopped responding to keyboard input. Mouse movement continued, but no response to mouse clicks. Xorg was consuming all available CPU time.

Version-Release number of selected component (if applicable):
Don't know which of these ae relevant, so listing all for completeness:
kernel-2.6.31-0.125.4.2.rc5.git2.fc12.i686
xorg-x11-drv-apm-1.2.2-1.fc12.i686
xorg-x11-drv-penmount-1.4.0-4.fc12.i686
xorg-x11-drv-sisusb-0.9.3-1.fc12.i686
xorg-x11-fonts-ISO8859-1-75dpi-7.2-9.fc12.noarch
xorg-x11-drv-tdfx-1.4.3-1.fc12.i686
xorg-x11-drv-aiptek-1.2.0-3.fc12.i686
xorg-x11-server-Xorg-1.6.99-33.20090807.fc12.i686
xorg-x11-drv-voodoo-1.2.3-1.fc12.i686
xorg-x11-drv-rendition-4.2.2-2.fc12.1.i686
xorg-x11-drv-mga-1.4.11-1.fc12.i686
xorg-x11-server-devel-1.6.99-33.20090807.fc12.i686
xorg-x11-drv-synaptics-1.1.99-5.20090728.fc12.i686
xorg-x11-drv-cirrus-1.3.2-1.fc12.i686
xorg-x11-drv-r128-6.8.1-1.fc12.i686
xorg-x11-fonts-misc-7.2-9.fc12.noarch
xorg-x11-drv-openchrome-0.2.903-14.fc12.i686
xorg-x11-drv-keyboard-1.3.99-3.20090715.fc12.1.i686
xorg-x11-twm-1.0.3-5.fc12.i686
xorg-x11-drv-vmware-10.16.7-1.fc12.i686
xorg-x11-xauth-1.0.2-7.fc12.i686
xorg-x11-proto-devel-7.4-27.fc12.noarch
xorg-x11-drv-nv-2.1.14-4.fc12.i686
xorg-x11-drv-v4l-0.2.0-3.fc12.1.i686
xorg-x11-drv-mach64-6.8.2-1.fc12.i686
xorg-x11-drv-fpit-1.3.0-4.fc12.i686
xorg-x11-xtrans-devel-1.2.2-4.fc12.noarch
xorg-x11-xdm-1.1.6-14.fc12.i686
xorg-x11-drv-vmmouse-12.6.4-3.fc12.1.i686
xorg-x11-fonts-Type1-7.2-9.fc12.noarch
xorg-x11-drv-siliconmotion-1.7.3-1.fc12.i686
xorg-x11-util-macros-1.2.2-2.fc12.noarch
xorg-x11-fonts-ISO8859-1-100dpi-7.2-9.fc12.noarch
xorg-x11-xkb-utils-7.4-5.fc12.i686
xorg-x11-drv-sis-0.10.2-1.fc12.i686
xorg-x11-drv-i128-1.3.3-1.fc12.i686
xorg-x11-drv-mouse-1.4.99-3.20090619.fc12.1.i686
xorg-x11-drv-elographics-1.2.3-4.fc12.i686
xorg-x11-server-common-1.6.99-33.20090807.fc12.i686
xorg-x11-fonts-75dpi-7.2-9.fc12.noarch
xorg-x11-drv-evdev-2.2.99-5.20090730.fc12.i686
xorg-x11-drv-ati-6.12.2-21.fc12.i686
xorg-x11-drv-void-1.2.0-3.fc12.1.i686
xorg-x11-font-utils-7.2-9.fc12.i686
xorg-x11-drv-geode-2.11.3-1.fc12.i686
xorg-x11-drv-i740-1.3.2-1.fc12.i686
xorg-x11-drv-mutouch-1.2.1-4.fc12.i686
xorg-x11-drv-intel-2.8.0-3.fc12.i686
xorg-x11-drv-vesa-2.2.1-1.fc12.i686
xorg-x11-drv-s3virge-1.10.4-1.fc12.i686
xorg-x11-drv-fbdev-0.4.1-1.fc12.i686
xorg-x11-utils-7.4-6.fc12.i686
xorg-x11-drivers-7.3-13.fc12.i686
xorg-x11-drv-savage-2.3.1-1.fc12.i686
xorg-x11-drv-trident-1.3.3-1.fc12.i686
xorg-x11-drv-hyperpen-1.3.0-3.fc12.i686
xorg-x11-drv-dummy-0.3.2-2.fc12.1.i686
xorg-x11-drv-ast-0.89.9-1.fc12.i686
xorg-x11-server-utils-7.4-11.fc12.i686
xorg-x11-xinit-1.0.9-12.fc12.i686
xorg-x11-drv-glint-1.2.4-1.fc12.i686
xorg-x11-apps-7.4-4.fc12.i686
xorg-x11-fonts-100dpi-7.2-9.fc12.noarch
xorg-x11-drv-neomagic-1.2.4-1.fc12.i686
xorg-x11-drv-acecad-1.3.0-3.fc12.i686
xorg-x11-drv-nouveau-0.0.15-2.20090805git712064e.fc12.i686

How reproducible:
It has only happened to me once - no idea what triggerred it

Steps to Reproduce:
1.?
2.
3.
  
Actual results:
Keyboard and mouse click inputs stopped responding

Expected results:


Additional info:
killall -9 Xorg from an ssh session allowed my to log back in again.
Comment 1 Quentin Armitage 2009-08-15 06:11:10 EDT
Apologies, wrong kernel version given. Should be:
kernel-2.6.31-0.125.rc5.git2.fc12.i686
Comment 2 Matěj Cepl 2009-08-17 18:58:36 EDT
(In reply to comment #0)
> How reproducible:
> It has only happened to me once - no idea what triggerred it

If you happen to reporoduce it again, try to follow steps described on http://wiki.x.org/wiki/Development/Documentation/ServerDebugging to collect backtrace of Xorg.

We will review this issue again once you've had a chance to attach this information.

Thanks in advance.
Comment 3 Stefan Becker 2009-09-18 11:44:16 EDT
This happens to me regularly on my Lenovo T60 with Mobility Radeon X1300 [1002:7149] since I pre-upgraded from F11 to F12-rawhide. As far as I can see there is no specific action to trigger it. When it happens the mouse pointer movement becomes jerky and there are no X display updates anymore.

The system itself is not frozen, i.e. you can still initiate a ACPI powerdown with the power off key or login with ssh and initiate a shutdown command.

Version-Release number of selected component (if applicable):
xorg-x11-drv-ati-6.13.0-0.4.20090908git651fe5a47.fc12.i686
libdrm-2.4.12-0.10.fc12.i686
mesa-libGL-7.6-0.11.fc12.i686
kernel-2.6.31-23.fc12.i686
xorg-x11-server-Xorg-1.6.99.901-2.fc12.i686

How reproducible:
always

Steps to Reproduce:
1. Login as normal user into X session
2. work normally, e.g. browse web pages

Additional info:

Smolt: http://www.smolts.org/client/show/pub_2f56c9e2-bad1-462e-b877-6491a917db79

Matej: I'll try to use your suggested debugging trial ASAP.

1. I'm using kernel parameter "nomodeset" as KMS breaks suspend
2. I have experienced the freeze after a fresh boot and after resume from hibernate
3. I have experienced the freeze with WLAN active or not active
4. I have experienced the freeze with cable Ethernet active or not active
5. Log in after the fact shows the X process uses up 100%, i.e. one CPU core
6. X process can only be killed with -9
7. Killing X does NOT restore the display, chvt hangs
8. uname -a: Linux XXX 2.6.31-23.fc12.i686 #1 SMP Wed Sep 16 16:09:25 EDT 2009 i686 i686 i386 GNU/Linux
9. There are no messages in /var/log/Xorg.0.log from the point in time when the freeze occurs

10. Don't know if this is related, but after a fresh boot I get these dmesg when X starts:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.31-23.fc12.i686 #1
-------------------------------------------------------
Xorg/1143 is trying to acquire lock:
 (&dev->struct_mutex){+.+.+.}, at: [<f7ca4232>] drm_mmap+0x38/0x66 [drm]

but task is already holding lock:
 (&mm->mmap_sem){++++++}, at: [<c0407abd>] sys_mmap2+0x72/0xb9

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (&mm->mmap_sem){++++++}:
       [<c04711a3>] __lock_acquire+0x9b3/0xb25
       [<c04713cc>] lock_acquire+0xb7/0xeb
       [<c04d7959>] might_fault+0x73/0xa4
       [<c06019fc>] copy_to_user+0x41/0x12b
       [<c05069bc>] filldir64+0xc8/0x10d
       [<c054986c>] sysfs_readdir+0x11a/0x161
       [<c0506c42>] vfs_readdir+0x7b/0xb8
       [<c0506cfa>] sys_getdents64+0x7b/0xcb
       [<c0403a50>] syscall_call+0x7/0xb
       [<ffffffff>] 0xffffffff

-> #3 (sysfs_mutex){+.+.+.}:
       [<c04711a3>] __lock_acquire+0x9b3/0xb25
       [<c04713cc>] lock_acquire+0xb7/0xeb
       [<c0825354>] __mutex_lock_common+0x43/0x32b
       [<c082572f>] mutex_lock_nested+0x41/0x5a
       [<c0549c36>] sysfs_addrm_start+0x34/0xb2
       [<c0548529>] sysfs_hash_and_remove+0x2d/0x71
       [<c054aab0>] sysfs_remove_link+0x29/0x3c
       [<c04f1911>] sysfs_slab_add+0x57/0x189
       [<c04f1a83>] sysfs_add_func+0x40/0x74
       [<c045831e>] worker_thread+0x194/0x275
       [<c045cef1>] kthread+0x7b/0x80
       [<c040463f>] kernel_thread_helper+0x7/0x10
       [<ffffffff>] 0xffffffff

-> #2 (slub_lock){+++++.}:
       [<c04711a3>] __lock_acquire+0x9b3/0xb25
       [<c04713cc>] lock_acquire+0xb7/0xeb
       [<c0825a6a>] down_read+0x45/0x93
       [<c0822fe0>] slab_cpuup_callback+0x50/0x13e
       [<c082920b>] notifier_call_chain+0x5d/0x95
       [<c0462192>] __raw_notifier_call_chain+0x23/0x39
       [<c0821f54>] _cpu_up+0x6d/0x121
       [<c082205f>] cpu_up+0x57/0x78
       [<c0a7b419>] kernel_init+0xba/0x211
       [<c040463f>] kernel_thread_helper+0x7/0x10
       [<ffffffff>] 0xffffffff

-> #1 (cpu_hotplug.lock){+.+.+.}:
       [<c04711a3>] __lock_acquire+0x9b3/0xb25
       [<c04713cc>] lock_acquire+0xb7/0xeb
       [<c0825354>] __mutex_lock_common+0x43/0x32b
       [<c082572f>] mutex_lock_nested+0x41/0x5a
       [<c0446425>] get_online_cpus+0x44/0x67
       [<c04134e1>] mtrr_del_page+0x40/0x131
       [<c0413616>] mtrr_del+0x44/0x5b
       [<f7c9bbc6>] drm_rmmap_locked+0xc4/0x17d [drm]
       [<f7ca2c56>] drm_master_destroy+0x61/0xe1 [drm]
       [<c05fbbb7>] kref_put+0x47/0x62
       [<f7ca2b61>] drm_master_put+0x25/0x40 [drm]
       [<f7c9f647>] drm_release+0x3f7/0x4d7 [drm]
       [<c04fa4e2>] __fput+0x101/0x1a9
       [<c04fa5b1>] fput+0x27/0x3a
       [<c04f6c9a>] filp_close+0x64/0x7f
       [<c04f6d31>] sys_close+0x7c/0xc2
       [<c0403a50>] syscall_call+0x7/0xb
       [<ffffffff>] 0xffffffff

-> #0 (&dev->struct_mutex){+.+.+.}:
       [<c04710aa>] __lock_acquire+0x8ba/0xb25
       [<c04713cc>] lock_acquire+0xb7/0xeb
       [<c0825354>] __mutex_lock_common+0x43/0x32b
       [<c082572f>] mutex_lock_nested+0x41/0x5a
       [<f7ca4232>] drm_mmap+0x38/0x66 [drm]
       [<c04df171>] mmap_region+0x264/0x3f9
       [<c04df58f>] do_mmap_pgoff+0x289/0x2ec
       [<c0407ad1>] sys_mmap2+0x86/0xb9
       [<c0403a50>] syscall_call+0x7/0xb
       [<ffffffff>] 0xffffffff

other info that might help us debug this:

1 lock held by Xorg/1143:
 #0:  (&mm->mmap_sem){++++++}, at: [<c0407abd>] sys_mmap2+0x72/0xb9

stack backtrace:
Pid: 1143, comm: Xorg Tainted: G        W  2.6.31-23.fc12.i686 #1
Call Trace:
 [<c0823ee6>] ? printk+0x22/0x3c
 [<c04704d8>] print_circular_bug_tail+0x68/0x84
 [<c04710aa>] __lock_acquire+0x8ba/0xb25
 [<c04713cc>] lock_acquire+0xb7/0xeb
 [<f7ca4232>] ? drm_mmap+0x38/0x66 [drm]
 [<f7ca4232>] ? drm_mmap+0x38/0x66 [drm]
 [<c0825354>] __mutex_lock_common+0x43/0x32b
 [<f7ca4232>] ? drm_mmap+0x38/0x66 [drm]
 [<f7ca4232>] ? drm_mmap+0x38/0x66 [drm]
 [<c04f091b>] ? kmem_cache_alloc+0xcb/0x159
 [<c046fe05>] ? trace_hardirqs_on_caller+0x122/0x155
 [<c082572f>] mutex_lock_nested+0x41/0x5a
 [<f7ca4232>] ? drm_mmap+0x38/0x66 [drm]
 [<f7ca4232>] drm_mmap+0x38/0x66 [drm]
 [<c04df171>] mmap_region+0x264/0x3f9
 [<c04df58f>] do_mmap_pgoff+0x289/0x2ec
 [<c0407ad1>] sys_mmap2+0x86/0xb9
 [<c0403a50>] syscall_call+0x7/0xb
[drm] Setting GART location based on new memory map
[drm] Loading R500 Microcode
platform radeon_cp.0: firmware: requesting radeon/R520_cp.bin
[drm] Num pipes: 1
drm] writeback test succeeded in 1 usecs
Comment 4 Stefan Becker 2009-09-18 14:43:32 EDT
Duplicate of bug #517744 (or vice versa)?
Comment 5 Stefan Becker 2009-09-18 15:08:19 EDT
In my case the the cause seems to be EXA in xorg-x11-drv-ati.

Quentin: you don't mention in your report what graphics card is in your system.

(gdb) bt
#0  0x00e6e416 in __kernel_vsyscall ()
#1  0x0059fd79 in ioctl () from /lib/libc.so.6
#2  0x0047188e in drmIoctl () from /usr/lib/libdrm.so.2
#3  0x00471cd3 in drmCommandNone () from /usr/lib/libdrm.so.2
#4  0x00bc71f4 in RADEONDownloadFromScreenCP (pSrc=<value optimized out>,
    x=<value optimized out>, y=<value optimized out>, w=<value optimized out>,
    h=<value optimized out>, dst=<value optimized out>,
    dst_pitch=<value optimized out>) at radeon_exa_funcs.c:669
#5  0x007cf6c3 in exaCopyDirty (migrate=<value optimized out>,
    pValidDst=<value optimized out>, pValidSrc=<value optimized out>,
    transfer=<value optimized out>, fallback_src=<value optimized out>,
    fallback_dst=<value optimized out>,
    fallback_srcpitch=<value optimized out>,
    fallback_dstpitch=<value optimized out>,
    fallback_index=<value optimized out>, sync=<value optimized out>)
    at exa_migration_classic.c:218
#6  0x007cfb12 in exaCopyDirtyToSys (migrate=0xbff7cb40)
    at exa_migration_classic.c:273
#7  0x007cfb88 in exaDoMoveOutPixmap (migrate=<value optimized out>)
    at exa_migration_classic.c:391
#8  0x007d054a in exaDoMigration_classic (pixmaps=<value optimized out>,
    npixmaps=<value optimized out>, can_accel=<value optimized out>)
    at exa_migration_classic.c:699
#9  0x007cd25b in exaDoMigration (pixmaps=<value optimized out>,
    npixmaps=<value optimized out>, can_accel=<value optimized out>)
    at exa.c:1205
#10 0x007ce7bd in exaPrepareAccessReg (pDrawable=<value optimized out>,
    index=<value optimized out>, pReg=<value optimized out>) at exa.c:392
#11 0x007ce811 in exaPrepareAccess (pDrawable=<value optimized out>,
    index=<value optimized out>) at exa.c:407
#12 0x007da726 in ExaCheckComposite (op=<value optimized out>,
    pSrc=<value optimized out>, pMask=<value optimized out>,
    pDst=<value optimized out>, xSrc=<value optimized out>,
    ySrc=<value optimized out>, xMask=<value optimized out>,
    yMask=<value optimized out>, xDst=<value optimized out>,
    yDst=<value optimized out>, width=<value optimized out>,
    height=<value optimized out>) at exa_unaccel.c:432
#13 0x007d805a in exaComposite (op=<value optimized out>,
    pSrc=<value optimized out>, pMask=<value optimized out>,
    pDst=<value optimized out>, xSrc=<value optimized out>,
    ySrc=<value optimized out>, xMask=<value optimized out>,
    yMask=<value optimized out>, xDst=<value optimized out>,
    yDst=<value optimized out>, width=<value optimized out>,
    height=<value optimized out>) at exa_render.c:1050
#14 0x0811b7d7 in damageComposite (op=<value optimized out>,
    pSrc=<value optimized out>, pMask=<value optimized out>,
    pDst=<value optimized out>, xSrc=<value optimized out>,
    ySrc=<value optimized out>, xMask=<value optimized out>,
    yMask=<value optimized out>, xDst=<value optimized out>,
    yDst=<value optimized out>, width=<value optimized out>,
    height=<value optimized out>) at damage.c:643
#15 0x0810ed60 in CompositePicture (op=<value optimized out>,
    pSrc=<value optimized out>, pMask=<value optimized out>,
    pDst=<value optimized out>, xSrc=<value optimized out>,
    ySrc=<value optimized out>, xMask=<value optimized out>,
    yMask=<value optimized out>, xDst=<value optimized out>,
    yDst=<value optimized out>, width=<value optimized out>,
    height=<value optimized out>) at picture.c:1721
#16 0x081b63a0 in miCompositeRects (op=<value optimized out>,
    pDst=<value optimized out>, color=<value optimized out>,
    nRect=<value optimized out>, rects=0xa181a98) at mirect.c:168
#17 0x0810eab4 in CompositeRects (op=<value optimized out>,
    pDst=<value optimized out>, color=<value optimized out>,
    nRect=<value optimized out>, rects=<value optimized out>) at picture.c:1745
#18 0x08115a1d in ProcRenderFillRectangles (client=<value optimized out>)
    at render.c:1456
#19 0x08111b44 in ProcRenderDispatch (client=<value optimized out>)
    at render.c:2041
#20 0x0806e137 in Dispatch () at dispatch.c:445
#21 0x08062885 in main (argc=<value optimized out>,
    argv=<value optimized out>, envp=<value optimized out>) at main.c:285

(gdb) bt full
#0  0x00e6e416 in __kernel_vsyscall ()
No symbol table info available.
#1  0x0059fd79 in ioctl () from /lib/libc.so.6
No symbol table info available.
#2  0x0047188e in drmIoctl () from /usr/lib/libdrm.so.2
No symbol table info available.
#3  0x00471cd3 in drmCommandNone () from /usr/lib/libdrm.so.2
No symbol table info available.
#4  0x00bc71f4 in RADEONDownloadFromScreenCP (pSrc=<value optimized out>,
    x=<value optimized out>, y=<value optimized out>, w=<value optimized out>,
    h=<value optimized out>, dst=<value optimized out>,
    dst_pitch=<value optimized out>) at radeon_exa_funcs.c:669
        i = 1168
        hpass = 1
        indirect = {idx = 136393216, start = 160913408, end = -1074279616,
          discard = -1074280168}
        __head = <value optimized out>
        __expected = <value optimized out>
        __count = <value optimized out>
        swap = Cannot access memory at address 0x0

Output from strace:

...
sigreturn()                             = ? (mask now [])
ioctl(11, 0x6444, 0)                    = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
ioctl(11, 0x6444, 0)                    = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
ioctl(11, 0x6444, 0)                    = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
ioctl(11, 0x6444^C <unfinished ...>
Process 1143 detached
Comment 6 Stefan Becker 2009-09-18 15:43:00 EDT
For now I've opened a new bug #524312 for xorg-x11-drv-ati
Comment 7 Matěj Cepl 2009-09-23 12:13:43 EDT
(In reply to comment #6)
> For now I've opened a new bug #524312 for xorg-x11-drv-ati  

Take a look at the Smolt profile:

  	ATI Technologies Inc M52 [ATI Mobility Radeon X1300] (wiki)

Closing your bug as a duplicate of this one.
Comment 8 Matěj Cepl 2009-09-23 12:14:38 EDT
*** Bug 524312 has been marked as a duplicate of this bug. ***
Comment 9 Stefan Becker 2009-09-23 14:49:14 EDT
Yeah I know, because this is my smolt profile.

But where is the smolt profile / graphics card info from Quentin's machine?
Comment 10 Stefan Becker 2009-09-23 14:54:25 EDT
dohhh... look at the X server log...

Quentin has a

NV: Found NVIDIA GeForce FX Go5650 at 01@00:00:0

with

(II) Loading sub module "xaa"

His is a totally different problem. Reopening bug #524312.
Comment 11 Matthew Miller 2009-09-30 11:02:00 EDT
This happens to me too. I put the backtrace in bug #436632, but here it is as well: 
:23:11 EDT   (-) [reply] -------

#0  0x00007f25cf046717 in ioctl () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007f25cc8cc203 in drmIoctl (fd=9, request=3221775460,
arg=0x7fff168b66a0) at xf86drm.c:188
#2  0x00007f25cc8cc44c in drmCommandWriteRead (fd=<value optimized out>, 
    drmCommandIndex=<value optimized out>, data=<value optimized out>,
size=<value optimized out>)
    at xf86drm.c:2394
#3  0x00007f25cbfbdf59 in bo_wait (bo=0x2146a80) at radeon_bo_gem.c:206
#4  0x00007f25cbfbe035 in bo_map (bo=0x2146a80, write=<value optimized out>)
    at radeon_bo_gem.c:181
#5  0x00007f25cc27dea6 in _radeon_bo_map (line=1170, func=<value optimized
out>, 
    file=0x6363615f78783672 <Address 0x6363615f78783672 out of bounds>,
write=1, 
    bo=<value optimized out>) at /usr/include/drm/radeon_bo.h:151
#6  r600_vb_get (line=1170, func=<value optimized out>, 
    file=0x6363615f78783672 <Address 0x6363615f78783672 out of bounds>,
write=1, 
    bo=<value optimized out>) at r6xx_accel.c:1170
#7  0x00007f25cc27df17 in r600_cp_start (pScrn=0x205dea0) at r6xx_accel.c:1204
#8  0x00007f25cc27ba56 in R600PrepareSolid (pPix=0x2520e80, alu=<value
optimized out>, 
    pm=4294967295, fg=<value optimized out>) at r600_exa.c:169
#9  0x00007f25cb581655 in exaFillRegionSolid (pDrawable=0x2520e80,
pRegion=0x2522280, 
    pixel=<value optimized out>, planemask=<value optimized out>, alu=<value
optimized out>, 
    clientClipType=0) at exa_accel.c:1003
#10 0x00007f25cb58226a in exaPolyFillRect (pDrawable=0x2520e80, pGC=0x20aab60,
nrect=1, 
    prect=0x250bf88) at exa_accel.c:800
#11 0x00000000004d21db in damagePolyFillRect (pDrawable=0x2520e80,
pGC=0x20aab60, nRects=1, 
    pRects=0x250bf88) at damage.c:1404
#12 0x0000000000563537 in miColorRects (pDst=0x2520b20, pClipPict=0x2520b20,
color=0xc0086464, 
    nRect=<value optimized out>, rects=0x250bf88, xoff=0, yoff=<value optimized
out>)
    at mirect.c:84
#13 0x0000000000563613 in miCompositeRects (op=3 '\003', pDst=0x2520b20,
color=0x250bf80, 
    nRect=<value optimized out>, rects=0x250bf88) at mirect.c:116
#14 0x00000000004cbf04 in ProcRenderFillRectangles (client=0x2437b60) at
render.c:1467
#15 0x000000000042c5dc in Dispatch () at dispatch.c:445
#16 0x0000000000421c6a in main (argc=<value optimized out>, argv=<value
optimized out>, 
    envp=<value optimized out>) at main.c:285  



Although today, X is just starting and restarting infinitely (xorg-x11-drv-ati-6.13.0-0.5.20090929git7968e1fb8.fc12.x86_64)
Comment 12 Matthew Miller 2009-09-30 11:12:10 EDT
Oh, no, eventually it froze up from doing that, and I get the same:

(gdb) bt
#0  0x00007f9c768f4717 in ioctl () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007f9c7417a203 in drmIoctl (fd=9, request=3221775460, 
    arg=0x7fff7f03a360) at xf86drm.c:188
#2  0x00007f9c7417a44c in drmCommandWriteRead (fd=<value optimized out>, 
    drmCommandIndex=<value optimized out>, data=<value optimized out>, 
    size=<value optimized out>) at xf86drm.c:2394
#3  0x00007f9c7386df59 in bo_wait (bo=0xc79300) at radeon_bo_gem.c:206
#4  0x00007f9c7386e035 in bo_map (bo=0xc79300, write=<value optimized out>)
    at radeon_bo_gem.c:181
#5  0x00007f9c73b2969d in _radeon_bo_map (line=2291, 
    func=<value optimized out>, file=0x1 <Address 0x1 out of bounds>, write=0, 
    bo=<value optimized out>) at /usr/include/drm/radeon_bo.h:151
#6  R600DownloadFromScreenCS (line=2291, func=<value optimized out>, 
    file=0x1 <Address 0x1 out of bounds>, write=0, bo=<value optimized out>)
    at r600_exa.c:2291
#7  0x00007f9c72e31100 in exaGetImage (pDrawable=0xc61a70, x=0, y=0, w=39, 
    h=26, format=<value optimized out>, planeMask=<value optimized out>, 
    d=<value optimized out>) at exa_accel.c:1283
#8  0x00000000005527c4 in miSpriteGetImage (pDrawable=0xc61a70, sx=0, sy=0, 
    w=39, h=26, format=<value optimized out>, planemask=<value optimized out>, 
    pdstLine=<value optimized out>) at misprite.c:425
#9  0x00000000004cbd19 in ProcRenderCreateCursor (client=0xc8e3d0)
    at render.c:1557
#10 0x000000000042c5dc in Dispatch () at dispatch.c:445
#11 0x0000000000421c6a in main (argc=<value optimized out>, 
    argv=<value optimized out>, envp=<value optimized out>) at main.c:285
Current language:  auto
The current source language is "auto; currently asm".
Comment 13 Matthew Miller 2009-09-30 11:13:31 EDT
Created attachment 363201 [details]
xorg log from similar problem.
Comment 14 Àlex Magaz Graça 2009-10-01 02:53:18 EDT
I also was affected by this bug. But some update between August 21th and
September 14th solved it. I'm attaching my Smolt profile if it helps. Tell me
if you need anything else.
Comment 15 Àlex Magaz Graça 2009-10-01 02:55:07 EDT
smoltSendProfile --printOnly doesn't work. So put the URL instead:

http://www.smolts.org/client/show/pub_e637e368-9fca-4582-a3ea-c876f3a024fe
Comment 16 Matthew Miller 2009-10-01 10:10:48 EDT
See also: bug #524368.

Still happens today with kernel-2.6.31.1-56.fc12.x86_64 and xorg-x11-drv-ati-6.13.0-0.6.20090929git7968e1fb8.fc12.x86_64.


(gdb) bt
#0  0x00007f9068b91717 in ioctl () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007f9066417203 in drmIoctl (fd=9, request=3221775460, 
    arg=0x7fff846c2e90) at xf86drm.c:188
#2  0x00007f906641744c in drmCommandWriteRead (fd=<value optimized out>, 
    drmCommandIndex=<value optimized out>, data=<value optimized out>, 
    size=<value optimized out>) at xf86drm.c:2394
#3  0x00007f9065b0af59 in bo_wait (bo=0x16d5200) at radeon_bo_gem.c:206
#4  0x00007f9065b0b035 in bo_map (bo=0x16d5200, write=<value optimized out>)
    at radeon_bo_gem.c:181
#5  0x00007f9065dc9966 in _radeon_bo_map (line=1193, 
    func=<value optimized out>, 
    file=0x6363615f78783672 <Address 0x6363615f78783672 out of bounds>, 
    write=1, bo=<value optimized out>) at /usr/include/drm/radeon_bo.h:151
#6  r600_vb_get (line=1193, func=<value optimized out>, 
    file=0x6363615f78783672 <Address 0x6363615f78783672 out of bounds>, 
    write=1, bo=<value optimized out>) at r6xx_accel.c:1193
#7  0x00007f9065dc99d3 in r600_cp_start (pScrn=0x1603ea0) at r6xx_accel.c:1227
#8  0x00007f9065dc406b in R600DoPrepareCopy (pScrn=0x1603ea0, 
    src_pitch=<value optimized out>, src_width=<value optimized out>, 
    src_height=<value optimized out>, src_offset=<value optimized out>, 
    src_bo=<value optimized out>, src_bpp=<value optimized out>, 
    dst_pitch=<value optimized out>, dst_width=<value optimized out>, 
    dst_height=<value optimized out>, dst_offset=<value optimized out>, 
    dst_bo=<value optimized out>, dst_bpp=<value optimized out>, 
    rop=<value optimized out>, planemask=<value optimized out>)
    at r600_exa.c:459
#9  0x00007f9065dc50eb in R600Copy (pDst=<value optimized out>, srcX=1470, 
    srcY=<value optimized out>, dstX=1486, dstY=0, w=<value optimized out>, 
    h=<value optimized out>) at r600_exa.c:1084
#10 0x00007f90650ced21 in exaFillRegionTiled (pDrawable=<value optimized out>, 
    pRegion=<value optimized out>, pTile=<value optimized out>, 
    pPatOrg=<value optimized out>, planemask=<value optimized out>, 
    alu=<value optimized out>, clientClipType=<value optimized out>)
    at exa_accel.c:1192
#11 0x00007f90650cf308 in exaPolyFillRect (pDrawable=0x19dcd70, pGC=0x19b79c0, 
    nrect=1, prect=0x18cf06c) at exa_accel.c:804
#12 0x00000000004d220b in damagePolyFillRect (pDrawable=0x19dcd70, 
    pGC=0x19b79c0, nRects=1, pRects=0x18cf06c) at damage.c:1404
#13 0x000000000042a534 in ProcPolyFillRectangle (client=0x18a0e30)
    at dispatch.c:1945
#14 0x000000000042c5dc in Dispatch () at dispatch.c:445
#15 0x0000000000421c6a in main (argc=<value optimized out>, 
    argv=<value optimized out>, envp=<value optimized out>) at main.c:285
Current language:  auto
The current source language is "auto; currently asm".
Comment 17 Matthew Miller 2009-10-01 10:17:27 EDT
*** Bug 524368 has been marked as a duplicate of this bug. ***
Comment 18 Matthew Miller 2009-10-01 10:30:19 EDT
Note that putting radeon.modeset=0 on the kernel command line, as suggested in bug 524368, makes the problem disappear.
Comment 19 Kevin DeKorte 2009-10-01 15:01:30 EDT
Upgraded to 

xorg-x11-server-Xorg-1.6.99.903-2.fc12.x86_64
xorg-x11-server-devel-1.6.99.903-2.fc12.x86_64
xorg-x11-server-common-1.6.99.903-2.fc12.x86_64
xorg-x11-server-utils-7.4-11.fc12.x86_64
xorg-x11-server-Xnest-1.6.99.903-2.fc12.x86_64

as found in koji, that is supposed to contain EXA fixes, but I still get the lockups.
Comment 20 Matthew Miller 2009-10-01 15:08:19 EDT
Removing needinfo flag -- backtraces are provided.
Comment 21 Kevin DeKorte 2009-10-02 12:26:06 EDT
I believe that an upgraded kernel solves this issue. I was repeatedly having KMS issues with crashing but after following this process the crashing seems to have stopped...

Foundation RAWHIDE as of Oct 2 + xserver from Koji

xorg-x11-server-Xorg-1.6.99.903-2.fc12.x86_64
xorg-x11-server-devel-1.6.99.903-2.fc12.x86_64
xorg-x11-server-common-1.6.99.903-2.fc12.x86_64
xorg-x11-server-utils-7.4-11.fc12.x86_64
xorg-x11-server-Xnest-1.6.99.903-2.fc12.x86_64


And I upgraded my kernel from git using this process

git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-2.6

cd linux-2.6

git checkout -b radeon_kms origin/master

git remote add -t drm-next airlied_drm_remote git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git

git pull airlied_drm_remote drm-next

cp /boot/config-2.6.31.1-56.fc12.x86_64 .config

make menuconfig

make all
sudo make modules_install
sudo make install


Since then I have not had any lockups like I had before. Also, I tried just using the drm-next kernel and seemed to still have the lockup issues, but can try again if anyone needs me to.

Still have minor cursor issues (white line at the top of the cursor box, occasionally), but at least the machine doesn't lock.
Comment 22 Kevin DeKorte 2009-10-08 10:23:51 EDT
kernel-2.6.31.1-65.fc12.x86_64 from koji seems to still have lockup issues. gtkperf -c 500 -a can lockup the machine almost instantly.
Comment 23 Kevin DeKorte 2009-10-08 17:07:15 EDT
Created attachment 364182 [details]
.config I use as requested by airlied
Comment 24 Quentin Armitage 2009-10-12 09:38:55 EDT
Created attachment 364456 [details]
gdb bt -f output

Re comment #2, I have not hadd this problem since I originally opened this bug, but it has just occurred again. I attach the output of gdb backtrace, and strace -p as requested.

I note that this bug is now has component xorg-x11-drv-ati, but I am using xorg-x11-drv-nouveau.
Comment 25 Quentin Armitage 2009-10-12 09:39:36 EDT
Created attachment 364457 [details]
strace -p output
Comment 26 Kevin DeKorte 2009-10-12 10:51:00 EDT
I found another new kernel in koji

kernel-2.6.32-0.24.rc4.git0.fc13.x86_64.rpm

And retested with it. With this kernel I could lock up X in < 30 seconds, and the grid in gdm (on the right screen, I have two) had red in the lower part of the grid, looked like a color mapping issue.

I built my own 2.6.32-rc4 kernel using the attached config file and I am having no issues with that kernel.
Comment 27 Kevin DeKorte 2009-10-12 10:51:54 EDT
Created attachment 364471 [details]
.config file used to build kernel 2.6.32-rc4
Comment 28 Kevin DeKorte 2009-10-12 10:52:59 EDT
Created attachment 364473 [details]
diff .config /boot/config-2.6.32-0.24.rc4.git0.fc13.x86_64

diff of my kernel config vs the fedora kernel config
Comment 29 mursusoft 2009-10-14 10:50:20 EDT
Tried Fedora 12 snapshot 20091011 livecd with kde and as soon as I got on the desktop it just froze and had to reboot with the power of power button.

GFX card is Radeon HD 4850
Comment 30 Kevin DeKorte 2009-10-15 00:08:14 EDT
I worked with airlied on this and we found that adding 

pcie_aspm=off

To the kernel line in grub stopped the hangs in most situations. Can anyone else try this and see if it works for them? 

My system: 

Q6600 CPU 64bit
ATI rv635 chipset video card
Comment 31 Matthew Miller 2009-10-19 11:59:31 EDT
pcie_asm=off seems to make the problem go away for me too. Radeon M76XT (Mobility Radeon HD 2600 XT, ChipID= 0x9583) in an iMac.
Comment 32 Rand All 2009-10-20 21:21:58 EDT
(In reply to comment #30)
> I worked with airlied on this and we found that adding 
> 
> pcie_aspm=off
> 
> To the kernel line in grub stopped the hangs in most situations. Can anyone
> else try this and see if it works for them? 
> 
> My system: 
> 
> Q6600 CPU 64bit
> ATI rv635 chipset video card  

I was having what looks like the same problem with my RadeonHD 3850.  Adding the above incantation to my grub.conf seems to fix it.

I have no idea what that switch does, but it works.
Comment 33 Ville-Pekka Vainio 2009-10-23 12:45:11 EDT
I might be experiencing the same problem with the system described at http://www.smolts.org/client/show/pub_5c61ca5d-e74f-4a22-a0d0-7320c317e39b - according to Smolt the video card is "ATI Technologies Inc Mobility Radeon HD 3450", even though this is a desktop computer.

I was directed to this bug by Adam's comment at https://bugzilla.redhat.com/show_bug.cgi?id=528593#c41

If I boot the system without pcie_asm=off, it completely hangs after about a minute or two of use, I can't even ssh into it. If I add pcie_asm=off, it works, but 2D rendering seems to be a bit slower.
Comment 34 Matthew Miller 2009-10-23 12:55:22 EDT
Agree on the impression of slower 2D rendering.

I'm going to humbly suggest that this is a F12 blocker....
Comment 35 Kevin DeKorte 2009-10-23 14:05:43 EDT
DRI2 2d operations are slower than DRI1 2d operations, however it does not imply this is a blocker. There is work to be done in optimizing this code, so it should get faster.
Comment 36 Matthew Miller 2009-10-23 14:15:59 EDT
The blocker is that the system entirely locks up with a very common and popular graphics card without a kernel command line workaround. Until that can at least be made automatic, it's definitely a blocker.
Comment 37 Kevin DeKorte 2009-10-23 14:48:07 EDT
Oh, I agree that discussions of this being a blocker on F12 should be done. Not due to the slower 2d performance, but due to the hang. 

Possible options discussed were:

1. disabling pcie_aspm by default,but a kernel developer was not in favor of this. 

2. Fixing the ATI kms driver to react properly to pcie_aspm events. 

3. Or when an ATI card is detected then disable pcie_aspm. 

R600 support for F12 is an advertised feature of it and it will cause a lot of people to try it out, if the machine hangs within a couple of minutes then people are gonna really say the Fedora is junk. Also, making this problem more difficult it appears that only a subset of the r6xx cards are affected by this and it may depend on the motherboard and video card combination for the problem to appear.
Comment 38 Rand All 2009-10-23 15:31:06 EDT
In regards to slow performance on these cards after the kernel option hack: after I got past the hanging problem with that magic grub command, I went ahead and installed the experimental new mesa drivers and found that my graphics performance overall got a lot better.  Do we know if the exp drivers are going to become non-experimental in time for F12 release?
Comment 39 Adam Williamson 2009-10-23 21:13:10 EDT
"I went ahead
and installed the experimental new mesa drivers and found that my graphics
performance overall got a lot better"

that doesn't make any sense unless you're using KDE with desktop effects enabled. Installing that package can _only_ affect 3D performance. No, the DRI for r600 will not be included by default for the release, it's not ready yet.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 40 Adam Williamson 2009-10-23 21:17:48 EDT
Thanks to Vedran Miletić for pointing out that this bug report is very confused. The original report is from an NVIDIA user.

Unfortunately this bug has built up so much momentum as being for the Radeon pcie_aspm issue that I don't think we'll be able to derail it. Quentin, I'm very sorry, but could you file a new bug for your problem? it's not right to make you do the work but in practical terms I can't see a better way to move forward :/

please let me know the URL once you've filed and I'll triage it appropriately.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 41 Kevin DeKorte 2009-10-23 21:26:59 EDT
Adam, 

I originally had a bug specifically for r600 lockup issues...

https://bugzilla.redhat.com/show_bug.cgi?id=524368

But that bug was merged into this one.
Comment 42 Matthew Miller 2009-10-23 21:43:44 EDT
Yeah, sorry about that -- I probably should have gone the other way around, but *I* didn't notice until too late. The bug got switched to ATI at around comment #3.
Comment 43 Quentin Armitage 2009-10-24 05:51:14 EDT
Re comment #40, no worries, I have created a new bug #530694 for NVIDIA problem.

Quentin
Comment 44 Jonathan Larmour 2009-10-25 23:22:52 EDT
Since this is now the place for Radeon X problems. I also have problems with a (new) Dell Studio laptop when *installing* the Fedora 12 x86/64 Beta. It crashes pretty reliably during installation unless I pass "nomodeset" on the kernel args for the installer.

Here's the relevant lspci -v which I've had to transcribe, not cut'n'paste:
01:00.0 VGA compatible controller: ATI Technologies Inc Mobility Radeon HD 3650 (prog-if 00 [VGA controller])
 Subsystem: Dell Device 02a0
 Flags: bus master, fast devsel, latency 0, IRQ 16
 Memory at d0000000 (32-bit, prefetchable) [size=256M]
 I/O ports at 2000 [size=256]
 Memory at fc000000 (32-bit, non-prefetchable) [size=64K]
 [virtual] Expansion ROM at fc020000 [disabled] [size=128K]
 Capabilities: [50] Power Management version 3
 Capabilities: [58] Express Legacy Endpoint, MSI 00
 Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
 Capabilities: [100] Vendor Specific Information <?>
 Kernel driver in use: radeon

Transcribing /tmp/X.log is too hard :), but I did notice:
(EE) AIGLX error: dlopen of /usr/lib64/dri/r600_dri.so failed (/usr/lib64/dri/r600_dri.so: cannot open shared object file: No such file or directory)
(EE) AIGLX: Reverting to software rendering

which may be a factor. I can copy out anything else specific if you ask.

The lockup occurs just by accepting defaults until you get to the Timezone selection page. All I have to do is move the mouse over and over the graphical map and then it locks up in a few seconds. The mouse pointer can still move, but nothing else works, including ctrl-alt-f1, ctrl-alt-del, alt-sysrq-anything. Sometimes the screen goes completely blank instead (and locks up). Sometimes there's some dots at the very top left of the screen. Unfortunately the lock-up means I can't switch back to the console to look at X.log :-(. Repeating the exact same with "nomodeset" on the kernel cmdline and there's no lockup.

Unfortunately I cannot properly install FC12 Beta as I'm going to have to swap out this laptop due to an unrelated fault. But hopefully this may allow someone to reproduce the problem, without even having to get as far as installing FC12 beta. Needless to say, crashing during installation is not going to win new followers, and later yum updates won't help!

Incidentally I tried running the installer with the extra kernel arg radeon.modeset=0 as suggested in comment #18 (instead of nomodeset), but this resulted in /sbin/loader SEGVing on boot before anaconda has even had a chance to properly get going:

running install...
running /sbin/loader
loader received SIGSEGV! Backtrace:
/sbin/loader[0x40783e]
/lib64/libc.so.6(+0x335f0)[0x7f0cb59aa5f0]
/sbin/loader[0x40c2f9]
/sbin/loader[0x408819]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f0cb5995b4d]
/sbin/loader[0x4058f9]
install exited abnormally [1/1]

Should I submit this as a separate bug? Is this meant to work?
Comment 45 António Lima 2009-10-26 13:10:54 EDT
I have the same lockup with Fedora 12 Beta. I used the livecd because this is a production machine and I can't install fedora 12 yet. X hangs shortly after login and only mouse continues to respond.

My graphic card is ATI Technologies Inc Mobility Radeon HD 3400 Series (RV 620).

I could not get a xorg log since the keyboard was unresponsive. Please let me know if I can provide more info, but I really can't do a proper install in this machine.
Comment 46 Adam Williamson 2009-10-26 17:16:33 EDT
Jonathan: "Since this is now the place for Radeon X problems."

No, it isn't. It's the report for the problem where you get X hanging shortly after system boot unless you use the 'pcie_aspm=off' boot parameter. If that's the case for you, please say that specifically, and note what your hardware is. If the symptoms or workaround are different for you, please report a separate bug.

Antonio, same goes for you, and for anyone else looking at this report. If you think you have the same problem, *please* try the pcie_aspm=off workaround. If that works for you, then you have this bug, and may add a comment; please explicitly state in your comment that the pcie_aspm=off workaround works for you, to avoid confusion. If it does NOT work for you, you have a different problem; please do not comment on this bug.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 47 Dave Airlie 2009-10-26 17:24:43 EDT
Kevin can you test with the -97 kernel from koji?

I think I have the aspm turned off for r600/rv770.
Comment 48 Kevin DeKorte 2009-10-26 18:25:14 EDT
Dave,

Well the -97 kernel seems to be working well so far without the pcie-aspm=off so that is good... not sure I understand completely what is going on here.

Command line: ro root=UUID=d8972dea-7186-4632-b087-93aa3b11ab9c rhgb quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us 



[kdekorte@quad ~]$ lspci   (chopped)
01:00.0 VGA compatible controller: ATI Technologies Inc Mobility Radeon HD 3600 Series
01:00.1 Audio device: ATI Technologies Inc RV635 Audio device [Radeon HD 3600 Series]
02:00.0 Ethernet controller: Attansic Technology Corp. L1 Gigabit Ethernet Adapter (rev b0)

[kdekorte@quad ~]$ dmesg | grep -i aspm
pci 0000:02:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'


looks like it disabled ASPM on the ethernet card rather than the VGA card, just not sure if the message is correct. But either way it hasn't locked up yet. And I ran my normal set of things that lock it up normally.
Comment 49 Jonathan Larmour 2009-10-26 21:15:11 EDT
@Adrian (comment #48): Apologies, pcie_aspm=off is not a workaround for me. Up to then, discussion was not specific to the use of that workaround so I was continuing in that vein.

I have created bug #531147 to track this second Radeon lock-up issue, which adding "nomodeset" to the kernel args does appear to work around.
Comment 50 Adam Williamson 2009-10-26 21:39:51 EDT
Adrian? that's a new one. I've had Andrew before, but not Adrian. =)

Depending on your hardware, your bug may also be https://bugzilla.redhat.com/show_bug.cgi?id=528593 . It's a bit chaotic at the moment to try and track exactly who has exactly what bug. Keep an eye on that one.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 51 Adam Williamson 2009-10-26 21:46:44 EDT
Can those who have this bug - i.e. those for whom pcie_aspm=off works around the problem - please test this kernel build:

http://koji.fedoraproject.org/koji/buildinfo?buildID=138225

and see how that works for them *without* that parameter? It ought to work.

(Kernel install procedure: use rpm -ivh for the kernel and -devel packages, and rpm -Uvh for the -headers and -firmware packages).

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 52 Ville-Pekka Vainio 2009-10-27 05:25:18 EDT
(In reply to comment #51)

Adam, it does work with my HD3450, which apparently is an RV620.
Comment 53 Matthew Miller 2009-10-27 10:49:24 EDT
Ditto: 2.6.31.5-97.fc12.x86_64 works for me.

Radeon M76XT (Mobility Radeon HD 2600 XT, ChipID= 0x9583) in an iMac.  


(Note: gdm may be broken in rawhide right now -- bug #531195. If you reboot and get no X at all, that could be it.)
Comment 54 Adam Williamson 2009-10-27 21:43:38 EDT
Looks like the patch is good, Dave - but we can't close the bug until it's tagged into F12 final.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 55 Adam Williamson 2009-10-30 17:06:22 EDT
Discussed at the blocker bug review today. This is probably the same as 531147 and 528593, I will confirm and re-triage later. Seems to be an issue affecting r600+ adapters on ICH9/ICH10 motherboards. Jerome and Dave are investigating and trying to reproduce. Stays as a blocker.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 56 Maarten 2009-11-01 07:35:55 EST
Consistent system freeze after a few seconds except for mouse movement. Using Live USB with desktop-i386-20091031.15.iso. pcie_aspm=off fixes the issue. In fact using fedora 12 right now after first try.  

lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation 4 Series Chipset DRAM Controller [8086:2e20] (rev 03)
00:01.0 PCI bridge [0604]: Intel Corporation 4 Series Chipset PCI Express Root Port [8086:2e21] (rev 03)
00:1a.0 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4 [8086:3a37]
00:1a.1 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5 [8086:3a38]
00:1a.2 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6 [8086:3a39]
00:1a.7 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2 [8086:3a3c]
00:1b.0 Audio device [0403]: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller [8086:3a3e]
00:1c.0 PCI bridge [0604]: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 1 [8086:3a40]
00:1c.4 PCI bridge [0604]: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 5 [8086:3a48]
00:1c.5 PCI bridge [0604]: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 6 [8086:3a4a]
00:1d.0 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 [8086:3a34]
00:1d.1 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 [8086:3a35]
00:1d.2 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 [8086:3a36]
00:1d.7 USB Controller [0c03]: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1 [8086:3a3a]
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev 90)
00:1f.0 ISA bridge [0601]: Intel Corporation 82801JIB (ICH10) LPC Interface Controller [8086:3a18]
00:1f.2 IDE interface [0101]: Intel Corporation 82801JI (ICH10 Family) 4 port SATA IDE Controller [8086:3a20]
00:1f.3 SMBus [0c05]: Intel Corporation 82801JI (ICH10 Family) SMBus Controller [8086:3a30]
00:1f.5 IDE interface [0101]: Intel Corporation 82801JI (ICH10 Family) 2 port SATA IDE Controller [8086:3a26]
01:00.0 VGA compatible controller [0300]: ATI Technologies Inc Mobility Radeon HD 3450 [1002:95c5]
01:00.1 Audio device [0403]: ATI Technologies Inc RV620 Audio device [Radeon HD 34xx Series] [1002:aa28]
02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller [10ec:8168] (rev 02)
03:00.0 IDE interface [0101]: Marvell Technology Group Ltd. 88SE6101 single-port PATA133 interface [11ab:6101] (rev b2)
Comment 57 Adam Williamson 2009-11-01 17:18:21 EST
So, I've done a general survey of all the Radeon-hanging-related bugs. Here's the results as they relate to this bug.

I'm closing this one as it's just too much of a mess. Here is information for each individual who has posted to this thread:

Quentin Armitage, as discussed above your problem is different from the others but got lost in the noise, apologies. We will follow up your issue at https://bugzilla.redhat.com/show_bug.cgi?id=530694 , as previously discussed.

Matthew Miller, your bug is likely the general r600+ / ICH8+ issue, that will henceforth be centralized at #528593. However, we'd need to confirm your motherboard chipset to be sure. If 'lspci' shows several components with ICH8, ICH9 or ICH10 in their name, then your issue is 528593. If it shows something different, please paste your lspci for me to look at and I'll advise you.

Alex Magaz Graça, you have an Intel graphics adapter on ICH7 chipset, your bug is different from all the others. Please file a new bug for your issue, against the xorg-x11-drv-intel component.

Kevin DeKorte, you have the classic r600+ICH9 combination, I believe (IDing the ICH9 based on some messages in your dmesg output). Your bug is almost certainly #528593. If you think you don't have an ICH9 chipset, please paste lspci and I'll advise further.

Need Real Name (mursusoft), we don't know your motherboard chipset, but you commented on 528593 that removing your Audigy sound card solved the problem. This suggests it's different from other people's bugs and is probably a resource conflict of some kind. If you would still like it investigated, please file it as a new bug against the kernel.

Rand All, you have an r600 graphics chipset, which suggests your problem may well be #528593, but we'd need to know your motherboard chipset details to be sure. If you see a lot of components with ICH8, ICH9 or ICH10 in the output of 'lspci', your bug is #528593. If you don't, please paste your lspci output and I will advise further.

Ville-Pekka Vainio, you have the classic r600+ICH9 combination. Your bug is #528593.

Jonathan Larmour, you have the classic r600+ICH9 combination. Your bug is #528593.

António Lima, you have an r600 graphics chipset, which suggests your problem may well be #528593, but we'd need to know your motherboard chipset details to be sure. If you see a lot of components with ICH8, ICH9 or ICH10 in the output of 'lspci', your bug is #528593. If you don't, please paste your lspci output and I will advise further.

Maarten, you have an r600+ICH10 combination. This is almost certainly the same as #529593, so please follow along there for now.

Thanks for your reports, everybody.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 58 Matthew Miller 2009-11-01 21:17:15 EST
Yes, ICH8. This is the 24" iMac before they switched to Nvidia. So bug #528593 it is.

Note You need to log in before you can comment on or make changes to this bug.