Bug 503926 - Frequent hangs of Xorg on ppc64
Frequent hangs of Xorg on ppc64
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-ati (Show other bugs)
11
All Linux
low Severity urgent
: ---
: ---
Assigned To: Dave Airlie
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-03 08:05 EDT by Jakub Jelinek
Modified: 2010-02-26 07:20 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-02-26 07:20:41 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Xorg.0.log.bz2 (9.39 KB, text/plain)
2009-06-03 08:05 EDT, Jakub Jelinek
no flags Details

  None (edit)
Description Jakub Jelinek 2009-06-03 08:05:31 EDT
Created attachment 346378 [details]
Xorg.0.log.bz2

Xorg on ppc64 (Terrasoft Powerstation) often hangs, the process is unkillable.
This has been happening already in Fedora 10 and keeps happening even with latest rawhide.
sudo pstack 1840
#0  0x0f8c84c8 in ioctl () from /lib/libc.so.6
#1  0x0f2f1a28 in drmDMA () from /usr/lib/libdrm.so.2
#2  0x0f1e8ad8 in RADEONCPGetBuffer ()
#3  0x0f1e8f0c in RADEONCPFlushIndirect ()
#4  0x0f247124 in ?? () from /usr/lib/xorg/modules/drivers//radeon_drv.so
#5  0x0f115ed0 in ?? () from /usr/lib/xorg/modules//libexa.so
#6  0x0f117958 in ?? () from /usr/lib/xorg/modules//libexa.so
#7  0x10158d34 in ?? ()
#8  0x0f1196d4 in exaGlyphs () from /usr/lib/xorg/modules//libexa.so
#9  0x101585f8 in ?? ()
#10 0x10143740 in CompositeGlyphs ()
#11 0x101502f8 in ?? ()
#12 0x1014b0f8 in ?? ()
#13 0x10041450 in Dispatch ()
#14 0x10021c78 in main ()
and
sudo strace -p 1840
Process 1840 attached - interrupt to quit
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [ABRT FPE KILL USR1 SEGV USR2 STKFLT CHLD CONT STOP TSTP TTIN IO])
ioctl(7, 0xc0286429, 0xff87e0b8)        = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [ABRT FPE KILL USR1 SEGV USR2 STKFLT CHLD CONT STOP TSTP TTIN IO])
ioctl(7, 0xc0286429, 0xff87e0b8)        = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [ABRT FPE KILL USR1 SEGV USR2 STKFLT CHLD CONT STOP TSTP TTIN IO])
ioctl(7, 0xc0286429, 0xff87e0b8)        = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [ABRT FPE KILL USR1 SEGV USR2 STKFLT CHLD CONT STOP TSTP TTIN IO])
ioctl(7, 0xc0286429, 0xff87e0b8)        = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [ABRT FPE KILL USR1 SEGV USR2 STKFLT CHLD CONT STOP TSTP TTIN IO])
ioctl(7, 0xc0286429, 0xff87e0b8)        = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [ABRT FPE KILL USR1 SEGV USR2 STKFLT CHLD CONT STOP TSTP TTIN IO])
ioctl(7, 0xc0286429, 0xff87e0b8)        = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [ABRT FPE KILL USR1 SEGV USR2 STKFLT CHLD CONT STOP TSTP TTIN IO])
ioctl(7, 0xc0286429, 0xff87e0b8)        = -1 EBUSY (Device or resource busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [ABRT FPE KILL USR1 SEGV USR2 STKFLT CHLD CONT STOP TSTP TTIN IO])

Usually the hangs happen when either starting up firefox or when scrolling in a larger web page.  Haven't seen this happen when using exclusively gnome-terminals.  Mouse cursor is moving but that's it, keyboard doesn't work nor mouse clicks have any effect.  Nothing interesting in dmesg while it hangs,
after I killall -9 Xorg the process stops eating all CPU and an OOPS appears:
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xd000000000b296a0
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=128 NUMA Maple
Modules linked in: fuse radeon drm sunrpc ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv6 ip6t_REJECT ip6table_filter ip6_tables ipv6 dm_multipath uinput tg3 ata_generic ipr pata_amd shpchp sata_sil [last unloaded: scsi_wait_scan]
NIP: d000000000b296a0 LR: d000000000b2994c CTR: c000000000092cf4
REGS: c000000074c22fb0 TRAP: 0300   Not tainted  (2.6.29.3-140.fc11.ppc64)
MSR: 9000000000009032 <EE,ME,IR,DR>  CR: 24084442  XER: 20000000
DAR: 0000000000000000, DSISR: 0000000040000000
TASK = c000000074b39a80[1840] 'Xorg' THREAD: c000000074c20000 CPU: 0
GPR00: 0000000000600017 c000000074c23230 d000000000be2680 c0000000773e3000 
GPR04: 0000000000000000 0000000000030426 000000000000000a 00000000000c10bc 
GPR08: d000000001005000 0000000000000000 0000000000030426 c000000074babdc0 
GPR12: 0000000024000024 c000000000ea2400 00000000ff87f540 00000000ff87f2d0 
GPR16: 0000000000000004 0000000000000008 0000000000418004 00000000003c0000 
GPR20: 0000000008430000 c0000000770cc388 c000000074c23ea0 0000000000000000 
GPR24: c000000079a6d148 c000000079a6d160 c0000000773e3180 c000000079a6d000 
GPR28: 0000000000000000 c0000000773e3000 d000000000bdf4c8 c000000074c23230 
NIP [d000000000b296a0] .radeon_read_ring_rptr+0xe8/0x120 [radeon]
LR [d000000000b2994c] .radeon_get_ring_head+0x98/0x198 [radeon]
Call Trace:
[c000000074c23230] [0000000000000001] 0x1 (unreliable)
[c000000074c232c0] [d000000000b2994c] .radeon_get_ring_head+0x98/0x198 [radeon]
[c000000074c23350] [d000000000b29b08] .radeon_commit_ring+0xbc/0x264 [radeon]
[c000000074c233e0] [d000000000b367d4] .radeon_do_cp_idle+0x1cc/0x1f0 [radeon]
[c000000074c23470] [d000000000b3773c] .radeon_do_release+0xf0/0x354 [radeon]
[c000000074c23520] [d000000000b43c48] .radeon_driver_lastclose+0x6c/0x94 [radeon]
[c000000074c235c0] [d000000000a0316c] .drm_lastclose+0x8c/0x354 [drm]
[c000000074c23670] [d000000000a03a9c] .drm_release+0x634/0x6b8 [drm]
[c000000074c23720] [c00000000017def4] .__fput+0x174/0x270
[c000000074c237d0] [c00000000017e03c] .fput+0x4c/0x60
[c000000074c23860] [c0000000001798f4] .filp_close+0xc8/0xf4
[c000000074c23900] [c0000000000a7918] .put_files_struct+0xd0/0x16c
[c000000074c239c0] [c0000000000a7a24] .exit_files+0x70/0x8c
[c000000074c23a50] [c0000000000a9e7c] .do_exit+0x2a8/0x8c4
[c000000074c23b50] [c0000000000aa564] .do_group_exit+0xcc/0x100
[c000000074c23bf0] [c0000000000ba6a4] .get_signal_to_deliver+0x458/0x4e8
[c000000074c23ce0] [c000000000015120] .do_signal+0x7c/0x37c
[c000000074c23e30] [c000000000008c48] do_work+0x24/0x28
Instruction dump:
e97e8020 800b0000 2f800000 419e003c 3880ffff 7d234b78 78840020 4805a3b5 
e8410028 48000020 579c003a e92b0018 <7c09e02e> 78034602 5003c42e 5003421e 
---[ end trace 3747c1052a895b99 ]---
Fixing recursive fault but reboot is needed!
rpm -q xorg-x11-drv-ati kernel; uname -a
xorg-x11-drv-ati-6.12.2-14.fc11.ppc
kernel-2.6.29.3-140.fc11.ppc64
kernel-2.6.29.4-162.fc11.ppc64
kernel-2.6.29.4-167.fc11.ppc64
Linux sickle.ms.mff.cuni.cz 2.6.29.3-140.fc11.ppc64 #1 SMP Tue May 12 10:33:41 EDT 2009 ppc64 ppc64 ppc64 GNU/Linux
Comment 1 Bug Zapper 2009-06-09 13:02:12 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 2 Matěj Cepl 2009-11-05 13:30:35 EST
Since this bugzilla report was filed, there have been several major updates in various components of the Xorg system, which may have resolved this issue. Users who have experienced this problem are encouraged to upgrade their system to the latest version of their packages. For packages from updates-testing repository you can use command

yum upgrade --enablerepo='*-updates-testing'

Alternatively, you can also try to test whether this bug is reproducible with the upcoming Fedora 12 distribution by downloading LiveMedia of F12 Beta available at http://alt.fedoraproject.org/pub/alt/nightly-composes/ . By using that you get all the latest packages without need to install anything on your computer. For more information on using LiveMedia take a look at https://fedoraproject.org/wiki/FedoraLiveCD .

Please, if you experience this problem on the up-to-date system, let us now in the comment for this bug, or whether the upgraded system works for you.

If you won't be able to reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you.

[This is a bulk message for all open Fedora Rawhide Xorg-related bugs. I'm adding myself to the CC list for each bug, so I'll see any comments you make after this and do my best to make sure every issue gets proper attention.]
Comment 3 Matěj Cepl 2010-02-26 07:16:36 EST
Could you please reply to the previous question? If you won't reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you.

[Note please, that this is machine generated comment for large amount of bugs; due to some technical issues, it is possible we've missed some of the responses -- it is happens, please, just a make a comment about that; that we will see. Thank you]
Comment 4 Jakub Jelinek 2010-02-26 07:20:41 EST
Sorry, the box died hw-wise, so I have no reproducer anymore.

Note You need to log in before you can comment on or make changes to this bug.