Bug 145534

Summary: (radeon 9200) X hung
Product: [Fedora] Fedora Reporter: Gene Czarcinski <gczarcinski>
Component: xorg-x11Assignee: X/OpenGL Maintenance List <xgl-maint>
Status: CLOSED ERRATA QA Contact: David Lawrence <dkl>
Severity: medium Docs Contact:
Priority: medium    
Version: 3Keywords: Triaged
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-04-11 02:51:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 136451    
Attachments:
Description Flags
copied just before doing kill for hung X process none

Description Gene Czarcinski 2005-01-19 14:52:56 UTC
Description of problem:
dual AMD 2800+ processors with ATI Radeon card 9200 (RV280)

Most works but occationally X seems to go into a hard loop ... mouse
continues to work but nothing else in X.  I can ssh into the system
from another system and kill X which is then restarted by gdm and
things work again.

lspci:

01:05.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon
9200] (rev 01)
01:05.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200]
(Secondary) (rev 01)

Segment of /var/log/messages:

Jan 19 09:33:35 amber sshd(pam_unix)[17149]: session opened for user
root by root(uid=0)
Jan 19 09:34:12 amber gconfd (gc-5359): Received signal 15, shutting
down cleanly
Jan 19 09:34:12 amber gconfd (gc-5359): Exiting
Jan 19 09:34:12 amber gdm[4917]: gdm_slave_xioerror_handler: Fatal X
error - Restarting :0
Jan 19 09:34:12 amber gdm(pam_unix)[4917]: session closed for user gc
Jan 19 09:34:12 amber su(pam_unix)[5582]: session closed for user root
Jan 19 09:34:15 amber kernel: agpgart: Found an AGP 2.0 compliant
device at 0000:00:00.0.
Jan 19 09:34:15 amber kernel: agpgart: Putting AGP V2 device at
0000:00:00.0 into 1x mode
Jan 19 09:34:15 amber kernel: agpgart: Putting AGP V2 device at
0000:01:05.0 into 1x mode
Jan 19 09:34:15 amber kernel: [drm] Loading R200 Microcode
Jan 19 09:35:13 amber gdm(pam_unix)[17232]: session opened for user gc
by (uid=0)

I am attaching /var/log/Xorg.0.log

Version-Release number of selected component (if applicable):
FC3 with current maintenance ... kernel 2.6.10-1.741_FC3smp and
xorg-x11 6.8.1-12.FC3.21

Comment 1 Gene Czarcinski 2005-01-19 14:54:00 UTC
Created attachment 109967 [details]
copied just before doing kill for hung X process

Comment 2 Mike A. Harris 2005-02-01 00:05:04 UTC
Please upgrade to the latest FC3 kernel, and install the latest
rawhide xorg-x11 rpms, which are currently 6.8.2rc3.  We will be
releasing 6.8.2 as erratum for FC3 in the very near future, and
there have been a significant number of Radeon driver bug fixes
since the xorg-x11 release you are experiencing this problem with.

Please update the report to indicate wether the problem is still
reproduceable using rawhide xorg-x11, and also which specific
version of xorg-x11 from rawhide that you've tried, as it gets
updated frequently.

Thanks in advance.

Comment 3 Mike A. Harris 2005-02-01 00:13:06 UTC
Setting to "NEEDINFO", awaiting rawhide xorg-x11 testing results
from reporter.

Comment 4 Gene Czarcinski 2005-02-01 11:56:26 UTC
OK, I pulled xorg-x11-6.8.1.903-2 from rawhide and am rebuilding it on
FC3 now (just to be safe).  I am currently running the
2.6.10-1.741_FC3smp kernel but have pulled the 2.6.10-1.753_FC3 kernel
from FC3-Updates-Testing.

Suggestion: if you want us to test stuff, put it into Testing rather
then rawhide.

Comment 5 Gene Czarcinski 2005-02-01 11:57:02 UTC
putting back to needinfo

Comment 6 Gene Czarcinski 2005-02-04 09:11:19 UTC
After running 2 days, 19 hours with the 2.6.10-1.753_FC3smp kernel and
xorg-x11-6.8.1.903-2 pulled from rawhide and rebuilt on my system, X
hung again.  BTW, when X hangs, the mouse cursor still can move but no
other graphical display and I cannot switch to a VT ... must ssh in
from another system to kill X.  I will be updating shortly to the
latest kernel.

Comment 8 Gene Czarcinski 2005-03-19 18:19:12 UTC
OK, I updated to xorg-x11-6.8.2-1.FC3.10test from FC3/Testing and this is even
worse ... I got three hangs in a matter of a few hours.

The problem with these hangs is that it is not clear what is causing them ...
nothing in /var/log/messages or /var/log/Xorg.0.log and the system is hard hung
requiring a power cycle or hardware reset (I cannot ssh in from another system).

BTW, I am running kernel-smp-2.6.10-1.766_FC3

Comment 9 Mike A. Harris 2005-03-22 13:15:48 UTC
Are you absolutely positive that it is X hanging the system?  If the
kernel or hardware hangs, X will be unuseable as well.  Your initial
report, the system did not completely hang, just had X problems, but
after updating, you now have a totally frozen system if I understand
correctly.

Please attach your X server log and config file, and /var/log/messages
from just after a crash.

Also, please try disabling DRI by commenting out load dri from the
X server config file, and see if that changes anything.  If disabling
DRI makes the problem go away, please re-enable DRI and try to stress
the system with OpenGL apps a bit and see if you can get the system
to hang more quickly and in a more reproduceable manner.

Thanks in advance.


Comment 10 Mike A. Harris 2005-03-22 13:16:57 UTC
Setting status to "NEEDINFO", awaiting feedback and file attachments.

Comment 11 Gene Czarcinski 2005-03-22 15:12:36 UTC
OK, I will not be able to run tests until later today or early tomorrow.

The reason I suspect X is that it is the only thing changing ... update to
6.8.2-1.FC3.10test and it hangs ... backoff to 6.8.1-12.FC3.12 and it does not
hang (as much anyway).

Comment 12 Gene Czarcinski 2005-03-25 15:51:11 UTC
OK, I have updated to 6.8.2-1.FC3.10test and edited /etc/X11/xorg.conf to
comment out loading dri.  I have not run OK for a couple of days.

I have not uncommented and am loading "dri" and waiting to see if the problem
occurs.

Comment 13 Gene Czarcinski 2005-03-25 15:52:12 UTC
"not run OK" -> "now run OK"

Comment 14 Mike A. Harris 2005-04-11 02:51:26 UTC
Ok, we've released 6.8.2 as official errata for FC3 now, and you've indicated
this seems to work now in the last few comments, so I'm closing as
fixed in "ERRATA".

If the problem recurs, please file a bug report in X.Org bugzilla with
complete details at http://bugs.freedesktop.org in the "xorg" component,
so they can fix it for 6.8.3.

Thanks.