Bug 662312 - [RV280] With 2.6.37+ system crashes after a little while unless agpmode=-1 or only 1 active cpu
Summary: [RV280] With 2.6.37+ system crashes after a little while unless agpmode=-1 or...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-ati
Version: 23
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
Assignee: Jérôme Glisse
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-12-11 13:56 UTC by Bruno Wolff III
Modified: 2018-04-11 17:03 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-12-20 12:05:02 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Xorg.0.log from a 3.6.36 boot (30.83 KB, text/plain)
2010-12-11 13:56 UTC, Bruno Wolff III
no flags Details
dmesg (96.95 KB, text/plain)
2010-12-21 01:20 UTC, Bruno Wolff III
no flags Details
/var/log/messages (145.19 KB, text/plain)
2010-12-21 01:21 UTC, Bruno Wolff III
no flags Details
Xorg.0.log (30.83 KB, text/plain)
2010-12-21 01:21 UTC, Bruno Wolff III
no flags Details
xorg.conf (910 bytes, text/plain)
2010-12-21 01:22 UTC, Bruno Wolff III
no flags Details

Description Bruno Wolff III 2010-12-11 13:56:36 UTC
Created attachment 468148 [details]
Xorg.0.log from a 3.6.36 boot

Description of problem:
When I boot using 2.6.37 kernels (from Kyle's repo) I see crashes typically within in 10 minutes if I am using the console doing some light graphics related stuff (clicking on a link in firefox/minefield or playing colossus).
If I don't use the console the system will stay up longer. 2.6.36 kernels will occasionally crash on this system, but are much rarer and happen when I am not using the console.
This happens with kernel-PAE-2.6.37-0.rc5.git2.1.fc15.i686 (I installed this from Kyle's repo before he did a rawhide build with the same version.) as well as earlier 2.6.37 in Kyle's repo.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Bruno Wolff III 2010-12-11 14:21:11 UTC
I forgot to include the version of the radeon driver:
xorg-x11-drv-ati-6.13.2-0.3.20101201gite142e55c5.fc15.i686

Comment 2 Bruno Wolff III 2010-12-15 16:37:59 UTC
I tried 2.6.37-0.rc5.git5.1.fc15.i686.PAE and I got an X lockup. The cursor still followed the mouse, but ctrl-alt-f2 didn't get a vt. I was able to ssh in to the machine and things seemed normal.

Comment 3 Bruno Wolff III 2010-12-20 06:32:52 UTC
I tried out kernel-PAE-2.6.37-0.rc6.git5.1.fc15.i686 and playing colossus for a short bit triggered a system crash.

Comment 4 Matěj Cepl 2010-12-21 00:09:52 UTC
Thanks for the bug report.  We have reviewed the information you have provided above, and there is some additional information we require that will be helpful in our diagnosis of this issue.

Please add drm.debug=0x04 to the kernel command line, restart computer, and attach

* your X server config file (/etc/X11/xorg.conf, if available),
* X server log file (/var/log/Xorg.*.log)
* output of the dmesg command, and
* system log (/var/log/messages)

to the bug report as individual uncompressed file attachments using the bugzilla file attachment link above.

We will review this issue again once you've had a chance to attach this information.

Thanks in advance.

Comment 5 Bruno Wolff III 2010-12-21 01:20:15 UTC
Created attachment 469883 [details]
dmesg

Comment 6 Bruno Wolff III 2010-12-21 01:21:04 UTC
Created attachment 469884 [details]
/var/log/messages

Comment 7 Bruno Wolff III 2010-12-21 01:21:52 UTC
Created attachment 469885 [details]
Xorg.0.log

Comment 8 Bruno Wolff III 2010-12-21 01:22:39 UTC
Created attachment 469886 [details]
xorg.conf

Comment 9 Matěj Cepl 2010-12-21 14:59:31 UTC
(In reply to comment #8)
> Created attachment 469886 [details]
> xorg.conf

Why do you have xorg.conf at all? I don't see anything particularly special there. What happens if you move it out of the way?

Thank you

Comment 10 Bruno Wolff III 2010-12-21 16:14:53 UTC
Because the monitor doesn't do EDID (at least not properly) and 1280x1024 isn't considered a safe resolution for unknown monitors. So to use the native resolution of the monitor I need to use an xorg.conf file. If I don't use it I get some squashed 4:3 setup at a low resolution.

Comment 11 Matěj Cepl 2010-12-22 15:41:03 UTC
That was my last straw I was catching on (you could probably achieve the same result with control-center/Display, but I don't expect much difference in the result) ... I don't see absolutely nothing wrong with the attached logs.

Passing to the developers.

Comment 12 Bruno Wolff III 2011-01-04 06:43:25 UTC
I retested this with kernel-PAE-2.6.37-0.rc8.git3.1.fc15.i686 and saw the same issue. (In this case the mouse pointer moved on the screen, but the icon was the wait icon which normally spins, but wasn't.)

Comment 13 Bruno Wolff III 2011-01-21 14:38:45 UTC
I am still seeing this with kernel-PAE-2.6.38-0.rc1.git1.1.fc15.i686 (from http://koji.fedoraproject.org/koji/taskinfo?taskID=2734067).

Comment 14 Bruno Wolff III 2011-02-21 19:58:25 UTC
I am still seeing this with kernel-PAE-2.6.38-0.rc5.git5.1.fc15.i686.

Comment 15 Bruno Wolff III 2011-03-05 15:41:41 UTC
I am still getting crashes with kernel-PAE-2.6.38-0.rc7.git2.3.fc15.i686 and xorg-x11-drv-ati-6.14.0-2.20110204gita27b5dbd9.fc15.i686. I continue to run kernel-PAE-2.6.36.2-12.rc1.fc15.i686 as a work around.

Comment 16 Dave Airlie 2011-03-15 01:58:16 UTC
does booting with radeon.agpmode=1 help at all.

if not what about radeon.agpmode=-1?

Comment 17 Bruno Wolff III 2011-03-15 08:02:39 UTC
radeon.agpmode=1 didn't help, system crashed pretty quickly after logging into a graphical desktop.

radeon.agpmode=-1 does seem to have helped. I normally had been crashing by this point. It is a bit early to be absolutely sure, but it looks good.

This is with kernel-PAE-2.6.38-1.fc15.i686 and xorg-x11-drv-ati-6.14.0-6.20110315git4d3504970.fc15.i686.

Thanks for the suggestion. I really wanted to get off 2.6.36 so I could look into some USB 1.0 issues.

Comment 18 Bruno Wolff III 2011-04-30 16:12:12 UTC
I played with this a bit more this morning using 2.6.38.4-20.
The system doesn't actually crash. If I have established an inbound ssh session, before the bug is triggered I have terminal access. I was unable to start a new ssh session after the bug was triggered. I didn't seem to be able to get alt-sysrq-c to trigger a crash from the console, but I was able to trigger one from the ssh session. I don't know if the traceback would be very useful. I do have a 2GB core dump. (The dump probably has luks keys in it, but I am thinking of changing the partition layout which would obsolete those keys, so I can provide the dump file if it would help.) If there are other things worth looking at while it's in the partially hung state, I can do that without too much trouble.

Comment 19 Bruno Wolff III 2011-10-21 23:55:20 UTC
I booted without radeon.agpmode=-1 and this problem affects 3.1.0-0.rc10.git0.1.fc16.i686.PAE (the rest of the system is rawhide, but debug kernels are running very slow, so I am using f16 kernels right now).

Comment 20 Bruno Wolff III 2012-01-23 08:30:57 UTC
This is still happening with kernel-PAE-3.3.0-0.rc1.git0.3.fc17.i686.
One new thing I noticed is when my machine came up with just one cpu that I didn't get a crash. But after I rebooted and had 2 functioning cpus I had a crash fairly soon.

Comment 21 Fedora End Of Life 2013-01-16 22:03:24 UTC
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 22 Fedora End Of Life 2013-07-04 05:28:52 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 23 Fedora End Of Life 2013-08-01 16:36:52 UTC
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 24 Bruno Wolff III 2013-09-21 07:29:41 UTC
This is still happening with kernel 3.12.0-0.rc1.git3.2.fc21.i686+PAE

Comment 25 Jan Kurik 2015-07-15 15:17:52 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle.
Changing version to '23'.

(As we did not run this process for some time, it could affect also pre-Fedora 23 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23

Comment 26 Bruno Wolff III 2016-05-17 12:33:43 UTC
So the machine with this card has died and I am not very likely to reuse the card in another machine. It is also possible this was related to a hardware bug in the CPU where a cache line related deadlock good happen on systems with more than one CPU. The CPU predates having updatable microcode for AMD and hence never got a fix.

Comment 27 Fedora End Of Life 2016-11-24 10:28:54 UTC
This message is a reminder that Fedora 23 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 23. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '23'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 23 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 28 Fedora End Of Life 2016-12-20 12:05:02 UTC
Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.