Bug 490291 - kernel-2.6.29-0.237.rc7.git4.fc11 recursive fault in radeon_read_ring_rptr
kernel-2.6.29-0.237.rc7.git4.fc11 recursive fault in radeon_read_ring_rptr
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
11
All Linux
low Severity medium
: ---
: ---
Assigned To: Dave Airlie
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-03-14 17:20 EDT by Michal Jaegermann
Modified: 2009-10-15 10:16 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-10-15 10:16:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg from 2.6.29-0.237.rc7.git4.fc11 terminated by protection fault (33.36 KB, text/plain)
2009-03-14 17:20 EDT, Michal Jaegermann
no flags Details
traces from X server restart with the current configuration (5.75 KB, text/plain)
2009-04-08 17:02 EDT, Michal Jaegermann
no flags Details

  None (edit)
Description Michal Jaegermann 2009-03-14 17:20:41 EDT
Created attachment 335228 [details]
dmesg from 2.6.29-0.237.rc7.git4.fc11 terminated by protection fault

Description of problem:

I got "general protection fault: 0000 [#1] SMP" with:

RIP: 0010:[<ffffffffa0048083>]  [<ffffffffa0048083>] radeon_read_ring_rptr+0x3b/
0x50 [radeon]

Full dmesg output is attached.  It is terminated with "Fixing recursive fault but reboot is needed!" so I rebooted.

This happened on an X server restart.  After a reboot the same operation
did not fault the first time but I got for a change:

X:2822 freeing invalid memtype e0102000-e0112000
X:2822 freeing invalid memtype e0112000-e0122000
X:2822 freeing invalid memtype e0122000-e0132000
X:2822 freeing invalid memtype e0132000-e0142000
X:2822 freeing invalid memtype e0142000-e0152000
X:2822 freeing invalid memtype e0152000-e0162000
X:2822 freeing invalid memtype e0162000-e0172000
X:2822 freeing invalid memtype e0172000-e0182000
X:2822 freeing invalid memtype e0182000-e0192000
X:2822 freeing invalid memtype e0192000-e01a2000
X:2822 freeing invalid memtype e01a2000-e01b2000
X:2822 freeing invalid memtype e01b2000-e01c2000
X:2822 freeing invalid memtype e01c2000-e01d2000
X:2822 freeing invalid memtype e01d2000-e01e2000
X:2822 freeing invalid memtype e01e2000-e01f2000
X:2822 freeing invalid memtype e01f2000-e0202000
X:2822 freeing invalid memtype e0202000-e0212000
X:2822 freeing invalid memtype e0212000-e0222000
X:2822 freeing invalid memtype e0222000-e0232000
X:2822 freeing invalid memtype e0232000-e0242000
X:2822 freeing invalid memtype e0242000-e0252000
X:2822 freeing invalid memtype e0252000-e0262000
X:2822 freeing invalid memtype e0262000-e0272000
X:2822 freeing invalid memtype e0272000-e0282000
X:2822 freeing invalid memtype e0282000-e0292000
X:2822 freeing invalid memtype e0292000-e02a2000
X:2822 freeing invalid memtype e02a2000-e02b2000
X:2822 freeing invalid memtype e02b2000-e02c2000
X:2822 freeing invalid memtype e02c2000-e02d2000
X:2822 freeing invalid memtype e02d2000-e02e2000
X:2822 freeing invalid memtype e02e2000-e02f2000
X:2822 freeing invalid memtype e02f2000-e0302000

A repeated server restart got me the same pile of "freeing invalid" messages and the same general protection fault as above.

In both cases KMS was _not_ used.


Version-Release number of selected component (if applicable):
2.6.29-0.237.rc7.git4.fc11.x86_64

How reproducible:
Apparently not every time but it is possible to repeat.

Additional info:
ATI Technologies Inc R300 AD [Radeon 9500 Pro] graphics card.
Comment 1 Michal Jaegermann 2009-03-14 17:33:39 EDT
With 'nomodeset' dropped from a command line so far I was unable to repeat that fault. OTOH on a reboot a monitor become totally unusable (no sync) without power cycling.
Comment 2 Michal Jaegermann 2009-03-31 16:50:52 EDT
I re-tried with kernel 2.6.29-21.fc11.x86_64, xorg-x11-server-Xorg-1.6.0-16.fc11,
and xorg-x11-drv-ati-6.12.0-2.fc11.  Results are still as described in the
report.  That means a bunch of "X:2779 freeing invalid memtype ..." followed
by recursive fault in radeon_read_ring_rptr on a server restart or a monitor unusable after a reboot if "nomodeset" was not used.

An X server is truly dead after such experiment and there is nothing which can be done from a local console.  Even from a remote login X process becomes an unkillable zombie.
Comment 3 Michal Jaegermann 2009-04-08 17:02:11 EDT
Created attachment 338800 [details]
traces from X server restart with the current configuration

I tried the same experiment (and a few times in the meantime) with kernel-2.6.29.1-54.fc11.x86_64, xorg-x11-server-Xorg-1.6.0-17.fc11 and xorg-x11-drv-ati-6.12.1-9.fc11.  This time I do not see that recursive fault any longer.  OTOH a series of "freeing invalid memtype" followed by a long list of "calling reserve_ram_pages_type" is still there.  Is that expected?

A relevant fragment of dmesg output is attached.
Comment 4 Bug Zapper 2009-06-09 08:14:13 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 5 Jérôme Glisse 2009-10-14 06:51:48 EDT
Can you test with fedora 12 livecd and report if it works with it.
Comment 6 Michal Jaegermann 2009-10-14 14:34:07 EDT
> Can you test with fedora 12 livecd ...

Not really.  OTOH I did not see that problem for a long time and whatever faults are in the current rawhide it is not that one.  A dmesg for the current rawhide kernel on this hardware can be found for example here:
https://bugzilla.redhat.com/attachment.cgi?id=364786
Comment 7 Jérôme Glisse 2009-10-15 10:16:49 EDT
Ok. So clausing this bug, reopen if you experience similar problem with fedora 12.

Note You need to log in before you can comment on or make changes to this bug.