Bug 177773
Summary: | radeon driver hangs machine hard | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Erwin Rol <redhatbugs> | ||||
Component: | xorg-x11-drv-ati | Assignee: | X/OpenGL Maintenance List <xgl-maint> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | |||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | rawhide | CC: | linux | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2006-03-06 22:39:20 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 150222 | ||||||
Attachments: |
|
Description
Erwin Rol
2006-01-13 20:50:23 UTC
This may be related to changes to radeon DRM in the kernel. I had similar symptoms that went away around the time of: commit 281ab031a8c9e5b593142eb4ec59a87faae8676a Author: Benjamin Herrenschmidt <benh.org> Date: Fri Dec 16 16:52:22 2005 +1100 [PATCH] radeon drm: fix agp aperture map offset This finally fixes the radeon memory mapping bug that was incorrectly fixed by the previous patch. This time, we use the actual vram size as the size to calculate how far to move the AGP aperture from the framebuffer in card's memory space. If there are still issues with this patch, they are due to bugs in the X driver that I'm working on fixing too. Signed-off-by: Benjamin Herrenschmidt <benh.org> Cc: Mark M. Hoffman <mhoffman> Cc: Paul Mackerras <paulus> Signed-off-by: Linus Torvalds <torvalds> and then re-appeared with commit 392c14beaca2ee85a98d0c6b453501be67423a20 Author: Linus Torvalds <torvalds.org> Date: Thu Dec 29 13:01:54 2005 -0800 Revert radeon AGP aperture offset changes This reverts the series of commits 67dbb4ea33731415fe09c62149a34f472719ac1d 281ab031a8c9e5b593142eb4ec59a87faae8676a 47807ce381acc34a7ffee2b42e35e96c0f322e52 that changed the GART VM start offset. It fixed some machines, but seems to continually interact badly with some X versions. (it's possible that my problems had another cause and the timing correlation was just coincidence). Disabling DRI stops the hangs for me. I don't think DRI is supported on my radeon, xdriinfo returns the following; Xlib: extension "XFree86-DRI" missing on display ":0.0". Screen 0: not direct rendering capable. what i did notice is that the hangs almost always happen when i am away from the machine. I only had one hang when i was actually using the machine. Maybe some powermanagement causes the hang ? I too have hard machine lockup using the "ati" driver with DRI enabled. The machine will lockup hard, usually between 2minutes and 15 minutes after X is started. It has locked up in the middle of my typing so I don't think it is a power management problem. There are no messages in any log I can find. Disabling DRI makes the problem go away, but as a previous poster noted, the display is very slow now. This behavior was seen on a clean install of FC5T2 with 'yum update' to 1/17. So xorg-x11-drv-ati is at version 6.5.7.2-1. The chip is reported as "ATI Technologies Inc M10 NT [FireGL Mobility T2] rev 128, Mem@ 0xe0000000/27, 0xc0100000/16, I/O @ 0x3000/0" (In reply to comment #2) > I don't think DRI is supported on my radeon, xdriinfo returns the following; Correct, DRI is not supported on R300 or newer Radeon chipsets in Fedora development. There is an experimental driver, but it is not built, and not ready for mainstream usage currently. > Xlib: extension "XFree86-DRI" missing on display ":0.0". > Screen 0: not direct rendering capable. That is to be expected for this hardware. > what i did notice is that the hangs almost always happen when i am away from the > machine. I only had one hang when i was actually using the machine. Maybe some > powermanagement causes the hang ? It is possible that the screensaver is kicking in while you're away, and that that is triggering a bug in the driver. A lot of the acceleration codepaths in the drivers are not used by normal software nowadays, and many of the accel primatives seem to only get used by various screen savers and other special-purpose applications, such as CAD software. Try disabling 2D acceleration by using the following in the xorg.conf device section: Option "noaccel" That will be horribly slow, but it is useful for diagnosis of the problem. Indicate if this helps at all or not, you may need to use it this way for a few hours or even a day perhaps. If it does seem to resolve the issue, then we can conclude for sure that it is likely faulty acceleration, in which case the next step is to try the various "XaaNo" options from the xorg.conf manpage one at a time, or in combinations to try to narrow down the problematic acceleration. Using the above diagnostic tests, what results are you able to obtain? ping Sorry for the slow reply, i wanted to test it on another machine first, but having problems installing the latest rawhide. The "noaccel" option seem to make it work, and yes that is as fast as my first 512k trident card :-) It could not have been the screen saver, becuase that wasn't trunned on at all, and i also turned of all powersavings in the bios. I am now running only on my Xpress 200G (RS480) and that seems to work without a hang with the normal rawhide ati driver. I will try two other radeons, a PCIe one in this x86_64 machine, and a PCI one in a machine i am trying to get rawhide to install on. Also i had some success with the patches from Benjamin from the xorg list, but the last one cause problems with VT switching. Just wanted to let you know i didn't forgot about the bug, i just been to busy with work to do more testing. Created attachment 124401 [details]
DRI enabled and all XaaNo* options mentioned in xorg.conf
I have this exact problem on a IBM T42p thinkpad with an ATI FireGL Mobility T2 (lspci shows: 01:00.0 VGA compatible controller: ATI Technologies Inc M10 NT [FireGL Mobility T2] (rev 80)) So this is what I have found while playing around with xorg.conf: * If I have "Load dri" the machine will lock hard as soon as X is started; forget screen savers, I usually can't get passed the gdm screen. * If I comment out "Load dri" the machine is stable and the moving of windows is possible with no tearing * If I have "Load dri" with "Option noaccel" in the device section the system is stable, but window response is very slow; moving a window results in losts of tear * If I have "Load dri" with every XaaNo* option, the system still locks hard (I also accidently found out that "Load dri" commented out with every XaaNo* results in very slow performance :) I've attached a copy of my Xorg log when I had dri enabled and every XaaNo option on in case that is of any help Oh I forgot to mention that I have updated to rawhide as of 8:00am 2/8 and I noticed that mesa is mentioning that support for accelerated ATI drivers is enabled. I thought it odd that when I read my log it claims that no accelleration is allowed for 9500/9700 it says that the mesa driver doesn't have support. Perhaps the message is outdated. I have a FireMV 2400 PCI (Quad head) card that directly hangs the machine when used with the normal Rawhide x drivers. I have made a patched version of the driver that makes it possible to use the card. I haven't tried DRI, which seems to need an other patch. Maybe someone could try if the patched version works on other hardware too. The patched source RPM can be found here; http://www.erwinrol.com/downloads/software/xorg-x11-drv-ati-6.5.7.3-2.ER.1.src.rpm Since the last RPM from rawhide has a higher version you might need to --force it. (In reply to comment #9) > Oh I forgot to mention that I have updated to rawhide as of 8:00am 2/8 and I > noticed that mesa is mentioning that support for accelerated ATI drivers is > enabled. I thought it odd that when I read my log it claims that no > accelleration is allowed for 9500/9700 it says that the mesa driver doesn't have > support. Perhaps the message is outdated. Please read the mesa package changelog entry for 6.4.2-2, which references the r300 driver request bug which also contains more information. Benjamin Herrenschmidt checked in his radeon mmap patch into xorg CVS. I have been running that patch for some weeks and things haven stable since. would it be possible to create a new ati-drv rpm from that new CVS version. Just to add a "me too"; with an up-to-date rawhide install on my Thinkpad Z60m with "ATI Radeon Mobility X600 (M24) 3150 (PCIE)", X locks the system hard on start. Commenting out the 'Load "dri"' line or adding the 'Option "noaccel"' line to the device section allows it to work. (In reply to comment #13) > Just to add a "me too"; with an up-to-date rawhide install on my Thinkpad Z60m > with "ATI Radeon Mobility X600 (M24) 3150 (PCIE)", X locks the system hard on > start. Commenting out the 'Load "dri"' line or adding the 'Option "noaccel"' > line to the device section allows it to work. Could you try the srpm i posted in comment #10 ? Still without the DRI , but accell might work with it. I rebuilt the latest rawhide xorg-x11-drv-ati with the two patches. My system locks on X start still with DRI enabled in the config. Adam Jackson is currently planning on doing an Xorg server 1.0.2 update release from the stable branch of CVS (not HEAD). If Ben's patch is considered stable enough for inclusion into the stable branch of Xorg server CVS, it may become part of the 1.0.2 release, or a release subsequent to that. Once a stable upstream Xorg server update has been released by X.Org which includes Ben's patch, we will consider including it in Fedora development. Perhaps a patch is needed for system-config-display that will disable DRI for R300+ cards for now. Otherwise, a bunch of systems will be left unusable after install. (In reply to comment #17) > Perhaps a patch is needed for system-config-display that will disable DRI for > R300+ cards for now. Otherwise, a bunch of systems will be left unusable after > install. If the config tool does it, then two problems are created: 1) When the driver is considered stable and reliable, aka "fixed", the config tool will still disable it until the config tool is updated. Creates more work for everyone, and more frustration for the end user. 2) When the driver is fixed/stable and the user upgrades to the new driver, their old configuration will still continue to needlessly disable the particular feature. Due to these types of problems, since around Red Hat Linux 7.2 or 7.3 we started patching the video drivers and/or Xserver directly to change any defaults as needed. This is the preferred method for any changes from upstream defaults, as the changes are then self contained within the X server, or driver that has the problem to begin with, and can be removed at the same time the problem is resolved, causing all systems to be in sync with the fixes at the same time, and not requiring the user to reconfigure or perform any other manual changes. I might update radeon over the weekend to account for this. None of this is really relevent to this particular bug report however, as the r300 DRI driver wasn't even enabled in rawhide until long after this report was filed. There is another bug tracking r300 DRI inclusion that you might want to CC yourself on, however I don't have the bug ID handy. HTH It turns out that Option "nodri" does not seem to resolve DRI related hangs on R300, from other reports we're getting. This means that commenting out "Load "dri"" seems to be the only way right now to have stable 2D-only setup if the r300 dri driver is supplied. We may decide to disable r300 dri between now and FC5 rather than risk any further instability. It isn't clear that any of this is related to the _original_ reporter's bug report here though. The others who have CC'd on the bug report and added comments, seem to be experiencing bug #177773 instead of this one, however it isn't 100% clear. Ben H is working on other fixes related to the patch refered to above I'm told, which seems to indicate the patches are in a state of flux. I'm leary to include them until they've had adequate testing in the wild, so we'll leave this one for a few more days to see if the patch situation changes upstream. In comment #20, I meant bug #182196 in the 4th paragraph. After re-reviewing all comments in this bug, I believe it is a straight dupe of bug #182196. *** This bug has been marked as a duplicate of 182196 *** |