Bug 170008

Summary: X Hangs on Radeon 7000 after RHEL4U2 Update
Product: Red Hat Enterprise Linux 4 Reporter: Thomas J. Baker <tjb>
Component: xorg-x11Assignee: X/OpenGL Maintenance List <xgl-maint>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.3CC: combslm, dely.l.sy, heinlein, hgarcia, idr, itsupport, link, milan.kerslager, mmatsuya, rkhadgar, sfolkwil, shelly, tao, t.h.amundsen, tkincaid, wwlinuxengineering, xgl-maint
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2006-0072 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-04-25 15:11:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 168430    

Description Thomas J. Baker 2005-10-06 13:42:59 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7

Description of problem:
After updating to RHEL4U2 and rebooting, X hangs pegging one CPU with no display. I've tried system-config-display --reconfig but the config file created is essentially the same as before. 

The system is a Dell PowerEdge 2800 with onboard ATI video:

10:0d.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] (prog-if 00 [VGA])
        Subsystem: Dell: Unknown device 016e
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop+ ParErr- Stepping+ SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (2000ns min), Cache Line Size 10
        Interrupt: pin A routed to IRQ 185
        Region 0: Memory at d0000000 (32-bit, prefetchable) [size=128M]
        Region 1: I/O ports at 9c00 [size=256]
        Region 2: Memory at df2f0000 (32-bit, non-prefetchable) [size=64K]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-


Version-Release number of selected component (if applicable):
xorg-x11-6.8.2-1.EL.13.20

How reproducible:
Always

Steps to Reproduce:
1. Update from RHEL4U1 to RHEL4U2
2. reboot
3.
  

Actual Results:  X hangs at startup.

Additional info:

I've switched to the vesa driver for now and that at least gives me 800x600 video.

Comment 1 Laramie Combs 2005-10-07 19:03:13 UTC
I have also experienced this problem on a Dell Poweredge 1855.  After the 
Update 2 on RHEL4, I booted into a frozen graphical login.  I have not tried 
the vesa driver yet, but will post the results once I do, in addition to the 
Video Card information.

Comment 2 Laramie Combs 2005-10-07 19:36:18 UTC
I just tried switching my video driver to be the vesa driver instead of the
radeon driver, and I am able to boot into the graphical login.

Here is the information on the video card:

06:0d.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon
7000/VE]

I am anxious to get this bug resolved seeing how we have 10 of these identical
machines, which will break the next time they reboot I assume?

Comment 3 Milan Kerslager 2005-10-11 17:05:37 UTC
I had to comment out 'Load "dri"' for ATI card after U2 too (in
/etc/X11/xorg.conf). I have this card (motherboard ASUS K8V-X):

ATI Technologies Inc RV280 [Radeon 9200 PRO] (rev 01)

Comment 4 Thomas J. Baker 2005-10-11 17:47:25 UTC
Disabling DRI and running the radeon driver works better than using the vesa
driver. I get 1024x768 again.

Comment 5 Mike A. Harris 2005-10-12 00:14:32 UTC
Thanks for the report.

The driver should be disabling DRI by default on Radeon 7000 hardware,
as it is known to be unstable currently.  I've reviewed the driver code,
and there is a slight flaw in the logic that is supposed to detect
Radeon 7000 and disable DRI.

The recommended short term workaround for this problem, is to comment out
the following line in xorg.conf, and restart the X server:

    Load "dri"

I have added this issue to our engineering queue, so the issue should be
addressed in a future RHEL update.

Thanks again for reporting the problem to us.



Comment 7 Aleksandar Milivojevic 2005-10-14 05:41:40 UTC
Actually, it used to be unstable only on some versions of Radeon 7000, not all
of them.  At least that was the case with original RHEL4 and U1.  Mine Radeon
7000 (which is PCI version, not AGP) used to work just fine and rock solid
stable so far with DRI enabled (which I had to manually enable, it was disabled
by default during installation).  My other computer (which is at U1 level, I've
not upgraded to U2 yet) runs some nice 3D screen saver on Radeon 7000 as I type
this (making the use of DRI, I remember that same screen saver being way slower
until I enabled DRI).

If it stops working after the update to U2 (as it seems it stopped for some
people), something additional got broken between U1 and U2.

Comment 8 Aleksandar Milivojevic 2005-10-14 16:02:52 UTC
Addendum.

I also have one Radeon 7000 AGP in one of my development PCs at work.

First, I must say that that particular PC at work is CentOS box, but it should
be the same source as RHEL, unless CentOS folks disabled it after original U2
was released and problem discovered.  Feel free to ignore this entire comment
based on this if you like.

Lspci lists the card as:

01:00.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon
7000/VE]

I compared Xorg.0.log files generated before and after update to U2.

Before update to U2, DRI worked just fine and radeon kernel driver was loaded
and used by the X server (and it worked stable for me, just like the Radeon 7000
PCI I have at home).

# grep radeon Xorg.0.log.backup
(II) LoadModule: "radeon"
(II) Loading /usr/X11R6/lib/modules/drivers/radeon_drv.o
(II) Module radeon: vendor="X.Org Foundation"
(II) Loading sub module "radeon"
(II) LoadModule: "radeon"
(II) Reloading /usr/X11R6/lib/modules/drivers/radeon_drv.o
(II) RADEON(0): [drm] loaded kernel module for "radeon" driver
(II) RADEON(0): [drm] created "radeon" driver at busid "pci:0000:01:00.0"

Also, there's corresponding line in /var/log/messages file:

kernel: [drm] Initialized radeon 1.11.0 20020828 on minor 0:

After update to U2, the X server is no longer loading the radeon driver.  This
is with DRI enabled in /etc/X11/xorg.conf.

# grep radeon Xorg.0.log
(II) LoadModule: "radeon"
(II) Loading /usr/X11R6/lib/modules/drivers/radeon_drv.o
(II) Module radeon: vendor="X.Org Foundation"
(II) Loading sub module "radeon"
(II) LoadModule: "radeon"
(II) Reloading /usr/X11R6/lib/modules/drivers/radeon_drv.o

Looking at /var/log/messages file also confirms that the driver was not loaded.

Seems I don't have "try and see if it works for me" option anymore.

Comment 9 Mark Arrasmith 2005-10-17 17:55:26 UTC
(In reply to comment #5) 
> The driver should be disabling DRI by default on Radeon 7000 hardware, 
> as it is known to be unstable currently.  I've reviewed the driver code, 
> and there is a slight flaw in the logic that is supposed to detect 
> Radeon 7000 and disable DRI. 
 
I have a similar locking problem for the Radeon 9000 AGP.  X windows will load 
and 3D acceleration works just fine.  But when I try to close out X then dri 
will take out the kernel. 
 
But, if when I comment out Load "dri" from xorg.conf the problem goes away. 
 
- mark 
 

Comment 10 Shelly Fangman 2005-10-18 19:13:02 UTC
I'm also seeing problems with a Dell 2850 using the Radeon 7000 after RHEL4U2,
although mine are a bit different.

I'm seeing this error when starting X:
Radeon(0) Failed to set up write-combining range (0xc8000000, 0x1000000)

The /var/log/messages file has this:
Kernel: mtr: type mismatch for c8000000, 1000000 old: write-back new:
write-combining.

moving from radeon to vesa in the /etc/X11/xorg.conf file doesn't help, nor does
commenting out dri.

Shelly

Comment 13 Mike A. Harris 2005-10-21 22:23:19 UTC
(In reply to comment #9)
> (In reply to comment #5) 
> > The driver should be disabling DRI by default on Radeon 7000 hardware, 
> > as it is known to be unstable currently.  I've reviewed the driver code, 
> > and there is a slight flaw in the logic that is supposed to detect 
> > Radeon 7000 and disable DRI. 
>  
> I have a similar locking problem for the Radeon 9000 AGP.  X windows will load 
> and 3D acceleration works just fine.  But when I try to close out X then dri 
> will take out the kernel. 

Similar problem perhaps, but probably unrelated.  There are other DRI related
problems on particular cards/motherboard/configuration combos, that will
go away with DRI disabled, but they're not all the same problem.

Your best bet if having a Radeon 9000 problem, is to file a bug report in
Xorg bugzilla, or to escalate an official Red Hat Enterprise Linux support
request at http://www.redhat.com/apps/support or by calling 1-888-REDHAT1.

This bug is tracking a known Radeon 7000 specific problem.


Comment 14 Mike A. Harris 2005-10-21 22:26:12 UTC
(In reply to comment #10)
> I'm also seeing problems with a Dell 2850 using the Radeon 7000 after RHEL4U2,
> although mine are a bit different.
> 
> I'm seeing this error when starting X:
> Radeon(0) Failed to set up write-combining range (0xc8000000, 0x1000000)
> 
> The /var/log/messages file has this:
> Kernel: mtr: type mismatch for c8000000, 1000000 old: write-back new:
> write-combining.

MTRR warnings are unrelated to your problem and are not errors.  Your problem
does not sound like it is related the issue initially reported here.
Recommendation is to file a request with Red Hat at
http://www.redhat.com/apps/support for official support for the issue.


Comment 15 Don 2005-10-26 21:36:37 UTC
I've got an IA64 rpx2600 HP server with the same issue as described
here...hanging X-windows at startup after the rhel4u2 update. This server has
the ATI 7000 radeon in it and did not experience this grief before.  

Just want to make sure that any corrections are also made for the IA64
architecture as well..don't want to be missed on this fix....

Comment 16 Mike A. Harris 2005-10-26 22:21:46 UTC
*** Bug 171700 has been marked as a duplicate of this bug. ***

Comment 22 Mike A. Harris 2005-11-29 00:55:48 UTC
xorg-x11-6.8.2-ati-radeon-7000-disable-dri.patch updated to fix the logic
inversion problem introduced in previous update.  This will be part of the
RHEL4 U3 update.

Setting status to "MODIFIED", please test and provide feedback.  Set status
to "ASSIGNED" if problem is still present after testing.

Comment 23 Mike A. Harris 2005-11-29 10:51:45 UTC
xorg-x11-6.8.2-1.EL.13.21 is now available for download and testing via
ftp at the following URL:  ftp://people.redhat.com/mharris/testing/4E

Please test this release as soon as possible, and provide feedback on the
results of testing in a status update.  If the problem is not resolved
after installing all of the rpm packages, and rebooting the system,
please set the bug back to "ASSIGNED" state.

Comment 25 Milan Kerslager 2005-11-29 11:38:55 UTC
As of my HW (see comment #3), the hangs are here even xorg has beed updated to
your EL.13.21 version. I'm not able to reach login screen from gdm when DRI is
enabled in the config file.

Comment 26 Mike A. Harris 2005-11-29 13:18:43 UTC
Please attach your config file and log file from the EL.13.21 build.

Comment 28 Mike A. Harris 2005-11-29 14:05:49 UTC
(In reply to comment #25)
> As of my HW (see comment #3), the hangs are here even xorg has beed updated to
> your EL.13.21 version. I'm not able to reach login screen from gdm when DRI is
> enabled in the config file.

Just reread your comment, and thought I'd clarify something in case there is
confusion...

DRI is broken for Radeon 7000.  Some people are able to get it to work on
some specific Radeon 7000 cards on some systems, but there are a very very
large number of systems which it is known to not work on.  For that reason,
we are intentionally disabling DRI by default on Radeon 7000, however the
patch that went into the last update had a glitch in it which caused DRI
to be enabled still, and to hang systems with Radeon 7000 present.

The "fix" for that problem, was to fix the patch to properly disable DRI
on Radeon 7000 by default.  Your claim in comment #25 is that you get
a hang when dri is "enabled in the config file".  That is totally expected,
as we are disabling it intentionally to avoid just that issue.  Please
/disable/ DRI in the config file, and re-test.

- Remove any lines from the config file that say:  Option "dri"
- Restart the X server, and DRI should be disabled, but only on Radeon 7000

If that is what happens, then this bug is now fixed.

If DRI is still enabled even without being forced on, then we have a
problem still.  One customer has confirmed the patch included in the
build you've tested, does properly disable DRI, so if you're seeing
something else, we'd like to determine why.

Thanks in advance.

Comment 30 Thomas J. Baker 2005-11-29 15:28:48 UTC
Mike, comment #28 seems contrary to #5. I thought that DRI should be disabled by
the driver for Radeon 7000 no matter what the config file says. With the current
driver (RHEL4U2), if I disable DRI in the config file everything is fine but I
didn't need to do this with the RHEL4U1 driver.

Comment 31 Paul Heinlein 2005-12-01 04:12:30 UTC
Regarding comment #23, the binary rpms xorg-*-6.8.2-1.EL.13.21 seem to work just
fine for me:

01:07.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon
7000/VE]

Don't know if it matters, but my Radeon sits on a plain PCI bus; the system
architecture is i386 (dual P-III).

Thank you, Mike!

Comment 33 Mike A. Harris 2005-12-01 06:35:24 UTC
(In reply to comment #30)
> Mike, comment #28 seems contrary to #5. I thought that DRI should be disabled by
> the driver for Radeon 7000 no matter what the config file says. With the current
> driver (RHEL4U2), if I disable DRI in the config file everything is fine but I
> didn't need to do this with the RHEL4U1 driver.

Ok, just to clarify things further, let's first clearly define the desired
behaviour of the patch:

1) If the config file does not contain:  Option "dri" or any variant, we
   want DRI to be disabled on Radeon 7000.  That is the desired "default".

2) If the user specifies Option "dri" in the config file, we want
   that to force DRI to be enabled for any radeon chip, including 7000.

3) If the user specifies Option "nodri" in the config file, we want
   that to force DRI to be disabled for any radeon chip, including 7000.

The current patch seems to handle all 3 cases right now I believe.  I'd
really like it if multiple users can test all 3 scenarios on a Radeon 7000
card, and on at least one other Radeon card that is not Radeon 7000, such
as a 9000 or something.

If we get all 3 scenarios tested on one of each card, with the desired
results, then I think we've licked this one.

Thanks in advance everyone for testing.




(In reply to comment #31)
> Regarding comment #23, the binary rpms xorg-*-6.8.2-1.EL.13.21 seem to work just
> fine for me:
> 
> 01:07.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon
> 7000/VE]
> 
> Don't know if it matters, but my Radeon sits on a plain PCI bus; the system
> architecture is i386 (dual P-III).
> 
> Thank you, Mike!

No prob.  Can you test what happens when you specify Option "dri" in the
device section and restart X, and also "nodri" option?  That'd cover the
3 cases for us for the 7000.  If someone else can confirm the 3 cases
work for other Radeon cards as expected, we're good to go.

Thanks again!

Comment 34 Paul Heinlein 2005-12-01 16:45:07 UTC
(In reply to comment #33)
> Ok, just to clarify things further, let's first clearly define the desired
> behaviour of the patch:
> 
> 1) If the config file does not contain:  Option "dri" or any variant, we
>    want DRI to be disabled on Radeon 7000.  That is the desired "default".

This is true for my PCI VE/7000.

> 2) If the user specifies Option "dri" in the config file, we want
>    that to force DRI to be enabled for any radeon chip, including 7000.

This appears untrue for my card. xorg.conf shows

  Section "Module"
    Load  "dbe"
    Load  "extmod"
    Load  "fbdevhw"
    Load  "glx"
    Load  "record"
    Load  "freetype"
    Load  "type1"
    Load  "dri"
  EndSection
  [....]
  Section "Device"
    Identifier  "Videocard0"
    Driver      "radeon"
    VendorName  "Videocard vendor"
    BoardName   "ATI Radeon 7000"
    Option      "DRI" "On"
  EndSection

But Xorg.0.log still shows

  (WW) RADEON(0): Direct Rendering is currently disabled on Radeon
       VE/7000 hardware due to instability.
  [....]
  (II) RADEON(0): Direct rendering disabled

> 3) If the user specifies Option "nodri" in the config file, we want
>    that to force DRI to be disabled for any radeon chip, including 7000.

This is true for me.

In other words, the new xserver exhibits the desired no-dri behavior, but I
can't force it to misbehave by trying to force DRI on.

Comment 35 Milan Kerslager 2005-12-01 23:05:38 UTC
In my case using Load "dri" in the Module section leads to hang. I have no
Option "DRI" "On" in the Device section.

In any case, one should make sure this option is commented out when
system-config-display is used to build /etc/X11/xorg.conf for this card (Radeon
9200 PRO in my case) to be able to agree with comment #33 and this solution (as
this is a different behaviour in compare to the pre-U2).

I'll test it tomorow.

Comment 37 David Juran 2005-12-05 10:58:25 UTC
Mike, in response to comment 32. The patch in comment 19 was submitted by a
customer (IT 81910) and tested by him (as I don't have the neccessary hardware)
Anyway, I believe it will remedy the problem seen in comment 34.

/David

Comment 39 Imed Chihi 2005-12-13 11:12:06 UTC
Customer feedback:

On RHEL 4 U2, using the system-config-display to generate an Xorg configuration
yields to an X hang. This was not causing a problem on RHEL 4 U1 on the same
i386 hardware. Commenting out the "Load dri" statement avoids the crash.

Log file says:
(WW) RADEON(0): Direct Rendering is disabled by default on Radeon VE/7000
        hardware due to instability, but has been forced on with
        "Option "dri" in xorg.conf.  You may experience instability.

 -Imed

Comment 40 Milan Kerslager 2005-12-13 13:57:13 UTC
As of comment #35, I agree with comment #39 and hope system-config-display will
be fixed before U3.

The bug #175618 has been opened to track this issue.

Comment 43 Larry Troan 2005-12-15 01:16:24 UTC
*** Bug 175795 has been marked as a duplicate of this bug. ***

Comment 45 Tom Kincaid 2005-12-16 17:42:11 UTC
Please be sure to file bugs against the appropriate release, RHEL Enterprise
Linux Beta. Update 3.


Comment 61 Tom Kincaid 2006-01-09 17:34:24 UTC
Added to the Must fix list, per discussion at the RHEL meeting 01/05/2006

Comment 62 Mike A. Harris 2006-01-12 00:11:06 UTC
xorg-x11-6.8.2-1.EL.13.25 is now available for download via ftp
at the following URL:  ftp://people.redhat.com/mharris/testing/4E

Everyone:  Please download and test these rpms as soon as possible.
Please test on both Radeon 7000 hardware, as well as any other
Radeon hardware that is available, which will help to maximize the
test coverage on as much hardware as possible.

For the testing:

1) Upgrade to the new rpms, and ensure your X server is configured to use
   the 'radeon' driver, by running "system-config-display --reconfig".

2) Perform a complete system reboot to ensure that the video hardware
   is completely reinitialized to factory power-on hardware settings.

If you've followed the above, your X server should be configured by
the config tool for DRI "auto" operation, ie: the driver will determine
wether or not to enable DRI on the given hardware internally.  Please
indicate the particular Radeon card you are testing on, and examine the
X server log to confirm wether or not DRI was enabled or not.  It should
automatically default to disabled on Radeon 7000 hardware (more specifically
on all ATI Radeon RV100 based chipsets, which includes the Radeon 7000,
Radeon VE, and other RV100 variants).  This should occur for all bus types,
including both PCI and AGP cards, etc.

For other Radeon chipsets to which DRI is supported, DRI should default to
being enabled.  If you're testing on Radeon 7200/8xxx/9000/9200 or other
boards, please indicate wether DRI autodetected and was enabled or disabled
by the driver.

If you have a system in which DRI did work on your Radeon 7000/VE hardware
in the past, for which you would like to override the automatic default
choice of DRI==disabled, insert into the Device section of xorg.conf:

    Option "dri"

That should force DRI mode of operation on all Radeon hardware, to which
the DRI codebase contains driver compatibility for, including Radeon 7000.
If DRI operation of the hardware is stable, you should be able to continue
using the driver with DRI force-enabled in this manner, however for hardware
which DRI operation is unstable, it is recommended to not specify the 'dri'
option at all as per above, and just let the driver default to stable
operation.

Additionally, you can test the 'nodri' option if desired, and indicate
wether that properly forces DRI to be disabled or not.  It should force
DRI off on all Radeon hardware.

Thanks in advance for your testing.

Setting status to "NEEDINFO_REPORTER"


Comment 69 Mike A. Harris 2006-01-13 10:54:44 UTC
Internal testing has shown the problem to be resolved with the latest patch
for all cases tested.  Additionally, there have been no regression reports
from people able to reproduce the original issue.

The issue appears to now be resolved.  Setting status to MODIFIED.

Comment 75 Red Hat Bugzilla 2006-03-07 18:18:39 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0072.html


Comment 90 Alan Matsuoka 2007-04-25 15:00:26 UTC
This should be closed.

Comment 91 Alan Matsuoka 2007-04-25 15:11:38 UTC
oops. should be closed.