Description of problem: I happen to have ATI Technologies Inc R300 AD [Radeon 9500 Pro] card in my test machine. It works well and reliably as long as I will put Option "AGPMode" "8" in a "Device" section of my /etc/X11/xorg.conf. With this skipped an attempt to start X leads to immediately to machine with a blank screen, locked keyboard and even SysRq key stops working very quickly even if initially it is possible to use it a bit "in blind". Checking logs afterwards one can find there: [drm] Initialized drm 1.0.1 20051102 ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 185 [drm] Initialized radeon 1.24.0 20060225 on minor 0 agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. agpgart: Badness. Don't know which AGP mode to set. [bridge_agpstat:1f000a0a vga_agpstat:ff00021b fell back to:- bridge_agpstat:1f000208 vga_agpstat:ff00021b] agpgart: Bridge couldn't do AGP x4. agpgart: Putting AGP V3 device at 0000:00:00.0 into 0x mode agpgart: Putting AGP V3 device at 0000:01:00.0 into 0x mode [drm] Setting GART location based on new memory map [drm] Loading R300 Microcode [drm] writeback test failed and we are ready to pull a plug. With "AGPMode" explicitely set the picture is different: [drm] Initialized drm 1.0.1 20051102 ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 201 [drm] Initialized radeon 1.24.0 20060225 on minor 0 agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. agpgart: Xorg tried to set rate=x12. Setting to AGP3 x8 mode. agpgart: Putting AGP V3 device at 0000:00:00.0 into 8x mode agpgart: Putting AGP V3 device at 0000:01:00.0 into 8x mode [drm] Setting GART location based on new memory map [drm] Loading R300 Microcode [drm] writeback test succeeded in 1 usecs and I have a picture, no complaints, no lockups or anything of that sort. As far as I can tell, from what I managed to collect with SysRq, with "AGPMode" not given we are sitting here: Call Trace: <ffffffff885529df>{:radeon:radeon_do_wait_for_idle+113} <ffffffff8855304f>{:radeon:radeon_cp_idle+0} <ffffffff88534d74>{:drm:drm_ioctl+371} <ffffffff80245975>{do_ioctl+85} <ffffffff8023260f>{vfs_ioctl+598} <ffffffff8025047b>{sys_ioctl+89} <ffffffff80261bc1>{tracesys+209} It does not seem to be anyting special on a list of blocking locks: S startx: 2521 [ffff81001fe9d0c0, 125] (not blocked on mutex) S xinit: 2537 [ffff810018580780, 123] (not blocked on mutex) R X: 2538 [ffff810018588080, 124] (not blocked on mutex) Simply we are not going anywhere. Version-Release number of selected component (if applicable): xorg-x11-drv-ati-6.6.0-3 but this is like that really for "all the time". How reproducible: always - unfortunately Additional info: AFAIK I am not the only one with similar observations although possibly a value for "AGPMode" is not always 8. I know that 4 did not work in my case. 12 is also for me a "killer value". It seems to be a different problem than bug #182196 although I am not sure.
From the above log snippets, it appears this is an agpgart issue and not an X server issue. In general you should _not_ use the AgpMode setting in the X server config file, and instead set the AGP rate in your BIOS and the X server should "just work" with it. > my case. 12 is also for me a "killer value". Because it is invalid. Valid modes are 1/2/4/8. You can only use the modes the hardware was designed for, which is the overlap of what the video card can do, combined with what the motherboard can do. In some cases some modes might not work with certain hardware combinations due to hardware flaws in the motherboard chipset, video card, or both. Reassigning this to the kernel for now, although it isn't clear if this is really a kernel bug or not. If the kernel folk think it is not a kernel bug, then you should probably file this directly in Xorg bugzilla.
(In reply to comment #1) > From the above log snippets, it appears this is an agpgart issue and not > an X server issue. In general you should _not_ use the AgpMode setting > in the X server config file, and instead set the AGP rate in your BIOS > and the X server should "just work" with it. > That's an interesting statement. I set the AGPMode to 8 and my card started working as well. At least until I tried 1.2244_FC6. I'll check to see if the BIOS setting makes any difference. Ian
(In reply to comment #2) > (In reply to comment #1) > > From the above log snippets, it appears this is an agpgart issue and not > > an X server issue. In general you should _not_ use the AgpMode setting > > in the X server config file, and instead set the AGP rate in your BIOS > > and the X server should "just work" with it. > > > > That's an interesting statement. Yeah, to further clarify... What I mean by "in general", is that it is *supposed* to automatically autodetect the AGP capabilities of the hardware, and automatically set the highest AGP rate that the hardware can do, if it isn't blacklisted for some reason. When the hardware isn't broken, and the kernel and X are working right, this generally happens. When the hardware is broken or having a bad day, or if the kernel AGP drivers aren't up to scratch, then it doesn't always work that way. Dave Jones can provide more accurate up to date info probably though. > I set the AGPMode to 8 and my card started working as well. At least > until I tried 1.2244_FC6. That suggests to me 2 things: 1) The hardware is capable of AGP 8x, but the kernel is not setting that by default for whatever reason. Possibly kernel bug, or just bad assumptions or something like that. 2) The hardware combination might not work at AGP 4x due to quirks, or perhaps the kernel AGP support has a glitch at 4x, and so 4x hangs. To be clear though, this is just an educated hypothesis of what is happening, but not conclusive. > I'll check to see if the BIOS setting makes any difference. Setting the AGP mode in the BIOS usually makes things work right, however some systems do not have an AGP mode setting in the CMOS unfortunately. It is probably a good idea to attach /var/log/messages from a problematic startup, as well as a working one, for comparison too. HTH
> In general you should _not_ use the AgpMode setting > in the X server config file, and instead set the AGP rate > in your BIOS and the X server should "just work" with it. That indeed would be nice; as long as you have such settings in BIOS, which is very far from certain, and you know what is a correct value. Moreover I happen to have such BIOS knob. and it even says "8x" for a long time. I just checked to be sure. The catch is that it does not help at all. As for 12 beeing invalid; I did that only out of sheer curiosity to see what will happen because I found in logs "Xorg tried to set rate=x12". Well, if it tried then let it and see ... Still I wonder why Xorg did that, or at least something claims that it did, if it is known in advance that this is an invalid rate? AFAICT there is really nothing on the subject in /var/log/messages beyond those two snippets I quoted in my original report. BTW - in the past I tried to turn off drm. No changes.
I think I've fixed this in the current rawhide kernel.
> I think I've fixed this in the current rawhide kernel. Somewhat dependent on what "current" means. If you are thinking about 2.6.17-1.2630.fc6, which is now the latest available, then nothing really changed and "agpgart: Badness....", and so on, it is still there and a display without an explicit "AGPMode" configuration is dead. OTOH if you are talking about something newer then I will see what will happen when it will show up on servers.
If the fix mentioned in comment #5 was supposed to be present in 2.6.18-1.2679.fc6 then, I am afraid, I have bad news. When trying with this kernel and missing '"AGPMode" "8"' line after an attempt to start X I see in dmesg (after loging from a remote): [drm] Initialized drm 1.0.1 20051102 ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 201 [drm] Initialized radeon 1.25.0 20060524 on minor 0 agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. agpgart: Badness. Don't know which AGP mode to set. [bridge_agpstat:1f000a0a vga_agpstat:ff00021b fell back to:- bridge_agpstat:1f000208 vga_agpstat:ff00021b] agpgart: Bridge couldn't do AGP x4. agpgart: Putting AGP V3 device at 0000:00:00.0 into 0x mode agpgart: Putting AGP V3 device at 0000:01:00.0 into 0x mode [drm] Setting GART location based on new memory map [drm] Loading R300 Microcode [drm] writeback test failed Process X apparently loops with the following line in 'top': 2524 root 24 -1 237m 5800 4008 R 99.7 1.2 2:25.32 X and there is no apparent way to kill it save of a reboot. Black screen and no response from a keyboard - like reported previously. With an explicit "AGPMode" set I see now this: [drm] Initialized drm 1.0.1 20051102 ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 201 [drm] Initialized radeon 1.25.0 20060524 on minor 0 agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. agpgart: Putting AGP V3 device at 0000:00:00.0 into 8x mode agpgart: Putting AGP V3 device at 0000:01:00.0 into 8x mode [drm] Setting GART location based on new memory map [drm] Loading R300 Microcode [drm] writeback test succeeded in 1 usecs and X starts and works fine. BTW - booting with 2.6.17-1.2647.fc6 does not change anything.
2679 was the last kernel that didn't have the fix :) The AGP 'Badness' messages you quote shouldn't be in the fixed kernel.
> 2679 was the last kernel that didn't have the fix :) Hm, talk about "current" from comment #5. With 2.6.18-1.2699.fc6 I indeed got with AGPMode not specified in an explicit manner: agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode [drm] Setting GART location based on new memory map [drm] Loading R300 Microcode [drm] writeback test succeeded in 1 usecs Kind of ironic in the face of previous claims "agpgart: Bridge couldn't do AGP x4". OTOH the whole setup still works fine in 8x mode if asked.
Hmm, those messages should only be displayed if you explicitly asked for x4. The behaviour should be.. if you ask for nothing, it'll try to do x8, and if the hardware supports it, you'll get it. If the hardware doesn't support it, you'll fall back to x4 mode with a warning (which you don't get). You seem to have silently fallen back to x4 mode, which shouldn't be possible. I'll look at the code some more.
ah, actually this makes sense. X is being smart and realising that you didn't specify '8', it's falling back to the only other thing it can do in this situation - x4. So this sounds like this is all fixed up ?
> ah, actually this makes sense. X is being smart .... Frankly I am not entirely sure what is an X behaviour really expected here. Why x4 is a default here and not x8, for example? In any case, it works without a need to guess how to fix what in practice looks like a crashed machine (even if it did not entirely crashed :-).
X plays it safe and goes with the more conservative setting because I believe that there are some cards that can do x8, but in certain boards, they exhibit problems. So if you wanted x8 on such a system, you'd need to specify it explicitly.