Description of problem: The default BIOS on the Tyan 2885 motherboard does not set up the MTRRs properly, resulting in 1x AGP performance on high end graphics card (instead of 8x) when the system has 8 GB+ of memory. Tyan has rewritten the BIOS with AMD's help (version 2885101k) and nVidia has rewritten the graphics driver (version 10-5331). This BIOS uses a 4 GB MTRR for memory from 4GB to 8GB and the mtrr.c in the 2.4.21-6EL kernel does not handle this correctly. A corrected mtrr.c, BIOS, and nVidia driver has been sent to Jim Paradis. With this combination of updates, the system will boot and run with 8x AGP performance. However, it will hang when loading any window manager more complicated than the 'failsafe' graphics manager. Version-Release number of selected component (if applicable): Tyan BIOS 2885101k nVidia driver 10-5331 How reproducible: Steps to Reproduce: 1. Start with a Tyan 2885 k8W system with 2x Opteron 24x processors and 4x1GB memory sticks. 2. Update the BIOS to 2885101k. 3. Install RHEL3. 4. Apply AGP detection patch. Replace /usr/src/linux- 2.4/arch/x86_64/kernel/mtrr.c with revised mtrr.c. 5. Recompile the kernel. 6. Shutdown the machine and add 4x1GB of memory, so that all slots are populated and the machine has 8GB physical RAM. 7. Reboot with new kernel into runlevel 3. 8. Configure the nvidia 10-5331 driver. Config X server to run with AGP support. 9. Go to runlevel 5. Observe that X comes up. 10. Set session type to "failsafe" and log in. 11. Run glxgears. Observe ~4500 fps on default settings. 12. Log out and log back in with "default" (kde/gnome) session. Actual results: System hangs while restoring kde/gnome session. Expected results: kde/gnome session should restore and run. Running glxgears should provide 4500 fps on default settings. Additional info: Many large geosurvey firms want to purchase Opteron workstations with 8x AGP support, 8 GB of memory, and RHEL3. Please do not ignore this report just because of nVidia's stupid proprietary drivers. Thank you. All necessary .c files, XF86Config files, nvidia driver libraries, and BIOS images have been sent to Jim Paradis (jparadis).
Does this happen with the nv driver too?
Actually, does this problem occur with all video hardware using any drivers? I've had various reports with that Tyan motherboard, which I've let sit for a while, as I know they were hardware related problems. I don't believe all of the problems reported were Nvidia related strictly. If you can confirm if this problem is wider in scope, that would be helpful. Also, if you could attach the files (preferably as unified diffs) to the bug report for review (mtrr.c), that would be appreciated also. Thanks in advance.
mtrr.c is part of the kernel, not XFree86... Reassigning to kernel.
Whoops, I changed the status to MODIFIED by accident, changing back.
Created attachment 97037 [details] Patch to fix the AMD64 kernel's handling of 4 GB MTRR buffers
I haven't tested this with anything but the nVidia drivers. ATI doesn't provide accellerated graphics support for AMD64 yet and I don't have any experience in getting the Open Source AGP drivers to work on Red Hat. I have confirmed that the problem occurs on both KDE and GNOME. I will try setting up a Radeon 7500 on this system and see if the problem still occurs.
Thanks Mark. The fix appears to me to be sane, and that it would affect all video hardware running in X, not just Nvidia, however it might have only triggered visible problems on certain setups to date.
Hmm, just noticed this. Shouldn't the following: + newsize = (u64) (mask_hi | ~0xff) << 32 | (mask_lo & ~0x800); Be changed to: + newsize = (u64) ((mask_hi | ~0xff) << 32 | (mask_lo & ~0x800));
Well, the first version doesn't load into KDE, but gets excellent (4800+ fps on glxgears) AGP performance under the failsafe wm. The second version loads into KDE, but can't load the AGP v3 drivers and only gives adequate AGP performance (2600+ fps). Under more compute intensive tests, the second version has about 1/10th the performance of the first version. I did some more regression tests, and the 2885101i BIOS (available on the Tyan website) loads KDE and provides excellent performance with the first version of the test. It has an MTRR map like this: reg00: base=0xf0000000 (3840MB), size= 128MB: write-combining, count=1 reg01: base=0x00000000 ( 0MB), size=2048MB: write-back, count=1 reg02: base=0x80000000 (2048MB), size=1024MB: write-back, count=1 reg03: base=0x100000000 (4096MB), size=4096MB: write-back, count=1 reg04: base=0xc0000000 (3072MB), size= 256MB: write-back, count=1 reg05: base=0xd0000000 (3328MB), size= 128MB: write-back, count=1 reg06: base=0xd8000000 (3456MB), size= 32MB: write-back, count=1 The 2885101k has an MTRR map like this: reg00: base=0x00000000 ( 0MB), size=2048MB: write-back, count=1 reg01: base=0x80000000 (2048MB), size=1024MB: write-back, count=1 reg02: base=0x100000000 (4096MB), size=4096MB: write-back, count=1 reg03: base=0xc0000000 (3072MB), size= 256MB: write-back, count=1 reg04: base=0xd0000000 (3328MB), size= 128MB: write-back, count=1 reg05: base=0xd8000000 (3456MB), size= 32MB: write-back, count=1 reg06: base=0xf0000000 (3840MB), size=128MB: write-combining, count=1 I'm not sure what the difference here really is, or what one system locks up and the other doesn't.
Richard Brunner pointed out that my temporary mtrr patch was wrong, and we worked out an improved version, which I have attached. It solves the issue with the 2885101k BIOS.
Created attachment 97059 [details] Improved patch to fix MTRR handling
MTRR patch submitted for U2
AMD Validation reported a problem when running RHEL3 U2 Beta1 for AMD64 on an Asus platform with 8 GB of memory. Applying the patch fixed the problem. Shouldn't the patch have been part of RHEL3 U2 Beta1?
The patch just missed inclusion in Beta1, but I have verified that it is in the codebase for the GA release.
For those of us who don't want to become career linux developers, could someone summarize what needs to be done to coax the Tyan S2885 with dual processors to actually run RHEL WS3 reliably ?? Is the best solution, for now, to just ditch AGP cards altogether and go to an ancient PCI card for reasonable, reliable performance? THANKS MUCH...
How do I indicate that this has been fixed in U2 and AMD has verified the fix?