Bug 52037

Summary: modprobe agpgart locks machine (resources aren't being allocated)
Product: [Retired] Red Hat Linux Reporter: Chris Runge <crunge>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED NOTABUG QA Contact: Brock Organ <borgan>
Severity: high Docs Contact:
Priority: high    
Version: 7.3   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2001-08-27 14:36:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output
none
lspci -vvv output
none
lspci -xxx output none

Description Chris Runge 2001-08-19 14:59:58 UTC
Description of Problem:

on roswell 2 (7.1.94), with the kernel 2.4.7-2, issuing the command
"modprobe agpgart" locks the machine solid; a reboot is necessary

Version-Release number of selected component (if applicable):

kernel 2.4.7-2

How Reproducible:

everytime; happens when I try the smp kernel, the non-smp kernel, the
noapic option, and after fiddling with almost every BIOS option

Steps to Reproduce:
1. modprobe agpgart

Actual Results:

machine locks solid; a hard reset is required

Expected Results:

the agpgart module should load, or give an error

Additional Information:

I didn't have this problem in 7.1. The problem is probably caused by the
new ServerWorks AGP support that has found itself in newer kernels (post
7.1)--I have a ServerWorks III HE-SL chipset in my SuperMicro 370DE6
motherboard. Before ServerWorks AGP support was available the kernel would
spit out an error message when I tried to load the agpgart module.

When the support was introduced in the generic kernel source I worked with
Jeff Hartmann, the agpgart author, to try to isolate the problem. At that
time when the module was loaded a series of error messages would scroll by
on the screen; the machine was still essentially hard-locked, however. He
ascertained after a while that he thought the problem had to do with
resources not being allocated correctly. The output of lspci -vvv shows
this:

00:00.0 Host bridge: ServerWorks CNB20HE Host Bridge (rev 23)
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort-
<MAbort- >SERR- <PERR-
        Region 0: Memory at <ignored> (32-bit, prefetchable) [disabled]
[size=128M]
        Region 1: Memory at f22ff000 (32-bit, non-prefetchable) [disabled]
[size=4K]

Here the Region 0 of 00:00.0 isn't allocated. dmesg also points this out:

PCI: Cannot allocate resource region 0 of device 00:00.0 

By playing around with various BIOS settings I can get it so that Region 0
is allocated, but Region 1 then goes unallocated. At Jeff's suggestion I
removed all PCI devices from the system--he thought maybe a soundcard or
other device was the problem--this still didn't fix it.

Booting with the uniproc kernel or with the noapic option doesn't fix the
problem either.

I think that if the resource allocation problem can be fixed this will fix
the agpgart lockup.

I will attach all relevant information that Jeff had asked for...

Comment 1 Chris Runge 2001-08-19 15:01:15 UTC
Created attachment 28414 [details]
dmesg output

Comment 2 Chris Runge 2001-08-19 15:01:49 UTC
Created attachment 28415 [details]
lspci -vvv output

Comment 3 Chris Runge 2001-08-19 15:02:16 UTC
Created attachment 28416 [details]
lspci -xxx output

Comment 4 Chris Runge 2001-08-19 15:03:24 UTC
Please note...

I am using a VisionTek GeForce 3 card at the current time; however, I am not
using the proprietary drivers from Nvidia. I also have a G450 from Matrox; it
also has the problem.

Comment 5 Glen Foster 2001-08-20 19:24:38 UTC
This defect is considered SHOULD-FIX for Fairfax.

Comment 6 Chris Runge 2001-08-27 14:36:36 UTC
this is no longer an issue...
SuperMicro recently released a new BIOS (1.1C) for the 370DE6 motherboard. A
combination of applying that, removing all cards except the Nvidia card, and
playing with the BIOS settings some more seemed to fix it. I'm thinking it is
the BIOS update that fixed the problem since I have tried the other things
before. When I had agpgart loaded I then was able to reinstall my other PCI
cards and things worked fine (and continue to).

Comment 7 Arjan van de Ven 2001-08-27 14:42:16 UTC
Bios bug -> NOTOURBUG :)
but that doesn't exist so I'll close as NOTABUG