Bug 252215

Summary: gp dl535g2 / AMD 8132 MMCONFIG blacklist
Product: Red Hat Enterprise Linux 5 Reporter: Tony Camuso <tcamuso>
Component: kernelAssignee: Tony Camuso <tcamuso>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 5.0CC: dzickus, jturner
Target Milestone: ---Keywords: OtherQA
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-12-19 02:42:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 248186    
Bug Blocks:    

Description Tony Camuso 2007-08-14 19:17:33 UTC
+++ This bug was initially created as a clone of Bug #248186 +++

Description of problem:

The blacklist patch that addresses this problem, as well as the bus
discovery patches, have been implemented in RHEL 4, for both i386 and
x86_64.

HOWEVER, these patches have only been implemented for x86_64 in RHEL5.

AMD-8132 hypertransport-to-PCI bridge does not support MMCONFIG cycles, 
so PCI configuration must use the legacy PortIO CF8/CFC mechanism. This 
chip also does not support MSI, and it is recommended that devices below
it have MSI disabled.  

All x86 systems using this chip must boot with '-pci nommconfig' switch
or the PCI behind this bridge will not be configured. 

Version-Release number of selected component (if applicable): 2.6.18-34

How reproducible:

Right now, we don't see the problem because the dl585g2 BIOS does not map
MMCONFIG into E820, so MMCONFIG is turned off anyway. However, as soon as 
the dl585g2 BIOS is fixed, we can expect problems. I have not encountered 
the problems yet, but Bhavana submitted patches to obviate the problem for
both 1386 and x86_64 on RHEL4 but ONLY for x86_64 on RHEL5.

Steps to Reproduce:
1.boot 
  
Actual results:
n/a


Expected results:
See AMD 8132 errata at 
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf



Additional info:

-- Additional comment from tcamuso on 2007-07-18 11:05 EST --
(In reply to comment #0)

This problem needs to be addressed ASAP for RHEL5-U1. 

This is a very easy fix, simply applying the same patch to the i386 MMCONFIG
code that was made to the x86_64 MMCONFIG code in order for the code to remain
consistent between the two architectures and with what was done in RHEL4.

While HP can continue code the BIOS (wrongly) to indicate to RHEL that MMCONFIG
is not supported on the DL585g2, which is why the 32-bit DL585g2 currently boots
without the fix, this is not consistent, nor is it the correct long-term fix,
since it is internally inconsistent within the BIOS itself. 


-- Additional comment from tcamuso on 2007-07-20 16:43 EST --
Created an attachment (id=159692)
Patch to blacklist 32-bit 585 in mmconfig.c 

This patch makes the RHEL5.1 32-bit x86 code consistent with RHEL5.1 x86_64
code and with RHEL4 32 and 64 bit x86 code. 

Patch to blacklist HP DL585 G2 to use legacy (CF8/CFC) PCI config accesses
rather than  MMCONFIG cycles. 


-- Additional comment from pm-rhel on 2007-07-31 10:15 EST --
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

-- Additional comment from jturner on 2007-08-01 19:48 EST --
Need testing results with the latest 5.1 code + patch in addition to other
testing which needs to occur.

-- Additional comment from tcamuso on 2007-08-14 13:33 EST --
The business case for the submitted patch is that the HP dl585g2 is a popular
system among RedHat's larger customers. 

Systems currently blacklisted are:
  PLATFORM                  RHEL4  RHEL5
  ---------------------     -----  -----
  HP xw9300 Workstation       X      X
  HP xw9400 Workstation       X      X
  HP ProLiant DL585 G2        X      X <- 32b with the submitted patch
  HP Compaq dc5700 Microtower        X <- 32b with the submitted patch
  Intel DG965MQ               X
  Intel D26928                X

As it is, the current blacklist does not cover all the systems affected, so a
Release Note must accompany the RHEL 4/5 update that lists affected platforms
that are not blacklisted. 

In RHELs 4.7 and 5.2, we can avoid the platform blacklist and release notes for
platforms not included therein by blacklisting the chipsets. 

It's just too late to do this in RHELs 4.6 and 5.1


This bug is being cloned for 5.2

The correct fix is to blacklist the chipsets rather than the platforms. 

History for this bug follows. 

-- Additional comment from dzickus on 2007-08-14 14:56 EST --

Tony please work with Don D. in making sure that a correct and accurate release
note is attached to RHEL 5.1.  Considering the temporary solution for 5.1
requires a little extra effort on the customer's part, I want to make sure the
details of your email are covered in here.

Thanks,
Don

Comment 1 Harald Hoyer 2007-08-15 07:37:37 UTC
what can pciutils do, in this context?

Comment 2 Tony Camuso 2007-08-15 14:52:15 UTC
(In reply to comment #1)
> what can pciutils do, in this context?

Sorry. This BZ should be attributted to kernel.

Comment 3 Tony Camuso 2007-08-15 14:54:03 UTC
(In reply to comment #1)
> what can pciutils do, in this context?

Sorry. This BZ should be attributted to kernel.

Comment 4 RHEL Program Management 2007-11-05 15:05:39 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 5 Tony Camuso 2007-11-08 13:51:54 UTC
Patch submitted as RFC.

Required are patches for three code streams

1. upstream
2. RHEL4.7
3. RHEL5.2



Comment 7 Tony Camuso 2007-12-19 02:42:45 UTC
We have takent 585g2 out of the blacklist in the new patch. 

The patch is being submitted upstream, and works just fine in RHEL5.2

Appended below is an excerpt from Naga Chumbalkar's test with the patch. 

Please close this BZ.

---------------------------------------------------------------------

Here's the output on a DL585 G2 (8132, with MCFG defined in the BIOS). I
commented out the e820 check, and turned your pr_debug to pr_info:

...
...
ACPI: bus type pci registered
naga: mmconfig-shared.c: commented out the e820 check
PCI: Using MMCONFIG at 80000000 - 8fffffff
PCI: No mmconfig possible on device 00:18
PCI: No mmconfig possible on device 00:19
PCI: Buses that can't use MMCONFIG will use type 1 PCI conf access.
ACPI: EC: Look up EC in DSDT
ACPI: SSDT 7FE58000, 04F0 (r2 HP     PNOWSSDT        2 HP          1)
ACPI: Interpreter enabled
ACPI: (supports S0 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: If a device isn't working, try "pci=nommconf". If that helps, please post a
 report.
PCI: Checking bus 0000:00 for MMCONFIG compliance.
PCI: Checking bus 0000:01 for MMCONFIG compliance.
PCI: Transparent bridge - 0000:00:09.0
PCI: Checking bus 0000:08 for MMCONFIG compliance.
PCI: Checking bus 0000:05 for MMCONFIG compliance.
PCI: Checking bus 0000:02 for MMCONFIG compliance.
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.IP2P._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.CPE0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.CPE1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.CPE2._PRT]
ACPI: PCI Root Bridge [PCI1] (0000:40)
PCI: Checking bus 0000:40 for MMCONFIG compliance.
PCI: Bus 0000:40 and its descendents cannot use MMCONFIG PCI Configuration acces
s.
ACPI: PCI Interrupt Routing Table [\_SB_.PCI1.BRGA._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI1.BRGB._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI1.IPE0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI1.IPE1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI1.IPE2._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI1.IPE3._PRT]
ACPI: PCI Interrupt Link [LNKW] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [LNKX] (IRQs 17) *0, disabled.
ACPI: PCI Interrupt Link [LNKY] (IRQs 18) *0, disabled.
ACPI: PCI Interrupt Link [LNKZ] (IRQs 19) *0, disabled.
ACPI: PCI Interrupt Link [LNU0] (IRQs 22) *5
ACPI: PCI Interrupt Link [LNU2] (IRQs 23) *10
ACPI: PCI Interrupt Link [LNKA] (IRQs 54) *0, disabled.
ACPI: PCI Interrupt Link [LNKB] (IRQs 55) *0, disabled.
ACPI: PCI Interrupt Link [LNKC] (IRQs 56) *0, disabled.
ACPI: PCI Interrupt Link [LNKD] (IRQs 57) *0, disabled.
...

The patch works as expected. 8132 is on bus 0x40. The NICs behind the 8132 are
functional using Port IO.


- naga -