Escalated to Bugzilla from IssueTracker
Description of problem: The problem is that pci_setup_bridge() clears the Prefetchable Memory Base and Limit Upper 32 Bits registers, i.e. it potentially moves the address range defined by these and the Prefetchable Memory Base and Limit registers. The function needs to preserve the upper 32 bits of the 64-bit start/end. The fix needed in RHEL5's 2.6.18-92.1.22.el5 kernel (the most recent I could find) is (a subset of the) linux-2.6 git commit c40a22e0ce5eb400f27449e59e43d021bee58b8d from the end of 2007. Please let us know if more information is needed. Thanks This event sent from IssueTracker by fleitner [Support Engineering Group] issue 258146
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=c40a22e0ce5eb400f27449e59e43d021bee58b8d;hp=f07234b66af1d1a204b9ddabdbdb312e8f1fb35e This event sent from IssueTracker by fleitner [Support Engineering Group] issue 258146
As-is, pci_setup_brige() always clears the upper 32 bits of the prefetchable-memory-behind-bridge base and limit registers. This much should be obvious by looking at the code. A subset of the patch that we referenced fixes this by considering the upper 32 bits in the associated PCI resource object, i.e. the values written to PCI_PREF_BASE_UPPER32 and PCI_PREF_LIMIT_UPPER32 are derived from the same region the value written to PCI_PREF_MEMORY_BASE is derived from. In order to test the code, RedHat will either need to find a system with an SBIOS willing to map a prefetchable 64-bit memory BAR (such as our BAR1) above 4GB and verify that the upper half of the address ranges isn't lost, or you'll need to hack your own kernel to move a suitable BAR above 4GB before same kernel runs pci_setup_bridge(). This event sent from IssueTracker by fleitner [Support Engineering Group] issue 258146
IIRC, SGI Altix boxes map all PCI mem above 4G. The real problem then is finding a card that has a prefetch memory value defined ... but I can always hack around that ;) P.
Created attachment 333911 [details] RHEL5 fix for this issue
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Created attachment 334029 [details] RHEL5 fix for this issue bmaly suggested further changes to the patch.
in kernel-2.6.18-141.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
~~ Attention - RHEL 5.4 Beta Released! ~~ RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner! If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value. Questions can be posted to this bug or your customer or partner representative.
Event posted on 08-17-2009 04:39pm EDT by jkachuck Hello, Per call today NVIDIA has tested it and it looks good. I have requested NVIDIA to update this issue with this information as well. Thank You Joe Kachuck This event sent from IssueTracker by jkachuck issue 258146
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html
Hello, NVIDIA engineering verified with the the two test kernels on a system on which a BAR is mapped above 4GB by the SBIOS. On both kernels BAR1 of the cards tested were placed above 4G and the prefetchable Memory window of the PCI bridge just above these cards were above 4G accordingly. Full functionality of the test cards were verified Thanks Garrison Wu