OS: Fairfax re1108 Modifications to the DMA mapping routines made by Redhat in the cpqarray driver may cause the kernel to panic after patching media based drivers. The driver provided by Compaq uses pci_map_single() & pci_unmap_single() system calls to allocate copy buffers. Redhat modified the driver code to use pci_map_page() & pci_unmap_page() to map scatter-gathers to copy buffers. The reason for the changes is not documented by Redhat. Changes were made to both cpqarray and cciss. The problem may not manifest itself in the cciss driver since Smart Array 5300 is 64-bit DMA capable controller. The cpqarray driver may not exhibit the failure with >2.5GB memory.
"...may cause the kernel to panic after patching media based drivers" So when you run patch to change "media based drivers" (whatever that means, could you please explain?), the kernel panics? Somehow, I doubt that was what you meant, yet I can't find another meaning for it. The change you seem to be objecting to is the highmem IO patch that you wanted us to include. Can you please confirm or deny? Would you like us to remove highmem IO capability for compaq hardware, thus damaging Oracle performance on Compaq hardware? You do seem to say that cpqarray hardware is fundamentally incapable of 64-bit DMA. Can you please confirm that?
Changing media based drivers means using the patch command to install a patch file to upgrade the drivers that are included on your distribution media after installation. This requires the kernel & loadable modules be rebuilt. We do this for either functional changes or to support new storage controllers. If the system has 4GB memory or more and is booting from the Integrated Array controller the kernel will panic or hang during boot. The cciss driver as provided by Compaq supports 64-bit DMA. We see no improvement in performance based upon the changes made by Redhat, but have not tested Oracle. Can you provide performance data from both modified & unmodified drivers? The Integrated Array Controller on the DL590/64 is not capable of 64-bit DMA.
The driver will not get passed an address that is outside the DMA mask it registers with the PCI layer... the block device layer will/should take care of bounce-copying into DMA range. As for the performance data; that's a question to Compaq I assume as Compaq has run those tests...
Let me make sure I understand this. You say that if you patch the driver we ship, the PATCHED driver may panic or hang. Sounds to me like the bug is in the patch, not the driver we ship. Please confirm my understanding.
First, older Smart Array hardware can ONLY do 32 bit DMA. There is NO WAY to pass a 64 bit address to the firmware on these controllers... The new controllers (cciss) CAN do 64 bit dma. And the driver was already set up to do this... We have been doing 64bit dma under IA64 since version 2.4.6 of the driver came out... The questions I have is: 1) Why do you feel it is a MUST for the driver to do a pci_map_page() instead of the pci_map_single() it was already doing... Under IA64 pci_map_single gives us 64bit DMA with the cciss driver. 2) I have seen several versions of the patch that hang on IA32 machines with more then 4G of memory. Have you tested these changes on IA32? 3) Who did these changes, and why weren't we informed that they needed to be done? We are trying to support these drivers, and it is very difficult if they are being changed and we are not even told when and why.
The modification you are griping about is a part of the highmem i/o patch that Compaq (as well as others) *demanded* that we include. Compaq was notified that we were including the patch, and Compaq and Red Hat both tested the patch on x86 hardware. The patch was widely discussed in many contexts. Compaq was provided with early access to the source code, and requested to test on more Compaq hardware than Compaq has provided to Red Hat. You wrote: >First, older Smart Array hardware can ONLY do 32 bit DMA. There is NO WAY to >pass a 64 bit address to the firmware on these controllers... Arjan van de Ven had already written: >The driver will not get passed an address that is outside the DMA mask it >registers with the PCI layer... the block device layer will/should take care of >bounce-copying into DMA range. That means that your objection was answered before you raised it. To the best of my knowledge, this bugzilla entry is being used to track a bug in a Compaq patch to a Compaq driver, a bug that is not present in the product Red Hat ships. I requested confirmation on this point, and while you did not honor my request, to the best of my ability to read between the lines in your response, I am correct in that understanding. I am therefore closing this bugzilla report as NOTABUG; I don't mean to imply by doing so that there is no bug in your patch, only that this report contains no information on any bug in our product. You may continue to provide further information in bugzilla relative to the bug in your patch, including asking for help fixing the bug in your patch, without re-opening the bug. If you have concrete information pointing to a bug in the driver as we shipped it, you may also post that. If we agree that your information points to a bug in the driver that we ship, we will re-open this bug report.