Red Hat Bugzilla – Bug 57247
Redhat modifications may cause panics after patching cpqarray
Last modified: 2005-10-31 17:00:50 EST
OS: Fairfax re1108
Modifications to the DMA mapping routines made by Redhat in the cpqarray
driver may cause the kernel to panic after patching media based drivers.
The driver provided by Compaq uses pci_map_single() & pci_unmap_single()
system calls to allocate copy buffers. Redhat modified the driver code to
use pci_map_page() & pci_unmap_page() to map scatter-gathers to copy
buffers. The reason for the changes is not documented by Redhat.
Changes were made to both cpqarray and cciss. The problem may not manifest
itself in the cciss driver since Smart Array 5300 is 64-bit DMA capable
controller. The cpqarray driver may not exhibit the failure with >2.5GB
"...may cause the kernel to panic after patching media based drivers"
So when you run patch to change "media based drivers" (whatever that
means, could you please explain?), the kernel panics? Somehow, I
doubt that was what you meant, yet I can't find another meaning for
The change you seem to be objecting to is the highmem IO patch that
you wanted us to include. Can you please confirm or deny? Would
you like us to remove highmem IO capability for compaq hardware,
thus damaging Oracle performance on Compaq hardware?
You do seem to say that cpqarray hardware is fundamentally incapable
of 64-bit DMA. Can you please confirm that?
Changing media based drivers means using the patch command to install a patch
file to upgrade the drivers that are included on your distribution media after
installation. This requires the kernel & loadable modules be rebuilt. We do
this for either functional changes or to support new storage controllers.
If the system has 4GB memory or more and is booting from the Integrated Array
controller the kernel will panic or hang during boot.
The cciss driver as provided by Compaq supports 64-bit DMA.
We see no improvement in performance based upon the changes made by Redhat, but
have not tested Oracle. Can you provide performance data from both modified &
The Integrated Array Controller on the DL590/64 is not capable of 64-bit DMA.
The driver will not get passed an address that is outside the DMA mask it
registers with the PCI layer... the block device layer will/should take care of
bounce-copying into DMA range.
As for the performance data; that's a question to Compaq I assume as Compaq has
run those tests...
Let me make sure I understand this. You say that if you patch the driver
we ship, the PATCHED driver may panic or hang. Sounds to me like the bug
is in the patch, not the driver we ship. Please confirm my understanding.
First, older Smart Array hardware can ONLY do 32 bit DMA. There is NO WAY to
pass a 64 bit address to the firmware on these controllers...
The new controllers (cciss) CAN do 64 bit dma. And the driver was already set
up to do this... We have been doing 64bit dma under IA64 since version 2.4.6
of the driver came out...
The questions I have is:
1) Why do you feel it is a MUST for the driver to do a pci_map_page() instead
of the pci_map_single() it was already doing... Under IA64 pci_map_single gives
us 64bit DMA with the cciss driver.
2) I have seen several versions of the patch that hang on IA32 machines with
more then 4G of memory. Have you tested these changes on IA32?
3) Who did these changes, and why weren't we informed that they needed to be
done? We are trying to support these drivers, and it is very difficult if they
are being changed and we are not even told when and why.
The modification you are griping about is a part of the highmem i/o
patch that Compaq (as well as others) *demanded* that we include.
Compaq was notified that we were including the patch, and Compaq and
Red Hat both tested the patch on x86 hardware. The patch was widely
discussed in many contexts. Compaq was provided with early access to
the source code, and requested to test on more Compaq hardware than
Compaq has provided to Red Hat.
>First, older Smart Array hardware can ONLY do 32 bit DMA. There is NO WAY to
>pass a 64 bit address to the firmware on these controllers...
Arjan van de Ven had already written:
>The driver will not get passed an address that is outside the DMA mask it
>registers with the PCI layer... the block device layer will/should take care of
>bounce-copying into DMA range.
That means that your objection was answered before you raised it.
To the best of my knowledge, this bugzilla entry is being used to track
a bug in a Compaq patch to a Compaq driver, a bug that is not present in
the product Red Hat ships. I requested confirmation on this point, and
while you did not honor my request, to the best of my ability to read
between the lines in your response, I am correct in that understanding.
I am therefore closing this bugzilla report as NOTABUG; I don't mean to
imply by doing so that there is no bug in your patch, only that this
report contains no information on any bug in our product. You may
continue to provide further information in bugzilla relative to the bug
in your patch, including asking for help fixing the bug in your patch,
without re-opening the bug.
If you have concrete information pointing to a bug in the driver as we
shipped it, you may also post that. If we agree that your information
points to a bug in the driver that we ship, we will re-open this bug