This bug continues work done from bug 219216 which implements a short-term printk limit workaround in RHEL 5.1. This bug is for a longer-term fix slated for RHEL 5.2.
A recap: Bug #219216 Comment #62 From Andrew Vasquez "...a SCSI LLD (low-level driver) is simply a transparent consumer of SG entries prepared and mapped by the upper-layers. qla2xxx doesn't manipulate sizes nor counts of SG entries. Again, I'm not entirely clear a LLD can 'do' something about this, if a request's SG list can't be mapped by the upper-layers, the I/O is simply flagged for retry." Bug #219216 Comment #63 From Rik van Riel "The qla2xxx driver seems to intentionally fill up the swiotlb (with requests that don't fit in a page, so they need to be bounce buffered under Xen)..." And, when this was discussed upstream: http://lkml.org/lkml/2007/6/2/82 ------quote----- Subject Re: [PATCH] quiet down swiotlb warnings On Sat, Jun 02, 2007 at 06:21:46PM +0300, Muli Ben-Yehuda wrote: > On Fri, Jun 01, 2007 at 10:26:01PM +0200, Andi Kleen wrote: > > > Normally swiotlb doesn't even try to bounce when dma mask is <= > > end_pfn so something must be very wrong in your kernel. It > > definitely isn't a mainline kernel. If this happens in Xen then Xen > > just needs fixing -- it should not try to bounce when the normal > > kernel wouldn't. > > Xen needs to bounce when the requested buffer is not contiguous in > machine memory (and indeed uses swiotlb for that). Then it should just restrict the sg list merging at the block layer to never merge into anything larger than a page. Then this cannot happen or only very rarely. ------end quote----- So, if I understand correctly, the fix needs to be in teh Xen kernel.
After confirming with QLogic, they have no patch since they believe it is not with the QLogic driver.
PCI-DMA: Out of SW-IOMMU space for 4608 bytes at device 0000:02:0b.0 and PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:08:03.1 Those are all sector-aligned, large IOs. Looks like we're passing SG lists through the swiotlb. Now, Xen's swiotlb_map_sg simply does not use the swiotlb, unless the test if (swiotlb_force || address_needs_mapping(hwdev, dev_addr)) { So: is "swiotlb=force" being used on the kernel options line in the case which is breaking? If not, then we need to work out why the SG list mapping is entering swiotlb. "address_needs_mapping" should not be returning true for a 64-bit-addressing-capable adapter, so we'll probably need to run the reproducer on an instrumented kernel to go any further.
We've already cycled through this in some of the earlier bugzilla comments. Comment #32 (https://bugzilla.redhat.com/show_bug.cgi?id=219216#c32) has details on what dma_get_required_mask() is returning while run with this kernel (it's a 32-bit mask). Comment #44 (https://bugzilla.redhat.com/show_bug.cgi?id=219216#c44) is Rik's note on how dma_get_required_mask() is implemented incorrectly in a Xen kernel. I believe EMC was able to easily reproduce this in their labs.
Comment #44 was: > A related problem: dma_get_required_mask() is wrong if the Xen kernel is > booted on a large system with a dom0 smaller than the maximum machine size. But there's no information I can see indicating that that is the case in this particular instance. And we are _still_ missing any kernel boot logs indicating early boot configuration: the only logs posted have been after significant uptime, and contained little except for storage error messages. Complete boot logs really are going to be helpful here, along with the kernel and hypervisor options being used. If "dmesg" no longer has them, you may need to reboot to obtain them.
Added pan_haifeng to the cc: list. Not sure he can read this BZ...
Wayne/Pan - have you been able to reproduce this issue? I think your group is the only one that can nail this down. If not, I say we close this...
The issue can not reproduce now, agree to close and will reopen when we hit again