This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 219216 - [EMC/QLogic 5.1 bug] qla2xxx driver running IO on DM-MPIO devices cause "kernel: PCI-DMA: Out of SW-IOMMU space "
[EMC/QLogic 5.1 bug] qla2xxx driver running IO on DM-MPIO devices cause "kern...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen (Show other bugs)
5.0
All Linux
urgent Severity urgent
: ---
: ---
Assigned To: Rik van Riel
: OtherQA
: 219219 (view as bug list)
Depends On:
Blocks: 216989 217104 227613 252029
  Show dependency treegraph
 
Reported: 2006-12-11 17:35 EST by Pan Haifeng
Modified: 2009-06-19 06:33 EDT (History)
13 users (show)

See Also:
Fixed In Version: RHBA-2007-0959
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-11-07 14:16:42 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
lspci log (40.75 KB, text/plain)
2006-12-12 16:51 EST, Pan Haifeng
no flags Details
dmesg log after reboot (122.09 KB, application/octet-stream)
2006-12-18 17:33 EST, Pan Haifeng
no flags Details
syslog after reboot (1.32 MB, text/plain)
2006-12-18 17:35 EST, Pan Haifeng
no flags Details
dmesg output loading ZEN kernel (23.41 KB, text/plain)
2006-12-19 16:31 EST, Andrew Vasquez
no flags Details
quiet down the kernel (1.27 KB, text/x-patch)
2007-08-13 15:20 EDT, Rik van Riel
no flags Details

  None (edit)
Description Pan Haifeng 2006-12-11 17:35:45 EST
Description of problem:
Have a RHEL5 x86_64 machine has 32 Clariion LUNs assigned to it and configured 
them as DM-MPIO devices. Run IO for about 6 minutes, then syslog keep 
poping "kernel: PCI-DMA: Out of SW-IOMMU space " messages. 

[root@l82bi220 current]# uname -a
Linux l82bi220.lss.emc.com 2.6.18-1.2747.el5xen #1 SMP Thu Nov 9 18:52:11 EST 
2006 x86_64 x86_64 x86_64 GNU/Linux

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Rik van Riel 2006-12-11 17:46:14 EST
*** Bug 219219 has been marked as a duplicate of this bug. ***
Comment 2 Andrew Vasquez 2006-12-12 14:53:06 EST
Given that the messages file wasn't attached to the bugzilla, 
I take it the entries in question are something like:

   PCI-DMA: Out of SW-IOMMU space for 4608 bytes at device 0000:02:0b.0
   PCI-DMA: Out of SW-IOMMU space for 4608 bytes at device 0000:02:0b.0
   PCI-DMA: Out of SW-IOMMU space for 4608 bytes at device 0000:02:0b.0
   PCI-DMA: Out of SW-IOMMU space for 4608 bytes at device 0000:02:0b.0

If so, then as the message implies, the kernel has run out of
IOMMU entries for the scatter-gather lists associated to a given command.
qla2xxx will detect this via a failure of during the mapping call
(pci_map_sg() or pci_map_single()) and will fail-out accordingly with
the proper unmap() call and a SCSI_MLQUEUE_HOST_BUSY status returned 
during queuecommand().

qla2xxx performs no internal command queuing -- if internal driver
resources are available (request-q entries), the command is immediately
submitted to the RISC.  So, all IOMMU entries mapped by the driver are
in use and will be freed upon command completion.

Are you seeing the same thing with a non-zen kernel?
Comment 3 Pan Haifeng 2006-12-12 15:07:21 EST
You are right, more logs like following:

Dec 11 10:06:13 l82bi220 kernel: EXT3-fs: mounted filesystem with ordered data 
mode.
Dec 11 10:12:38 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.0
Dec 11 10:12:38 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.0
Dec 11 10:12:38 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.1
Dec 11 10:12:38 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.1
Dec 11 10:12:38 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.0

I did not see the same on on-xen kernel.
Comment 4 RHEL Product and Program Management 2006-12-12 15:20:34 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.
Comment 6 Andrew Vasquez 2006-12-12 16:44:01 EST
And I take it the PCI device at 0000:08:03.0/1 is the QLogic HBA?
Could you attach the output of 'lspci -vvv'?
Comment 7 Pan Haifeng 2006-12-12 16:51:37 EST
Created attachment 143455 [details]
lspci log
Comment 8 Rob Kenna 2006-12-13 09:33:27 EST
This needs attention.  Is this in the host or a guest?  How much physical memory
and how much assigned to the guest (if that's the case)?
Comment 9 Pan Haifeng 2006-12-13 09:38:00 EST
It is in a host OS not a guest OS. 
[root@l82bi220 ~]# cat /proc/meminfo
MemTotal:      4031596 kB
MemFree:       3239452 kB
Buffers:        250120 kB
Cached:         368280 kB
SwapCached:          0 kB
Active:         409028 kB
Inactive:       265288 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      4031596 kB
LowFree:       3239452 kB
SwapTotal:     2031608 kB
SwapFree:      2031608 kB
Dirty:             224 kB
Writeback:          96 kB
AnonPages:       55780 kB
Mapped:          22628 kB
Slab:            79008 kB
PageTables:       6356 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   4047404 kB
Committed_AS:   140676 kB
VmallocTotal: 34359738367 kB
VmallocUsed:      4852 kB
VmallocChunk: 34359732927 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB
Comment 10 Rik van Riel 2006-12-18 15:38:15 EST
Looking at the code in detail, I'm no longer sure why exactly this is a blocker.
 The qla2xxx driver simply throws IO at the swiotlb as fast as it can, and backs
off when the swiotlb is full.

I can think of two things that would affect system performance negatively:
1) the number of printks from lib/swiotlb.c flying past :)  (needs rate limiting?)
2) the fact that we're using the swiotlb at all, surely qla2xxx is capable of
addressing memory >4GB ?
Comment 11 Andrew Vasquez 2006-12-18 15:49:01 EST
Re: comment #10:

> 2) the fact that we're using the swiotlb at all, surely qla2xxx is capable 
>    of addressing memory >4GB ?

Yes, qla2xxx can DMA above 4GB.  The driver's employed the following logic 
for some time:


static void
qla2x00_config_dma_addressing(scsi_qla_host_t *ha)
{       
        /* Assume a 32bit DMA mask. */
        ha->flags.enable_64bit_addressing = 0;
        
        if (!dma_set_mask(&ha->pdev->dev, DMA_64BIT_MASK)) {
                /* Any upper-dword bits set? */
                if (MSD(dma_get_required_mask(&ha->pdev->dev)) &&
                    !pci_set_consistent_dma_mask(ha->pdev, DMA_64BIT_MASK)) {
                        /* Ok, a 64bit DMA mask is applicable. */
                        ha->flags.enable_64bit_addressing = 1;
                        ha->isp_ops.calc_req_entries = qla2x00_calc_iocbs_64;
                        ha->isp_ops.build_iocbs = qla2x00_build_scsi_iocbs_64;
                        return; 
                }
        }

        dma_set_mask(&ha->pdev->dev, DMA_32BIT_MASK);
        pci_set_consistent_dma_mask(ha->pdev, DMA_32BIT_MASK);
}


According to the system's meminfo output:

   [root@l82bi220 ~]# cat /proc/meminfo
   MemTotal:      4031596 kB
   MemFree:       3239452 kB
   Buffers:        250120 kB

there is less than 4GB on the system.

Are you suggesting that dma_get_required_mask() is returning back
no bits in the upper 32bit dword on the zen-kernel -- and thus
causing the driver to fall back to 32bit dma-configuration?
Comment 12 Rik van Riel 2006-12-18 16:01:01 EST
Andrew, could you please attach the /var/log/dmesg from booting up the Xen
kernel on this system, so I can search it for any interesting messages?

The swiotlb setup and other parts of the kernel should have something to say
about what's going on here...
Comment 13 Pan Haifeng 2006-12-18 17:33:31 EST
Created attachment 143956 [details]
dmesg log after reboot
Comment 14 Pan Haifeng 2006-12-18 17:35:40 EST
Created attachment 143957 [details]
syslog after reboot
Comment 15 Andrew Vasquez 2006-12-18 17:53:33 EST
The logs contain no kernel initialization messages, instead
there are a slew of I/O failures due to some broken SCSI storage:

  ...
  sd 5:0:1:31: Device not ready: <6>: Current: sense key: Not Ready
    Additional sense: Logical unit not ready, manual intervention required
  end_request: I/O error, dev sddr, sector 0
  sd 5:0:1:31: Device not ready: <6>: Current: sense key: Not Ready
    Additional sense: Logical unit not ready, manual intervention required
  end_request: I/O error, dev sddr, sector 0
  sd 5:0:1:31: Device not ready: <6>: Current: sense key: Not Ready


We'll need the early boot messages (dmesg -s 100000 might help, if
the data was retrieved via 'dmesg' command).
Comment 16 Pan Haifeng 2006-12-18 17:57:48 EST
dmesg -s 100000 has same information. Can not get the required information 
using dmesg. 
Comment 17 Andrew Vasquez 2006-12-18 18:07:31 EST
Since we are only interested in the kernel boot-messages, could you
disconnect the broken storage (all storage) and reboot the machine
so that the circular-buffer does not wrap.  'dmesg -s 100000' should
then at least be able to capture the relevant data.
Comment 19 Andrew Vasquez 2006-12-19 16:31:18 EST
Created attachment 144043 [details]
dmesg output loading ZEN kernel

From: sprah_alex@emc.com						       
						  

Here is the log that Mike and I retrieved from Haifeng's system.	       
						  
We have removed the FC cables from the HBA and rebooted the system.	       
						  
-Alex
Comment 20 Andrew Vasquez 2006-12-19 16:39:54 EST
In taking a closer look at the driver messages as well, I can
verify that given the logic mentioned in comment #11, only a 32
bit mask is being set:

   qla2xxx 0000:08:03.0: 
    QLogic Fibre Channel HBA Driver: 8.01.07-k1
     QLogic QLA2462 - PCI-X 2.0 to 4Gb FC, Dual Channel 
     ISP2422: PCI-X Mode 1 (133 MHz) @ 0000:08:03.0 hdma-, host#=5, fw=4.00.23 [IP] 

Basically:

   'hdma-' equates to a 32bit DMA mask being set.
   'hdma+' equates to a 64bit DMA mask being set.

So either:

1) dma_set_mask(&ha->pdev->dev, DMA_64BIT_MASK) is failing
2) or NO upper-dword bits are set in dma_get_required_mask()
3) or pci_set_consistent_dma_mask(ha->pdev, DMA_64BIT_MASK) is failing
Comment 25 Andrius Benokraitis 2007-01-05 14:19:52 EST
QLogic/EMC: we are at the point where this won't make RHEL5 unless a really
low-risk patch is proposed... highly probable this will be deferred to 5.1.
Comment 26 Andrew Vasquez 2007-01-05 17:07:14 EST
Rik,

Given comment#20 and the kernel boot logs in comment#19, is there
anything that qla2xxx is doing wrong in setting it's dma-mask?

I have a feeling dma_get_required_mask() is returning a mask that
has no upper-dword bits set.  Would a ZEN kernel act in such a way
even if the machine has less than 4gb of memory (this one I believe
has 2gb)?
Comment 27 Rik van Riel 2007-01-05 17:21:12 EST
Not that I know.  Did you add any printks to the kernel on your test system to
figure out exactly what is happening?
Comment 28 Andrius Benokraitis 2007-01-09 10:53:36 EST
I think we are at the point where this needs to be deferred to 5.1... 
Comment 29 Wayne Berthiaume 2007-01-09 17:32:47 EST
Hi Andrius.

   We're able to easily reproduce this. As a result of today's conversation 
with QLogic and comment #20, QLogic will provide us with an instrumented driver 
that will, hopefully, answer all the questions and get to the bottom of this 
one. 

   My fear is this problem is in a very popular configuration that we would not 
be able to support with this issue at release time if we can't fix it soon. I 
am mindful we are running out of runway on this as well. 

Regards,
Wayne.
Comment 30 Andrius Benokraitis 2007-01-10 15:10:21 EST
Andrew @ QLogic: Any ideas in regard to Comment #27 from Rik?

Wayne, thanks for the update... Given that more work is needed on this, I think
we have run out of time in 5.0 on this one. Tom, your thoughts?
Comment 31 Andrius Benokraitis 2007-01-10 16:03:57 EST
Officially out of runway for 5.0. Deferred to 5.1.
Comment 32 Andrew Vasquez 2007-01-10 16:38:26 EST
I've sent emc a debug driver which displays the 'required-mask' and
return codes for the dma-calls.  In testing locally with a snapshot6
xen kernel on an HP x86_64 machine with 2gb, I get the following:

        QLogic Fibre Channel HBA Driver
        PCI: Enabling device 0000:1f:00.0 (0140 -> 0143)
        ACPI: PCI Interrupt 0000:1f:00.0[A] -> GSI 16 (level, low) -> IRQ 16
        qla2xxx 0000:1f:00.0: Found an ISP2432, irq 16, iobase 0xffffc20000020000
        *** qla2x00_config_dma_addressing: required_mask set to 000000007fffffff.
        *** qla2x00_config_dma_addressing: required_mask has no high-dword bits set.
        *** qla2x00_config_dma_addressing: set consistent 64bit mask returned 0.
        *** qla2x00_config_dma_addressing: defaulting to 32bit mask/consistent-mask.
        qla2xxx 0000:1f:00.0: Configuring PCI space...

Which tells me that a 32bit DMA mask is being set for dma_set_mask()
and pci_set_consistent_dma_mask() since dma_get_required_mask() is
returning back 7fffffff -- no upper-dword bits set...

So again, I'm still a bit confused about with Rik's initial comments on
'double-buffering'  given qla2xxx doesn't set anything less than a 32bit
DMA mask...

I've asked EMC to retry their failure machine with snapshot6 as the
original results were apparently logged with snapshot2.

Comment 33 Andrew Vasquez 2007-01-10 16:42:51 EST
Clarification -- not 'double buffering' but according to the 
'required' DMA mask, a 32bit mask is sufficient...  So why 
should the driver set a 64bit mask
Comment 35 Rik van Riel 2007-06-01 11:17:32 EDT
Andrew, the value of 0x7fffffff is consistent with the way 
dma_get_required_mask() works.

        u32 low_totalram = ((max_pfn - 1) << PAGE_SHIFT);
        u32 high_totalram = ((max_pfn - 1) >> (32 - PAGE_SHIFT));

2GB of memory is 512k pages, which results in a low_totalram value of 2G and a 
high_totalram value of 0.  After that we take this branch:

        if (!high_totalram) {
                /* convert to mask just covering totalram */
                low_totalram = (1 << (fls(low_totalram) - 1)); 
                low_totalram += low_totalram - 1;
                mask = low_totalram;

This ends up setting low_totalram (and mask) to one less than 2GB, to be 
precise 0x7fffffff

Are you saying that dma_get_required_mask() is doing the wrong thing?
Comment 36 Rik van Riel 2007-06-01 13:08:58 EDT
Andrew, since the qlogic driver seems to rely on the swiotlb running out of 
space to rate limit itself, would it be enough to simply put the printk from 
swiotlb_full() under a printk_ratelimit() or even disable it?
Comment 37 Andrew Vasquez 2007-06-01 13:16:55 EDT
Re: comment #36:  That sounds reasonable to me.
Comment 38 Rik van Riel 2007-06-01 13:58:43 EDT
Patch proposed upstream: http://lkml.org/lkml/2007/6/1/207

Depending on feedback either this patch or a slightly changed one will be 
submitted for RHEL 5.1.
Comment 40 Tom Coughlan 2007-06-01 16:47:54 EDT
Rik,

Why not use printk_ratelimit ?

deve_ack just the same, since a fix for this BZ is targeted for 5.1. 

Tom
Comment 41 Rik van Riel 2007-06-01 16:54:47 EDT
Fair enough Tom,

I'll submit a patch with just a printk_ratelimit() for inclusion in the RHEL 
5.1 kernel.
Comment 42 RHEL Product and Program Management 2007-06-01 17:01:47 EDT
This request was evaluated by Red Hat Kernel Team for inclusion in a Red
Hat Enterprise Linux maintenance release, and has moved to bugzilla 
status POST.
Comment 43 Rik van Riel 2007-06-07 16:24:55 EDT
Argh.  Turns out Xen has this jewel in lib/Makefile:

swiotlb-$(CONFIG_XEN) := ../arch/i386/kernel/swiotlb.o

That means x86-64 Xen is actually using arch/i386/kernel/swiotlb.o, and we 
most likely are running into the qla2xxx driver calling swiotlb_map_single() 
with a to-DMA area that straddles a page boundary.

If there is an easy way to disable spanning page boundaries with a non-SG 
request in the qla2xxx driver, we will not have to bounce buffer the IO 
requests at all.  Is there a way to achieve this?
Comment 44 Rik van Riel 2007-06-07 16:59:24 EDT
A related problem: dma_get_required_mask() is wrong if the Xen kernel is 
booted on a large system with a dom0 smaller than the maximum machine size.

For example, think of a 16GB system which is booted with dom0_mem=2G.  The 
dom0 kernel will think that it only has 2GB and will set a 32 bit DMA mask, 
even though the system has way more memory than that.

Oops.
Comment 45 Andrius Benokraitis 2007-06-15 11:25:32 EDT
Rik - are you waiting on QLogic for comments on this?
Comment 49 Andrew Vasquez 2007-06-20 17:02:28 EDT
According to Rik, the requests coming down from the block-layer are            
                                                  
bordering a page-boundary, the DMA mapping in-turn is proceeding down          
                                                  
a path which requires (incorrectly) the use of a bounce-buffer to              
                                                  
manage the exchange.  These semantics are all above the low-level              
                                                  
driver (qla2xxx), the driver in this case can simply registers it's            
                                                  
supported DMA mask, and rely on the upper-layers to efficiently manage         
                                                  
the DMA pools.  There's nothing more qla2xxx can do to address DMA
mappings.
Comment 51 Don Dutile 2007-07-13 16:38:33 EDT
Could someone @Qlogic try the following patch out to see if it corrects the
problem? (sorry, don't have hw to test with).


http://lists.xensource.com/archives/html/xen-changelog/2007-07/msg00093.html
Comment 52 Chip Coldwell 2007-07-17 11:15:32 EDT
(In reply to comment #9)
> It is in a host OS not a guest OS. 
> [root@l82bi220 ~]# cat /proc/meminfo
> MemTotal:      4031596 kB

> LowTotal:      4031596 kB

Does is strike anybody else as odd that we are using bounce buffers when all of
memory is low memory?

Chip

Comment 53 Chip Coldwell 2007-07-17 11:33:59 EDT
(In reply to comment #43)
> Argh.  Turns out Xen has this jewel in lib/Makefile:
> 
> swiotlb-$(CONFIG_XEN) := ../arch/i386/kernel/swiotlb.o
> 
> That means x86-64 Xen is actually using arch/i386/kernel/swiotlb.o, and we 
> most likely are running into the qla2xxx driver calling swiotlb_map_single() 
> with a to-DMA area that straddles a page boundary.

It can't be swiotlb_map_single, because that function will panic after emitting
the message (it calls swiotlb_full(hwdev, size, dir, 1), and that last argument
set to 1 means it will panic).  It must be swiotlb_map_sg that gets called.

> If there is an easy way to disable spanning page boundaries with a non-SG 
> request in the qla2xxx driver, we will not have to bounce buffer the IO 
> requests at all.  Is there a way to achieve this?

I think this is wrong.  It is definitely an SG request that is generating the
messages.

Chip

Comment 57 Andrius Benokraitis 2007-07-25 21:28:09 EDT
Note to EMC (Wayne/Haifeng: can you please test the patch in Comment #51 ASAP?
Does this solve the issue?
Comment 58 Pan Haifeng 2007-08-09 15:01:11 EDT
Did following to apply the patch, compile the kernel and make new init image. 
The patch did not fix the issue from the observation. 

# wget kernel-2.6.18-8.el5.src.rpm
# rpm -ivh kernel-2.6.18-8.el5.src.rpm
# cd /usr/src/redhat/SPECS
# rpmbuild -bp kernel-2.6.spec <--- this should apply the Xen patches
# cd ../BUILD/kernel-2.6.18/linux-2.6.18
# patch -p1 < /extra/xenPatch.diff
# make mrproper
# cp configs/kernel-2.6.18-i686-xen.config .config
# make
# make modules_install && make install 
# mv /boot/initrd-2.6.18-8.el5xen.img /boot/initrd-2.6.18-8.el5xen.img.bak
# mkinitrd -v /boot/initrd-2.6.18-8.el5xen.img 2.6.18-8.el5xen
# vi /boot/grub/grub.conf
# shutdown -r now

After server boot up, run IO on the patched kernel, still has the error:
Aug  9 14:45:45 l82bi220 last message repeated 3 times
Aug  9 14:45:45 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.1
Aug  9 14:45:45 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.0
Aug  9 14:45:45 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.0
Aug  9 14:45:45 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.1
Aug  9 14:45:45 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.1
Aug  9 14:45:45 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 65536 bytes 
at device 0000:08:03.0
Aug  9 14:45:45 l82bi220 last message repeated 2 times
Aug  9 14:45:58 l82bi220 kernel: PCI-DMA: Out of SW-IOMMU space for 16384 bytes 
at device 0000:08:03.0



Comment 59 Rik van Riel 2007-08-13 11:25:58 EDT
OK, so the qla2xxx driver stuffs the SW-IOMMU full of sg requests with elements
larger than page size :(

I am not sure what we can do here, except maybe quiet down the printk...
Comment 60 Andrius Benokraitis 2007-08-13 12:10:33 EDT
Marcus/Andrew - thoughts here?
Comment 61 Marcus Barrow 2007-08-13 13:16:38 EDT
I am going to meet with Haifeng and learn how he reproduces this and try to do that at RedHat.

It seems to me our handling in this area is standard

 If we can reproduce this at Red Hat, it will be easier for
everyone to look at.
Comment 62 Andrew Vasquez 2007-08-13 13:23:17 EDT
Just for clarification here (not necessarily trying to beat this
to death), but a SCSI LLD (low-level driver) is simply a transparent
consumer of SG entries prepared and mapped by the upper-layers.  qla2xxx
doesn't manipulate sizes nor counts of SG entries.  Again, I'm not
entirely clear a LLD can 'do' something about this, if a request's SG
list can't be mapped by the upper-layers, the I/O is simply flagged for
retry. 
Comment 63 Rik van Riel 2007-08-13 15:20:58 EDT
Created attachment 161206 [details]
quiet down the kernel

The qla2xxx driver seems to intentionally fill up the swiotlb (with requests
that don't fit in a page, so they need to be bounce buffered under Xen). 
Unless the system has another driver that panics when the swiotlb is full,
there should be no bad side effects.

This trivial patch quiets down the kernel.  If something panics, we will still
have enough error messages to figure out what went wrong.  This patch should
not introduce any regressions and has been proposed for inclusion in RHEL 5.1.
Comment 65 Andrius Benokraitis 2007-08-14 11:31:43 EDT
This bug is for creating a workaround for the printk floods in RHEL 5.1, and bug
252029 has been created which is for a longer-term solution slated for RHEL 5.2.
Comment 66 Don Zickus 2007-08-15 15:07:05 EDT
in 2.6.18-40.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 68 Pan Haifeng 2007-08-20 14:19:11 EDT
Downloaded and install kernel-2.6.18-40.el5xen. Run IO on the new test kenrel 
for 4 hours, no error message out. 

[root@l82bi220 current]# uname -a
Linux l82bi220.lss.emc.com 2.6.18-40.el5xen #1 SMP Tue Aug 14 18:12:49 EDT 2007 
x86_64 x86_64 x86_64 GNU/Linux
Comment 70 errata-xmlrpc 2007-11-07 14:16:42 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0959.html

Note You need to log in before you can comment on or make changes to this bug.