Bug 690313 - cxgb4i: iommu_alloc is failing (RHEL 6.1)
Summary: cxgb4i: iommu_alloc is failing (RHEL 6.1)
Keywords:
Status: CLOSED DUPLICATE of bug 647006
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.1
Hardware: ppc64
OS: All
urgent
urgent
Target Milestone: rc
: 6.1
Assignee: Steve Best
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 687961
TreeView+ depends on / blocked
 
Reported: 2011-03-23 20:51 UTC by IBM Bug Proxy
Modified: 2011-04-07 14:27 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-04-07 14:27:47 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 70722 0 None None None Never

Description IBM Bug Proxy 2011-03-23 20:51:36 UTC
=Comment: #0=================================================
Breno Henrique Leitao <leitao.ibm.com> - 
On a dbench traffic, cxgb4i is crashing, due a iommu_alloc() failure.
It seems that there is a memory leak.
Giving the following error:

iommu_alloc failed, tbl c00000007ec0b700 vaddr c000000042d0c000 npages 2
iommu_alloc failed, tbl c00000007ec0b700 vaddr c000000042d0c000 npages 2
iommu_alloc failed, tbl c00000007ec0b700 vaddr c000000042d0c000 npages 2
iommu_alloc failed, tbl c00000007ec0b700 vaddr c000000042d0c000 npages 2
iommu_alloc failed, tbl c00000007ec0b700 vaddr c000000042d0c000 npages 2
iommu_alloc failed, tbl c00000007ec0b700 vaddr c000000042d0c000 npages 2

 
Contact Information = brenohl.com 
 
---uname output---
Red hat 6.1 kernel (alpha 1)
 
Machine Type = pSeries 
 
---Debugger---
A debugger is not configured
 
---Kernel - IO Component Data--- 
Stack trace output:
 no
 
Oops output:
 no
 
System Dump Info:
  The system is not configured to capture a system dump.
 
*Additional Instructions for brenohl.com: 
-Attach sysctl -a output output to the bug.
=Comment: #2=================================================
Timothy P. Noonan <tpnoonan.com> - 
HI Red Hat, Once the RHBZ is opened can you let Chelsio see the RHBZ as well? Thanks
=Comment: #4=================================================
JASON J. SPIETH <spieth.com> - 
1.Server architecture(s) (please list all effected) (x86/POWER6/Z/etc.):
Power

2.Server type (9117-MMA/HS20/s390/etc.): 
n/a

3.General component (desktop/kernel/base OS/dev tools/etc.):
kernel

4.Other components involved (ixgbe/java/emulex/etc.):
no

5.Does the server have the latest GA firmware? 
yes

6.Has the problem been shown to occur on more than one system? 
yes

7.Is a tested patch available? 
no

If yes to the above, has it been approved upstream? 
n/a

8.What is the latest official Red Hat build on which this bug has been
seen?
RHEL 6.1 Alpha

Comment 2 IBM Bug Proxy 2011-03-24 15:40:41 UTC
------- Comment From leitao.ibm.com 2011-03-24 11:35 EDT-------
I also tested on upstream kernel and this problem also happens.
Also, I tested using open iscsi (iscsi_tcp), and a similar problem also happen. Thus, it could be a cxgb4 NIC issue (instead of cxgb4i iscsi issue).

Comment 3 IBM Bug Proxy 2011-03-29 22:05:08 UTC
------- Comment From tpnoonan.com 2011-03-29 17:51 EDT-------
please consider as exception

Comment 4 Neil Horman 2011-03-29 23:33:39 UTC
This is likely due to missing support for CONFIG_NEED_DMA_MAP_STATE, which chelsio has provided a workaround for.  I'll be updating cxgb4 as soon as testing is done.  There are test kernel links available in bz 647006 comment 69

Comment 5 IBM Bug Proxy 2011-04-01 19:31:06 UTC
------- Comment From leitao.ibm.com 2011-04-01 15:28 EDT-------
I just got the latest kernel -124.test and the problem is still being reproducible.

Comment 6 Neil Horman 2011-04-01 20:02:45 UTC
Hm, this is still occuring upstream as well, I take it? (just saw what you said in comment 2).  Which upstream kernel version are you using?

Comment 7 kxie 2011-04-01 23:48:39 UTC
FYI, we got the -124 kernel source bits from IBM and tested it on P7 system and am not seeing this issue. So far, transfered ~200G data already. 

I did see the error if the cxgb4 patch regarding CONFIG_NEED_DMA_MAP_STATE is not included.

Comment 8 IBM Bug Proxy 2011-04-04 12:11:18 UTC
------- Comment From leitao.ibm.com 2011-04-04 08:06 EDT-------
> Hm, this is still occuring upstream as well, I take it? (just saw what you said
> in comment 2).  Which upstream kernel version are you using?
Well, I tested it on 2.6.37 and I saw the problem.

Comment 9 IBM Bug Proxy 2011-04-06 17:41:52 UTC
------- Comment From tpnoonan.com 2011-04-06 13:31 EDT-------
HiChelsio/Breno, It's unclear from he last few comments if there is a fix, is there? Thanks

Comment 10 IBM Bug Proxy 2011-04-06 20:23:50 UTC
------- Comment From leitao.ibm.com 2011-04-06 16:13 EDT-------
> HiChelsio/Breno, It's unclear from he last few comments if there is a fix, is
> there? Thanks

Hi,

Yes, we finally see this problem disappear in the last (-128) kernel.

Thanks
Breno

Comment 11 John Jarvis 2011-04-07 14:13:02 UTC
So this is fixed with the latest snapshot, correct?  Can we close this bug?

Comment 12 Steve Best 2011-04-07 14:27:47 UTC
Neil's changes done in bz 647006 have fixed this.

*** This bug has been marked as a duplicate of bug 647006 ***


Note You need to log in before you can comment on or make changes to this bug.