Bug 276111 - [RHEL5 RT][OPENIB] IPoIB crashes ib_mthca module on MT25204 HCAs
[RHEL5 RT][OPENIB] IPoIB crashes ib_mthca module on MT25204 HCAs
Status: CLOSED DUPLICATE of bug 251934
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel (Show other bugs)
All All
medium Severity low
: ---
: ---
Assigned To: Doug Ledford
Depends On:
  Show dependency treegraph
Reported: 2007-09-04 09:24 EDT by Gurhan Ozen
Modified: 2008-02-27 14:56 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-10-08 10:35:30 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Gurhan Ozen 2007-09-04 09:24:56 EDT
Description of problem:
  When trying to test IPoIB with iperf program, one of the HCAs has become
unusable and, though it wasn't verbose, ib_mthca crashed. 

  After running iperf program, running ibstat hanged and gave this error:

ibstat: ibpanic: [4235] main: stat of IB device 'mthca0' failed: (Device or
resource busy) 
  Trying to reload ib_mthca to remedy problem, rmmod ib_mthca hung as well and
gave this error:

  kernel: ib_mthca 0000:0c:00.0: HW2SW_MPT failed (-16)

Version-Release number of selected component (if applicable):
[root@dell-pe1950-02 ~]# uname -a
Linux dell-pe1950-02.rhts.boston.redhat.com 2.6.21-37.el5rt #1 SMP PREEMPT RT
Thu Aug 30 16:05:41 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
[root@dell-pe1950-02 ~]# modinfo ib_mthca
version:        0.08
license:        Dual BSD/GPL
description:    Mellanox InfiniBand HCA low-level driver
author:         Roland Dreier
srcversion:     DBCBEAE0F96BE105E037FE4
alias:          pci:v00001867d00005E8Csv*sd*bc*sc*i*
alias:          pci:v000015B3d00005E8Csv*sd*bc*sc*i*
alias:          pci:v00001867d00006274sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006274sv*sd*bc*sc*i*
alias:          pci:v00001867d00006282sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006282sv*sd*bc*sc*i*
alias:          pci:v00001867d00006278sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006278sv*sd*bc*sc*i*
alias:          pci:v00001867d00005A44sv*sd*bc*sc*i*
alias:          pci:v000015B3d00005A44sv*sd*bc*sc*i*
depends:        ib_mad,ib_core
vermagic:       2.6.21-37.el5rt SMP preempt mod_unload 
parm:           catas_reset_disable:disable reset on catastrophic event if
nonzero (int)
parm:           qos_support:Enable QoS support if > 0 (int)
parm:           fw_cmd_doorbell:post FW commands through doorbell page if
nonzero (and supported by FW) (int)
parm:           debug_level:Enable debug tracing if > 0 (int)
parm:           msi_x:attempt to use MSI-X if nonzero (int)
parm:           msi:attempt to use MSI if nonzero (int)
parm:           tune_pci:increase PCI burst from the default set by BIOS if
nonzero (int)
parm:           num_qp:maximum number of QPs per HCA (int)
parm:           rdb_per_qp:number of RDB buffers per QP (int)
parm:           num_cq:maximum number of CQs per HCA (int)
parm:           num_mcg:maximum number of multicast groups per HCA (int)
parm:           num_mpt:maximum number of memory protection table entries per
HCA (int)
parm:           num_mtt:maximum number of memory translation table segments per
HCA (int)
parm:           num_udav:maximum number of UD address vectors per HCA (int)
parm:           fmr_reserved_mtts:number of memory translation table segments
reserved for FMR (int)

How reproducible:

Steps to Reproduce:
1. Run a network performance program to the ib0 interface.

Additional info:
  This only happened with MT25204 cards so far. MT25208 doesn't seem to have
this issue. I am yet to test with qlogic cards, will post the results about them
as well.
Comment 1 Gurhan Ozen 2007-10-08 10:35:30 EDT
Marking this as duplicate of BZ #251934 . If there are separate release notes
for RT kernel, then the release on #251934 should be included for RT as well. 

*** This bug has been marked as a duplicate of 251934 ***

Note You need to log in before you can comment on or make changes to this bug.