Bug 279861 - OpenMPI issues when both nodes are Qlogic PCIe cards
Summary: OpenMPI issues when both nodes are Qlogic PCIe cards
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: openmpi
Version: 5.1
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Doug Ledford
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-09-05 23:54 UTC by Gurhan Ozen
Modified: 2013-11-04 01:33 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-09-21 17:53:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Gurhan Ozen 2007-09-05 23:54:29 UTC
Description of problem:
When running mpitests-IMB_MPI1 testsuite of mpitests package over 2 nodes where
both nodes are  InfiniPath_QLE7140 HCAs, the program crashes with the following
backtrace:
[dell-pe1950-03.rhts.boston.redhat.com][0,1,0][btl_openib_endpoint.c:213:mca_btl_openib_endpoint_post_send]
error posting send request errno says Invalid argument

[0,1,0][btl_openib_component.c:1332:btl_openib_component_progress] from
dell-pe1950-03.rhts.boston.redhat.com to: dell-pe1950-02.rhts.boston.redhat.com
error polling LP CQ with status RETRY EXCEEDED ERROR status number 12 for wr_id
173782656 opcode 1
--------------------------------------------------------------------------
The InfiniBand retry count between two MPI processes has been
exceeded.  "Retry count" is defined in the InfiniBand spec 1.2
(section 12.7.38):

    The total number of times that the sender wishes the receiver to
    retry timeout, packet sequence, etc. errors before posting a
    completion error.

This error typically means that there is something awry within the
InfiniBand fabric itself.  You should note the hosts on which this
error has occurred; it has been observed that rebooting or removing a
particular host from the job can sometimes resolve this issue.  

Two MCA parameters can be used to control Open MPI's behavior with
respect to the retry count:

* btl_openib_ib_retry_count - The number of times the sender will
  attempt to retry (defaulted to 7, the maximum value).

* btl_openib_ib_timeout - The local ACK timeout parameter (defaulted
  to 10).  The actual timeout value used is calculated as:

     4.096 microseconds * (2^btl_openib_ib_timeout)

  See the InfiniBand spec 1.2 (section 12.7.34) for more details.
--------------------------------------------------------------------------
mpirun noticed that job rank 1 with PID 4844 on node
dell-pe1950-02.rhts.boston.redhat.com exited on signal 15 (Terminated). 


Version-Release number of selected component (if applicable):
# rpm -qa | egrep "openib|openmpi|mpitests"
openmpi-devel-1.2.3-4.el5
openib-srptools-0.0.6-5.el5
openib-mstflint-1.2-5.el5
openmpi-1.2.3-4.el5
mpitests-debuginfo-2.0-2
openib-1.2-5.el5
openib-diags-1.2.7-5.el5
openib-perftest-1.2-5.el5
openib-debuginfo-1.2-5.el5
openmpi-libs-1.2.3-4.el5
mpitests-2.0-2
openib-tvflash-0.9.2-5.el5
openmpi-debuginfo-1.2.3-4.el5
# modinfo ib_ipath
filename:      
/lib/modules/2.6.18-43.el5/kernel/drivers/infiniband/hw/ipath/ib_ipath.ko
description:    QLogic InfiniPath driver
author:         QLogic <support>
license:        GPL
srcversion:     60096FEC902AEF4EEFCAD65
alias:          pci:v00001FC1d00000010sv*sd*bc*sc*i*
alias:          pci:v00001FC1d0000000Dsv*sd*bc*sc*i*
depends:        ib_core
vermagic:       2.6.18-43.el5 SMP mod_unload gcc-4.1
parm:           qp_table_size:QP table size (uint)
parm:           lkey_table_size:LKEY table size in bits (2^n, 1 <= n <= 23) (uint)
parm:           max_pds:Maximum number of protection domains to support (uint)
parm:           max_ahs:Maximum number of address handles to support (uint)
parm:           max_cqes:Maximum number of completion queue entries to support
(uint)
parm:           max_cqs:Maximum number of completion queues to support (uint)
parm:           max_qp_wrs:Maximum number of QP WRs to support (uint)
parm:           max_qps:Maximum number of QPs to support (uint)
parm:           max_sges:Maximum number of SGEs to support (uint)
parm:           max_mcast_grps:Maximum number of multicast groups to support (uint)
parm:           max_mcast_qp_attached:Maximum number of attached QPs to support
(uint)
parm:           max_srqs:Maximum number of SRQs to support (uint)
parm:           max_srq_sges:Maximum number of SRQ SGEs to support (uint)
parm:           max_srq_wrs:Maximum number of SRQ WRs support (uint)
parm:           disable_sma:uint
parm:           ib_ipath_disable_sma:Disable the SMA
parm:           cfgports:Set max number of ports to use (ushort)
parm:           kpiobufs:Set number of PIO buffers for driver
parm:           debug:mask for debug prints (uint)
module_sig:    
883f35046cb61a956bc506ef7b1fb11243e309e23f9e3a61fda5c94159875195a459bc1f34c40a09ee073e5d729c3b0816f6bf372e92c1b46efefe



How reproducible:
Everytime

Steps to Reproduce:
1. Have 2 nodes with Qlogic PCIe cards (I used InfiniPath_QLE7140)
2. Build, install mpitests.
3. Run mpitests-IMB_MPI1 .
  
Actual results:


Expected results:


Additional info:

Comment 1 Doug Ledford 2007-09-21 17:53:48 UTC
It turns out that this problem, specifically the segfault, wasn't related to two
ipath cards as it was that one of the ipath cards was running at a scant 2% of
the overall speed of the IB fabric and as such was causing excessive retries
that closed the connection and the mpitest program wasn't built to deal with
connections going away unexpectedly and as a result segfaulted.  As such, I'm
closing this as NOTABUG.  The problem went away when the test network was
updated to get rid of the slow connection that was disrupting the IB fabric.


Note You need to log in before you can comment on or make changes to this bug.