Bug 279861 - OpenMPI issues when both nodes are Qlogic PCIe cards
OpenMPI issues when both nodes are Qlogic PCIe cards
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: openmpi (Show other bugs)
All Linux
high Severity high
: ---
: ---
Assigned To: Doug Ledford
Depends On:
  Show dependency treegraph
Reported: 2007-09-05 19:54 EDT by Gurhan Ozen
Modified: 2013-11-03 20:33 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-09-21 13:53:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Gurhan Ozen 2007-09-05 19:54:29 EDT
Description of problem:
When running mpitests-IMB_MPI1 testsuite of mpitests package over 2 nodes where
both nodes are  InfiniPath_QLE7140 HCAs, the program crashes with the following
error posting send request errno says Invalid argument

[0,1,0][btl_openib_component.c:1332:btl_openib_component_progress] from
dell-pe1950-03.rhts.boston.redhat.com to: dell-pe1950-02.rhts.boston.redhat.com
error polling LP CQ with status RETRY EXCEEDED ERROR status number 12 for wr_id
173782656 opcode 1
The InfiniBand retry count between two MPI processes has been
exceeded.  "Retry count" is defined in the InfiniBand spec 1.2
(section 12.7.38):

    The total number of times that the sender wishes the receiver to
    retry timeout, packet sequence, etc. errors before posting a
    completion error.

This error typically means that there is something awry within the
InfiniBand fabric itself.  You should note the hosts on which this
error has occurred; it has been observed that rebooting or removing a
particular host from the job can sometimes resolve this issue.  

Two MCA parameters can be used to control Open MPI's behavior with
respect to the retry count:

* btl_openib_ib_retry_count - The number of times the sender will
  attempt to retry (defaulted to 7, the maximum value).

* btl_openib_ib_timeout - The local ACK timeout parameter (defaulted
  to 10).  The actual timeout value used is calculated as:

     4.096 microseconds * (2^btl_openib_ib_timeout)

  See the InfiniBand spec 1.2 (section 12.7.34) for more details.
mpirun noticed that job rank 1 with PID 4844 on node
dell-pe1950-02.rhts.boston.redhat.com exited on signal 15 (Terminated). 

Version-Release number of selected component (if applicable):
# rpm -qa | egrep "openib|openmpi|mpitests"
# modinfo ib_ipath
description:    QLogic InfiniPath driver
author:         QLogic <support@pathscale.com>
license:        GPL
srcversion:     60096FEC902AEF4EEFCAD65
alias:          pci:v00001FC1d00000010sv*sd*bc*sc*i*
alias:          pci:v00001FC1d0000000Dsv*sd*bc*sc*i*
depends:        ib_core
vermagic:       2.6.18-43.el5 SMP mod_unload gcc-4.1
parm:           qp_table_size:QP table size (uint)
parm:           lkey_table_size:LKEY table size in bits (2^n, 1 <= n <= 23) (uint)
parm:           max_pds:Maximum number of protection domains to support (uint)
parm:           max_ahs:Maximum number of address handles to support (uint)
parm:           max_cqes:Maximum number of completion queue entries to support
parm:           max_cqs:Maximum number of completion queues to support (uint)
parm:           max_qp_wrs:Maximum number of QP WRs to support (uint)
parm:           max_qps:Maximum number of QPs to support (uint)
parm:           max_sges:Maximum number of SGEs to support (uint)
parm:           max_mcast_grps:Maximum number of multicast groups to support (uint)
parm:           max_mcast_qp_attached:Maximum number of attached QPs to support
parm:           max_srqs:Maximum number of SRQs to support (uint)
parm:           max_srq_sges:Maximum number of SRQ SGEs to support (uint)
parm:           max_srq_wrs:Maximum number of SRQ WRs support (uint)
parm:           disable_sma:uint
parm:           ib_ipath_disable_sma:Disable the SMA
parm:           cfgports:Set max number of ports to use (ushort)
parm:           kpiobufs:Set number of PIO buffers for driver
parm:           debug:mask for debug prints (uint)

How reproducible:

Steps to Reproduce:
1. Have 2 nodes with Qlogic PCIe cards (I used InfiniPath_QLE7140)
2. Build, install mpitests.
3. Run mpitests-IMB_MPI1 .
Actual results:

Expected results:

Additional info:
Comment 1 Doug Ledford 2007-09-21 13:53:48 EDT
It turns out that this problem, specifically the segfault, wasn't related to two
ipath cards as it was that one of the ipath cards was running at a scant 2% of
the overall speed of the IB fabric and as such was causing excessive retries
that closed the connection and the mpitest program wasn't built to deal with
connections going away unexpectedly and as a result segfaulted.  As such, I'm
closing this as NOTABUG.  The problem went away when the test network was
updated to get rid of the slow connection that was disrupting the IB fabric.

Note You need to log in before you can comment on or make changes to this bug.