Bug 480696 - RDMA latencytest and perftest fail with QLogic IB
RDMA latencytest and perftest fail with QLogic IB
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Don Zickus
Red Hat Kernel QE team
Depends On:
  Show dependency treegraph
Reported: 2009-01-19 17:25 EST by Steve Reichard
Modified: 2009-09-02 04:44 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-09-02 04:44:41 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Steve Reichard 2009-01-19 17:25:01 EST
Description of problem:

In my configuration, both latencytest and perftest fail.  

No issues with IBoIP/tcp.
qperf rdma test also do not have issues

  A DL580 (Intel, 16 core, 64 GB memory) and a DL585 (AMD, 8 core, 72GB)
  each with a Qlogic QLE7240 wired directly

Version-Release number of selected component (if applicable):

kernel: 2.6.18-128.el5  (RHEL 5.3 RC-2)

How reproducible:

Easily reproduced in current config

Steps to Reproduce:
  Start broker
[root@renoir:/var/log]#  /usr/sbin/qpidd  --auth no --mgmt-enable no

  Start Test
[root@degas:~]# latencytest --protocol rdma --size 32 --rate 1000 -b renoir-ib
[root@degas:~]# perftest --protocol rdma  -b renoir-ib -s --count 100

Actual results:

# latencytest --protocol rdma --size 32 --rate 1000 -b renoir-ib
2009-jan-19 17:02:23 warning Connection closed
Error in receiver: Connection closed
Latency(ms): min=0.083, max=0.23, avg=0.0920769
Latency(ms): min=1.79769e+308, max=0, avg=nan
Latency(ms): min=1.79769e+308, max=0, avg=nan
Latency(ms): min=1.79769e+308, max=0, avg=nan
Latency(ms): min=1.79769e+308, max=0, avg=nan

# perftest   --protocol rdma -b renoir-ib -s --count 100
2009-jan-19 17:17:13 warning Connection closed
PublishThread exception: Connection closed

Expected results:
Similar to tcp, except faster or higher tput

# /usr/bin/latencytest  --rate 1000 --size 32 -b renoir-ib
Latency(ms): min=0.521, max=37.385, avg=19.287
Latency(ms): min=0.518, max=37.359, avg=19.1428
Latency(ms): min=0.52, max=37.36, avg=19.1129
Latency(ms): min=0.518, max=37.367, avg=19.2241
Latency(ms): min=0.522, max=37.364, avg=19.145

# perftest  -s --count 100 -b renoir-ib 
19805.9	2289.27	4674.1	4.56455

Additional info:
Comment 1 Steve Reichard 2009-01-19 17:27:15 EST
It has been asked, 

Are these boards (Qlogic QLE7240) supported?

Are they supported with MRG?

I've also just tried the driver we ship, should I download the driver etc from Qlogic?
Comment 2 Doug Ledford 2009-04-22 19:09:03 EDT
The ipath driver was updated as part of the ofed 1.4.1 kernel update I submitted on rhkernel-list.  It is likely that if anything will solve this issue, that it will.  However, I can't reproduce (don't have the right hardware...I have ipath cards, but I don't have any machines that can run the QLE7240 even though I have 4 of them sitting around).  Please let me know if this problem persists once an updated rhel5 kernel with the ofed 1.4.1 patch included is available.
Comment 4 Don Zickus 2009-05-06 13:15:55 EDT
in kernel-2.6.18-144.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.
Comment 7 errata-xmlrpc 2009-09-02 04:44:41 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.