Bug 480696 - RDMA latencytest and perftest fail with QLogic IB
RDMA latencytest and perftest fail with QLogic IB
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.3
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Don Zickus
Red Hat Kernel QE team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-01-19 17:25 EST by Steve Reichard
Modified: 2009-09-02 04:44 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-09-02 04:44:41 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Steve Reichard 2009-01-19 17:25:01 EST
Description of problem:

In my configuration, both latencytest and perftest fail.  

No issues with IBoIP/tcp.
qperf rdma test also do not have issues

Configuration:
  A DL580 (Intel, 16 core, 64 GB memory) and a DL585 (AMD, 8 core, 72GB)
  each with a Qlogic QLE7240 wired directly



Version-Release number of selected component (if applicable):

kernel: 2.6.18-128.el5  (RHEL 5.3 RC-2)
openib-1.3.2-0.20080728.0355.3.el5
libipathverbs


How reproducible:

Easily reproduced in current config

Steps to Reproduce:
1. 
  Start broker
[root@renoir:/var/log]#  /usr/sbin/qpidd  --auth no --mgmt-enable no

2.
  Start Test
[root@degas:~]# latencytest --protocol rdma --size 32 --rate 1000 -b renoir-ib
or
[root@degas:~]# perftest --protocol rdma  -b renoir-ib -s --count 100

  
Actual results:

# latencytest --protocol rdma --size 32 --rate 1000 -b renoir-ib
2009-jan-19 17:02:23 warning Connection closed
Error in receiver: Connection closed
Latency(ms): min=0.083, max=0.23, avg=0.0920769
Latency(ms): min=1.79769e+308, max=0, avg=nan
Latency(ms): min=1.79769e+308, max=0, avg=nan
Latency(ms): min=1.79769e+308, max=0, avg=nan
Latency(ms): min=1.79769e+308, max=0, avg=nan



# perftest   --protocol rdma -b renoir-ib -s --count 100
2009-jan-19 17:17:13 warning Connection closed
PublishThread exception: Connection closed



Expected results:
Similar to tcp, except faster or higher tput

# /usr/bin/latencytest  --rate 1000 --size 32 -b renoir-ib
Latency(ms): min=0.521, max=37.385, avg=19.287
Latency(ms): min=0.518, max=37.359, avg=19.1428
Latency(ms): min=0.52, max=37.36, avg=19.1129
Latency(ms): min=0.518, max=37.367, avg=19.2241
Latency(ms): min=0.522, max=37.364, avg=19.145

# perftest  -s --count 100 -b renoir-ib 
19805.9	2289.27	4674.1	4.56455



Additional info:
Comment 1 Steve Reichard 2009-01-19 17:27:15 EST
It has been asked, 

Are these boards (Qlogic QLE7240) supported?

Are they supported with MRG?


I've also just tried the driver we ship, should I download the driver etc from Qlogic?
 
spr
Comment 2 Doug Ledford 2009-04-22 19:09:03 EDT
The ipath driver was updated as part of the ofed 1.4.1 kernel update I submitted on rhkernel-list.  It is likely that if anything will solve this issue, that it will.  However, I can't reproduce (don't have the right hardware...I have ipath cards, but I don't have any machines that can run the QLE7240 even though I have 4 of them sitting around).  Please let me know if this problem persists once an updated rhel5 kernel with the ofed 1.4.1 patch included is available.
Comment 4 Don Zickus 2009-05-06 13:15:55 EDT
in kernel-2.6.18-144.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.
Comment 7 errata-xmlrpc 2009-09-02 04:44:41 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html

Note You need to log in before you can comment on or make changes to this bug.