Bug 480696 - RDMA latencytest and perftest fail with QLogic IB
Summary: RDMA latencytest and perftest fail with QLogic IB
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.3
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Don Zickus
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-01-19 22:25 UTC by Steve Reichard
Modified: 2009-09-02 08:44 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-02 08:44:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:1243 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.4 kernel security and bug fix update 2009-09-01 08:53:34 UTC

Description Steve Reichard 2009-01-19 22:25:01 UTC
Description of problem:

In my configuration, both latencytest and perftest fail.  

No issues with IBoIP/tcp.
qperf rdma test also do not have issues

Configuration:
  A DL580 (Intel, 16 core, 64 GB memory) and a DL585 (AMD, 8 core, 72GB)
  each with a Qlogic QLE7240 wired directly



Version-Release number of selected component (if applicable):

kernel: 2.6.18-128.el5  (RHEL 5.3 RC-2)
openib-1.3.2-0.20080728.0355.3.el5
libipathverbs


How reproducible:

Easily reproduced in current config

Steps to Reproduce:
1. 
  Start broker
[root@renoir:/var/log]#  /usr/sbin/qpidd  --auth no --mgmt-enable no

2.
  Start Test
[root@degas:~]# latencytest --protocol rdma --size 32 --rate 1000 -b renoir-ib
or
[root@degas:~]# perftest --protocol rdma  -b renoir-ib -s --count 100

  
Actual results:

# latencytest --protocol rdma --size 32 --rate 1000 -b renoir-ib
2009-jan-19 17:02:23 warning Connection closed
Error in receiver: Connection closed
Latency(ms): min=0.083, max=0.23, avg=0.0920769
Latency(ms): min=1.79769e+308, max=0, avg=nan
Latency(ms): min=1.79769e+308, max=0, avg=nan
Latency(ms): min=1.79769e+308, max=0, avg=nan
Latency(ms): min=1.79769e+308, max=0, avg=nan



# perftest   --protocol rdma -b renoir-ib -s --count 100
2009-jan-19 17:17:13 warning Connection closed
PublishThread exception: Connection closed



Expected results:
Similar to tcp, except faster or higher tput

# /usr/bin/latencytest  --rate 1000 --size 32 -b renoir-ib
Latency(ms): min=0.521, max=37.385, avg=19.287
Latency(ms): min=0.518, max=37.359, avg=19.1428
Latency(ms): min=0.52, max=37.36, avg=19.1129
Latency(ms): min=0.518, max=37.367, avg=19.2241
Latency(ms): min=0.522, max=37.364, avg=19.145

# perftest  -s --count 100 -b renoir-ib 
19805.9	2289.27	4674.1	4.56455



Additional info:

Comment 1 Steve Reichard 2009-01-19 22:27:15 UTC
It has been asked, 

Are these boards (Qlogic QLE7240) supported?

Are they supported with MRG?


I've also just tried the driver we ship, should I download the driver etc from Qlogic?
 
spr

Comment 2 Doug Ledford 2009-04-22 23:09:03 UTC
The ipath driver was updated as part of the ofed 1.4.1 kernel update I submitted on rhkernel-list.  It is likely that if anything will solve this issue, that it will.  However, I can't reproduce (don't have the right hardware...I have ipath cards, but I don't have any machines that can run the QLE7240 even though I have 4 of them sitting around).  Please let me know if this problem persists once an updated rhel5 kernel with the ofed 1.4.1 patch included is available.

Comment 4 Don Zickus 2009-05-06 17:15:55 UTC
in kernel-2.6.18-144.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 7 errata-xmlrpc 2009-09-02 08:44:41 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html


Note You need to log in before you can comment on or make changes to this bug.