Bug 463800 - [LTC 6.0 FEAT] 201588:Enable RDS in OFED InfiniBand kernel and include RDS user tools
[LTC 6.0 FEAT] 201588:Enable RDS in OFED InfiniBand kernel and include RDS us...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: rds-tools (Show other bugs)
6.0
All All
high Severity high
: alpha
: 6.0
Assigned To: Doug Ledford
Red Hat Kernel QE team
: FutureFeature
Depends On:
Blocks: 356741 554559 597328
  Show dependency treegraph
 
Reported: 2008-09-24 14:50 EDT by IBM Bug Proxy
Modified: 2011-12-06 10:51 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
* Running the rds-ping command may fail, returning the error: bind() failed, errno: 99 (Cannot assign requested address). Note, also that this error may occur even with LOAD_RDS=yes set in /etc/rdma/rdma.conf. To work around this issue, load the rds-tcp module. * Running the command rds-stress on a client may result in the following error attempting to connect to the server: connecting to <server IP address>:4000: No route to host connect(<server IP address>) failed#
Story Points: ---
Clone Of:
: 597328 (view as bug list)
Environment:
Last Closed: 2010-11-11 11:22:21 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description IBM Bug Proxy 2008-09-24 14:50:47 EDT
=Comment: #0=================================================
Emily J. Ratliff <emilyr@us.ibm.com> - 2008-09-24 13:51 EDT
1. Feature Overview:
Feature Id:	[201588]
a. Name of Feature:	Enable RDS in OFED InfiniBand kernel and include RDS user tools
b. Feature Description
RDS is a realiable datagram mode socket implementation for Oracle RAC over InfiniBand. This feature
is a part of OFED-1.3 release.  This feature includes kernel RDS component

2. Feature Details:
Sponsor:	LTC
Architectures:
x86
ppc64

Arch Specificity: Purely Common Code
Affects Kernel Modules: Yes
Delivery Mechanism: Direct from community
Category:	Networking
Request Type:	Package - Feature from Upstream
d. Upstream Acceptance:	In Progress
Sponsor Priority	
IBM Confidential:	no
Code Contribution:	3rd party code
g. Component Version Target:	OFED

3. Business Case
IBM Oracle RAC middleware enablement is scheduled in April.09. This enablement depends on this
feature to be in Distro from OFED-1.3.

4. Primary contact at Red Hat: 
John Jarvis
jjarvis@redhat.com

5. Primary contacts at Partner:
Project Management Contact:
Mike Wortman, wortman@us.ibm.com, 512-838-8582

Technical contact(s):
Shirley Ma, xma@us.ibm.com

IBM Manager:
Larry Kessler, lkessler@us.ibm.com
Comment 1 John Jarvis 2010-01-22 09:55:19 EST
IBM, just to clarify, is this the RDS user tools to which you refer the  rds-tools/1.5/1.el6/x86_64/rds-tools-1.5-1.el6.x86_64.rpm package currently in the nightly RHEL 6.0 builds?  If so this is already included.
Comment 2 IBM Bug Proxy 2010-05-10 16:21:15 EDT
------- Comment From halves@linux.vnet.ibm.com 2010-05-10 16:10 EDT-------
Verified on RHEL6 Snap2:

I found rds-tools on RHEL6 Snap2 but rds-info command failed.

[root@cluster-6 iperf-2.0.4]# rds-info
rds-generic-tool: Unable to create socket: Address family not supported by protocol

I tried find a rds kernel module, but I can`t found it.

I look inside Kernel source, and I found a rds directory with Kconfig and Makefile files inside.

[root@cluster-5 net]# cd /usr/src/kernels/2.6.32-23.el6.ppc64/net/rds
[root@cluster-5 rds]# ls
Kconfig  Makefile
[root@cluster-5 rds]#

Regards,

Higor
Comment 3 IBM Bug Proxy 2010-05-13 15:50:50 EDT
------- Comment From halves@linux.vnet.ibm.com 2010-05-13 15:45 EDT-------
Hello,

I can not found rds kernel modules on RHEL6 snap3.

Regards,

Higor
Comment 4 IBM Bug Proxy 2010-05-27 17:10:41 EDT
------- Comment From halves@linux.vnet.ibm.com 2010-05-27 17:09 EDT-------
Hello,

I did not found the rds driver on RHEL6 Snap5-early (RHEL6.0-20100523.0), I just found rds-tools package.

Regards,

Higor
Comment 5 Doug Ledford 2010-05-27 17:28:59 EDT
You must be on powerpc.  I just looked in the configs and it's enabled for other arches, but not powerpc.  I'll post about it internally to get it enabled on ppc as well.
Comment 6 IBM Bug Proxy 2010-05-27 17:41:08 EDT
------- Comment From pradeep@us.ibm.com 2010-05-27 17:37 EDT-------
(In reply to comment #11)
> You must be on powerpc.  I just looked in the configs and it's enabled for
> other arches, but not powerpc.  I'll post about it internally to get it enabled
> on ppc as well.

Thanks Doug. Yes, we are using ppc64.
Comment 10 IBM Bug Proxy 2010-07-14 15:20:49 EDT
------- Comment From halves@linux.vnet.ibm.com 2010-07-14 15:18 EDT-------
Hello Team,

I found rds drivers on RHEL6 Snap7.

[root@cluster-11 rds]# uname -a
Linux cluster-11.ltc.austin.ibm.com 2.6.32-44.el6.ppc64 #1 SMP Wed Jul 7 16:39:25 EDT 2010 ppc64 ppc64 ppc64 GNU/Linux
[root@cluster-11 rds]# pwd
/lib/modules/2.6.32-44.el6.ppc64/kernel/net/rds
[root@cluster-11 rds]# ls
rds.ko  rds_rdma.ko  rds_tcp.ko

Regards,

Higor
Comment 11 IBM Bug Proxy 2010-08-09 18:00:58 EDT
------- Comment From pradeep@us.ibm.com 2010-08-09 17:53 EDT-------
Even with the latest snapshot (2.6.32-54.el6.ppc64) we still see errors like the following:
[root@cluster-11 ~]# rds-info
rds-generic-tool: Unable get statistics: Protocol not available
rds-generic-tool: Unable get statistics: Protocol not available

Counters:
CounterName            Value
conn_reset                0
recv_drop_bad_checksum                0
recv_drop_old_seq                0
recv_drop_no_sock                0
recv_drop_dead_sock                0
recv_deliver_raced                0
recv_delivered                0
recv_queued                0
recv_immediate_retry                0
recv_delayed_retry                0
recv_ack_required                0
recv_rdma_bytes                0
recv_ping                0
send_queue_empty                0
send_queue_full                0
send_sem_contention                0
send_sem_queue_raced                0
send_immediate_retry                0
send_delayed_retry                0
send_drop_acked                0
send_ack_required                0
send_queued                0
send_rdma                0
send_rdma_bytes                0
send_pong                0
page_remainder_hit                0
page_remainder_miss                0
copy_to_user                0
copy_from_user                0
cong_update_queued                0
cong_update_received                0
cong_send_error                0
cong_send_blocked                0

RDS Sockets:
BoundAddr BPort        ConnAddr CPort     SndBuf     RcvBuf    Inode
0.0.0.0     0         0.0.0.0     0      62464      62464   224602

RDS Connections:
LocalAddr      RemoteAddr           NextTX           NextRX Flg

Receive Message Queue:
LocalAddr LPort      RemoteAddr RPort              Seq      Bytes

Send Message Queue:
LocalAddr LPort      RemoteAddr RPort              Seq      Bytes

Retransmit Message Queue:
LocalAddr LPort      RemoteAddr RPort              Seq      Bytes
[root@cluster-11 ~]# rds-ping 10.1.1.120
bind() failed, errno: 99 (Cannot assign requested address)
[root@cluster-11 ~]# lsmod | grep rds
rds                   110490  0
[root@cluster-11 ~]# rds-ping -I 10.1.1.110 10.1.1.120
bind() failed, errno: 99 (Cannot assign requested address)
[root@cluster-11 ~]#

I happened to see some old OFED documentation that stated that RDS was not supported on ppc64. is that still true with Rhel6?
Comment 12 IBM Bug Proxy 2010-08-09 19:00:41 EDT
------- Comment From pradeep@us.ibm.com 2010-08-09 18:54 EDT-------
I loaded rds-tcp and then "rds-ping -c 3 10.1.1.120" seemed to not error out. Is this expected behavior? LOAD_RDS=yes was set in /etc/rdma/rdma.conf..
Comment 13 IBM Bug Proxy 2010-08-09 19:40:31 EDT
------- Comment From pradeep@us.ibm.com 2010-08-09 19:31 EDT-------
Even though a server is running on the remoted end, rds-stress fails as follows:

rds-stress -s 10.1.1.110 -p 4000
connecting to 10.1.1.110:4000: No route to host
connect(10.1.1.110) failed[root@cluster-12 ~]#
Comment 14 Andy Grover 2010-08-09 21:05:28 EDT
Doug,

Any chance you can upgrade rds-tools to latest ver, 2.0.4? It'd be preferable to have the latest version installed if there are bugs being reported.

It can be found here:

http://oss.oracle.com/projects/rds/dist/files/sources/rds-tools-2.0.4.tar.gz

Along with the usual bugfixes, it also splits rds headers & manpages our of rds-tools into a separate rds-devel rpm.
Comment 15 Ben 2010-08-27 16:28:41 EDT
Waiting on IBM for text of release note documenting shortcomings in this feature (see connected IT for details)
Comment 17 Ben 2010-08-30 13:55:41 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
rds-ping may fail with the following error : "bind() failed, errno: 99 (Cannot assign requested address)". This error may occur even with LOAD_RDS=yes set in /etc/rdma/rdma.conf.

The work around to this problem is to load the rds-tcp module. After this, rds-ping works as expected.

rds-stress on the client encounters the following error when it attempts to connect to the server :

connecting to <server IP address>:4000: No route to host
connect(<server IP address>) failed#

At this point there is no known work around for this problem.
Comment 18 Ben 2010-08-30 16:07:28 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,10 +1,10 @@
-rds-ping may fail with the following error : "bind() failed, errno: 99 (Cannot assign requested address)". This error may occur even with LOAD_RDS=yes set in /etc/rdma/rdma.conf.
+Known issues with RDS user tools.
+1) 'rds-ping' may fail with the following error : "bind() failed, errno: 99 (Cannot assign requested address)". This error may occur even with 'LOAD_RDS=yes' set in '/etc/rdma/rdma.conf'.
+The work around to this problem is to load the 'rds-tcp' module. After this, 'rds-ping' works as expected.
 
-The work around to this problem is to load the rds-tcp module. After this, rds-ping works as expected.
-
-rds-stress on the client encounters the following error when it attempts to connect to the server :
-
+2) 'rds-stress' on the client encounters the following error when it attempts to connect to the server :
+-----------
 connecting to <server IP address>:4000: No route to host
 connect(<server IP address>) failed#
-
+-----------
 At this point there is no known work around for this problem.
Comment 20 Ryan Lerch 2010-09-02 18:56:02 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,10 +1,10 @@
-Known issues with RDS user tools.
-1) 'rds-ping' may fail with the following error : "bind() failed, errno: 99 (Cannot assign requested address)". This error may occur even with 'LOAD_RDS=yes' set in '/etc/rdma/rdma.conf'.
-The work around to this problem is to load the 'rds-tcp' module. After this, 'rds-ping' works as expected.
+* Running the rds-ping command may fail, returing the error:
 
-2) 'rds-stress' on the client encounters the following error when it attempts to connect to the server :
------------
+bind() failed, errno: 99 (Cannot assign requested address). 
+							
+Note, also that this error may occur even with LOAD_RDS=yes set in /etc/rdma/rdma.conf. To work around this issue, load the rds-tcp module.
+
+* Running the command rds-stress on a client may result in the following error attempting to connect to the server:
+
 connecting to <server IP address>:4000: No route to host
-connect(<server IP address>) failed#
+connect(<server IP address>) failed#------------
-At this point there is no known work around for this problem.
Comment 21 IBM Bug Proxy 2010-09-28 12:21:13 EDT
------- Comment From tpnoonan@us.ibm.com 2010-09-28 12:17 EDT-------
this feature is not verifiable on rhel6.0 due to defects, a rhel6 tech note has been requested and a defect opened for rhel6.1
Comment 22 Ben 2010-10-08 13:31:53 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1,4 @@
-* Running the rds-ping command may fail, returing the error:
+* Running the rds-ping command may fail, returning the error:
 
 bind() failed, errno: 99 (Cannot assign requested address).
Comment 23 releng-rhel@redhat.com 2010-11-11 11:22:21 EST
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.