Red Hat Bugzilla – Bug 463800
[LTC 6.0 FEAT] 201588:Enable RDS in OFED InfiniBand kernel and include RDS user tools
Last modified: 2018-10-27 06:37:59 EDT
=Comment: #0================================================= Emily J. Ratliff <emilyr@us.ibm.com> - 2008-09-24 13:51 EDT 1. Feature Overview: Feature Id: [201588] a. Name of Feature: Enable RDS in OFED InfiniBand kernel and include RDS user tools b. Feature Description RDS is a realiable datagram mode socket implementation for Oracle RAC over InfiniBand. This feature is a part of OFED-1.3 release. This feature includes kernel RDS component 2. Feature Details: Sponsor: LTC Architectures: x86 ppc64 Arch Specificity: Purely Common Code Affects Kernel Modules: Yes Delivery Mechanism: Direct from community Category: Networking Request Type: Package - Feature from Upstream d. Upstream Acceptance: In Progress Sponsor Priority IBM Confidential: no Code Contribution: 3rd party code g. Component Version Target: OFED 3. Business Case IBM Oracle RAC middleware enablement is scheduled in April.09. This enablement depends on this feature to be in Distro from OFED-1.3. 4. Primary contact at Red Hat: John Jarvis jjarvis@redhat.com 5. Primary contacts at Partner: Project Management Contact: Mike Wortman, wortman@us.ibm.com, 512-838-8582 Technical contact(s): Shirley Ma, xma@us.ibm.com IBM Manager: Larry Kessler, lkessler@us.ibm.com
IBM, just to clarify, is this the RDS user tools to which you refer the rds-tools/1.5/1.el6/x86_64/rds-tools-1.5-1.el6.x86_64.rpm package currently in the nightly RHEL 6.0 builds? If so this is already included.
------- Comment From halves@linux.vnet.ibm.com 2010-05-10 16:10 EDT------- Verified on RHEL6 Snap2: I found rds-tools on RHEL6 Snap2 but rds-info command failed. [root@cluster-6 iperf-2.0.4]# rds-info rds-generic-tool: Unable to create socket: Address family not supported by protocol I tried find a rds kernel module, but I can`t found it. I look inside Kernel source, and I found a rds directory with Kconfig and Makefile files inside. [root@cluster-5 net]# cd /usr/src/kernels/2.6.32-23.el6.ppc64/net/rds [root@cluster-5 rds]# ls Kconfig Makefile [root@cluster-5 rds]# Regards, Higor
------- Comment From halves@linux.vnet.ibm.com 2010-05-13 15:45 EDT------- Hello, I can not found rds kernel modules on RHEL6 snap3. Regards, Higor
------- Comment From halves@linux.vnet.ibm.com 2010-05-27 17:09 EDT------- Hello, I did not found the rds driver on RHEL6 Snap5-early (RHEL6.0-20100523.0), I just found rds-tools package. Regards, Higor
You must be on powerpc. I just looked in the configs and it's enabled for other arches, but not powerpc. I'll post about it internally to get it enabled on ppc as well.
------- Comment From pradeep@us.ibm.com 2010-05-27 17:37 EDT------- (In reply to comment #11) > You must be on powerpc. I just looked in the configs and it's enabled for > other arches, but not powerpc. I'll post about it internally to get it enabled > on ppc as well. Thanks Doug. Yes, we are using ppc64.
------- Comment From halves@linux.vnet.ibm.com 2010-07-14 15:18 EDT------- Hello Team, I found rds drivers on RHEL6 Snap7. [root@cluster-11 rds]# uname -a Linux cluster-11.ltc.austin.ibm.com 2.6.32-44.el6.ppc64 #1 SMP Wed Jul 7 16:39:25 EDT 2010 ppc64 ppc64 ppc64 GNU/Linux [root@cluster-11 rds]# pwd /lib/modules/2.6.32-44.el6.ppc64/kernel/net/rds [root@cluster-11 rds]# ls rds.ko rds_rdma.ko rds_tcp.ko Regards, Higor
------- Comment From pradeep@us.ibm.com 2010-08-09 17:53 EDT------- Even with the latest snapshot (2.6.32-54.el6.ppc64) we still see errors like the following: [root@cluster-11 ~]# rds-info rds-generic-tool: Unable get statistics: Protocol not available rds-generic-tool: Unable get statistics: Protocol not available Counters: CounterName Value conn_reset 0 recv_drop_bad_checksum 0 recv_drop_old_seq 0 recv_drop_no_sock 0 recv_drop_dead_sock 0 recv_deliver_raced 0 recv_delivered 0 recv_queued 0 recv_immediate_retry 0 recv_delayed_retry 0 recv_ack_required 0 recv_rdma_bytes 0 recv_ping 0 send_queue_empty 0 send_queue_full 0 send_sem_contention 0 send_sem_queue_raced 0 send_immediate_retry 0 send_delayed_retry 0 send_drop_acked 0 send_ack_required 0 send_queued 0 send_rdma 0 send_rdma_bytes 0 send_pong 0 page_remainder_hit 0 page_remainder_miss 0 copy_to_user 0 copy_from_user 0 cong_update_queued 0 cong_update_received 0 cong_send_error 0 cong_send_blocked 0 RDS Sockets: BoundAddr BPort ConnAddr CPort SndBuf RcvBuf Inode 0.0.0.0 0 0.0.0.0 0 62464 62464 224602 RDS Connections: LocalAddr RemoteAddr NextTX NextRX Flg Receive Message Queue: LocalAddr LPort RemoteAddr RPort Seq Bytes Send Message Queue: LocalAddr LPort RemoteAddr RPort Seq Bytes Retransmit Message Queue: LocalAddr LPort RemoteAddr RPort Seq Bytes [root@cluster-11 ~]# rds-ping 10.1.1.120 bind() failed, errno: 99 (Cannot assign requested address) [root@cluster-11 ~]# lsmod | grep rds rds 110490 0 [root@cluster-11 ~]# rds-ping -I 10.1.1.110 10.1.1.120 bind() failed, errno: 99 (Cannot assign requested address) [root@cluster-11 ~]# I happened to see some old OFED documentation that stated that RDS was not supported on ppc64. is that still true with Rhel6?
------- Comment From pradeep@us.ibm.com 2010-08-09 18:54 EDT------- I loaded rds-tcp and then "rds-ping -c 3 10.1.1.120" seemed to not error out. Is this expected behavior? LOAD_RDS=yes was set in /etc/rdma/rdma.conf..
------- Comment From pradeep@us.ibm.com 2010-08-09 19:31 EDT------- Even though a server is running on the remoted end, rds-stress fails as follows: rds-stress -s 10.1.1.110 -p 4000 connecting to 10.1.1.110:4000: No route to host connect(10.1.1.110) failed[root@cluster-12 ~]#
Doug, Any chance you can upgrade rds-tools to latest ver, 2.0.4? It'd be preferable to have the latest version installed if there are bugs being reported. It can be found here: http://oss.oracle.com/projects/rds/dist/files/sources/rds-tools-2.0.4.tar.gz Along with the usual bugfixes, it also splits rds headers & manpages our of rds-tools into a separate rds-devel rpm.
Waiting on IBM for text of release note documenting shortcomings in this feature (see connected IT for details)
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: rds-ping may fail with the following error : "bind() failed, errno: 99 (Cannot assign requested address)". This error may occur even with LOAD_RDS=yes set in /etc/rdma/rdma.conf. The work around to this problem is to load the rds-tcp module. After this, rds-ping works as expected. rds-stress on the client encounters the following error when it attempts to connect to the server : connecting to <server IP address>:4000: No route to host connect(<server IP address>) failed# At this point there is no known work around for this problem.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,10 +1,10 @@ -rds-ping may fail with the following error : "bind() failed, errno: 99 (Cannot assign requested address)". This error may occur even with LOAD_RDS=yes set in /etc/rdma/rdma.conf. +Known issues with RDS user tools. +1) 'rds-ping' may fail with the following error : "bind() failed, errno: 99 (Cannot assign requested address)". This error may occur even with 'LOAD_RDS=yes' set in '/etc/rdma/rdma.conf'. +The work around to this problem is to load the 'rds-tcp' module. After this, 'rds-ping' works as expected. -The work around to this problem is to load the rds-tcp module. After this, rds-ping works as expected. - -rds-stress on the client encounters the following error when it attempts to connect to the server : - +2) 'rds-stress' on the client encounters the following error when it attempts to connect to the server : +----------- connecting to <server IP address>:4000: No route to host connect(<server IP address>) failed# - +----------- At this point there is no known work around for this problem.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,10 +1,10 @@ -Known issues with RDS user tools. -1) 'rds-ping' may fail with the following error : "bind() failed, errno: 99 (Cannot assign requested address)". This error may occur even with 'LOAD_RDS=yes' set in '/etc/rdma/rdma.conf'. -The work around to this problem is to load the 'rds-tcp' module. After this, 'rds-ping' works as expected. +* Running the rds-ping command may fail, returing the error: -2) 'rds-stress' on the client encounters the following error when it attempts to connect to the server : ------------ +bind() failed, errno: 99 (Cannot assign requested address). + +Note, also that this error may occur even with LOAD_RDS=yes set in /etc/rdma/rdma.conf. To work around this issue, load the rds-tcp module. + +* Running the command rds-stress on a client may result in the following error attempting to connect to the server: + connecting to <server IP address>:4000: No route to host -connect(<server IP address>) failed# +connect(<server IP address>) failed#------------ -At this point there is no known work around for this problem.
------- Comment From tpnoonan@us.ibm.com 2010-09-28 12:17 EDT------- this feature is not verifiable on rhel6.0 due to defects, a rhel6 tech note has been requested and a defect opened for rhel6.1
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,4 +1,4 @@ -* Running the rds-ping command may fail, returing the error: +* Running the rds-ping command may fail, returning the error: bind() failed, errno: 99 (Cannot assign requested address).
Red Hat Enterprise Linux 6.0 is now available and should resolve the problem described in this bug report. This report is therefore being closed with a resolution of CURRENTRELEASE. You may reopen this bug report if the solution does not work for you.