Bug 765311 (GLUSTER-3579) - new feature request: rdma bonding in glusterfs ib-verbs
Summary: new feature request: rdma bonding in glusterfs ib-verbs
Keywords:
Status: CLOSED NOTABUG
Alias: GLUSTER-3579
Product: GlusterFS
Classification: Community
Component: rdma
Version: 3.2.3
Hardware: x86_64
OS: Linux
low
low
Target Milestone: ---
Assignee: Anand Avati
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-09-17 08:40 UTC by Oliver Deppert
Modified: 2015-12-01 16:45 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-07-25 20:35:55 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Oliver Deppert 2011-09-17 08:40:37 UTC
Hi,

I'm using a three node cluster infrastructure. Every node has a two-port Infiniband HCA. All ports are connected to a single unmanaged IB switch. I've managed to bond the two ports in the tcp-layer to double the throughput and for failover reasons for the tcp-layer.

As far as I know, openMPI which uses ib-verbs too is able to use the two ports of the HCA in the same way like tcp-bonding. It automaticaly recognizes the two ports in the same IB fabric and will use them bonded to each other in the rdma stack. So, it will double the mpi throughput natively as soon as two ports are available.

A nice enhancement for glusterfs would be a option which allows the user to choose a rdma-bonded two-port HCA. At the moment the user has the choice to switch either between the physical port one or two of the HCA but there is no way to use both ports sumultaniosly. It would be nice (like in OpenMPI) if glusterfs recognizes all active ports, tests if they are in the same IB-fabric and then use both ports under different options of load-balancing (round robin etc) or failover methods.

Because glusterfs also uses ib-verbs, now called rdma transport, I thought it could be possible to use two ports bonded in the rdma-stack.

regards,
Oliver

Comment 1 Amar Tumballi 2011-09-28 04:01:10 UTC
Hi Oliver,

We can work on this feature only after 3.3.x release cycle.

Comment 2 Amar Tumballi 2012-02-27 10:36:06 UTC
This is the priority for immediate future (before 3.3.0 GA release). Will bump the priority up once we take RDMA related tasks.

Comment 4 Andrei Mikhailovsky 2013-02-05 14:57:18 UTC
Has this feature been added in 3.3.0?

Comment 5 Ben England 2014-04-25 13:09:20 UTC
why do you need bonding?  If you have 40-Gbps IB infrastructure (QDR) or 56-Gbps IB infrastructure, surely you don't need it.  Also, there are other ways to employ multiple network ports besides bonding - if you are not using native glusterfs, you could use one port for non-Gluster traffic and one port for Gluster traffic on a separate network.

I heard Gluster is now integrated with librdmacm, does librdmacm support bonding?  For that matter, does IPOIB support bonding?

If you have 10-Gbps Ethernet ports, you can use "balance-alb" NIC bonding with jumbo frames with size MTU=9000 and this will get you up to 1.5 GB/s network throughput for a single Gluster client (I know, should be 2 GB/s, but it's a lot better than with 1 port).   This will get you considerable performance gain without using RDMA at all.

Comment 6 Ben England 2014-07-25 20:35:55 UTC
since there have been no further comments and original post is 3 years old, am closing.


Note You need to log in before you can comment on or make changes to this bug.