Bug 765311 - (GLUSTER-3579) new feature request: rdma bonding in glusterfs ib-verbs
new feature request: rdma bonding in glusterfs ib-verbs
Product: GlusterFS
Classification: Community
Component: rdma (Show other bugs)
x86_64 Linux
low Severity low
: ---
: ---
Assigned To: Anand Avati
: FutureFeature
Depends On:
  Show dependency treegraph
Reported: 2011-09-17 04:40 EDT by Oliver Deppert
Modified: 2015-12-01 11:45 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2014-07-25 16:35:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Oliver Deppert 2011-09-17 04:40:37 EDT

I'm using a three node cluster infrastructure. Every node has a two-port Infiniband HCA. All ports are connected to a single unmanaged IB switch. I've managed to bond the two ports in the tcp-layer to double the throughput and for failover reasons for the tcp-layer.

As far as I know, openMPI which uses ib-verbs too is able to use the two ports of the HCA in the same way like tcp-bonding. It automaticaly recognizes the two ports in the same IB fabric and will use them bonded to each other in the rdma stack. So, it will double the mpi throughput natively as soon as two ports are available.

A nice enhancement for glusterfs would be a option which allows the user to choose a rdma-bonded two-port HCA. At the moment the user has the choice to switch either between the physical port one or two of the HCA but there is no way to use both ports sumultaniosly. It would be nice (like in OpenMPI) if glusterfs recognizes all active ports, tests if they are in the same IB-fabric and then use both ports under different options of load-balancing (round robin etc) or failover methods.

Because glusterfs also uses ib-verbs, now called rdma transport, I thought it could be possible to use two ports bonded in the rdma-stack.

Comment 1 Amar Tumballi 2011-09-28 00:01:10 EDT
Hi Oliver,

We can work on this feature only after 3.3.x release cycle.
Comment 2 Amar Tumballi 2012-02-27 05:36:06 EST
This is the priority for immediate future (before 3.3.0 GA release). Will bump the priority up once we take RDMA related tasks.
Comment 4 Andrei Mikhailovsky 2013-02-05 09:57:18 EST
Has this feature been added in 3.3.0?
Comment 5 Ben England 2014-04-25 09:09:20 EDT
why do you need bonding?  If you have 40-Gbps IB infrastructure (QDR) or 56-Gbps IB infrastructure, surely you don't need it.  Also, there are other ways to employ multiple network ports besides bonding - if you are not using native glusterfs, you could use one port for non-Gluster traffic and one port for Gluster traffic on a separate network.

I heard Gluster is now integrated with librdmacm, does librdmacm support bonding?  For that matter, does IPOIB support bonding?

If you have 10-Gbps Ethernet ports, you can use "balance-alb" NIC bonding with jumbo frames with size MTU=9000 and this will get you up to 1.5 GB/s network throughput for a single Gluster client (I know, should be 2 GB/s, but it's a lot better than with 1 port).   This will get you considerable performance gain without using RDMA at all.
Comment 6 Ben England 2014-07-25 16:35:55 EDT
since there have been no further comments and original post is 3 years old, am closing.

Note You need to log in before you can comment on or make changes to this bug.