Bug 1593079

Summary:	IO performance is slow
Product:	[Community] GlusterFS	Reporter:	Thang <thangvubk>
Component:	rdma	Assignee:	bugs <bugs>
Status:	CLOSED WONTFIX	QA Contact:
Severity:	urgent	Docs Contact:
Priority:	medium
Version:	mainline	CC:	atumball, bugs, thangvubk
Target Milestone:	---	Flags:	thangvubk: needinfo-
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-06-17 11:10:54 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Thang 2018-06-20 03:03:59 UTC

Description of problem:
I'm trying to deploy a bare-metal cluster using glusterfs over RDMA. But the IO performance is not as fast as expected.
The glusterfs have 3 nodes using of HDDs with RAID 5 configuration.
The network is 56G Infiniband.

Version-Release number of selected component (if applicable):


How reproducible:
I use gluster-kuberntes and follow the tutorial on gluster-kubernetes github
I also report an issue at "https://github.com/gluster/gluster-kubernetes/issues/480"

Actual results:
IO speed is 14.2MB/s

Expected results:
About 200MB/s

Additional info:
First. I have tested the network the speed is about 36Gbps as shown below.
"perf -c 192.168.0.101 -P 16
------------------------------------------------------------
Client connecting to 192.168.0.101, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 18] local 192.168.0.102 port 46752 connected with 192.168.0.101 port 5001
[  3] local 192.168.0.102 port 46722 connected with 192.168.0.101 port 5001
[  4] local 192.168.0.102 port 46724 connected with 192.168.0.101 port 5001
[  5] local 192.168.0.102 port 46726 connected with 192.168.0.101 port 5001
[  7] local 192.168.0.102 port 46730 connected with 192.168.0.101 port 5001
[  6] local 192.168.0.102 port 46728 connected with 192.168.0.101 port 5001
[  8] local 192.168.0.102 port 46732 connected with 192.168.0.101 port 5001
[  9] local 192.168.0.102 port 46734 connected with 192.168.0.101 port 5001
[ 10] local 192.168.0.102 port 46736 connected with 192.168.0.101 port 5001
[ 11] local 192.168.0.102 port 46738 connected with 192.168.0.101 port 5001
[ 12] local 192.168.0.102 port 46740 connected with 192.168.0.101 port 5001
[ 13] local 192.168.0.102 port 46741 connected with 192.168.0.101 port 5001
[ 14] local 192.168.0.102 port 46744 connected with 192.168.0.101 port 5001
[ 16] local 192.168.0.102 port 46748 connected with 192.168.0.101 port 5001
[ 15] local 192.168.0.102 port 46746 connected with 192.168.0.101 port 5001
[ 17] local 192.168.0.102 port 46750 connected with 192.168.0.101 port 5001
[ ID] Interval       Transfer     Bandwidth
[ 18]  0.0-10.0 sec  1.65 GBytes  1.42 Gbits/sec
[  3]  0.0-10.0 sec  2.65 GBytes  2.28 Gbits/sec
[  4]  0.0-10.0 sec  2.87 GBytes  2.47 Gbits/sec
[  5]  0.0-10.0 sec  3.08 GBytes  2.65 Gbits/sec
[  7]  0.0-10.0 sec  2.34 GBytes  2.01 Gbits/sec
[  6]  0.0-10.0 sec  2.22 GBytes  1.90 Gbits/sec
[  8]  0.0-10.0 sec  2.76 GBytes  2.37 Gbits/sec
[  9]  0.0-10.0 sec  3.38 GBytes  2.90 Gbits/sec
[ 10]  0.0-10.0 sec  2.73 GBytes  2.35 Gbits/sec
[ 11]  0.0-10.0 sec  3.00 GBytes  2.58 Gbits/sec
[ 12]  0.0-10.0 sec  2.84 GBytes  2.44 Gbits/sec
[ 14]  0.0-10.0 sec  3.08 GBytes  2.64 Gbits/sec
[ 16]  0.0-10.0 sec  2.84 GBytes  2.44 Gbits/sec
[ 15]  0.0-10.0 sec  2.57 GBytes  2.21 Gbits/sec
[ 13]  0.0-10.0 sec  2.16 GBytes  1.85 Gbits/sec
[ 17]  0.0-10.0 sec  1.75 GBytes  1.51 Gbits/sec
[SUM]  0.0-10.0 sec  41.9 GBytes  36.0 Gbits/sec"

Next, i test the normal disk speed it is about 220MB/s
"dd if=/dev/zero of=/test  bs=1M count=1024 oflag=sync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 4.88277 s, 220 MB/s"

But when i test the gluster volume speed, it is very slow (16.2MB/s)
"dd if=/dev/zero of=/mnt/gluster_vol/test  bs=1M count=100  oflag=sync
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 7.37125 s, 14.2 MB/s"

My volume configuration is as follow.
"gluster volume info vol_3d582863ad2c090a625dd11d548af391

Volume Name: vol_3d582863ad2c090a625dd11d548af391
Type: Disperse
Volume ID: e6538acf-ddff-4b8e-963f-7c42a84e38ac
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: rdma
Bricks:
Brick1: 192.168.0.103:/var/lib/heketi/mounts/vg_0a0e479ba2cd16ec734f9937c3c0acb0/brick_54b322354a68d728dce1306dce1e1fc2/brick
Brick2: 192.168.0.101:/var/lib/heketi/mounts/vg_79369cf5dcc8f19f668b99cfa8fc1496/brick_e6a96af3ce804e4615b16f983e4fcc8a/brick
Brick3: 192.168.0.102:/var/lib/heketi/mounts/vg_4d261c2ab4baaf6ba936ddb7c5ee792d/brick_a90fff90e44b4e4c7236c6792f0b109e/brick
Options Reconfigured:
config.transport: rdma
transport.address-family: inet
nfs.disable: on"

Comment 2 Niels de Vos 2018-06-28 07:56:01 UTC

This most likely is a bug intended for the Gluster Community project, and not the Red Hat Gluster Storage product. I am moving this to the right location now. In case you are using the Red Hat Gluster Storage product, open a support case at https://access.redhat.com/support/cases/new and mention this bug report.

Please let us know if you started a discussion on the Gluster users mailinglist as suggested in the GitHub issue. We would also need to know which version of Gluster you are using (you can change it in this bug).

Comment 3 Shyamsundar 2018-10-23 14:54:59 UTC

Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.

Comment 4 Amar Tumballi 2019-06-17 11:10:54 UTC

Thanks for the report, but we are not able to look into the RDMA section
actively, and are seriously considering from dropping it from active support.

More on this @
https://lists.gluster.org/pipermail/gluster-devel/2018-July/054990.html


> ‘RDMA’ transport support:
> 
> Gluster started supporting RDMA while ib-verbs was still new, and very high-end infra around that time were using Infiniband. Engineers did work
> with Mellanox, and got the technology into GlusterFS for better data migration, data copy. While current day kernels support very good speed with
> IPoIB module itself, and there are no more bandwidth for experts in these area to maintain the feature, we recommend migrating over to TCP (IP
> based) network for your volume.
> 
> If you are successfully using RDMA transport, do get in touch with us to prioritize the migration plan for your volume. Plan is to work on this
> after the release, so by version 6.0, we will have a cleaner transport code, which just needs to support one type.