Bug 1134839

Summary: [Tracker] RDMA support in glusterfs
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Mohammed Rafi KC <rkavunga>
Component: rdmaAssignee: Mohammed Rafi KC <rkavunga>
Status: CLOSED CURRENTRELEASE QA Contact: storage-qa-internal <storage-qa-internal>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.0CC: aavati, anoopcs, bturner, jthottan, lmohanty, nlevinki, rcyriac, rtalur, rwheeler, vagarwal, vbhat
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1164079 (view as bug list) Environment:
Last Closed: 2015-01-22 09:54:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 831205, 839810, 852370, 1131502, 1154018, 1154617    
Bug Blocks: 1164079, 1166140, 1166515    

Description Mohammed Rafi KC 2014-08-28 10:19:52 UTC
Description of problem:

While running the sanity tests for rdma component, 
Following tests were successful,

1. areequal
2. dbench
3. dd
4. fs_mark
5. fsx
6. gluster_build
7. locks
8. ltp
9. multiplefiles
10. openssl
11. posix compilance
12. postmark
13. readlarge
14. rpc
15. syscalbench
16. tiobench
17. compile-kernel
18. bonnie++
19. ffsb
20. fileop


Following test was failing:

1. iozone


Version-Release number of selected component (if applicable):

rhs 3.0

How reproducible:


Steps to Reproduce:

1.create a 4*2 distributed-replicate volume with transport type as rdma.
2.start the volume and mount using fuse mount .
3.run the scripts by passing mount path argument.

Actual results:

All tests are passing except iozone.

Expected results:

Should pass all tests

Additional info:

Error details about iozone.

iozone fails throwing an i/o error during the fwrite in function wb_fd_err (fd_t *fd, xlator_t *this, int32_t *op_errno);

Comment 2 Mohammed Rafi KC 2014-09-04 05:36:44 UTC
IOZONE test fails for FUSE in write-behind translator, probably it is because of any error in previous write. When we turn off write-behind translator, IOZONE is able to complete successfully.

RAFI KC

Comment 3 Jiffin 2014-09-04 06:19:04 UTC
Gluster NFS mount on rdma volume

Mounting of NFS is only possible from some hosts in the peer list.
Otherwise mount fails with an error 
mount.nfs: mounting <host_ip>:/<volname> failed, reason given by server: No such file or directory


How reproducible:
Mostly

Steps to Reproduce:
1.)Create a rdma volume
2.)Start the volume
3.)Mount using nfs type

Sanity tests on NFS mount

While running the sanity tests for rdma component, 
Following tests were successful,

1. areequal
2. dbench
3. dd
4. fs_mark
5. fsx
6. gluster_build
7. locks
8. ltp
9. multiplefiles
10. openssl
11. posix compilance
12. postmark
13. readlarge
14. rpc
15. syscalbench
16. tiobench
17. compile-kernel
18. bonnie++
19. ffsb



Following test was failing:

1. iozone

Steps to Reproduce:

1.create a 4*2 distributed-replicate volume with transport type as rdma.
2.start the volume and mount using nfs mount .
3.run the scripts by passing mount path argument.

Actual results:

All tests are passing except iozone.

Expected results:

Should pass all tests

Comment 4 Mohammed Rafi KC 2014-09-08 06:47:48 UTC
We ran the perf tests on a 2*2  distributed-replicated volume with profile setting to on and turning off write-behind translator.

The averaged numbers (average of three rounds) for rdma are:

Testname                                  rdma-numbers
emptyfiles_create                           656.54
emptyfiles_delete                           632.69
smallfiles_create                           940.50
smallfiles_rewrite                          638.63
smallfiles_read                             331.56
smallfiles_reread                           257.87
smallfiles_delete                           673.56
largefile_create                            36.66
largefile_rewrite                           30.59
largefile_read                              2.61
largefile_reread                            0.42
largefile_delete                            0.48
directory_crawl_create                      531.35
directory_crawl                             26.89
directory_recrawl                           20.22
metadata_modify                             242.87
directory_crawl_delete                      403.72

Comment 5 Mohammed Rafi KC 2014-09-08 10:08:23 UTC
Comparison of the above perf test on 2*2 distributed replicated, rdma volume mounted as FUSE mount. 

The averaged numbers (average of three rounds) for tcp and rdma are:

Testname                    tcp-numbers              rdma-numbers
emptyfiles_create            1884.06                   656.54
emptyfiles_delete            1400.21	               632.69
smallfiles_create            2222.42                   940.50
smallfiles_rewrite           594.99                    638.63
smallfiles_read              370.57                    331.56
smallfiles_reread            294.23                    257.87
smallfiles_delete            1195.33                   673.56
largefile_create             29.13                     36.66
largefile_rewrite            35.97                     30.59
largefile_read               9.48                      2.61
largefile_reread             0.41                      0.42
largefile_delete             0.44                      0.48
directory_crawl_create       1808.93                   531.35
directory_crawl              31.04                     26.89
directory_recrawl            23.09                     20.22
metadata_modify              268.68                    242.87
directory_crawl_delete       637.32                    403.72

Comment 7 Mohammed Rafi KC 2014-09-08 11:01:51 UTC
Yes, the numbers are time taken to complete each test cases.
We are running performance test, once it completed we can finalize about performance of rdma.

Comment 8 M S Vishwanath Bhat 2014-09-09 09:28:19 UTC
Yes. The numbers are in seconds.

Comment 9 Mohammed Rafi KC 2014-09-11 12:01:53 UTC
Comparison of perf tests with transport type TCP only and RDMA only on a 2*2  distributed-replicated volume with turning off write-behind translator, mounted using NFS mount.



Testname                RDMA-Time                   TCP-Time
emptyfiles_create       558.85                      418.02
emptyfiles_delete       785.17                      528.03
smallfiles_create       5216.05                     5753.08
smallfiles_rewrite      3858.95                     4582.13
smallfiles_read         305.07                      210.73
smallfiles_reread       253.45                      185.91
smallfiles_delete       547.75                      511.17
largefile_create        34.16                       32.54
largefile_rewrite       28.29                       27.38
largefile_read          3.73                        4.83
largefile_reread        0.42                        0.39
largefile_delete        0.49                        0.48
directory_crawl_create  428.01                      418.57
directory_crawl         115.31                      93.91
directory_recrawl       54.86                       59.96
metadata_modify         160.51                      172.83
directory_crawl_delete  367.00                      407.75

Comment 10 Jiffin 2014-09-12 05:07:02 UTC
Izone (Large file performance) test :

Setup
-----
Test is performed on two servers and two clients enviroment.

Volume used : 2x2 with write behind translator off

Iozone contains eight threads.


I.) Gluster fuse mount with rdma
================================

TESTS : (avg of two tests)
---------------------------

1)Sequential Write  performed for 1GB and 8GB
i -  1GB  - 30.5MB/s
ii - 8GB  - 42.67MB/s


2)Sequential Read  performed for  8GB  - 112.65 MB/s

3)Random read/write for 1GB files      
i - read - 18.494 MB/s
ii - write -4.5 MB/s

II.) Gluster fuse mount with tcp
================================

1.) Sequential Write for 8GB - 42.18 MB/s

2.) Sequential Read for 8GB - 101. MB/s

Comment 11 Jiffin 2014-09-15 05:31:46 UTC
Small file performance for fuse mount
======================================

Enviroment : 2 servers and 2 clients (2x2 volume)

Eight threads are running on each clients and each threads have 1000 files.


Operation 		   RDMA			|	     TCP
-------------------------------------------------------------------------------
                                                |
		Throughput	Transfer rate	|   Throughput	   Transfer rate
                (files/sec)       (MB/sec)      |   (files/sec)       (MB/sec) 
                                                |
                                                |
create		208.1206165	13.00735        |   277.2856905	    17.3303555
                                                |
append		619.333253	38.708328	|   448.1761315     28.0110085
                                                |
read		2263.879154	141.4924475	|   1254.1275705    78.382973
                                                |
delete		1449.014152	   ----         |   428.6171985        ----
                                                |
mkdir		102.504926         ----         |   89.213589	       ----
                                                |
rmdir		71.3374545	   ----	        |   80.6003745         ----

Comment 12 Jiffin 2014-09-16 05:55:10 UTC
Small file performance for nfs mount
======================================

Enviroment : 2 servers and 2 clients (2x2 volume is used)

Eight threads are running on each clients and each threads have 1000 files.

Result is the average of two tests.

Operation 		   RDMA			|	     TCP
-------------------------------------------------------------------------------
                                                |
		Throughput	Transfer rate	|   Throughput	   Transfer rate
                (files/sec)       (MB/sec)      |   (files/sec)       (MB/sec) 
                                                |
                                                |
create		37.9576065	2.3723505       |   42.7179325	    2.669871 
                                                |
append		110.363493	6.897718	|   106.2460645	    6.640379 
                                                |
read		2335.1621405	145.9476335	|   1251.13685	    78.1960535  
                                                |
delete		852.314923	   ----         |   778.2418585        ----
                                                |
mkdir		81.677207         ----          |   110.36599	       ----
                                                |
rmdir		135.483443	   ----	        |   91.1340405         ----

Comment 13 Mohammed Rafi KC 2014-10-10 05:34:28 UTC
Regression test result or rdma

Failed test
open-behind
all tests with uss
couple of tests in bugs folder which is put in a file (Failed Tests In regression(BUGS ONLY) ) and shared in google drive folder.
https://drive.google.com/a/redhat.com/folderview?id=0B_iUWhUNczSVZjVkd2p2Vk1kZlE&usp=sharing

Comment 14 Anoop C S 2014-11-12 06:59:52 UTC
Sanity testing results of RDMA support on RHSS 3.0.2 [RHEL 6.6] with a downstream glusterfs build including the following patches:

http://review.gluster.org/#/c/8479/
http://review.gluster.org/#/c/8850/
http://review.gluster.org/#/c/8762/
http://review.gluster.org/#/c/8925/
http://review.gluster.org/#/c/8934/
http://review.gluster.org/#/c/9003/
http://review.gluster.org/#/c/9005/
http://review.gluster.org/#/c/8498/
http://review.gluster.org/#/c/8975/
http://review.gluster.org/#/c/9011/

Note:- All tests were run on a 2x2 RDMA only volume with default options.

Following tests were succesful

1.  areequal
2.  dbench
3.  dd
4.  fs_mark
5.  fsx
6.  gluster_build
7.  locks
8.  ltp
9.  multiplefiles
10. openssl
11. posix compilance
12. postmark
13. readlarge
14. rpc
15. syscalbench
16. tiobench
17. compile-kernel
18. bonnie++
19. ffsb

Following test failed

1. iozone

Note:- iozone ran successfully with the write-behind translator off.

Comment 15 Anoop C S 2014-11-12 11:17:55 UTC
Comparison of perf-test results of RDMA support on RHSS 3.0.2 [RHEL 6.6] with a downstream glusterfs build including the following patches to its TCP kind.

Note:- All tests were run on a fuse mount of 2x2 volume.

FUSE	                RDMA     | pure tcp  | IPoIB tcp
                        ------------------------------
Test	                Time(sec)| Time (sec)| Time (sec)
emptyfiles_create       312.62      326.37     279.62
emptyfiles_delete       583.65      578.22     590.02
smallfiles_create       924.18      796.08     854.9
smallfiles_rewrite      583.46      540.96     461.96
smallfiles_read         301.16      322.8      272.6
smallfiles_reread       232.5       250.68     218.32
smallfiles_delete       589.58      650.2      658.5
largefile_create        24.41       27.81      19.36
largefile_rewrite       31.67       39.22      31.53
largefile_read          2.42        9.31       2.79
largefile_reread        0.4         0.41       0.41
largefile_delete        0.51        0.5        0.5
directory_crawl_create  427.86      453.12     414.75
directory_crawl         36.46       28.56      36.88
directory_recrawl       20.48       19.89      19.43
metadata_modify         230.76      243.75     202.41
directory_crawl_delete  367.91      345.53     339.88

Comment 17 Anoop C S 2014-11-14 08:38:57 UTC
Comparison of perf-test results of RDMA support on RHSS 3.0.2 [RHEL 6.6] with a downstream glusterfs build including the following patches to its TCP kind.

Note:- All tests were run on a nfs mount of 2x2 volume.

NFS                       RDMA    |  pure tcp  |   IPoIB
                        ----------------------------------
Test                    Time (sec)| Time (sec) | Time (sec)
emptyfiles_create       223.48       1806.53      241.24
emptyfiles_delete       448.27       601.53       454.98
smallfiles_create       5068.87      5840.57      5048.36
smallfiles_rewrite      4037.22      4081.6       4229.32
smallfiles_read         235.71       248.07       214.21
smallfiles_reread       182.51       203.54       188.27
smallfiles_delete       474.05       641.03       477.99
largefile_create        25.5         29.8         33.03
largefile_rewrite       34.03        28.06        26.1
largefile_read          2.72         9.15         2.39
largefile_reread        0.41         0.41         0.41
largefile_delete        0.48         0.49         0.48
directory_crawl_create  415.41       695.46       418
directory_crawl         98.55        64.78        75.02
directory_recrawl       114.59       80.61        95.05
metadata_modify         144.55       200.15       157.11
directory_crawl_delete  388.39       442.42       378.99

Comment 18 Vivek Agarwal 2015-01-22 09:54:59 UTC
RDMA support is GAed in 3.0.3, hence closing the tracker bug.