Bug 839810
Summary: | RDMA high cpu usage and poor performance | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Bryan Whitehead <bwhitehead> | ||||
Component: | ib-verbs | Assignee: | Raghavendra G <rgowdapp> | ||||
Status: | CLOSED DEFERRED | QA Contact: | |||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 3.3.0 | CC: | andrei, anoopcs, bugs, bwhitehead, gluster-bugs, jdarcy, jthottan, pjameson, rkavunga, rwheeler | ||||
Target Milestone: | --- | Keywords: | Triaged | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 852370 (view as bug list) | Environment: | |||||
Last Closed: | 2014-12-14 19:40:28 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 852370, 858479, 952693, 962431, 1134839, 1164079, 1166140, 1166515 | ||||||
Attachments: |
|
I have a very similar setup actually, but I do not experience the performance issues that you've described. Same IB cards: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0) installed in Ubuntu 12.04 file server. I am not using the drives provided by Mellanox, instead I am using the Ubuntu infiniband PPA and original kernel modules. I've ran some tests and each connected client can easily do about 700-800mb/s using iozone benchmark and around 1GB/s with 4 concurrent dd threads from /dev/zero. The server load stays reasonably low, however, I don't remember the figures. I was just wondering if any progress had been made on this in the past few months. We were testing with ConnectX 3 cards (Mellanox Technologies MT27500 Family [ConnectX-3]), and we were also getting much poorer performance using RDMA. We were mostly testing using DDs on a VM that utilized libgfapi (since the NFS server and fuse plugins seemed to be a bottleneck themselves), and got the following speeds on a replicated volume: Local: [root@node2 glusterfs]# time dd if=/dev/zero of=/mnt/raid/local_test.img bs=1M count=200000 oflag=direct conv=fdatasync 200000+0 records in 200000+0 records out 209715200000 bytes (210 GB) copied, 377.337 s, 556 MB/s TCP/IP, 10Gbit ethernet: livecd ~ # time dd if=/dev/zero of=/dev/vdb bs=1M count=200000 oflag=direct conv=fdatasync 200000+0 records in 200000+0 records out 209715200000 bytes (210 GB) copied, 405.462 s, 517 MB/s Infiniband (RDMA volume type): livecd ~ # time dd if=/dev/zero of=/dev/vdb bs=1M count=200000 oflag=direct conv=fdatasync 200000+0 records in 200000+0 records out 209715200000 bytes (210 GB) copied, 664.39 s, 316 MB/s IPoIB, Infiniband (TCP volume type): livecd ~ # dd if=/dev/zero of=/dev/vdb bs=1M count=200000 oflag=direct conv=fdatasync 200000+0 records in 200000+0 records out 209715200000 bytes (210 GB) copied, 408.181 s, 514 MB/s We also just tried SRP to a ram disk to make sure it wasn't just the transport (using SRPT/SCST on the server, and the SRP module on the client): [root@node1 ~]# dd if=/dev/zero of=/dev/sde bs=1M count=13836 oflag=direct conv=fdatasync 13836+0 records in 13836+0 records out 14508097536 bytes (15 GB) copied, 8.91368 s, 1.6 GB/s The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug. If there has been no update before 9 December 2014, this bug will get automatocally closed. |
Created attachment 597910 [details] output of gluster volume profile testrdma info Description of problem: create volume using rdma transport. I'm using Mellanox Technologies MT26428 IB QDR. using native verbs I can barely get 1/3 the speed of underlying disks. If I use IPoIB, I can get full speed of underlying disks (and very little CPU usage from glusterfsd. Version-Release number of selected component (if applicable): 3.3.0 also same problems on 3.2.5 How reproducible: always Steps to Reproduce: 1. setup rdma only volume 2. mount to a directoy (fuse) 3. dd if=/dev/zero bs=1M of=/path/to/mount/test.out Actual results: poor performance and high CPU usage. Expected results: extremely fast performance with nominal CPU usage. Additional info: I will attach IPoIB performance output once I reconfigured IB. (I turned it off to make sure IBoIP was not affecting the native verbs).