Bug 1487495

Summary: client-io-threads option not working for replicated volumes
Product: Red Hat Gluster Storage Reporter: Manoj Pillai <mpillai>
Component: replicateAssignee: Ravishankar N <ravishankar>
Status: CLOSED ERRATA QA Contact: Vijay Avuthu <vavuthu>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.3CC: amukherj, ravishankar, rcyriac, rhinduja, rhs-bugs, sheggodu, storage-qa-internal
Target Milestone: ---Keywords: Performance
Target Release: RHGS 3.4.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: rebase
Fixed In Version: glusterfs-3.12.2-1 Doc Type: If docs needed, set a value
Doc Text:
Previously, there was a bug in the code due to which client-io-threads was not loaded in a replicate volume, even though `gluster volume set <volname> client-io-threads on` returned success. With this fix, it works correctly.
Story Points: ---
Clone Of:
: 1498570 1598416 (view as bug list) Environment:
Last Closed: 2018-09-04 06:35:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1498570, 1499158, 1503134    

Description Manoj Pillai 2017-09-01 05:44:34 UTC
Description of problem:
The option "performance.client-io-threads on" is needed in some scenarios where the fuse thread becomes a bottleneck. But this option seems to be disabled for replicated volumes i.e. setting the option on via the "gluster v set ..." command has no effect.

It is fine to set the option to off by default for replicated volumes. But we need the option of turning it on in scenarios where we see that the fuse thread is bottlenecked.

Version-Release number of selected component (if applicable):



How reproducible:

Additional info:
gluster v info perfvol

Volume Name: perfvol
Type: Replicate
Volume ID: 7773ecd2-6a6b-4220-bf79-42bd279d6476
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Brick1: smerf02-10ge:/mnt/rhs_brick1
Brick2: smerf03-10ge:/mnt/rhs_brick1
Brick3: smerf04-10ge:/mnt/rhs_brick1
Options Reconfigured:
performance.strict-o-direct: on
network.remote-dio: disable
performance.io-cache: off
performance.write-behind: off
performance.client-io-threads: on
performance.readdir-ahead: off
performance.read-ahead: off
transport.address-family: inet
nfs.disable: on

Output of top command (with -H option) during a random read test on above volume:
 3737 root      20   0  607540  24080   3792 R 99.2  0.0   7:31.29 glusterfs
 3734 root      20   0  607540  24080   3792 S 42.2  0.0   4:04.87 glusterfs
 3733 root      20   0  607540  24080   3792 S 42.0  0.0   4:06.15 glusterfs
 4166 root      20   0  504132   1016    368 D  1.7  0.0   0:00.28 fio

# pstack 3737 [the thread above with 99+% cpu usage]
Thread 1 (process 3737):
#0  0x00007f02d1597190 in readv () from /lib64/libc.so.6
#1  0x00007f02ca26e75a in fuse_thread_proc () from /usr/lib64/glusterfs/3.8.4/xlator/mount/fuse.so
#2  0x00007f02d1cd3e25 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f02d15a034d in clone () from /lib64/libc.so.6

Comment 6 Manoj Pillai 2017-09-03 19:21:19 UTC
Some data on the potential performance benefit from turning client-io-threads on ...

fio random io test, with 16 threads, direct=1, block size=8.
1x3 volume with bricks on nvme. 10GbE network. fuse-mounted on client.

no client-io-threads:
read : io=8192.0MB, bw=113406K/s, iops=14175
write: io=8192.0MB, bw=65170K/s, iops=8146
[The fuse thread is at 99+% CPU utilization in the above tests.]

client-io-threads turned on by manually editing volfiles [Thanks to Krutika for showing how to do that]:
read : io=8192.0MB, bw=211604K/s, iops=26450
write: io=8192.0MB, bw=152025K/s, iops=19003

Read iops improves by 1.86x
Write iops improves by 2.33x

I understand that in perf QE testing we have seen performance degradations from turning client-io-threads on. No disputing that, but turning it on UNDER THE RIGHT CIRCUMSTANCES, i.e. when you are hitting a fuse thread bottleneck, can give huge benefits. We should be able to turn it on via the CLI in these scenarios. But based on perf QE results, we can keep it off by default.

Comment 7 Ravishankar N 2017-10-04 16:26:07 UTC
Upstream patch: https://review.gluster.org/18430

Comment 8 Manoj Pillai 2017-10-09 05:39:04 UTC
Ravi, my understanding is that the number of client-io-threads is controlled by the option performance.io-thread-count? And that this option also controls the number of io-threads at the brick. 

Can you confirm the above and that currently there is no way control the number of client-io-threads independently of the number of io-threads at the brick?

Comment 9 Ravishankar N 2017-10-09 06:18:13 UTC
That is correct.  https://github.com/gluster/glusterfs/commit/09232fd6855f288c47b5396dcd4d4245a154576f seems to have introduced support for loading io-threads xlator on the client side but has not provided any specific way to modify the tunables exclusively for the brick or client side. So yes, changing any xlator option in iot xlator gets reflected in both the brick and client volfiles.

Comment 10 Manoj Pillai 2017-10-09 08:16:52 UTC
Thanks! I opened an upstream bz for this along with some data in support of the request: https://bugzilla.redhat.com/show_bug.cgi?id=1499649.

Comment 14 Vijay Avuthu 2018-04-30 06:45:07 UTC

Build Used: glusterfs-3.12.2-8.el7rhgs.x86_64

Below Scenario's are verified:

Scenario 1:

1) create 1 * 3 replicate volume and start it
2) check for performance.client-io-threads and it should be off
3) check in trusted-<volname>.tcp-fuse.vol file whether io-threads are loaded or not ( it shouldnot be loaded )

Scenario 2:

1) create distribute volume ( 1 * 1 ) and start
2) check for performance.client-io-threads and it should be on
3) io-threads should be loaded in trusted-<volname>.tcp-fuse.vol
4) add bricks to convert to replicate volume ( 1 * 2 )
5) check for performance.client-io-threads and it should be off
6) io-threads shouldn't be loaded in trusted-<volname>.tcp-fuse.vol
7) remove bricks so that volume type will be converted to distribute
8) check for performance.client-io-threads and it should be on agian
9) io-threads should be loaded in trusted-<volname>.tcp-fuse.vol

Scenario 3:

1) create 1 * 3 replicate volume and start it
2) set performance.client-io-threads to on
3) check for performance.client-io-threads and it should be on
4) io-threads should be loaded in trusted-<volname>.tcp-fuse.vol
5) add bricks so that volume converts to 2 * 3 
6) check for performance.client-io-threads and it should be on
7) io-threads should be loaded in trusted-<volname>.tcp-fuse.vol
8) remove bricks so that 2 * 3 converts to 1 * 3
9) check for performance.client-io-threads and it should be on
10) io-threads should be loaded in trusted-<volname>.tcp-fuse.vol

Changing status to Verified.

Comment 17 errata-xmlrpc 2018-09-04 06:35:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.