1487495 – client-io-threads option not working for replicated volumes

Bug 1487495 - client-io-threads option not working for replicated volumes

Summary: client-io-threads option not working for replicated volumes

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	replicate
Sub Component:
Version:	rhgs-3.3
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.4.0
Assignee:	Ravishankar N
QA Contact:	Vijay Avuthu
Docs Contact:
URL:
Whiteboard:	rebase
Depends On:
Blocks:	1498570 1499158 1503134
TreeView+	depends on / blocked

Reported:	2017-09-01 05:44 UTC by Manoj Pillai
Modified:	2018-09-04 06:37 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.12.2-1
Doc Type:	If docs needed, set a value
Doc Text:	Previously, there was a bug in the code due to which client-io-threads was not loaded in a replicate volume, even though `gluster volume set <volname> client-io-threads on` returned success. With this fix, it works correctly.
Clone Of:
Clones:	1498570 1598416 (view as bug list)
Environment:
Last Closed:	2018-09-04 06:35:11 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2018:2607	0	None	None	None	2018-09-04 06:37:32 UTC

Description Manoj Pillai 2017-09-01 05:44:34 UTC

Description of problem:
The option "performance.client-io-threads on" is needed in some scenarios where the fuse thread becomes a bottleneck. But this option seems to be disabled for replicated volumes i.e. setting the option on via the "gluster v set ..." command has no effect.

It is fine to set the option to off by default for replicated volumes. But we need the option of turning it on in scenarios where we see that the fuse thread is bottlenecked.

Version-Release number of selected component (if applicable):

glusterfs-libs-3.8.4-43.el7rhgs.x86_64
glusterfs-3.8.4-43.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-43.el7rhgs.x86_64
glusterfs-fuse-3.8.4-43.el7rhgs.x86_64

kernel-3.10.0-693.el7.x86_64


How reproducible:
always


Additional info:
<quote>
gluster v info perfvol

Volume Name: perfvol
Type: Replicate
Volume ID: 7773ecd2-6a6b-4220-bf79-42bd279d6476
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: smerf02-10ge:/mnt/rhs_brick1
Brick2: smerf03-10ge:/mnt/rhs_brick1
Brick3: smerf04-10ge:/mnt/rhs_brick1
Options Reconfigured:
performance.strict-o-direct: on
network.remote-dio: disable
performance.io-cache: off
performance.write-behind: off
performance.client-io-threads: on
performance.readdir-ahead: off
performance.read-ahead: off
transport.address-family: inet
nfs.disable: on
</quote>

Output of top command (with -H option) during a random read test on above volume:
<quote>
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 3737 root      20   0  607540  24080   3792 R 99.2  0.0   7:31.29 glusterfs
 3734 root      20   0  607540  24080   3792 S 42.2  0.0   4:04.87 glusterfs
 3733 root      20   0  607540  24080   3792 S 42.0  0.0   4:06.15 glusterfs
 4166 root      20   0  504132   1016    368 D  1.7  0.0   0:00.28 fio
</quote>

# pstack 3737 [the thread above with 99+% cpu usage]
<quote>
Thread 1 (process 3737):
#0  0x00007f02d1597190 in readv () from /lib64/libc.so.6
#1  0x00007f02ca26e75a in fuse_thread_proc () from /usr/lib64/glusterfs/3.8.4/xlator/mount/fuse.so
#2  0x00007f02d1cd3e25 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f02d15a034d in clone () from /lib64/libc.so.6
</quote>

Comment 6 Manoj Pillai 2017-09-03 19:21:19 UTC

Some data on the potential performance benefit from turning client-io-threads on ...

fio random io test, with 16 threads, direct=1, block size=8.
1x3 volume with bricks on nvme. 10GbE network. fuse-mounted on client.

no client-io-threads:
read : io=8192.0MB, bw=113406K/s, iops=14175
write: io=8192.0MB, bw=65170K/s, iops=8146
[The fuse thread is at 99+% CPU utilization in the above tests.]

client-io-threads turned on by manually editing volfiles [Thanks to Krutika for showing how to do that]:
read : io=8192.0MB, bw=211604K/s, iops=26450
write: io=8192.0MB, bw=152025K/s, iops=19003

Read iops improves by 1.86x
Write iops improves by 2.33x

I understand that in perf QE testing we have seen performance degradations from turning client-io-threads on. No disputing that, but turning it on UNDER THE RIGHT CIRCUMSTANCES, i.e. when you are hitting a fuse thread bottleneck, can give huge benefits. We should be able to turn it on via the CLI in these scenarios. But based on perf QE results, we can keep it off by default.

Comment 7 Ravishankar N 2017-10-04 16:26:07 UTC

Upstream patch: https://review.gluster.org/18430

Comment 8 Manoj Pillai 2017-10-09 05:39:04 UTC

Ravi, my understanding is that the number of client-io-threads is controlled by the option performance.io-thread-count? And that this option also controls the number of io-threads at the brick. 

Can you confirm the above and that currently there is no way control the number of client-io-threads independently of the number of io-threads at the brick?

Comment 9 Ravishankar N 2017-10-09 06:18:13 UTC

That is correct.  https://github.com/gluster/glusterfs/commit/09232fd6855f288c47b5396dcd4d4245a154576f seems to have introduced support for loading io-threads xlator on the client side but has not provided any specific way to modify the tunables exclusively for the brick or client side. So yes, changing any xlator option in iot xlator gets reflected in both the brick and client volfiles.

Comment 10 Manoj Pillai 2017-10-09 08:16:52 UTC

Thanks! I opened an upstream bz for this along with some data in support of the request: https://bugzilla.redhat.com/show_bug.cgi?id=1499649.

Comment 14 Vijay Avuthu 2018-04-30 06:45:07 UTC

Update:
=========

Build Used: glusterfs-3.12.2-8.el7rhgs.x86_64

Below Scenario's are verified:

Scenario 1:

1) create 1 * 3 replicate volume and start it
2) check for performance.client-io-threads and it should be off
3) check in trusted-<volname>.tcp-fuse.vol file whether io-threads are loaded or not ( it shouldnot be loaded )


Scenario 2:

1) create distribute volume ( 1 * 1 ) and start
2) check for performance.client-io-threads and it should be on
3) io-threads should be loaded in trusted-<volname>.tcp-fuse.vol
4) add bricks to convert to replicate volume ( 1 * 2 )
5) check for performance.client-io-threads and it should be off
6) io-threads shouldn't be loaded in trusted-<volname>.tcp-fuse.vol
7) remove bricks so that volume type will be converted to distribute
8) check for performance.client-io-threads and it should be on agian
9) io-threads should be loaded in trusted-<volname>.tcp-fuse.vol

Scenario 3:

1) create 1 * 3 replicate volume and start it
2) set performance.client-io-threads to on
3) check for performance.client-io-threads and it should be on
4) io-threads should be loaded in trusted-<volname>.tcp-fuse.vol
5) add bricks so that volume converts to 2 * 3 
6) check for performance.client-io-threads and it should be on
7) io-threads should be loaded in trusted-<volname>.tcp-fuse.vol
8) remove bricks so that 2 * 3 converts to 1 * 3
9) check for performance.client-io-threads and it should be on
10) io-threads should be loaded in trusted-<volname>.tcp-fuse.vol


Changing status to Verified.

Comment 17 errata-xmlrpc 2018-09-04 06:35:11 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Note You need to log in before you can comment on or make changes to this bug.