1510724 – [RFE] Enable parallel-readdir by default for all gluster volumes

Bug 1510724 - [RFE] Enable parallel-readdir by default for all gluster volumes

Summary: [RFE] Enable parallel-readdir by default for all gluster volumes

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	readdir-ahead
Sub Component:
Version:	rhgs-3.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Csaba Henk
QA Contact:	Prasanth
Docs Contact:
URL:
Whiteboard:
Depends On:	1631406
Blocks:	1660534 1662830
TreeView+	depends on / blocked

Reported:	2017-11-08 04:55 UTC by Poornima G
Modified:	2021-02-25 17:24 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1662830 (view as bug list)
Environment:
Last Closed:	2021-02-25 17:24:37 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Poornima G 2017-11-08 04:55:33 UTC

Description of problem:
Currently, parallel-readdir feature is optional, we have seen that this feature improves readdir performance greatly in large cluster, and to some extent in small cluster, its good to enable this feature by default.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 19 Raghavendra G 2019-01-02 06:23:06 UTC

For some performance data, see:
1. https://events.static.linuxfound.org/sites/events/files/slides/Gluster_DirPerf_Vault2017_0.pdf
2. https://www.spinics.net/lists/gluster-users/msg34956.html
3. https://bugzilla.redhat.com/show_bug.cgi?id=1628807#c35

Comment 20 Raghavendra G 2019-01-02 06:42:42 UTC

Also see:
1. https://lists.gluster.org/pipermail/gluster-devel/2018-September/055419.html
2. https://lists.gnu.org/archive/html/gluster-devel/2013-09/msg00034.html

From a mail to gluster-devel titled "serialized readdir(p) across subvols and effect on performance"

<snip>
All,

As many of us are aware, readdir(p)s are serialized across DHT subvols. One of the intuitive first reactions for this algorithm is that readdir(p) is going to be slow.

However this is partly true as reading the contents of a directory is normally split into multiple readdir(p) calls and most of the times (when a directory is sufficiently large to have dentries and inode data is bigger than a typical readdir(p) buffer size - 128K when readdir-ahead is enabled and 4KB on fuse when readdir-ahead is disabled - on each subvol) a single readdir(p) request is served from a single subvolume (or two subvolumes in the worst case) and hence a single readdir(p) is not serialized across all subvolumes.

Having said that, there are definitely cases where a single readdir(p) request can be serialized on many subvolumes. A best example for this is a readdir(p) request on an empty directory. Other relevant examples are those directories which don't have enough dentries to fit into a single readdir(p) buffer size on each subvolume of DHT. This is where performance.parallel-readdir helps. Also, note that this is the same reason why having cache-size for each readdir-ahead (loaded as a parent for each DHT subvolume) way bigger than a single readdir(p) buffer size won't really improve the performance in proportion to cache-size when performance.parallel-readdir is enabled.

Though this is not a new observation [1] (I stumbled upon [1] after realizing the above myself independently while working on performance.parallel-readdir), I felt this as a common misconception (I ran into similar argument while trying to explain DHT architecture to someone new to Glusterfs recently) and hence thought of writing out a mail to clarify the same.


[1] https://lists.gnu.org/archive/html/gluster-devel/2013-09/msg00034.html

regards,
Raghavendra

</snip>

Comment 36 Sahina Bose 2019-12-06 07:58:02 UTC

All patches referenced have been merged. Is there anything else left to be fixed on this bug, apart from enabling by default. Raghavendra?

(Assigning to Susant to follow up)

Comment 43 Csaba Henk 2021-02-25 17:24:37 UTC

So, recap (partly repeating private comment):

This issue is tracked upstream on https://github.com/gluster/glusterfs/issues/1884. Inferred from that conversation:

- We *cannot* merge the fix until the following issues are handled:
  - https://bugzilla.redhat.com/1631406 / https://github.com/gluster/glusterfs/issues/1416, "Dependencies of performance.parallel-readdir should be automatically turned on"
  - https://github.com/gluster/glusterfs/issues/1472, "Readdir-ahead leads to inconsistent ls results" (see https://github.com/gluster/glusterfs/issues/1884#issuecomment-738611183)
- It turned out consensus is missing whether the fix *should* be merged: see https://github.com/gluster/glusterfs/issues/1884#issuecomment-738808182

Moreover, one of the referred dependencies, GH issue #1472 is not being actively worked on.

Therefore we don't expect this fix to be merged in the foreseeable future (and it's uncertain if it will ever be merged).
Therefore I close this bug with WONTFIX. If circumstances will allow us to plan ahead for the merger, we can reopen. (As this does not deliver new functionality, just adjusts default behavior, and we don't know of active customer demand for this behavior, we'll follow up on upstream in this regard.)

Note You need to log in before you can comment on or make changes to this bug.