Bug 1622281 - [Ganesha] du -sh crawl is extremely slow when coupled with Recursive ls and finds
Summary: [Ganesha] du -sh crawl is extremely slow when coupled with Recursive ls and f...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: nfs-ganesha
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Kaleb KEITHLEY
QA Contact: Manisha Saini
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-25 09:48 UTC by Manisha Saini
Modified: 2020-03-10 12:16 UTC (History)
10 users (show)

Fixed In Version: nfs-ganesha-2.7.3-8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-10 12:16:44 UTC
Embargoed:


Attachments (Terms of Use)

Description Manisha Saini 2018-08-25 09:48:33 UTC
Description of problem:

Hit this issue while verifying BZ - 
https://bugzilla.redhat.com/show_bug.cgi?id=1415608

Note:The above BZ mentioned was fixed having readdir chunking and xreaddir plus code enable.

Since it is been disabled with latest build,in order to verify the bug,tested the usecase with readdir chunking code disable and xreaddir plus enable -
 nfs-ganesha-gluster-2.5.5-10.el7rhgs.x86_64 . 

Mounted the EC volume on 4 clients via v3/v4 (same VIP). Create a huge dataset of around 134 GB on mount point.Dataset consist of mix large,small and empty directory sets (Details below in steps)

Once the dataset is created on mount point,Ran find,du -sh,ll -R on each mount 

Observation :

No hangs were observed wrt ll -R  and find's (fixed in BZ 1415608). But du -sh hung for around 2.5 hours.

Version-Release number of selected component (if applicable):

# rpm -qa | grep ganesha
nfs-ganesha-gluster-2.5.5-10.el7rhgs.x86_64
nfs-ganesha-debuginfo-2.5.5-10.el7rhgs.x86_64
nfs-ganesha-2.5.5-10.el7rhgs.x86_64
glusterfs-ganesha-3.12.2-16.el7rhgs.x86_64


How reproducible:
2/2


Steps to Reproduce:
1.Create 6 node ganesha cluster
2.Create 6 x (4 + 2) Distributed-Disperse Volume.Enable ganesha on the volume
3.Mount the volume on 4 clients v3/v4 via same VIP.
4.Create huge data set consisting of small,large and empty directories-

Detailed-

Equal number of files approximately 1.1 million files averaging to 8k file size in large and small directory sets.
The small directory set had 12.5k directory with less than or equal to 100 files per directory and the large directory set comprised of 50 directories with approximately 20k files per directory.Empty directory sets consist of 12.5k directories.

5.Once the data set is created, trigger recursive find,du -sh,ll -R from 3 clients

Actual results:

ll -R, find's were running without any hangs.

To crawl 134GB of dataset, du took around 2.5 hours.


Expected results:

du -sh should not take this long. This should be improved.


Additional info:

This needs to be retested with readdir chunking code enable in order to check if readdir chunking code some what improves this time for du to crawl 134 GB of data set

Comment 8 Kaleb KEITHLEY 2020-02-13 14:31:28 UTC
fixed in current build, pending qe verification


Note You need to log in before you can comment on or make changes to this bug.