Bug 958324

Summary: Take advantage of readdir-plus efficiencies for handling container GET responses
Product: [Community] GlusterFS Reporter: Peter Portante <pportant>
Component: object-storageAssignee: Thiago da Silva <thiago>
Status: CLOSED EOL QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: pre-releaseCC: bugs, gluster-bugs, ppai, pportant, riek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-22 15:40:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Peter Portante 2013-04-30 22:43:38 UTC
Take advantage of readdir-plus efficiencies for handling container GET responses.

Currently the gluster-swift integration relies on the Python Standard Library implementation of os.walk() to return the contents of all the files in all the subdirectories of a container. We could probably increase the efficiency by taking advantage of readdir-plus functionality in GlusterFS (or so we think).

Comment 2 Prashanth Pai 2013-12-05 09:05:15 UTC
readdir-plus is slower as it tries to fetch additional information(all we need for GET is filenames) and moreover, afaict, glusterfs's readdir plus is not available externally to application.

os.walk is slow. There is an alternative to it here: https://github.com/benhoyt/scandir

Some very crude benchmark I tried:

6 x 2 Distributed-Replicate glusterfs volume:
---------------------------------------------

On first run:
[root@pp ~]# python ./walk_bench.py
os.walk() =  78.3573181629 s
scandir.walk() =  59.4629478455 s

On second run:
[root@pp ~]# python ./walk_bench.py
os.walk() =  61.2966690063 s
scandir.walk() =  56.0211482048 s

On plain ext4- "/":
---------------------

On first run:
[root@pp ~]# python ./walk_bench.py 
os.walk() =  14.4488649368 s
scandir.walk() =  4.77531003952 s

On second run:
[root@pp ~]# python ./walk_bench.py 
os.walk() =  5.18742895126 s
scandir.walk() =  4.57708311081 s

Comment 3 Kaleb KEITHLEY 2015-10-22 15:40:20 UTC
pre-release version is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.