Take advantage of readdir-plus efficiencies for handling container GET responses. Currently the gluster-swift integration relies on the Python Standard Library implementation of os.walk() to return the contents of all the files in all the subdirectories of a container. We could probably increase the efficiency by taking advantage of readdir-plus functionality in GlusterFS (or so we think).
readdir-plus is slower as it tries to fetch additional information(all we need for GET is filenames) and moreover, afaict, glusterfs's readdir plus is not available externally to application. os.walk is slow. There is an alternative to it here: https://github.com/benhoyt/scandir Some very crude benchmark I tried: 6 x 2 Distributed-Replicate glusterfs volume: --------------------------------------------- On first run: [root@pp ~]# python ./walk_bench.py os.walk() = 78.3573181629 s scandir.walk() = 59.4629478455 s On second run: [root@pp ~]# python ./walk_bench.py os.walk() = 61.2966690063 s scandir.walk() = 56.0211482048 s On plain ext4- "/": --------------------- On first run: [root@pp ~]# python ./walk_bench.py os.walk() = 14.4488649368 s scandir.walk() = 4.77531003952 s On second run: [root@pp ~]# python ./walk_bench.py os.walk() = 5.18742895126 s scandir.walk() = 4.57708311081 s
pre-release version is ambiguous and about to be removed as a choice. If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.