Description of problem: os.walk() in Python walks the entire path given to it. It internally does a stat to determine if a file is a file or directory. This additional stat is not required to determine of a file is a file/directory. An alternative implementation called "scandir.walk()" exists which is at least 2-3 times faster. This is because "scandir.walk()" reads the d_type member of dirent structure returned by readdir(). GlusterFS posix xlator does properly populate the d_type member. Hence it can be accessed/consumed by applications. https://github.com/benhoyt/scandir Version-Release number of selected component (if applicable): GlusterFS master branch How reproducible: Run the benchmark script on glusterfs mount point vs on a xfs mountpoint. https://github.com/benhoyt/scandir/blob/master/benchmark.py Actual results: On XFS: # python benchmark.py Using fast C version of scandir Comparing against builtin version of os.walk() Priming the system's cache... Benchmarking walks on benchtree, repeat 1/3... Benchmarking walks on benchtree, repeat 2/3... Benchmarking walks on benchtree, repeat 3/3... os.walk took 0.035s, scandir.walk took 0.019s -- 1.9x as fast On GlusterFS: # python benchmark.py Using fast C version of scandir Comparing against builtin version of os.walk() Priming the system's cache... Benchmarking walks on benchtree, repeat 1/3... Benchmarking walks on benchtree, repeat 2/3... Benchmarking walks on benchtree, repeat 3/3... os.walk took 0.845s, scandir.walk took 0.864s -- 1.0x as fast Expected results: scandir.walk() to be faster than os.walk() as it only does readdir() without doing stat() on each file. TODO: Retry with all performance xlators disabled.
We need to verify if the 'struct dirent'->d_type is retrieved correctly over the fuse filesystem. In case it is not, this would be a bug in fuse.
I did check that (using following script) on latest master branch code. It does fill that. #!/usr/bin/env python # Return status indicates if d_type returned import ctypes import sys (DT_UNKNOWN, DT_DIR,) = (0, 4,) class dirent(ctypes.Structure): _fields_ = [ ("d_ino", ctypes.c_long), ("d_off", ctypes.c_long), ("d_reclen", ctypes.c_ushort), ("d_type", ctypes.c_ubyte), ("d_name", ctypes.c_char*256)] direntp = ctypes.POINTER(dirent) libc = ctypes.cdll.LoadLibrary("libc.so.6") libc.readdir.restype = direntp dirp = libc.opendir(".") if dirp: ep = libc.readdir(dirp) else: sys.exit(1) print ep.contents.d_type
I was wrong. The above script failed to detect it because d_type is always set correctly for "." and ".." entries. GlusterFS correctly propagates d_type from posix xlator up the stack till FUSE. It turns out that XFS does't fill correct d_type until recently (Linux>=3.15 and xfsprogs>=3.2.0). If one formats his/her filesystem with XFS's newer version 5 on-disk format, d_type is rightly set. Example: mkfs.xfs -m crc=1 /srv/disk1 However, GlusterFS can support filling the right d_type in readdirp() responses even if XFS doesn't using the pre-fetched stat information.
REVIEW: http://review.gluster.org/14095 (posix: Set correct d_type for readdirp() calls) posted (#1) for review on master by Prashanth Pai (ppai)
Created a nested fs tree of depth = 4 on glusterfs mountpoint. In the below example: ls command from coreutils is capable of avoiding additional lstat() if it finds d_type to be set correctly. BEFORE http://review.gluster.org/14095: root# strace -fc -e getdents,lstat ls -fR /mnt/gluster-object/gsmetadata >> /dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 55.95 0.672307 30 22226 getdents 44.05 0.529388 24 22224 lstat ------ ----------- ----------- --------- --------- ---------------- 100.00 1.201695 44450 total AFTER http://review.gluster.org/14095: root# strace -fc -e getdents,lstat ls -fR /mnt/gluster-object/gsmetadata >> /dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.595680 27 22226 getdents ------ ----------- ----------- --------- --------- ---------------- 100.00 0.595680 22226 total
REVIEW: http://review.gluster.org/14095 (posix: Set correct d_type for readdirp() calls) posted (#2) for review on master by Prashanth Pai (ppai)
COMMIT: http://review.gluster.org/14095 committed in master by Jeff Darcy (jdarcy) ------ commit 77def44d497d090ef3f393b6d9403c1a29dcf993 Author: Prashanth Pai <ppai> Date: Wed Apr 27 13:37:07 2016 +0530 posix: Set correct d_type for readdirp() calls dirent.d_type can contain the type of the directory entry. The 'd_type' struct member in dirent is present in Linux and many BSD flavours. However, filling d_type with correct value requires support from the underlying filesystem. If not, d_type is set to DT_UNKNOWN. XFS added support for d_type as part of their newer version 5 on-disk format. However, this requires Linux >= 3.15, xfsprogs >= 3.2.0 and the bricks to be formatted using the new format. This patch enables posix xlator to set d_type to the right value even when the underlying filesystem does not support it. d_type can be set using information previously fetched by stat() on the dir entry. This will aid FUSE applications to leverage d_type to avoid the expense of calling lstat() if further actions depend on the type of the file. Refer `man 3 readdir` and `man 2 getdents` BUG: 1175711 Change-Id: Ic5a262fe4c64122726b4fae2d1bea375c559ca04 Signed-off-by: Prashanth Pai <ppai> Reviewed-on: http://review.gluster.org/14095 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Jeff Darcy <jdarcy>