Description of problem: I recently tried to enable performance.md-cache-statfs for some testing, but every time I subject the FUSE mount to directory scans, the client ends up segfaulting. Version-Release number of selected component (if applicable): 4.1.5-ubuntu1~xenial1 from the PPA for the client 4.1.5-ubuntu1~bionic1 from the PPA for the server I was also able to reproduce this on a manual build of the client from the git master branch. How reproducible: I can consistently reproduce it with the steps below, although the time it takes to trigger is variable (e.g. it might happen in the middle of the 1st scan, or the 8th). I have not encountered any segfaults outside of when performance.md-cache-statfs is enabled. Steps to Reproduce: 1. Enable performance.md-cache-statfs on a volume `gluster volume set tank performance.md-cache-statfs on` 2. On the client, run the following command to put a little stress on the cache (there are about 8k files in various directories in /mnt/tank) `for i in $(seq 1 10); do find /mnt/tank >/dev/null; done` Actual results: The client segfaults with the following info logged: ``` pending frames: frame : type(1) op(STAT) frame : type(0) op(0) patchset: git://git.gluster.org/glusterfs.git signal received: 11 time of crash: 2018-09-24 21:02:40 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 4.1.5 /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x2038a)[0x7fc54fb7538a] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x2e7)[0x7fc54fb7f0d7] /lib/x86_64-linux-gnu/libc.so.6(+0x354b0)[0x7fc54ef694b0] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(mem_put+0x3e)[0x7fc54fb9e8ee] /usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/mount/fuse.so(+0x146aa)[0x7fc54d6106aa] /usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/debug/io-stats.so(+0x19071)[0x7fc548348071] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_statfs_cbk+0x13c)[0x7fc54fbf8c2c] /usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/performance/md-cache.so(+0x1471e)[0x7fc54878371e] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_statfs_resume+0x1e5)[0x7fc54fc160e5] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(call_resume+0x75)[0x7fc54fb9a635] /usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/performance/io-threads.so(+0x5588)[0x7fc548565588] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fc54f3056ba] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc54f03b41d] ``` After the segfault, there are a cluster of `Transport endpoint is not connected` errors while the find commands continue running. Expected results: The command succeeds without error. Additional info: GDB stack trace, if that helps: ``` Thread 8 "glusteriotwr0" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff7e3d700 (LWP 13880)] 0x00007ffff7b1d8ee in mem_put (ptr=0x7fffe43c2130) at mem-pool.c:870 870 mem-pool.c: No such file or directory. (gdb) backtrace #0 0x00007ffff7b1d8ee in mem_put (ptr=0x7fffe43c2130) at mem-pool.c:870 #1 0x00007ffff558f6aa in FRAME_DESTROY (frame=0x7fffe4415438) at ../../../../libglusterfs/src/stack.h:178 #2 STACK_DESTROY (stack=0x7fffe00079b8) at ../../../../libglusterfs/src/stack.h:198 #3 fuse_statfs_cbk (frame=<optimized out>, cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>, op_errno=0, buf=<optimized out>, xdata=0x0) at fuse-bridge.c:3253 #4 0x00007ffff02c7071 in ?? () from /usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/debug/io-stats.so #5 0x00007ffff7b77c2c in default_statfs_cbk (frame=0x7fffe0008518, cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=0, buf=0x7fffec030d40, xdata=0x0) at defaults.c:1607 #6 0x00007ffff070271e in mdc_statfs (frame=frame@entry=0x7fffe4415438, this=<optimized out>, loc=loc@entry=0x7fffe0009488, xdata=xdata@entry=0x0) at md-cache.c:1084 #7 0x00007ffff7b950e5 in default_statfs_resume (frame=0x7fffe0008518, this=0x7fffec017920, loc=0x7fffe0009488, xdata=0x0) at defaults.c:2273 #8 0x00007ffff7b19635 in call_resume (stub=0x7fffe0009438) at call-stub.c:2689 #9 0x00007ffff04e4588 in iot_worker (data=0x7fffec02d5c0) at io-threads.c:231 #10 0x00007ffff72846ba in start_thread (arg=0x7ffff7e3d700) at pthread_create.c:333 #11 0x00007ffff6fba41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 ``` Volume info ``` Volume Name: tank Type: Distribute Volume ID: f801b0c4-c1c4-4d28-9ff0-3a2ba2eb1919 Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: g1:/data/gluster/tank/brick-89f393fe/brick Options Reconfigured: performance.md-cache-statfs: on nfs.disable: on transport.address-family: inet ``` where /data/gluster/tank/brick-89f393fe is a ZFS mount.
REVIEW: https://review.gluster.org/22009 (performance/md-cache: Fix a crash when statfs caching is enabled) posted (#1) for review on master by Vijay Bellur
REVIEW: https://review.gluster.org/22009 (performance/md-cache: Fix a crash when statfs caching is enabled) posted (#3) for review on master by Raghavendra G
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report. glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html [2] https://www.gluster.org/pipermail/gluster-users/