1632503 – FUSE client segfaults when performance.md-cache-statfs is enabled for a volume

Bug 1632503 - FUSE client segfaults when performance.md-cache-statfs is enabled for a volume

Summary: FUSE client segfaults when performance.md-cache-statfs is enabled for a volume

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	fuse
Sub Component:
Version:	4.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Vijay Bellur
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-09-24 22:38 UTC by Stephen Muth
Modified:	2019-03-25 16:30 UTC (History)
CC List:	3 users (show)
Fixed In Version:	glusterfs-6.0
Clone Of:
Environment:
Last Closed:	2019-03-25 16:30:59 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Gluster.org Gerrit	22009	0	None	Merged	performance/md-cache: Fix a crash when statfs caching is enabled	2019-01-11 03:24:11 UTC

Description Stephen Muth 2018-09-24 22:38:00 UTC

Description of problem:

I recently tried to enable performance.md-cache-statfs for some testing, but every time I subject the FUSE mount to directory scans, the client ends up segfaulting.


Version-Release number of selected component (if applicable):
4.1.5-ubuntu1~xenial1 from the PPA for the client
4.1.5-ubuntu1~bionic1 from the PPA for the server
I was also able to reproduce this on a manual build of the client from the git master branch.


How reproducible:
I can consistently reproduce it with the steps below, although the time it takes to trigger is variable (e.g. it might happen in the middle of the 1st scan, or the 8th). I have not encountered any segfaults outside of when performance.md-cache-statfs is enabled.


Steps to Reproduce:
1. Enable performance.md-cache-statfs on a volume
`gluster volume set tank performance.md-cache-statfs on`
2. On the client, run the following command to put a little stress on the cache (there are about 8k files in various directories in /mnt/tank)
`for i in $(seq 1 10); do find /mnt/tank >/dev/null; done`


Actual results:
The client segfaults with the following info logged:
```
pending frames:
frame : type(1) op(STAT)
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2018-09-24 21:02:40
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 4.1.5
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x2038a)[0x7fc54fb7538a]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x2e7)[0x7fc54fb7f0d7]
/lib/x86_64-linux-gnu/libc.so.6(+0x354b0)[0x7fc54ef694b0]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(mem_put+0x3e)[0x7fc54fb9e8ee]
/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/mount/fuse.so(+0x146aa)[0x7fc54d6106aa]
/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/debug/io-stats.so(+0x19071)[0x7fc548348071]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_statfs_cbk+0x13c)[0x7fc54fbf8c2c]
/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/performance/md-cache.so(+0x1471e)[0x7fc54878371e]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_statfs_resume+0x1e5)[0x7fc54fc160e5]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(call_resume+0x75)[0x7fc54fb9a635]
/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/performance/io-threads.so(+0x5588)[0x7fc548565588]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fc54f3056ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc54f03b41d]
```
After the segfault, there are a cluster of `Transport endpoint is not connected` errors while the find commands continue running.


Expected results:
The command succeeds without error.


Additional info:
GDB stack trace, if that helps:
```
Thread 8 "glusteriotwr0" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7e3d700 (LWP 13880)]
0x00007ffff7b1d8ee in mem_put (ptr=0x7fffe43c2130) at mem-pool.c:870
870     mem-pool.c: No such file or directory.
(gdb) backtrace
#0  0x00007ffff7b1d8ee in mem_put (ptr=0x7fffe43c2130) at mem-pool.c:870
#1  0x00007ffff558f6aa in FRAME_DESTROY (frame=0x7fffe4415438) at ../../../../libglusterfs/src/stack.h:178
#2  STACK_DESTROY (stack=0x7fffe00079b8) at ../../../../libglusterfs/src/stack.h:198
#3  fuse_statfs_cbk (frame=<optimized out>, cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>, op_errno=0, buf=<optimized out>, xdata=0x0)
    at fuse-bridge.c:3253
#4  0x00007ffff02c7071 in ?? () from /usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/debug/io-stats.so
#5  0x00007ffff7b77c2c in default_statfs_cbk (frame=0x7fffe0008518, cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=0, buf=0x7fffec030d40, xdata=0x0)
    at defaults.c:1607
#6  0x00007ffff070271e in mdc_statfs (frame=frame@entry=0x7fffe4415438, this=<optimized out>, loc=loc@entry=0x7fffe0009488, xdata=xdata@entry=0x0) at md-cache.c:1084
#7  0x00007ffff7b950e5 in default_statfs_resume (frame=0x7fffe0008518, this=0x7fffec017920, loc=0x7fffe0009488, xdata=0x0) at defaults.c:2273
#8  0x00007ffff7b19635 in call_resume (stub=0x7fffe0009438) at call-stub.c:2689
#9  0x00007ffff04e4588 in iot_worker (data=0x7fffec02d5c0) at io-threads.c:231
#10 0x00007ffff72846ba in start_thread (arg=0x7ffff7e3d700) at pthread_create.c:333
#11 0x00007ffff6fba41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
```

Volume info
```
Volume Name: tank
Type: Distribute
Volume ID: f801b0c4-c1c4-4d28-9ff0-3a2ba2eb1919
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: g1:/data/gluster/tank/brick-89f393fe/brick
Options Reconfigured:
performance.md-cache-statfs: on
nfs.disable: on
transport.address-family: inet
```
where /data/gluster/tank/brick-89f393fe is a ZFS mount.

Comment 1 Worker Ant 2019-01-10 19:17:17 UTC

REVIEW: https://review.gluster.org/22009 (performance/md-cache: Fix a crash when statfs caching is enabled) posted (#1) for review on master by Vijay Bellur

Comment 2 Worker Ant 2019-01-11 03:24:11 UTC

REVIEW: https://review.gluster.org/22009 (performance/md-cache: Fix a crash when statfs caching is enabled) posted (#3) for review on master by Raghavendra G

Comment 3 Shyamsundar 2019-03-25 16:30:59 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report.

glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.