Description of problem: glusterfs client version 7.1-ubuntu1~bionic1 (taken from http://ppa.launchpad.net/gluster/glusterfs-7/ubuntu) crashes. The filesystem is mounted from 3 nodes cluster using glusterfs protocol. How reproducible: Happens rarely and unpredictably. Additional info: [2020-01-12 12:45:49.588961] I [MSGID: 109066] [dht-rename.c:1951:dht_rename] 0-nafsha-dht: renaming /gfs-atl/components/reports/0.37.2-69 151/reports/vendor/myclabs/deep-copy/src/DeepCopy/Matcher/.PropertyNameMatcher.php.pBd9iW (c82f3a76-a099-426f-ad5d-084c6b961d5f) (hash=naf sha-replicate-0/cache=nafsha-replicate-0) => /gfs-atl/components/reports/0.37.2-69151/reports/vendor/myclabs/deep-copy/src/DeepCopy/Matche r/PropertyNameMatcher.php ((null)) (hash=nafsha-replicate-0/cache=<nul>) pending frames: frame : type(1) op(SETATTR) frame : type(1) op(SETATTR) frame : type(1) op(LOOKUP) frame : type(1) op(FLUSH) frame : type(0) op(0) frame : type(1) op(OPEN) patchset: git://git.gluster.org/glusterfs.git signal received: 6 time of crash: 2020-01-12 12:45:49 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 7.1 /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x2292b)[0x7fceb2e3092b] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x306)[0x7fceb2e3afc6] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7fceb21e0f20] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7fceb21e0e97] /lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7fceb21e2801] /lib/x86_64-linux-gnu/libc.so.6(+0x895e5)[0x7fceb222b5e5] /lib/x86_64-linux-gnu/libc.so.6(+0x8992a)[0x7fceb222b92a] /lib/x86_64-linux-gnu/libpthread.so.0(pthread_cond_timedwait+0x427)[0x7fceb25a1077] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x312cf)[0x7fceb2e3f2cf] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7fceb259a6db] /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fceb22c388f]
Coredump (281MB compressed) could be downloaded here: https://t12.catchmedia.com/tmp/glusterfs71-client.core.gz
Hi, Can you share "thread apply all bt full" output from the core image? I have to setup the environment to access the same. Thanks, Mohit Agrawal
Created attachment 1651777 [details] gdb output of thread apply all bt full on the core See attached gdb output of thread apply all bt full
Kindly install the glusterfs debug version RPM and then share the stack dump. I am not able to see the stack of glusterfs in the attached bt output.
Created attachment 1651797 [details] gdb output of thread apply all bt full on the core - with debug symbols Attached gdb output of thread apply all bt full on the core - with debug symbols
Hi, As per stack dump it seems it is not a gluster bug, as we can see futex is getting crashed after throwing the error message "The futex facility returned an unexpected error code.". It seems the issue is similar to gcc bug that is already fixed in the latest release. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=935750 Thanks, Mohit Agrawal
I believe this to be an upstream bug which was accidentally marked to downstream . Hence changing it to appropriate stream. If I am wrong, feel free to correct accordingly
I think it is imperative to fix glusterfs to workaround the bug in gcc of previous versions. You see, the gcc-9 is only available in ubuntu 19.04 and higher: https://packages.ubuntu.com/search?keywords=gcc-9 Whereas ubuntu 18.04 is still subject to the problem. Servers (where glusterfs usually is used) tend to be upgraded slowly, so we are talking about year if not more of users unable to use glusterfs on ubuntu. By not doing workaround (or possible - backporting the fix to previous version of gcc) many users will be unable to use the glusterfs-client.
*** Bug 1775927 has been marked as a duplicate of this bug. ***
Based on the stack trace, I would say this bug is the same as bug #1785208. It's fixed in version 7.2. Can you check if it doesn't crash anymore if you upgrade to 7.2 ?
Do we need to upgrade/restart servers or only clients is enough. This is important for us because of production env.
The recommended upgrade procedure is described here: https://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_7/ First you should upgrade servers, then clients. For minor revision updates probably it's irrelevant the order, but as far as I know this is not tested.
This bug is moved to https://github.com/gluster/glusterfs/issues/873, and will be tracked there from now on. Visit GitHub issues URL for further details
Upgraded glusterfs to version 7.4 (using PPA https://launchpad.net/~gluster), no problems for 12 days already, from my perspective the bug is fixed.