Bug 1790211

Summary: glusterfs-client crashed - signal received: 6
Product: [Community] GlusterFS Reporter: Arie Skliarouk <skliarie+redhat-bugzilla>
Component: fuseAssignee: bugs <bugs>
Status: CLOSED UPSTREAM QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 7CC: bugs, jahernan, lesser.evil, nchilaka, pasik, rhs-bugs, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-12 12:20:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
gdb output of thread apply all bt full on the core
none
gdb output of thread apply all bt full on the core - with debug symbols none

Description Arie Skliarouk 2020-01-12 16:13:12 UTC
Description of problem:
glusterfs client version 7.1-ubuntu1~bionic1 (taken from http://ppa.launchpad.net/gluster/glusterfs-7/ubuntu)
crashes.

The filesystem is mounted from 3 nodes cluster using glusterfs protocol.

How reproducible:
Happens rarely and unpredictably.

Additional info:
[2020-01-12 12:45:49.588961] I [MSGID: 109066] [dht-rename.c:1951:dht_rename] 0-nafsha-dht: renaming /gfs-atl/components/reports/0.37.2-69
151/reports/vendor/myclabs/deep-copy/src/DeepCopy/Matcher/.PropertyNameMatcher.php.pBd9iW (c82f3a76-a099-426f-ad5d-084c6b961d5f) (hash=naf
sha-replicate-0/cache=nafsha-replicate-0) => /gfs-atl/components/reports/0.37.2-69151/reports/vendor/myclabs/deep-copy/src/DeepCopy/Matche
r/PropertyNameMatcher.php ((null)) (hash=nafsha-replicate-0/cache=<nul>)  
pending frames:
frame : type(1) op(SETATTR)
frame : type(1) op(SETATTR)
frame : type(1) op(LOOKUP)
frame : type(1) op(FLUSH)
frame : type(0) op(0)
frame : type(1) op(OPEN)
patchset: git://git.gluster.org/glusterfs.git
signal received: 6
time of crash: 
2020-01-12 12:45:49
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 7.1
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x2292b)[0x7fceb2e3092b]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x306)[0x7fceb2e3afc6]
/lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7fceb21e0f20]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7fceb21e0e97]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7fceb21e2801]
/lib/x86_64-linux-gnu/libc.so.6(+0x895e5)[0x7fceb222b5e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8992a)[0x7fceb222b92a]
/lib/x86_64-linux-gnu/libpthread.so.0(pthread_cond_timedwait+0x427)[0x7fceb25a1077]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x312cf)[0x7fceb2e3f2cf]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7fceb259a6db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fceb22c388f]

Comment 1 Arie Skliarouk 2020-01-12 17:50:43 UTC
Coredump (281MB compressed) could be downloaded here:
https://t12.catchmedia.com/tmp/glusterfs71-client.core.gz

Comment 2 Mohit Agrawal 2020-01-13 04:01:53 UTC
Hi,

 Can you share "thread apply all bt full" output from the core image? 
 I have to setup the environment to access the same.

Thanks,
Mohit Agrawal

Comment 3 Arie Skliarouk 2020-01-13 08:48:10 UTC
Created attachment 1651777 [details]
gdb output of thread apply all bt full on the core

See attached gdb output of
thread apply all bt full

Comment 4 Mohit Agrawal 2020-01-13 09:30:38 UTC
Kindly install the glusterfs debug version RPM and then share the stack dump.
I am not able to see the stack of glusterfs in the attached bt output.

Comment 5 Arie Skliarouk 2020-01-13 09:42:50 UTC
Created attachment 1651797 [details]
gdb output of thread apply all bt full on the core - with debug symbols

Attached
gdb output of thread apply all bt full on the core - with debug symbols

Comment 6 Mohit Agrawal 2020-01-13 09:58:46 UTC
Hi,

As per stack dump it seems it is not a gluster bug, as we can see futex is getting crashed after throwing the error message "The futex facility returned an unexpected error code.".

It seems the issue is similar to gcc bug that is already fixed in the latest release.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=935750

Thanks,
Mohit Agrawal

Comment 7 Nag Pavan Chilakam 2020-01-21 13:42:10 UTC
I believe this to be an upstream bug which was accidentally marked to downstream . 
Hence changing it to appropriate stream.
If I am wrong, feel free to correct accordingly

Comment 8 Arie Skliarouk 2020-02-06 09:11:54 UTC
I think it is imperative to fix glusterfs to workaround the bug in gcc of previous versions.
You see, the gcc-9 is only available in ubuntu 19.04 and higher:
https://packages.ubuntu.com/search?keywords=gcc-9

Whereas ubuntu 18.04 is still subject to the problem. Servers (where glusterfs usually is used) tend to be upgraded slowly, so we are talking about year if not more of users unable to use glusterfs on ubuntu.

By not doing workaround (or possible - backporting the fix to previous version of gcc) many users will be unable to use the glusterfs-client.

Comment 9 Xavi Hernandez 2020-02-11 14:16:58 UTC
*** Bug 1775927 has been marked as a duplicate of this bug. ***

Comment 10 Xavi Hernandez 2020-02-11 14:25:54 UTC
Based on the stack trace, I would say this bug is the same as bug #1785208.

It's fixed in version 7.2. Can you check if it doesn't crash anymore if you upgrade to 7.2 ?

Comment 11 Arie Skliarouk 2020-02-12 09:07:52 UTC
Do we need to upgrade/restart servers or only clients is enough. This is important for us because of production env.

Comment 12 Xavi Hernandez 2020-02-12 12:46:01 UTC
The recommended upgrade procedure is described here: https://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_7/

First you should upgrade servers, then clients. For minor revision updates probably it's irrelevant the order, but as far as I know this is not tested.

Comment 13 Worker Ant 2020-03-12 12:20:00 UTC
This bug is moved to https://github.com/gluster/glusterfs/issues/873, and will be tracked there from now on. Visit GitHub issues URL for further details

Comment 14 Arie Skliarouk 2020-04-19 12:25:07 UTC
Upgraded glusterfs to version 7.4 (using PPA https://launchpad.net/~gluster), no problems for 12 days already, from my perspective the bug is fixed.