Description of problem: Consider the case where we have a setup with different gluster volumes shared through Samba. Crashes are seen in racy scenarios where same client connects to/disconnects from those different shares for which backtrace similar to the one given below is observed from core dump: (gdb) bt #0 0x00007f4a94e28625 in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f4a94e29e05 in abort () at abort.c:92 #2 0x00007f4a96793f21 in dump_core () at ../source3/lib/dumpcore.c:336 #3 0x00007f4a9677e1a0 in smb_panic_s3 (why=<value optimized out>) at ../source3/lib/util.c:808 #4 0x00007f4a97efcac1 in smb_panic (why=0x7f4a97f0bcf5 "eturned status %d\n") at ../lib/util/fault.c:159 #5 0x00007f4a97efcb82 in fault_report (sig=11) at ../lib/util/fault.c:77 #6 sig_fault (sig=11) at ../lib/util/fault.c:88 #7 <signal handler called> #8 0x00007f4a7e3c636e in _gf_msg_internal (domain=0x7f4a7e43d3fc "logrotate", file=<value optimized out>, function=0x7f4a7e43daa0 "_gf_log", line=<value optimized out>, level=GF_LOG_ERROR, errnum=<value optimized out>, trace=0, msgid=101012, fmt=0x7f4a7e43d406 "failed to open logfile") at logging.c:1867 #9 _gf_msg (domain=0x7f4a7e43d3fc "logrotate", file=<value optimized out>, function=0x7f4a7e43daa0 "_gf_log", line=<value optimized out>, level=GF_LOG_ERROR, errnum=<value optimized out>, trace=0, msgid=101012, fmt=0x7f4a7e43d406 "failed to open logfile") at logging.c:2064 #10 0x00007f4a7e3c5e38 in _gf_log (domain=0x7f4a7e43d452 "logging-infra", file=<value optimized out>, function=0x7f4a7e43dae0 "gf_log_flush_timeout_cbk", line=1815, level=GF_LOG_DEBUG, fmt=0x7f4a7e43d9b8 "Log timer timed out. About to flush outstanding messages if present") at logging.c:2163 #11 0x00007f4a7e3c8b32 in gf_log_flush_timeout_cbk (data=0x7f4a992b5800) at logging.c:1814 #12 0x00007f4a7e3e66e3 in gf_timer_proc (ctx=0x7f4a992b5800) at timer.c:193 #13 0x00007f4a98121a51 in start_thread (arg=0x7f4a7b8a2700) at pthread_create.c:301 #14 0x00007f4a94ede93d in ?? () from /lib64/libc.so.6 #15 0x0000000000000000 in ?? () Version-Release number of selected component (if applicable): Red Hat Gluster Storage Server 3.1 How reproducible: Very hard. Steps to Reproduce: Will update soon with a good reproducer. Actual results: smbd crashed and restarted. Expected results: No crashes are seen Additional info: Please see https://bugzilla.redhat.com/show_bug.cgi?id=1315201#c18 and https://bugzilla.redhat.com/show_bug.cgi?id=1315201#c20 for detailed RCA for this issue.
Downstream patch: https://code.engineering.redhat.com/gerrit/#/c/70407/
Upstream patches: http://review.gluster.org/#/c/13784/ <-- master branch http://review.gluster.org/#/c/13803/ <-- release-3.7 branch
Transcoding / encoding tests over video file formats and rigorous test of running huge IOs and simultaneously multiple connect and disconnect of the mounted share on windows client where performed. No crashes were seen during these run.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240
*** Bug 1214174 has been marked as a duplicate of this bug. ***