Description of problem: SMB crashed when I was running cleanup on a smallfile test. Version-Release number of selected component (if applicable): glusterfs-3.8.4-12.el7rhgs.x86_64 How reproducible: Intermittent. Steps to Reproduce: 1. Run smallfile create 2. Run smallfile cleanup Actual results: Occasional crashes reported. Expected results: Normal operation. Additional info:
Core was generated by `/usr/sbin/smbd'. Program terminated with signal 6, Aborted. #0 0x00007f14e38661d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) bt #0 0x00007f14e38661d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f14e38678c8 in __GI_abort () at abort.c:90 #2 0x00007f14e51c6c2b in dump_core () at ../source3/lib/dumpcore.c:322 #3 0x00007f14e51b9fe7 in smb_panic_s3 (why=<optimized out>) at ../source3/lib/util.c:814 #4 0x00007f14e76ac59f in smb_panic (why=why@entry=0x7f14e7d48fa0 "reinit_after_fork() failed") at ../lib/util/fault.c:166 #5 0x00007f14e7d4837c in smbd_accept_connection (ev=0x7f14e8ea66f0, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at ../source3/smbd/server.c:759 #6 0x00007f14e51cf3dc in run_events_poll (ev=0x7f14e8ea66f0, pollrtn=<optimized out>, pfds=0x7f14e8ebca90, num_pfds=7) at ../source3/lib/events.c:257 #7 0x00007f14e51cf630 in s3_event_loop_once (ev=0x7f14e8ea66f0, location=<optimized out>) at ../source3/lib/events.c:326 #8 0x00007f14e3bf640d in _tevent_loop_once (ev=ev@entry=0x7f14e8ea66f0, location=location@entry=0x7f14e7d4b776 "../source3/smbd/server.c:1127") at ../tevent.c:533 #9 0x00007f14e3bf65ab in tevent_common_loop_wait (ev=0x7f14e8ea66f0, location=0x7f14e7d4b776 "../source3/smbd/server.c:1127") at ../tevent.c:637 #10 0x00007f14e7d43ad4 in smbd_parent_loop (parent=<optimized out>, ev_ctx=0x7f14e8ea66f0) at ../source3/smbd/server.c:1127 #11 main (argc=<optimized out>, argv=<optimized out>) at ../source3/smbd/server.c:1780
Hi Ben, I think Vivek already hit this issue while running regression tests sometime back and reported BZ 1400957(see the bt from bug description). You can find the reason for closing that bug as the last comment. So in order to confirm whether this is the same case or not we need ctdb logs from the Samba server where it crashed.
After talking iwth Anoop we think this was caused by: https://bugzilla.redhat.com/show_bug.cgi?id=1400957#c9 I am closing this as NOTABUG.