Bug 1572075

Summary: glusterfsd crashing because of RHGS WA?
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Daniel Horák <dahorak>
Component: coreAssignee: hari gowtham <hgowtham>
Status: CLOSED ERRATA QA Contact: Rajesh Madaka <rmadaka>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: amukherj, dahorak, mbukatov, nthomas, rhinduja, rhs-bugs, rmadaka, sankarshan, sheggodu, storage-qa-internal, vbellur
Target Milestone: ---   
Target Release: RHGS 3.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.12.2-10 Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
: 1575864 (view as bug list) Environment:
Last Closed: 2018-09-04 06:47:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1503137    
Attachments:
Description Flags
Full backtrace none

Comment 3 Daniel Horák 2018-04-26 10:03:23 UTC
# gdb /usr/sbin/glusterfsd /core.7102 
  GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-110.el7
  Copyright (C) 2013 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-redhat-linux-gnu".
  For bug reporting instructions, please see:
  <http://www.gnu.org/software/gdb/bugs/>...
  Reading symbols from /usr/sbin/glusterfsd...Reading symbols from /usr/lib/debug/usr/sbin/glusterfsd.debug...done.
  done.
  [New LWP 7107]
  [New LWP 7108]
  [New LWP 7125]
  [New LWP 7122]
  [New LWP 7112]
  [New LWP 7113]
  [New LWP 7121]
  [New LWP 7116]
  [New LWP 7111]
  [New LWP 7110]
  [New LWP 7109]
  [New LWP 7135]
  [New LWP 7134]
  [New LWP 7128]
  [New LWP 7127]
  [New LWP 7126]
  [New LWP 14296]
  [New LWP 7137]
  [New LWP 7114]
  [New LWP 9952]
  [New LWP 7115]
  [New LWP 14298]
  [New LWP 7106]
  [New LWP 7117]
  [New LWP 7120]
  [New LWP 7123]
  [New LWP 7124]
  [New LWP 7129]
  [New LWP 14196]
  [New LWP 7136]
  [New LWP 7105]
  [New LWP 7104]
  [New LWP 7103]
  [New LWP 7102]
  [New LWP 14299]
  [New LWP 14297]
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib64/libthread_db.so.1".
  Core was generated by `/usr/sbin/glusterfsd -s gl1.example.com --volfile'.
  Program terminated with signal 11, Segmentation fault.
  #0  0x00007f3eb44b7580 in server_priv_to_dict (this=<optimized out>, dict=0x7f3e80017460, brickname=0x7f3e80004d50 "/mnt/brick_alpha_distrep_1/1")
      at server.c:248
  248	                        if (!strcmp (brickname,
  (gdb) t a a bt
  
  Thread 36 (Thread 0x7f3e71ffb700 (LWP 14297)):
  #0  0x00007f3ec39a04fd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3ec39a0394 in __sleep (seconds=0, seconds@entry=30) at ../sysdeps/unix/sysv/linux/sleep.c:137
  #2  0x00007f3eb79d474d in posix_health_check_thread_proc (data=0x7f3eb0008860) at posix-helpers.c:1897
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e71ffb700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 35 (Thread 0x7f3e70ff9700 (LWP 14299)):
  #0  0x00007f3ec39a04fd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3ec39a0394 in __sleep (seconds=0, seconds@entry=30) at ../sysdeps/unix/sysv/linux/sleep.c:137
  #2  0x00007f3eb79d474d in posix_health_check_thread_proc (data=0x7f3e84000d90) at posix-helpers.c:1897
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e70ff9700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 34 (Thread 0x7f3ec5797780 (LWP 7102)):
  #0  0x00007f3ec4111f47 in pthread_join (threadid=139907355805440, thread_return=thread_return@entry=0x0) at pthread_join.c:92
  #1  0x00007f3ec5310478 in event_dispatch_epoll (event_pool=0x562157fb6a30) at event-epoll.c:746
  #2  0x0000562156dc02a7 in main (argc=19, argv=<optimized out>) at glusterfsd.c:2550
  
  Thread 33 (Thread 0x7f3ebc8b1700 (LWP 7103)):
  #0  0x00007f3ec4117eed in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3ec52c0986 in gf_timer_proc (data=0x562157fbecf0) at timer.c:165
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3ebc8b1700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 32 (Thread 0x7f3ebc0b0700 (LWP 7104)):
  #0  0x00007f3ec4118411 in do_sigwait (sig=0x7f3ebc0afe1c, set=<optimized out>) at ../sysdeps/unix/sysv/linux/sigwait.c:61
  #1  __sigwait (set=set@entry=0x7f3ebc0afe20, sig=sig@entry=0x7f3ebc0afe1c) at ../sysdeps/unix/sysv/linux/sigwait.c:99
  #2  0x0000562156dc358b in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2137
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3ebc0b0700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 31 (Thread 0x7f3ebb8af700 (LWP 7105)):
  #0  0x00007f3ec39a04fd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3ec39a0394 in __sleep (seconds=0, seconds@entry=30) at ../sysdeps/unix/sysv/linux/sleep.c:137
  #2  0x00007f3ec52db1cd in pool_sweeper (arg=<optimized out>) at mem-pool.c:481
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3ebb8af700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 30 (Thread 0x7f3e737fe700 (LWP 7136)):
  #0  0x00007f3ec39da113 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3ec530fd12 in event_dispatch_epoll_worker (data=0x7f3e84076d80) at event-epoll.c:649
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e737fe700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 29 (Thread 0x7f3eac0f3700 (LWP 14196)):
  #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  #1  0x00007f3eb55b1d3c in iot_worker (data=0x7f3eb005f160) at io-threads.c:193
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3eac0f3700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 28 (Thread 0x7f3e7a7fc700 (LWP 7129)):
  #0  0x00007f3ec39d0c03 in select () at ../sysdeps/unix/syscall-template.S:81
  ---Type <return> to continue, or q <return> to quit---
  #1  0x00007f3eb688c92a in changelog_ev_dispatch (data=0x7f3e84057338) at changelog-ev-handle.c:350
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e7a7fc700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 27 (Thread 0x7f3e9d2f8700 (LWP 7124)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3eb644a303 in br_stub_signth (arg=<optimized out>) at bit-rot-stub.c:867
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e9d2f8700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 26 (Thread 0x7f3eb40a7700 (LWP 7123)):
  #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  #1  0x00007f3eb55b1d3c in iot_worker (data=0x7f3e84039e60) at io-threads.c:193
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3eb40a7700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 25 (Thread 0x7f3e9eefc700 (LWP 7120)):
  #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  #1  0x00007f3eb79d0b55 in janitor_get_next_fd (this=0x7f3eb0008860) at posix-helpers.c:1419
  #2  posix_janitor_thread_proc (data=0x7f3eb0008860) at posix-helpers.c:1467
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e9eefc700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 24 (Thread 0x7f3eac8f4700 (LWP 7117)):
  #0  0x00007f3ec39d0c03 in select () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3eb688c92a in changelog_ev_dispatch (data=0x7f3eb007c638) at changelog-ev-handle.c:350
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3eac8f4700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 23 (Thread 0x7f3ebb0ae700 (LWP 7106)):
  #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  #1  0x00007f3ec52ee018 in syncenv_task (proc=proc@entry=0x562157fbf510) at syncop.c:603
  #2  0x00007f3ec52eeee0 in syncenv_processor (thdata=0x562157fbf510) at syncop.c:695
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3ebb0ae700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 22 (Thread 0x7f3e717fa700 (LWP 14298)):
  #0  0x00007f3ec39a04fd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3ec39a0394 in __sleep (seconds=0, seconds@entry=5) at ../sysdeps/unix/sysv/linux/sleep.c:137
  #2  0x00007f3eb79d4f41 in posix_disk_space_check_thread_proc (data=0x7f3e84000d90) at posix-helpers.c:2083
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e717fa700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 21 (Thread 0x7f3ead8f6700 (LWP 7115)):
  #0  0x00007f3ec39d0c03 in select () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3eb688c92a in changelog_ev_dispatch (data=0x7f3eb007c638) at changelog-ev-handle.c:350
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3ead8f6700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 20 (Thread 0x7f3eac0b2700 (LWP 9952)):
  #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  #1  0x00007f3eb55b1d3c in iot_worker (data=0x7f3e84039e60) at io-threads.c:193
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3eac0b2700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  ---Type <return> to continue, or q <return> to quit---
  
  Thread 19 (Thread 0x7f3eae0f7700 (LWP 7114)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3eb688c6e3 in changelog_ev_connector (data=0x7f3eb007c638) at changelog-ev-handle.c:205
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3eae0f7700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 18 (Thread 0x7f3e72ffd700 (LWP 7137)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3ec50735ed in rpcsvc_request_handler (arg=0x7f3eb003fdf0) at rpcsvc.c:1883
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e72ffd700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 17 (Thread 0x7f3e727fc700 (LWP 14296)):
  #0  0x00007f3ec39a04fd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3ec39a0394 in __sleep (seconds=0, seconds@entry=5) at ../sysdeps/unix/sysv/linux/sleep.c:137
  #2  0x00007f3eb79d4f41 in posix_disk_space_check_thread_proc (data=0x7f3eb0008860) at posix-helpers.c:2083
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e727fc700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 16 (Thread 0x7f3e7bfff700 (LWP 7126)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3eb688c6e3 in changelog_ev_connector (data=0x7f3e84057338) at changelog-ev-handle.c:205
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e7bfff700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 15 (Thread 0x7f3e7b7fe700 (LWP 7127)):
  #0  0x00007f3ec39d0c03 in select () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3eb688c92a in changelog_ev_dispatch (data=0x7f3e84057338) at changelog-ev-handle.c:350
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e7b7fe700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 14 (Thread 0x7f3e7affd700 (LWP 7128)):
  #0  0x00007f3ec39d0c03 in select () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3eb688c92a in changelog_ev_dispatch (data=0x7f3e84057338) at changelog-ev-handle.c:350
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e7affd700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 13 (Thread 0x7f3e78ff9700 (LWP 7134)):
  #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  #1  0x00007f3eb79d0b55 in janitor_get_next_fd (this=0x7f3e84000d90) at posix-helpers.c:1419
  #2  posix_janitor_thread_proc (data=0x7f3e84000d90) at posix-helpers.c:1467
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e78ff9700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 12 (Thread 0x7f3e73fff700 (LWP 7135)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3eb79d514b in posix_fsyncer_pick (this=this@entry=0x7f3e84000d90, head=head@entry=0x7f3e73ffee80) at posix-helpers.c:2151
  #2  0x00007f3eb79d53d5 in posix_fsyncer (d=0x7f3e84000d90) at posix-helpers.c:2247
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e73fff700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 11 (Thread 0x7f3eaffff700 (LWP 7109)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  ---Type <return> to continue, or q <return> to quit---
  #1  0x00007f3ec50735ed in rpcsvc_request_handler (arg=0x7f3eb003fdf0) at rpcsvc.c:1883
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3eaffff700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 10 (Thread 0x7f3eaeefe700 (LWP 7110)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3eb4d6e065 in index_worker (data=<optimized out>) at index.c:217
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3eaeefe700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 9 (Thread 0x7f3ec5635700 (LWP 7111)):
  #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  #1  0x00007f3eb55b1d3c in iot_worker (data=0x7f3eb005f160) at io-threads.c:193
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3ec5635700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 8 (Thread 0x7f3ead0f5700 (LWP 7116)):
  #0  0x00007f3ec39d0c03 in select () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3eb688c92a in changelog_ev_dispatch (data=0x7f3eb007c638) at changelog-ev-handle.c:350
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3ead0f5700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 7 (Thread 0x7f3e9e6fb700 (LWP 7121)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3eb79d514b in posix_fsyncer_pick (this=this@entry=0x7f3eb0008860, head=head@entry=0x7f3e9e6fae80) at posix-helpers.c:2151
  #2  0x00007f3eb79d53d5 in posix_fsyncer (d=0x7f3eb0008860) at posix-helpers.c:2247
  #3  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e9e6fb700) at pthread_create.c:308
  #4  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 6 (Thread 0x7f3eae5fc700 (LWP 7113)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3eb6448d2b in br_stub_worker (data=<optimized out>) at bit-rot-stub-helpers.c:375
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3eae5fc700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 5 (Thread 0x7f3eaedfd700 (LWP 7112)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3eb644a303 in br_stub_signth (arg=<optimized out>) at bit-rot-stub.c:867
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3eaedfd700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 4 (Thread 0x7f3e9d3f9700 (LWP 7122)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3eb4d6e065 in index_worker (data=<optimized out>) at index.c:217
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e9d3f9700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 3 (Thread 0x7f3e9caf7700 (LWP 7125)):
  #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  #1  0x00007f3eb6448d2b in br_stub_worker (data=<optimized out>) at bit-rot-stub-helpers.c:375
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3e9caf7700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 2 (Thread 0x7f3eb83e8700 (LWP 7108)):
  ---Type <return> to continue, or q <return> to quit---
  #0  0x00007f3ec39da113 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
  #1  0x00007f3ec530fd12 in event_dispatch_epoll_worker (data=0x562157ffe1b0) at event-epoll.c:649
  #2  0x00007f3ec4110dd5 in start_thread (arg=0x7f3eb83e8700) at pthread_create.c:308
  #3  0x00007f3ec39d9b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
  
  Thread 1 (Thread 0x7f3eba8ad700 (LWP 7107)):
  #0  0x00007f3eb44b7580 in server_priv_to_dict (this=<optimized out>, dict=0x7f3e80017460, brickname=0x7f3e80004d50 "/mnt/brick_alpha_distrep_1/1")
      at server.c:248
  #1  0x0000562156dc4ee2 in glusterfs_handle_brick_status (req=0x7f3e640018d0) at glusterfsd-mgmt.c:1178
  #2  0x00007f3ec52ebad0 in synctask_wrap () at syncop.c:375
  #3  0x00007f3ec3922fc0 in ?? () from /lib64/libc.so.6
  #4  0x0000000000000000 in ?? ()
  (gdb)

Comment 4 Daniel Horák 2018-04-26 10:04:19 UTC
Created attachment 1427120 [details]
Full backtrace

Comment 9 hari gowtham 2018-05-04 09:16:14 UTC
Looking at the data available, the crash has happened at server_priv_to_dict 
where the value for xprt->xl_private is empty. which means the client is not linked to the xprt. 

From the crash, we can see that it crashed when "gluster get-state detail" was issued. Tried the same, but I'm not able to crash it as the xprt has the xl_private value filled every time. and things work fine.

I can see that the same command (get-state) was executed a number of times and didn't crash.

while looking further, I suspected it could be a race, so I checked the commands that are executed around the same time.
The commands that were executed were gluster profile commands, gluster pool list and gluster get-state volumeoptions. They are executed in a random order for each crash.

tried executing these commands from a number a machines for a number of times and I still couldn't crash the brick.

To debug, Tried attaching gdb to the server, by then the xprt itself is null and its skipping the whole check without crashing.
Not sure why xprt was null even when we received the server xlator as "this" here.

Comment 16 Rajesh Madaka 2018-05-23 07:22:58 UTC
Followed the steps mentioned in above description.

After Gluster import into Web-Admin, i didn't find any crashes of glusterd and no bricks went to offline.

verified with below version:

glusterfs-3.12.2-11

Comment 18 errata-xmlrpc 2018-09-04 06:47:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607