Bug 1622554

Summary: glusterd coredumps while installing or upgrading
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: glusterdAssignee: Sanju <srakonde>
Status: CLOSED DUPLICATE QA Contact: Bala Konda Reddy M <bmekala>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: rhs-bugs, sankarshan, sasundar, storage-qa-internal, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-28 12:16:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
console log from the RHGS Server
none
glusterd log file from the RHGS Server
none
coredump file none

Description SATHEESARAN 2018-08-27 12:55:47 UTC
Description of problem:
-----------------------
While installing glusterfs-server in RHGS 3.4.0, glusterd coredumps
I have seen this behaviour even with upgrading from RHGS 3.3.1 to RHGS 3.4.0.

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHGS 3.4.0 nightly ( glusterfs-3.12.2-17.el7rhgs )

How reproducible:
-----------------
5/7

Steps to Reproduce:
-------------------
1. Install glusterfs-server on RHEL 7.5 Server

(or)

2. Upgrade from RHGS 3.3.1 to RHGS 3.4.0


Actual results:
---------------
While installation is happening, observed the coredump message on the console

Expected results:
-----------------
There should not be any core dumped by glusterd while installing or upgrade

Comment 1 SATHEESARAN 2018-08-27 12:56:34 UTC
Here is the snip from the console log:
---------------------------------------

<snip> 

 Installing : glusterfs-server-3.12.2-17.el7rhgs.x86_64                                                                                                                           26/26 
Created symlink from /etc/systemd/system/multi-user.target.wants/glusterd.service to /usr/lib/systemd/system/glusterd.service.
/var/tmp/rpm-tmp.s0uJnO: line 26: 27062 Segmentation fault      (core dumped) glusterd --xlator-option *.upgrade=on -N
  Verifying  : userspace-rcu-0.7.9-2.el7rhgs.x86_64                                                                                                                                 1/26 
  Verifying  : libini_config-1.3.1-29.el7.x86_64         

</snip>

Comment 2 Atin Mukherjee 2018-08-27 13:13:50 UTC
Please dump the backtrace of the core.

Comment 3 SATHEESARAN 2018-08-27 13:45:33 UTC
Created attachment 1478959 [details]
console log from the RHGS Server

Comment 4 SATHEESARAN 2018-08-27 13:46:50 UTC
Created attachment 1478960 [details]
glusterd log file from the RHGS Server

Comment 5 SATHEESARAN 2018-08-27 13:48:57 UTC
Created attachment 1478961 [details]
coredump file

Comment 6 SATHEESARAN 2018-08-27 13:50:10 UTC
(In reply to Atin Mukherjee from comment #2)
> Please dump the backtrace of the core.

I have attached the core file to this bug

Comment 7 Sanju 2018-08-28 12:16:16 UTC
output of t a a bt:

(gdb) t a a bt
 
Thread 8 (Thread 0x7f07870c8700 (LWP 27069)):
#0  0x00007f0797009b0f in _gf_msg (domain=domain@entry=0x7f07970b2742 "epoll",
    file=file@entry=0x7f07970b2734 "event-epoll.c",
    function=function@entry=0x7f07970b2aa0 <__FUNCTION__.11140> "event_dispatch_epoll_worker",
    line=line@entry=613, level=level@entry=GF_LOG_INFO, errnum=errnum@entry=0, trace=trace@entry=0,
    msgid=msgid@entry=101190, fmt=fmt@entry=0x7f07970b27a1 "Started thread with index %d")
    at logging.c:2039
#1  0x00007f0797066332 in event_dispatch_epoll_worker (data=0x5613ad22a3b0) at event-epoll.c:612
#2  0x00007f0795e67dd5 in start_thread (arg=0x7f07870c8700) at pthread_create.c:308
#3  0x00007f0795730b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
 
Thread 7 (Thread 0x7f07878c9700 (LWP 27068)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f078bb5137b in hooks_worker (args=<optimized out>) at glusterd-hooks.c:529
#2  0x00007f0795e67dd5 in start_thread (arg=0x7f07878c9700) at pthread_create.c:308
#3  0x00007f0795730b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
 
Thread 6 (Thread 0x7f078c604700 (LWP 27067)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f0797043e68 in syncenv_task (proc=proc@entry=0x5613ad2114a0) at syncop.c:603
#2  0x00007f0797044d30 in syncenv_processor (thdata=0x5613ad2114a0) at syncop.c:695
#3  0x00007f0795e67dd5 in start_thread (arg=0x7f078c604700) at pthread_create.c:308
#4  0x00007f0795730b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
 
Thread 5 (Thread 0x7f07974f0780 (LWP 27062)):
#0  0x00007f0795e68f47 in pthread_join (threadid=139670307243776,
    thread_return=thread_return@entry=0x0) at pthread_join.c:92
#1  0x00007f0797066af8 in event_dispatch_epoll (event_pool=0x5613ad209210) at event-epoll.c:746
#2  0x00005613ac91a247 in main (argc=4, argv=<optimized out>) at glusterfsd.c:2550
 
Thread 4 (Thread 0x7f078de07700 (LWP 27064)):
#0  selinux_release () at libdm-common.c:1018
#1  0x00007f078a2ad6dc in dm_lib_exit () at ioctl/libdm-iface.c:2129
#2  0x00007f07972ed18a in _dl_fini () at dl-fini.c:253
#3  0x00007f079566bb69 in __run_exit_handlers (status=status@entry=15,
    listp=0x7f07959f86c8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:77
#4  0x00007f079566bbb7 in __GI_exit (status=status@entry=15) at exit.c:99
#5  0x00005613ac91d47f in cleanup_and_exit (signum=15) at glusterfsd.c:1423
#6  0x00005613ac91d575 in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2145
#7  0x00007f0795e67dd5 in start_thread (arg=0x7f078de07700) at pthread_create.c:308
#8  0x00007f0795730b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
 
Thread 3 (Thread 0x7f078d606700 (LWP 27065)):
#0  0x00007f07956f74fd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f07956f7394 in __sleep (seconds=0, seconds@entry=30)
    at ../sysdeps/unix/sysv/linux/sleep.c:137
#2  0x00007f079703120d in pool_sweeper (arg=<optimized out>) at mem-pool.c:481
#3  0x00007f0795e67dd5 in start_thread (arg=0x7f078d606700) at pthread_create.c:308
#4  0x00007f0795730b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
 
---Type <return> to continue, or q <return> to quit---
Thread 2 (Thread 0x7f078e608700 (LWP 27063)):
#0  0x00007f0795e6eeed in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f07970169d6 in gf_timer_proc (data=0x5613ad2108c0) at timer.c:174
#2  0x00007f0795e67dd5 in start_thread (arg=0x7f078e608700) at pthread_create.c:308
#3  0x00007f0795730b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
 
Thread 1 (Thread 0x7f078ce05700 (LWP 27066)):
#0  arena_alloc (arena=<optimized out>) at urcu-bp.c:355
#1  add_thread () at urcu-bp.c:385
#2  rcu_bp_register () at urcu-bp.c:467
#3  0x00007f078b4ff0ce in _rcu_read_lock_bp () at urcu/static/urcu-bp.h:196
#4  rcu_read_lock_bp () at urcu-bp.c:271
#5  0x00007f078bb9aefb in glusterd_get_quorum_cluster_counts (this=this@entry=0x5613ad219890,
    active_count=active_count@entry=0x7f0787ac9fd0, quorum_count=quorum_count@entry=0x7f0787ac9fd4)
    at glusterd-server-quorum.c:227
#6  0x00007f078bac0b19 in glusterd_restart_bricks (opaque=<optimized out>) at glusterd-utils.c:6345
#7  0x00007f078bad5283 in glusterd_spawn_daemons (opaque=<optimized out>) at glusterd-utils.c:3549
#8  0x00007f0797041890 in synctask_wrap () at syncop.c:375
#9  0x00007f0795679fc0 in ?? () from /lib64/libc.so.6
#10 0x0000000000000000 in ?? ()
(gdb)

Thread 4, in its clean and exit path, cleaned up urcu resources and thread 1 is trying to access rcu_read_lock_bp() which led to the segmentation fault.

*** This bug has been marked as a duplicate of bug 1238067 ***