Bug 810109

Summary: glustershd process crashed
Product: [Community] GlusterFS Reporter: Shwetha Panduranga <shwetha.h.panduranga>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: gluster-bugs, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:41:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 817967    
Attachments:
Description Flags
information of threads from back trace none

Description Shwetha Panduranga 2012-04-05 07:58:07 UTC
Created attachment 575311 [details]
information of threads from back trace

Description of problem:
warning: Corrupted shared library list
Core was generated by `/usr/local/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /etc/'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000032f16093e7 in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6.x86_64
(gdb) bt full
#0  0x00000032f16093e7 in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#1  0x00000032f1609e1a in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#2  0x00000032f160df40 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#3  0x00000032f1614625 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#4  0x00007f84ac1c1ba7 in gf_print_trace (signum=11) at common-utils.c:365
        tm = 0x0
        msg = '\000' <repeats 1023 times>
        timestr = '\000' <repeats 255 times>
        utime = 0
        ret = 0
        fd = 4
#5  <signal handler called>
No symbol table info available.
#6  0x00000032f16093e7 in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#7  0x00000032f1609e1a in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#8  0x00000032f160df40 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#9  0x00000032f1614625 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
No symbol table info available.
#10 0x00000000004072f7 in glusterfs_pidfile_cleanup (ctx=0x1b1b010) at glusterfsd.c:1294
        cmd_args = 0x1b1b010
        __FUNCTION__ = "glusterfs_pidfile_cleanup"
#11 0x0000000000405e1c in cleanup_and_exit (signum=15) at glusterfsd.c:813
        ctx = 0x1b1b010
        trav = 0x0
        __FUNCTION__ = "cleanup_and_exit"
#12 0x0000000000407708 in glusterfs_sigwaiter (arg=0x7fff4de7a710) at glusterfsd.c:1382
        set = {__val = {18947, 0 <repeats 15 times>}}
        ret = 0
        sig = 15
---Type <return> to continue, or q <return> to quit---
#13 0x00000032f22077f1 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#14 0x00000032f1ee570d in clone () from /lib64/libc.so.6
No symbol table info available.

Version-Release number of selected component (if applicable):
3.3.0qa33

Steps to Reproduce:
1.create distribute-replicate volume(2X3). start the volume.
2.create fuse, nfs mounts. 
3.run gfsc1.sh from fuse mount
4.run nfsc1.sh from nfs mount
4.add-brick to the volume
5.start rebalance 
6.status rebalance
7.stop rebalance
8.brink down 2 bricks from each replicate set, so that one brick is online from
each replica set
9.brick back bricks online
10.start force rebalance
11.query rebalance status 
12.stop rebalance

Repeat step8 to step12 3-4 times.

13.stop the volume (couldn't stop the volume)
14.killall glusterfs; killall glusterfsd ; killall glusterd (caused the crash)
  
Actual results:
glustershd process crashed

Additional info:
Refer to bug 810106. Both the crashes happened at the same time. Also attaching a file which has information of threads from back trace.

Comment 1 Anand Avati 2012-04-13 17:09:36 UTC
CHANGE: http://review.gluster.com/3146 (libglusterfs: Syncop procs should not exceed SYNCENV_PROC_MAX) merged in master by Anand Avati (avati)

Comment 2 Shwetha Panduranga 2012-05-12 14:02:42 UTC
Bug is fixed . Verified on 3.3.0qa41