Bug 826032

Summary: glusterfsd crashed while performing "kill -HUP" on glusterfsd process in a loop
Product: [Community] GlusterFS Reporter: Shwetha Panduranga <shwetha.h.panduranga>
Component: coreAssignee: Amar Tumballi <amarts>
Status: CLOSED CURRENTRELEASE QA Contact: Shwetha Panduranga <shwetha.h.panduranga>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.3-betaCC: gluster-bugs, h-sugimoto, ujjwala, vraman
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:46:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Backtrace of core none

Description Shwetha Panduranga 2012-05-29 12:25:29 UTC
Description of problem:
-----------------------

(gdb) bt 
#0  0x000000363867a1bc in free () from /lib64/libc.so.6
#1  0x00000036386737a9 in _IO_free_backup_area_internal () from /lib64/libc.so.6
#2  0x000000363867170a in _IO_new_file_overflow () from /lib64/libc.so.6
#3  0x00000036386709bd in _IO_new_file_xsputn () from /lib64/libc.so.6
#4  0x0000003638647516 in vfprintf () from /lib64/libc.so.6
#5  0x000000363864e898 in fprintf () from /lib64/libc.so.6
#6  0x00007f323a3ba11a in _gf_log (domain=0x40f225 "glusterfsd", file=0x40f1dc "glusterfsd.c", function=0x40feca "reincarnate", line=874, level=GF_LOG_INFO, 
    fmt=0x40f6a0 "Fetching the volume file from server...") at logging.c:579
#7  0x0000000000406327 in reincarnate (signum=1) at glusterfsd.c:873
#8  0x0000000000407b81 in glusterfs_sigwaiter (arg=0x7fff9134daf0) at glusterfsd.c:1409
#9  0x0000003638a077f1 in start_thread () from /lib64/libpthread.so.0
#10 0x00000036386e570d in clone () from /lib64/libc.so.6



Version-Release number of selected component (if applicable):
--------------------------------------------------------------
3.3.0qa43

How reproducible:
-----------------
not consistent 

Steps to Reproduce:
--------------------
1.create a plain distribute volume
2.start the volume
3.execute "while true; do kill -HUP <glusterfsd_pid> ; done
  
Actual results:
----------------
After some time, glusterfsd crashed. 

Additional info:
-----------------
Crash found while verifying Bug 799882

Comment 1 Shwetha Panduranga 2012-05-29 12:30:09 UTC
Created attachment 587396 [details]
Backtrace of core

Comment 2 Amar Tumballi 2012-05-30 05:21:12 UTC
I suspect it is due to creating of multiple threads in every distribute init(). This is fixed by qa44. Can you confirm?

Comment 3 Amar Tumballi 2012-05-30 08:23:31 UTC
I tested it my self. I noticed that the crash is comming because of nested lock or something similar. The client side issue of distribute sync env thread is not the issue here.. More debugging happening.

Comment 4 Amar Tumballi 2012-05-31 04:54:32 UTC
http://review.gluster.com/3493 fixes the issue. Currently only on upstream, on need basis, will be backported to release-3.3 branch.

Comment 5 Amar Tumballi 2012-06-01 06:53:44 UTC
the bug fix is only in upstream, not in release-3.3. Hence moving it out of the ON_QA, and setting MODIFIED (as a standard practice @ Red Hat)