Bug 826032 - glusterfsd crashed while performing "kill -HUP" on glusterfsd process in a loop
glusterfsd crashed while performing "kill -HUP" on glusterfsd process in a loop
Product: GlusterFS
Classification: Community
Component: core (Show other bugs)
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Amar Tumballi
Shwetha Panduranga
Depends On:
  Show dependency treegraph
Reported: 2012-05-29 08:25 EDT by Shwetha Panduranga
Modified: 2013-12-18 19:08 EST (History)
4 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2013-07-24 13:46:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Backtrace of core (13.91 KB, application/octet-stream)
2012-05-29 08:30 EDT, Shwetha Panduranga
no flags Details

  None (edit)
Description Shwetha Panduranga 2012-05-29 08:25:29 EDT
Description of problem:

(gdb) bt 
#0  0x000000363867a1bc in free () from /lib64/libc.so.6
#1  0x00000036386737a9 in _IO_free_backup_area_internal () from /lib64/libc.so.6
#2  0x000000363867170a in _IO_new_file_overflow () from /lib64/libc.so.6
#3  0x00000036386709bd in _IO_new_file_xsputn () from /lib64/libc.so.6
#4  0x0000003638647516 in vfprintf () from /lib64/libc.so.6
#5  0x000000363864e898 in fprintf () from /lib64/libc.so.6
#6  0x00007f323a3ba11a in _gf_log (domain=0x40f225 "glusterfsd", file=0x40f1dc "glusterfsd.c", function=0x40feca "reincarnate", line=874, level=GF_LOG_INFO, 
    fmt=0x40f6a0 "Fetching the volume file from server...") at logging.c:579
#7  0x0000000000406327 in reincarnate (signum=1) at glusterfsd.c:873
#8  0x0000000000407b81 in glusterfs_sigwaiter (arg=0x7fff9134daf0) at glusterfsd.c:1409
#9  0x0000003638a077f1 in start_thread () from /lib64/libpthread.so.0
#10 0x00000036386e570d in clone () from /lib64/libc.so.6

Version-Release number of selected component (if applicable):

How reproducible:
not consistent 

Steps to Reproduce:
1.create a plain distribute volume
2.start the volume
3.execute "while true; do kill -HUP <glusterfsd_pid> ; done
Actual results:
After some time, glusterfsd crashed. 

Additional info:
Crash found while verifying Bug 799882
Comment 1 Shwetha Panduranga 2012-05-29 08:30:09 EDT
Created attachment 587396 [details]
Backtrace of core
Comment 2 Amar Tumballi 2012-05-30 01:21:12 EDT
I suspect it is due to creating of multiple threads in every distribute init(). This is fixed by qa44. Can you confirm?
Comment 3 Amar Tumballi 2012-05-30 04:23:31 EDT
I tested it my self. I noticed that the crash is comming because of nested lock or something similar. The client side issue of distribute sync env thread is not the issue here.. More debugging happening.
Comment 4 Amar Tumballi 2012-05-31 00:54:32 EDT
http://review.gluster.com/3493 fixes the issue. Currently only on upstream, on need basis, will be backported to release-3.3 branch.
Comment 5 Amar Tumballi 2012-06-01 02:53:44 EDT
the bug fix is only in upstream, not in release-3.3. Hence moving it out of the ON_QA, and setting MODIFIED (as a standard practice @ Red Hat)

Note You need to log in before you can comment on or make changes to this bug.