Bug 765368 (GLUSTER-3636)

Summary: [glusterfs-3.3.0qa1]: glustershd crashd in synctask_wrap
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: replicateAssignee: Raghavendra Bhat <rabhat>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: low    
Version: pre-releaseCC: gluster-bugs, saurabh
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: glusterfs-3.2.5qa4 Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Raghavendra Bhat 2011-09-27 04:39:30 UTC
glustershd crashed immedietly after starting in synctask_wrap. This is the backtrace of the core generated.

Core was generated by `/usr/local/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /etc/'.
Program terminated with signal 11, Segmentation fault.
#0  0x00002b5b90158562 in synctask_wrap (task=0xffffffffb4001db0) at ../../../libglusterfs/src/syncop.c:104
104             ret = task->syncfn (task->opaque);
(gdb) bt
#0  0x00002b5b90158562 in synctask_wrap (task=0xffffffffb4001db0) at ../../../libglusterfs/src/syncop.c:104
#1  0x0000003eb5e419c0 in ?? () from /lib64/libc.so.6
#2  0x0000000000000000 in ?? ()
(gdb) info thr
  4 Thread 28542  0x0000003eb5ed48a8 in epoll_wait () from /lib64/libc.so.6
  3 Thread 28543  0x0000003eb660e838 in do_sigwait () from /lib64/libpthread.so.0
  2 Thread 28545  0x0000003eb5e9a541 in nanosleep () from /lib64/libc.so.6
* 1 Thread 28544  0x00002b5b90158562 in synctask_wrap (task=0xffffffffb4001db0) at ../../../libglusterfs/src/syncop.c:104
(gdb) p *task
Cannot access memory at address 0xffffffffb4001db0
(gdb) 



23: volume glustershd
 24:     type debug/io-stats
 25:     subvolumes mirror-replicate-0
 26: end-volume

+------------------------------------------------------------------------------+
[2011-09-26 07:33:03.841984] I [rpc-clnt.c:1591:rpc_clnt_reconfig] 0-mirror-client-1: changing port to 24009 (from 0)
[2011-09-26 07:33:03.842114] I [rpc-clnt.c:1591:rpc_clnt_reconfig] 0-mirror-client-0: changing port to 24009 (from 0)
[2011-09-26 07:33:04.750868] W [socket.c:419:__socket_keepalive] 0-socket: failed to set keep idle on socket 9
[2011-09-26 07:33:04.750954] W [socket.c:1874:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
[2011-09-26 07:33:04.751024] W [socket.c:419:__socket_keepalive] 0-socket: failed to set keep idle on socket 11
[2011-09-26 07:33:04.751053] W [socket.c:1874:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
[2011-09-26 07:33:04.751117] W [socket.c:419:__socket_keepalive] 0-socket: failed to set keep idle on socket 12
[2011-09-26 07:33:04.751145] W [socket.c:1874:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
[2011-09-26 07:33:06.754964] W [socket.c:419:__socket_keepalive] 0-socket: failed to set keep idle on socket 13
[2011-09-26 07:33:06.755058] W [socket.c:1874:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
[2011-09-26 07:33:06.755110] W [socket.c:419:__socket_keepalive] 0-socket: failed to set keep idle on socket 14
[2011-09-26 07:33:06.755127] W [socket.c:1874:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
[2011-09-26 07:33:06.755160] W [socket.c:419:__socket_keepalive] 0-socket: failed to set keep idle on socket 15
[2011-09-26 07:33:06.755188] W [socket.c:1874:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
[2011-09-26 07:33:06.755225] W [socket.c:419:__socket_keepalive] 0-socket: failed to set keep idle on socket 16
[2011-09-26 07:33:06.755246] W [socket.c:1874:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
[2011-09-26 07:33:07.764398] I [client-handshake.c:1085:select_server_supported_programs] 0-mirror-client-1: Using Program GlusterFS 3.3.0qa11, Num (1298437), Version (310)
[2011-09-26 07:33:07.764660] I [client-handshake.c:917:client_setvolume_cbk] 0-mirror-client-1: Connected to 10.1.11.74:24009, attached to remote volume '/export/mirror'.
[2011-09-26 07:33:07.764691] I [afr-common.c:3455:afr_notify] 0-mirror-replicate-0: Subvolume 'mirror-client-1' came back up; going online.
[2011-09-26 07:33:07.766134] I [client-handshake.c:1085:select_server_supported_programs] 0-mirror-client-0: Using Program GlusterFS 3.3.0qa11, Num (1298437), Version (310)
[2011-09-26 07:33:07.766385] I [client-handshake.c:917:client_setvolume_cbk] 0-mirror-client-0: Connected to 10.1.11.73:24009, attached to remote volume '/export/mirror'.
[2011-09-26 07:33:07.766407] I [afr-common.c:3459:afr_notify] 0-mirror-replicate-0: subvol 0 came up, start crawl
[2011-09-26 07:33:07.766428] I [afr-common.c:3554:afr_notify] 0-mirror-replicate-0: All subvolumes came up, start crawl
[2011-09-26 07:33:07.766453] I [afr-self-heald.c:435:afr_proactive_self_heal] 0-mirror-replicate-0: starting crawl for -1
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2011-09-26 07:33:07
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.3.0qa11
/lib64/libc.so.6[0x3eb5e302d0]
/usr/local/lib/libglusterfs.so.0(synctask_wrap+0x10)[0x2b5b90158562]
/lib64/libc.so.6[0x3eb5e419c0]
---------

Comment 1 Anand Avati 2011-10-02 05:31:21 UTC
CHANGE: http://review.gluster.com/547 (Across glibc implementations, interpretation of argc/argv passed) merged in master by Vijay Bellur (vijay)

Comment 2 Vijay Bellur 2011-10-10 07:41:31 UTC
*** Bug 3619 has been marked as a duplicate of this bug. ***

Comment 3 Anand Avati 2011-10-10 08:03:58 UTC
CHANGE: http://review.gluster.com/571 (Across glibc implementations, interpretation of argc/argv passed) merged in release-3.2 by Vijay Bellur (vijay)

Comment 4 Anand Avati 2011-10-10 08:04:18 UTC
CHANGE: http://review.gluster.com/572 (Across glibc implementations, interpretation of argc/argv passed) merged in release-3.1 by Vijay Bellur (vijay)

Comment 5 Raghavendra Bhat 2011-10-31 08:38:08 UTC
Its fixed now. Crash was due to a gcc bug because if which wrong pointer was passed to synctask_wrap. With the fix crash is not seen. Checked with glusterfs-3.2.5qa4.