Bug 762166 (GLUSTER-434)

Summary: Crash with 3.0.0pre2 on client01 with "metarates" parallel MPI metadata benchmark
Product: [Community] GlusterFS Reporter: Harshavardhana <fharshav>
Component: distributeAssignee: Anand Avati <aavati>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: low    
Version: mainlineCC: chrisw, cww, gluster-bugs, pavan, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Harshavardhana 2009-12-04 07:32:45 UTC
Backtrace Output

----------
Reading symbols from /opt/availmedia/glusterfs/3.0.0/lib/glusterfs/3.0.0pre2/xlator/mount/fuse.so...done.
Loaded symbols for /opt/availmedia/glusterfs/3.0.0/lib/glusterfs/3.0.0pre2/xlator/mount/fuse.so
Reading symbols from /opt/availmedia/glusterfs/3.0.0/lib/glusterfs/3.0.0pre2/transport/socket.so...done.
Loaded symbols for /opt/availmedia/glusterfs/3.0.0/lib/glusterfs/3.0.0pre2/transport/socket.so
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
Core was generated by `/opt/availmedia/glusterfs/3.0.0/sbin/glusterfs --log-level=NORMAL --volfile-ser'.
Program terminated with signal 11, Segmentation fault.
#0  0x00002ad9c6b08679 in client_start_ping (data=<value optimized out>) at client-protocol.c:445
445                     if ((conn->saved_frames->count == 0) ||
(gdb) bt
#0  0x00002ad9c6b08679 in client_start_ping (data=<value optimized out>) at client-protocol.c:445
#1  0x00002ad9c604ddbe in gf_timer_proc (ctx=0x14b89010) at timer.c:172
#2  0x0000003384c06307 in start_thread () from /lib64/libpthread.so.0
#3  0x00000033844d1ded in clone () from /lib64/libc.so.6
------------

Comment 1 Harshavardhana 2009-12-04 07:37:49 UTC
(gdb) p (client_connection_t)trans->xl_private
$9 = {lock = {__data = {__lock = 347731184, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, 
      __list = {__prev = 0x0, __next = 0x0}}, __size = "��\024", '\0' <repeats 35 times>, 
    __align = 347731184}, callid = 0, saved_frames = 0x2, frame_timeout = 347664160, ping_started = 0, 
  ping_timeout = 347737792, transport_activity = 0, reconnect = 0x0, connected = -32 '�', 
  max_block_size = 46912496580208, timer = 0xa065a8c012270002, ping_timer = 0x0}
(gdb)

Comment 2 Harshavardhana 2009-12-04 07:44:43 UTC
443             pthread_mutex_lock (&conn->lock);
444             {
445                     if ((conn->saved_frames->count == 0) ||
446                         !conn->connected) {
447                             /* using goto looked ugly here,
448                              * hence getting out this way */
449                             if (conn->ping_timer)

conn->ping_timer is 0x0 (NULL)

Comment 3 Harshavardhana 2009-12-04 10:27:21 UTC
frame : type(1) op(SETATTR)
frame : type(1) op(TRUNCATE)
frame : type(1) op(SETATTR)
frame : type(1) op(TRUNCATE)

patchset: 2.0.1-841-gaa53bb5
signal received: 11
frame : type(1) op(TRUNCATE)
time of crash: 2009-12-04 01:58:12
configuration details:
argp 1
backtrace 1
db.h 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.0.0pre2
frame : type(1) op(SETATTR)
frame : type(1) op(TRUNCATE)
frame : type(1) op(SETATTR)
frame : type(1) op(TRUNCATE)
frame : type(1) op(SETATTR)
frame : type(1) op(TRUNCATE)
frame : type(1) op(SETATTR)
frame : type(1) op(TRUNCATE)
---------
frame : type(1) op(SETATTR)

Comment 4 Harshavardhana 2009-12-05 17:38:37 UTC
patch sent for reviews http://patches.gluster.com/patch/2542/

Comment 5 Anand Avati 2010-02-22 06:31:03 UTC
PATCH: http://patches.gluster.com/patch/2790 in master (protocol/client: better pointer check on saved_frames mapping in ping timer)

Comment 6 Anand Avati 2010-02-22 06:31:08 UTC
PATCH: http://patches.gluster.com/patch/2790 in release-3.0 (protocol/client: better pointer check on saved_frames mapping in ping timer)