Bug 762163 (GLUSTER-431)

Summary: segfault in timer thread :O
Product: [Community] GlusterFS Reporter: Amar Tumballi <amarts>
Component: coreAssignee: Anand Avati <aavati>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: mainlineCC: chrisw, fharshav, gluster-bugs, pavan, rabhat, vraman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTNR Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Harshavardhana 2009-12-03 22:42:58 UTC
core file present at /share/tickets/431/core.5122

Comment 1 Amar Tumballi 2009-12-04 01:07:41 UTC
This happened when we were testing glusterfs on our local cluster.
----------

Program terminated with signal 11, Segmentation fault.
#0  client_start_ping (data=<value optimized out>) at client-protocol.c:490
490             dummy_frame->local = trans;
(gdb) bt
#0  client_start_ping (data=<value optimized out>) at client-protocol.c:490
#1  0x00002b4f2178faee in gf_timer_proc (ctx=0x19503010) at timer.c:172
#2  0x0000003384c06307 in start_thread () from /lib64/libpthread.so.0
#3  0x00000033844d1ded in clone () from /lib64/libc.so.6
(gdb) p *dummy_frame
Cannot access memory at address 0x0
(gdb) bt
#0  client_start_ping (data=<value optimized out>) at client-protocol.c:490
#1  0x00002b4f2178faee in gf_timer_proc (ctx=0x19503010) at timer.c:172
#2  0x0000003384c06307 in start_thread () from /lib64/libpthread.so.0
#3  0x00000033844d1ded in clone () from /lib64/libc.so.6
(gdb) l
485
486             hdrlen = gf_hdr_len (req, 0);
487             hdr    = gf_hdr_new (req, 0);
488
489             dummy_frame = create_frame (this, this->ctx->pool);
490             dummy_frame->local = trans;
491
492             ret = protocol_client_xfer (dummy_frame, this, trans,
493                                         GF_OP_TYPE_MOP_REQUEST, GF_MOP_PING,
494                                         hdr, hdrlen, NULL, 0, NULL);
(gdb) 

------------------

Its completely possible that memory was exhausted and calloc would have returned NULL. 

Just need more checks at memory allocation place.

Comment 2 Anand Avati 2010-02-22 05:39:56 UTC
PATCH: http://patches.gluster.com/patch/2789 in master (protocol/client: add memory allocation checks)

Comment 3 Anand Avati 2010-02-22 05:40:00 UTC
PATCH: http://patches.gluster.com/patch/2789 in release-3.0 (protocol/client: add memory allocation checks)