Bug 763150 (GLUSTER-1418)

Summary: Crash in saved_frames_put
Product: [Community] GlusterFS Reporter: Anush Shetty <anush>
Component: protocolAssignee: Amar Tumballi <amarts>
Status: CLOSED WONTFIX QA Contact:
Severity: high Docs Contact:
Priority: low    
Version: mainlineCC: gluster-bugs, vraman
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Anush Shetty 2010-08-23 09:14:29 UTC
While starting the glusterfs client with 2 server processes on latest mainline and 2 others on 3.0.5, the client crashed

patchset: git://git.sv.gnu.org/gluster.git
signal received: 11
time of crash: 2010-08-23 14:35:21
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.1.0git
/lib/libc.so.6[0x7f8a54b09530]
/usr/local/lib/glusterfs/3.1.0git/xlator/legacy/protocol/client.so(saved_frames_put+0x92)[0x7f8a52f6bc82]
/usr/local/lib/glusterfs/3.1.0git/xlator/legacy/protocol/client.so(save_frame+0x21)[0x7f8a52f559f1]
/usr/local/lib/glusterfs/3.1.0git/xlator/legacy/protocol/client.so(protocol_client_xfer+0x1a6)[0x7f8a52f55bf6]
/usr/local/lib/glusterfs/3.1.0git/xlator/legacy/protocol/client.so(protocol_client_handshake+0x3fd)[0x7f8a52f5f52d]
/usr/local/lib/glusterfs/3.1.0git/xlator/legacy/protocol/client.so(notify+0x11a)[0x7f8a52f5f66a]
/usr/local/lib/libglusterfs.so.0(xlator_notify+0x46)[0x7f8a556a3616]
/usr/local/lib/glusterfs/3.1.0git/transport/socket.so[0x7f8a51c81998]
/usr/local/lib/libglusterfs.so.0[0x7f8a556bd684]
glusterfs(main+0x237)[0x404e67]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7f8a54af4abd]
glusterfs[0x403079]

backtrace:

(gdb) bt full
#0  list_add_tail (frames=0x1a20ec8, frame=0x7f8a53e532e0, op=0, type=2 '\002', callid=2) at ../../../../../libglusterfs/src/list.h:52
No locals.
#1  saved_frames_put (frames=0x1a20ec8, frame=0x7f8a53e532e0, op=0, type=2 '\002', callid=2) at saved-frames.c:95
        head_frame = 0x1a20f08
#2  0x00007f8a52f559f1 in save_frame (trans=0x1a20988, frame=0x4c7239d1, op=1207966184, type=0 '\000', callid=104) at client-protocol.c:343
        conn = 0x1a20e28
        timeout = {tv_sec = 140232089744096, tv_usec = 27397672}
#3  0x00007f8a52f55bf6 in protocol_client_xfer (frame=0x7f8a53e532e0, this=<value optimized out>, trans=0x1a20988, type=0, op=0, hdr=0x7f8a48001d38, 
    hdrlen=<value optimized out>, vector=0x0, count=0, iobref=0x0) at client-protocol.c:639
        conf = 0x1a208e8
        conn = 0x1a20e28
        ret = <value optimized out>
        rsphdr = {callid = 0, type = 0, op = 0, size = 0, {req = {pid = 0, uid = 0, gid = 0, ngrps = 0, groups = {0 <repeats 16 times>}, lk_owner = 0}, 
            rsp = {op_ret = 0, op_errno = 0}}}
#4  0x00007f8a52f5f52d in protocol_client_handshake (this=0x1a09a78, trans=0x1a20988) at client-protocol.c:6297
        hdr = 0x7f8a48001d38
        options = <value optimized out>
        ret = <value optimized out>
        dict_len = 255
        process_uuid_xl = 0x7f8a48001bf8 "pitta-22593-2010/08/23-14:35:21:781141-client2"
        __FUNCTION__ = "protocol_client_handshake"
#5  0x00007f8a52f5f66a in notify (this=0x1a09a78, event=<value optimized out>, data=0x1a20988) at client-protocol.c:6563
        handshake = 0x0
        ret = -2
        was_not_down = <value optimized out>
        conn = <value optimized out>
        conf = 0x1a208e8
        parent = <value optimized out>
        __FUNCTION__ = "notify"
#6  0x00007f8a556a3616 in xlator_notify (xl=0x1a09a78, event=5, data=0x1a20988) at xlator.c:916
        old_THIS = 0x1a08488
        ret = <value optimized out>
#7  0x00007f8a51c81998 in socket_connect_finish (fd=<value optimized out>, idx=<value optimized out>, data=0x1a20988, poll_in=0, poll_out=4, 
---Type <return> to continue, or q <return> to quit---
    poll_err=<value optimized out>) at socket.c:841
        ret = <value optimized out>
        priv = <value optimized out>
        event = 5
#8  socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x1a20988, poll_in=0, poll_out=4, poll_err=<value optimized out>)
    at socket.c:865
        priv = <value optimized out>
        ret = <value optimized out>
#9  0x00007f8a556bd684 in event_dispatch_epoll_handler (event_pool=0x1a00c58) at event.c:812
        data = <value optimized out>
        idx = <value optimized out>
        event_data = 0x7f8a48000d94
        handler = 0x54
#10 event_dispatch_epoll (event_pool=0x1a00c58) at event.c:876
        events = 0x7f8a48000d78
        i = 2
        ret = 6
        __FUNCTION__ = "event_dispatch_epoll"
#11 0x0000000000404e67 in main (argc=6, argv=0x7fffb74ded58) at glusterfsd.c:1318
        ctx = 0x1a00010
        ret = 0

Comment 1 Anand Avati 2010-08-25 04:41:47 UTC
PATCH: http://patches.gluster.com/patch/4279 in master (legacy/protocol/client: fix namespace collisions.)

Comment 2 Anush Shetty 2010-08-25 08:05:08 UTC
Crashes again with the above patch 

patchset: git://git.sv.gnu.org/gluster.git
signal received: 11
time of crash: 2010-08-25 16:33:29
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.1.0git
/lib/libc.so.6[0x7f1b8b829530]
/usr/local/lib/glusterfs/3.1.0git/xlator/protocol/client.so(client_start_ping+0x2c)[0x7f1b89eb707c]
/usr/local/lib/glusterfs/3.1.0git/xlator/legacy/protocol/client.so(protocol_client_xfer+0x290)[0x7f1b89c75cf0]
/usr/local/lib/glusterfs/3.1.0git/xlator/legacy/protocol/client.so(protocol_client_handshake+0x3fd)[0x7f1b89c7f53d]
/usr/local/lib/glusterfs/3.1.0git/xlator/legacy/protocol/client.so(notify+0x11a)[0x7f1b89c7f67a]
/usr/local/lib/libglusterfs.so.0(xlator_notify+0x46)[0x7f1b8c3c3676]
/usr/local/lib/glusterfs/3.1.0git/transport/socket.so[0x7f1b889a1998]
/usr/local/lib/libglusterfs.so.0[0x7f1b8c3dd7e4]
glusterfs(main+0x237)[0x404e67]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7f1b8b814abd]
glusterfs[0x403079]

Comment 3 Amar Tumballi 2010-08-29 08:38:07 UTC
Currently because we are going ahead with 'gfid' changes in 3.1 itself, 3.0 and 3.1 are not going to be compatible with each other. Only test which needs to be performed is that there should not be any crashes if glusterfs version in client and server are different.

Marking this bug as 'wontfix' as we will be going to remove 'legacy/protocol' also from the build.