Bug 762556 (GLUSTER-824)

Summary: Crash in afr rename transaction
Product: [Community] GlusterFS Reporter: Vijay Bellur <vijay>
Component: replicateAssignee: Vikas Gorur <vikas>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: mainlineCC: amarts, gluster-bugs, pavan
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Vijay Bellur 2010-04-15 14:09:12 UTC
The following core was seen during afr rename transaction:


warning: core file may not match specified executable file.
Program terminated with signal 11, Segmentation fault.
[New process 6818]
[New process 6827]
[New process 6821]
#0  0x00002b984335cf0f in afr_unlock (frame=0x2aaaac093ae0, this=0x117d7280) at afr-transaction.c:543
543     afr-transaction.c: No such file or directory.
        in afr-transaction.c
(gdb) bt
#0  0x00002b984335cf0f in afr_unlock (frame=0x2aaaac093ae0, this=0x117d7280) at afr-transaction.c:543
#1  0x00002b984335d37d in afr_lock_lower_cbk (frame=0x2aaaac093ae0, cookie=<value optimized out>, this=0x117d7280, op_ret=-1, op_errno=2)
    at afr-transaction.c:1052
#2  0x00002b984312d0d7 in client_entrylk_cbk (frame=0x11836580, hdr=0x11848f00, hdrlen=<value optimized out>, iobuf=<value optimized out>)
    at client-protocol.c:5376
#3  0x00002b98431297ca in protocol_client_pollin (this=0x117d6ec0, trans=0x117dc0e0) at client-protocol.c:6347
#4  0x00002b9843130392 in notify (this=0x3c305519c0, event=2, data=0x117dc0e0) at client-protocol.c:6390
#5  0x00002aaaaaaafd33 in socket_event_handler (fd=<value optimized out>, idx=1, data=0x117dc0e0, poll_in=1, poll_out=0, poll_err=0)
    at socket.c:814
#6  0x0000003df6c26c85 in event_dispatch_epoll (event_pool=0x117d1820) at event.c:804
#7  0x0000000000403d26 in main ()
(gdb) thread apply all bt full

Thread 3 (process 6821):
#0  0x0000003c30299761 in nanosleep () from /lib64/libc.so.6
No symbol table info available.
#1  0x0000003c302ccda4 in usleep () from /lib64/libc.so.6
No symbol table info available.
#2  0x0000003df6c1a454 in gf_timer_proc (ctx=0x117d0010) at timer.c:177
        now = 1270728347029173
        now_tv = {tv_sec = 1270728347, tv_usec = 29173}
        event = (gf_timer_t *) 0x2aaab0032bc0
        reg = (gf_timer_registry_t *) 0x117d8ed0
        __FUNCTION__ = "gf_timer_proc"
#3  0x0000003c30e06367 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4  0x0000003c302d30ad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 2 (process 6827):
#0  0x0000003c30e0d2cb in read () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00002b984379d5da in fuse_kern_chan_receive (chp=<value optimized out>, buf=0x2b984305a000 "0", size=135168) at fuse_kern_chan.c:28
        ch = (struct fuse_chan *) 0x117dc850
        err = 293410712
        res = 48
        se = (struct fuse_session *) 0x117dcc10
        __PRETTY_FUNCTION__ = "fuse_kern_chan_receive"
#2  0x00002b98437a0e90 in fuse_chan_receive (ch=0x117dc850, buf=0x2b984305a000 "0", size=135168) at fuse_session.c:191
        res = <value optimized out>
#3  0x00002b984357d01f in fuse_thread_proc (data=<value optimized out>) at fuse-bridge.c:2509
        mount_point = <value optimized out>
        this = (xlator_t *) 0x117d1dc0
        priv = (fuse_private_t *) 0x117dc9f0
        res = 48
        iobuf = (struct iobuf *) 0x117d1798
        chan_size = 135168
        __FUNCTION__ = "fuse_thread_proc"
#4  0x0000003c30e06367 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#5  0x0000003c302d30ad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 1 (process 6818):
#0  0x00002b984335cf0f in afr_unlock (frame=0x2aaaac093ae0, this=0x117d7280) at afr-transaction.c:543
        flock = {l_type = 2, l_whence = 1485, l_start = 0, l_len = 0, l_pid = 0}
        call_count = 0
---Type <return> to continue, or q <return> to quit---
        local = (afr_local_t *) 0x2aaaac006fa0
        priv = (afr_private_t *) 0x117dc590
        lower = (loc_t *) 0x2aaaac008490
        higher = (loc_t *) 0x117df3f0
        lower_name = 0x0
        higher_name = 0x0
#1  0x00002b984335d37d in afr_lock_lower_cbk (frame=0x2aaaac093ae0, cookie=<value optimized out>, this=0x117d7280, op_ret=-1, op_errno=2)
    at afr-transaction.c:1052
        local = (afr_local_t *) 0x2aaaac006fa0
        child_index = 2
        lower = <value optimized out>
        higher = <value optimized out>
        higher_name = <value optimized out>
        __FUNCTION__ = "afr_lock_lower_cbk"
#2  0x00002b984312d0d7 in client_entrylk_cbk (frame=0x11836580, hdr=0x11848f00, hdrlen=<value optimized out>, iobuf=<value optimized out>)
    at client-protocol.c:5376
        fn = (ret_fn_t) 0xeeeeeeee
        _parent = (call_frame_t *) 0x3c305519c0
        op_errno = 293789104
#3  0x00002b98431297ca in protocol_client_pollin (this=0x117d6ec0, trans=0x117dc0e0) at client-protocol.c:6347
        conf = (client_conf_t *) 0x117db600
        ret = 0
        iobuf = (struct iobuf *) 0x0
        hdr = 0x11848f00 ""
        hdrlen = 32
#4  0x00002b9843130392 in notify (this=0x3c305519c0, event=2, data=0x117dc0e0) at client-protocol.c:6390
        ret = <value optimized out>
        child_down = <value optimized out>
        was_not_down = <value optimized out>
        trans = (transport_t *) 0x1182ddb0
        conn = <value optimized out>
        conf = (client_conf_t *) 0x117db600
        parent = <value optimized out>
        __FUNCTION__ = "notify"
#5  0x00002aaaaaaafd33 in socket_event_handler (fd=<value optimized out>, idx=1, data=0x117dc0e0, poll_in=1, poll_out=0, poll_err=0)
    at socket.c:814
        this = (transport_t *) 0x3c305519c0
        priv = (socket_private_t *) 0x117dc400
        ret = -286331154
#6  0x0000003df6c26c85 in event_dispatch_epoll (event_pool=0x117d1820) at event.c:804
        events = (struct epoll_event *) 0x117ddf10
        i = 2
        ret = 3
        __FUNCTION__ = "event_dispatch_epoll"
#7  0x0000000000403d26 in main ()
---Type <return> to continue, or q <return> to quit---
No symbol table info available.
(gdb)

Comment 1 Anand Avati 2010-04-20 05:51:09 UTC
PATCH: http://patches.gluster.com/patch/3147 in master (cluster/afr: Check for call_count in ENTRY_RENAME_TRANSACTION)

Comment 2 Anand Avati 2010-04-20 05:51:18 UTC
PATCH: http://patches.gluster.com/patch/3148 in release-2.0 (cluster/afr: Check for call_count in ENTRY_RENAME_TRANSACTION.)

Comment 3 Anand Avati 2010-04-20 05:51:25 UTC
PATCH: http://patches.gluster.com/patch/3146 in release-3.0 (cluster/afr: Check for call_count in ENTRY_RENAME_TRANSACTION.)