Description of problem: I have two servers and 16 Sandisk IF disks, 8 disks zoned to each server. The volume is setup as a 8x2: [root@rhs-srv-09 ~]# gluster v info Volume Name: testvol Type: Distributed-Replicate Volume ID: 6a703fe5-f294-407d-8926-2a3999bfa369 Status: Started Number of Bricks: 8 x 2 = 16 Transport-type: tcp Bricks: Brick1: rhs-srv-09-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick1/gfsbrick Brick2: rhs-srv-10-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick1/gfsbrick Brick3: rhs-srv-09-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick2/gfsbrick Brick4: rhs-srv-10-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick2/gfsbrick Brick5: rhs-srv-09-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick3/gfsbrick Brick6: rhs-srv-10-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick3/gfsbrick Brick7: rhs-srv-09-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick4/gfsbrick Brick8: rhs-srv-10-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick4/gfsbrick Brick9: rhs-srv-09-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick5/gfsbrick Brick10: rhs-srv-10-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick5/gfsbrick Brick11: rhs-srv-09-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick6/gfsbrick Brick12: rhs-srv-10-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick6/gfsbrick Brick13: rhs-srv-09-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick7/gfsbrick Brick14: rhs-srv-10-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick7/gfsbrick Brick15: rhs-srv-09-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick8/gfsbrick Brick16: rhs-srv-10-priv.ceph-dev.lab.eng.rdu2.redhat.com:/brick8/gfsbrick I also have 4 clients accessing the volume running performance tests. During the write perf tests, when I write with a record size of 64k and under, I see a crash on one or more clients. Version-Release number of selected component (if applicable): glusterfs-3.7.9-10.el7rhgs.x86_64 How reproducible: The smaller the record size the more reproducible it is. Steps to Reproduce: 1. Do sequential / random writes with 4k record size. 2. 3. Actual results: Client side crash. Expected results: Normal operation. Additional info: Adding crash info below.
Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: pending frames: Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(1) op(WRITE) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(1) op(WRITE) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(1) op(WRITE) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: frame : type(0) op(0) Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: patchset: git://git.gluster.com/glusterfs.git Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: signal received: 6 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: time of crash: Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: 2016-08-27 20:46:16 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: configuration details: Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: argp 1 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: backtrace 1 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: dlfcn 1 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: libpthread 1 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: llistxattr 1 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: setfsid 1 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: spinlock 1 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: epoll.h 1 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: xattr.h 1 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: st_atim.tv_nsec 1 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: package-string: glusterfs 3.7.9 Aug 27 16:46:16 rhs-cli-10 gluster-mount[3458]: ---------
(gdb) bt #0 0x00007f06fbfd15f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f06fbfd2ce8 in __GI_abort () at abort.c:90 #2 0x00007f06fc011317 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f06fc11a988 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:196 #3 0x00007f06fc018fe1 in malloc_printerr (ar_ptr=0x7f06d0000020, ptr=<optimized out>, str=0x7f06fc1180a1 "invalid fastbin entry (free)", action=3) at malloc.c:5013 #4 _int_free (av=0x7f06d0000020, p=<optimized out>, have_lock=0) at malloc.c:3835 #5 0x00007f06eb8ce812 in dht_local_wipe (this=0x7f06ec0204a0, local=0x7f06ea51798c) at dht-helper.c:627 #6 0x00007f06eb916566 in dht_writev_cbk (frame=0x7f06fb4009f0, cookie=<optimized out>, this=<optimized out>, op_ret=16384, op_errno=0, prebuf=0x7f06e9e90b90, postbuf=0x7f06e9e90c00, xdata=0x7f06fdbbf460) at dht-inode-write.c:111 #7 0x00007f06ebb66836 in afr_writev_unwind (frame=0x7f06fb3fbeb0, this=<optimized out>) at afr-inode-write.c:252 #8 0x00007f06ebb66b09 in afr_writev_wind_cbk (frame=0x7f06fb3f95b4, cookie=0x1, this=0x7f06ec01f6a0, op_ret=<optimized out>, op_errno=0, prebuf=0x7f06f087a940, postbuf=0x7f06f087a9b0, xdata=0x7f06fdbbdbc4) at afr-inode-write.c:377 #9 0x00007f06ebddc9b9 in client3_3_writev_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f06fb4047c0) at client-rpc-fops.c:912 #10 0x00007f06fd6b1990 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f06ec165600, pollin=pollin@entry=0x7f06ec3d1290) at rpc-clnt.c:764 #11 0x00007f06fd6b1c4f in rpc_clnt_notify (trans=<optimized out>, mydata=0x7f06ec165630, event=<optimized out>, data=0x7f06ec3d1290) at rpc-clnt.c:905 #12 0x00007f06fd6ad793 in rpc_transport_notify (this=this@entry=0x7f06ec175300, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f06ec3d1290) at rpc-transport.c:546 #13 0x00007f06f23489a4 in socket_event_poll_in (this=this@entry=0x7f06ec175300) at socket.c:2353 #14 0x00007f06f234b5e4 in socket_event_handler (fd=fd@entry=10, idx=idx@entry=1, data=0x7f06ec175300, poll_in=1, poll_out=0, poll_err=0) at socket.c:2466 #15 0x00007f06fd951c4a in event_dispatch_epoll_handler (event=0x7f06f087ae80, event_pool=0x7f06fedaa5d0) at event-epoll.c:575 #16 event_dispatch_epoll_worker (data=0x7f06fee00840) at event-epoll.c:678 #17 0x00007f06fc74bdc5 in start_thread (arg=0x7f06f087b700) at pthread_create.c:308 #18 0x00007f06fc0921cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Closed as DUP. *** This bug has been marked as a duplicate of bug 1305406 ***
This was fixed in bug glibc-2.17-106.el7_2.6 as bug 1313308.