Bug 1240284 - Disperse volume: NFS crashed
Summary: Disperse volume: NFS crashed
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1223636 1240245 1243648
TreeView+ depends on / blocked
 
Reported: 2015-07-06 12:49 UTC by Pranith Kumar K
Modified: 2016-06-16 13:20 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.8rc2
Clone Of: 1240245
: 1243648 (view as bug list)
Environment:
Last Closed: 2016-06-16 13:20:35 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
nfs log file (2.72 MB, application/x-gzip)
2015-07-07 02:27 UTC, Pranith Kumar K
no flags Details

Comment 1 Pranith Kumar K 2015-07-06 12:50:42 UTC
Description of problem:
=======================

NFS crashed while compiling kernel and with append operations.

NFS log:
========
[2015-07-06 07:45:23.832283] A [MSGID: 0] [mem-pool.c:120:__gf_calloc] : no memory available for size (34917844374315) [call stack follows]
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f07d7bed826]
/usr/lib64/libglusterfs.so.0(_gf_msg_nomem+0x370)[0x7f07d7bee530]
/usr/lib64/libglusterfs.so.0(__gf_calloc+0x139)[0x7f07d7c24959]
/usr/lib64/libglusterfs.so.0(iobuf_get_from_stdalloc+0x6e)[0x7f07d7c2633e]
/usr/lib64/libglusterfs.so.0(iobuf_get2+0xa3)[0x7f07d7c278e3]
[2015-07-06 07:45:23.842655] E [ec-common.c:259:ec_check_complete] (--> 0-: Assertion failed: fop->resume == NULL
[2015-07-06 07:45:23.842749] E [ec-common.c:259:ec_check_complete] (--> 0-: Assertion failed: fop->resume == NULL
[2015-07-06 07:45:23.842813] W [MSGID: 112018] [nfs3.c:2454:nfs3svc_create_setattr_cbk] 0-nfs: 66e8714c: <gfid:13a56d81-6d50-447c-9efc-619fa8140be4>/linux-4.0.1/arch/ia64/include/uapi/asm/setup.h => -1 (Input/output error) [Input/output error]
[2015-07-06 07:45:23.842860] W [MSGID: 112199] [nfs3-helpers.c:3496:nfs3_log_newfh_res] 0-nfs-nfsv3: XID: 66e8714c, CREATE: NFS: 5(I/O error), POSIX: 5(Input/output error), FH: exportid e852b17e-c253-42a6-ba84-ff16b6faa2c5, gfid 3982d8b1-f4a0-4010-9866-965c55813d76, mountid d8655fd7-0000-0000-0000-000000000000
[2015-07-06 07:45:23.842770] W [MSGID: 122040] [ec-common.c:895:ec_prepare_update_cbk] 0-vol3-disperse-0: Failed to get size and version [Input/output error]
[2015-07-06 07:45:23.843047] E [ec-generic.c:1362:ec_xattrop_cbk] (--> 0-vol3-disperse-0: invalid argument: frame->local [Invalid argument]
[2015-07-06 07:45:23.843619] W [MSGID: 122056] [ec-combine.c:888:ec_combine_check] 0-vol3-disperse-0: Mismatching xdata in answers of 'INODELK'
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2015-07-06 07:45:23
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.1
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f07d7bed826]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7f07d7c0d3ef]
/lib64/libc.so.6(+0x3f55e326a0)[0x7f07d658c6a0]
/usr/lib64/libglusterfs.so.0(dict_destroy+0x30)[0x7f07d7be7c40]
/usr/lib64/glusterfs/3.7.1/xlator/protocol/client.so(client3_3_xattrop_cbk+0x1cd)[0x7f07ca4d193d]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f07d79bc445]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x142)[0x7f07d79bd8f2]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f07d79b8ad8]
/usr/lib64/glusterfs/3.7.1/rpc-transport/socket.so(+0xa255)[0x7f07cc71e255]
/usr/lib64/glusterfs/3.7.1/rpc-transport/socket.so(+0xbe4d)[0x7f07cc71fe4d]
/usr/lib64/libglusterfs.so.0(+0x89970)[0x7f07d7c51970]
/lib64/libpthread.so.0(+0x3f56207a51)[0x7f07d6cd8a51]
/lib64/libc.so.6(clone+0x6d)[0x7f07d664296d]
---------
[root@interstellar ~]# 



Backtrace:
=========
(gdb) bt
#0  dict_destroy (this=0x7f07d51b4c1c) at dict.c:564
#1  0x00007f07ca4d193d in client3_3_xattrop_cbk (req=<value optimized out>, 
    iov=<value optimized out>, count=<value optimized out>, myframe=0x7f07d57dc218)
    at client-rpc-fops.c:1859
#2  0x00007f07d79bc445 in rpc_clnt_handle_reply (clnt=0x7f07c4a643c0, 
    pollin=0x7f07b22da480) at rpc-clnt.c:766
#3  0x00007f07d79bd8f2 in rpc_clnt_notify (trans=<value optimized out>, 
    mydata=0x7f07c4a643f0, event=<value optimized out>, data=<value optimized out>)
    at rpc-clnt.c:894
#4  0x00007f07d79b8ad8 in rpc_transport_notify (this=<value optimized out>, 
    event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:543
#5  0x00007f07cc71e255 in socket_event_poll_in (this=0x7f07c4a74030)
    at socket.c:2290
#6  0x00007f07cc71fe4d in socket_event_handler (fd=<value optimized out>, 
    idx=<value optimized out>, data=0x7f07c4a74030, poll_in=1, poll_out=0, 
    poll_err=0) at socket.c:2403
#7  0x00007f07d7c51970 in event_dispatch_epoll_handler (data=0x7f07c4100000)
    at event-epoll.c:575
#8  event_dispatch_epoll_worker (data=0x7f07c4100000) at event-epoll.c:678
#9  0x00007f07d6cd8a51 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f07d664296d in clone () from /lib64/libc.so.6
(gdb)

Comment 2 Pranith Kumar K 2015-07-07 02:27:28 UTC
Created attachment 1049029 [details]
nfs log file

This contains logfile of the nfs mount. We need to look at the logs on 6th July.

Comment 3 Pranith Kumar K 2015-07-07 02:54:47 UTC
This is the output of ec_t
(gdb) p *$5
$6 = {xl = 0x7f07c401af90, healers = 0, heal_waiters = 0, nodes = 6, bits_for_nodes = 3, fragments = 4, redundancy = 2, fragment_size = 512, 
  stripe_size = 2048, up = 1, idx = 5, xl_up_count = 6, xl_up = 63, xl_notify_count = 6, xl_notify = 63, node_mask = 63, xl_list = 0x7f07c48f1420, 
  lock = 1, timer = 0x0, shutdown = _gf_false, background_heals = 0, heal_wait_qlen = 0, pending_fops = {next = 0x7f07b4bdf154, 
    prev = 0x7f07b4bf1618}, heal_waiting = {next = 0x7f07c48f0f80, prev = 0x7f07c48f0f80}, healing = {next = 0x7f07c48f0f90, prev = 0x7f07c48f0f90}, 
  fop_pool = 0x7f07c48f1060, cbk_pool = 0x7f07c48f11a0, lock_pool = 0x7f07c48f12e0, shd = {iamshd = _gf_false, enabled = _gf_true, timeout = 0, 
    index_healers = 0x0, full_healers = 0x0}, vol_uuid = '\000' <repeats 36 times>, leaf_to_subvolid = 0x7f07d51a5834}

Comment 4 Anand Avati 2015-07-07 09:15:13 UTC
REVIEW: http://review.gluster.org/11558 (cluster/ec: Fix use after free bug) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 5 Anand Avati 2015-07-07 11:41:57 UTC
REVIEW: http://review.gluster.org/11558 (cluster/ec: Fix use after free bug) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 6 Nagaprasad Sathyanarayana 2015-10-25 14:51:47 UTC
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.

Comment 7 Niels de Vos 2016-06-16 13:20:35 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.