Hide Forgot
GlusterFS client crashed when run with error-gen(ENOMEM). It was on 3.0.1rc2. This is the backtrace of the core. It was observed on my local machine. [?1034hGNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-slackware-linux"... warning: Can't read pathname for load map: Input/output error. Reading symbols from /opt/glusterfs/3.0.1rc2/lib/libglusterfs.so.0...done. Loaded symbols for /opt/glusterfs/3.0.1rc2/lib/libglusterfs.so.0 Reading symbols from /lib64/libdl.so.2...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /lib64/libpthread.so.0...done. Loaded symbols for /lib64/libpthread.so.0 Reading symbols from /lib64/libc.so.6...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /opt/glusterfs/3.0.1rc2/lib/glusterfs/3.0.1rc2/xlator/protocol/client.so...done. Loaded symbols for /opt/glusterfs/3.0.1rc2/lib/glusterfs/3.0.1rc2/xlator/protocol/client.so Reading symbols from /opt/glusterfs/3.0.1rc2/lib/glusterfs/3.0.1rc2/xlator/cluster/stripe.so...done. Loaded symbols for /opt/glusterfs/3.0.1rc2/lib/glusterfs/3.0.1rc2/xlator/cluster/stripe.so Reading symbols from /opt/glusterfs/3.0.1rc2/lib/glusterfs/3.0.1rc2/xlator/mount/fuse.so...done. Loaded symbols for /opt/glusterfs/3.0.1rc2/lib/glusterfs/3.0.1rc2/xlator/mount/fuse.so Reading symbols from /opt/glusterfs/3.0.1rc2/lib/glusterfs/3.0.1rc2/transport/socket.so...done. Loaded symbols for /opt/glusterfs/3.0.1rc2/lib/glusterfs/3.0.1rc2/transport/socket.so Reading symbols from /lib64/libnss_files.so.2...done. Loaded symbols for /lib64/libnss_files.so.2 Reading symbols from /usr/lib64/libgcc_s.so.1...done. Loaded symbols for /usr/lib64/libgcc_s.so.1 Core was generated by `/opt/glusterfs/3.0.1rc2/sbin/glusterfs -f str_clnt.vol /d/glusterfs/mnt/client1'. Program terminated with signal 11, Segmentation fault. [New process 4122] [New process 4123] [New process 4128] #0 0x00007f930a46bca7 in inode_link (inode=0x0, parent=0x612a70, name=0x6190b1 "run13347", stbuf=0x619168) at inode.c:723 723 table = inode->table; (gdb) bt #0 0x00007f930a46bca7 in inode_link (inode=0x0, parent=0x612a70, name=0x6190b1 "run13347", stbuf=0x619168) at inode.c:723 #1 0x00007f9308e59fa9 in fuse_entry_cbk (frame=0x619e48, cookie=0x619a60, this=0x60b5a0, op_ret=0, op_errno=12, inode=0x0, buf=0x619168) at fuse-bridge.c:480 #2 0x00007f9308e5a70d in fuse_lookup_cbk (frame=0x619e48, cookie=0x619a60, this=0x60b5a0, op_ret=0, op_errno=12, inode=0x0, stat=0x619168, dict=0x0, postparent=0x6193a8) at fuse-bridge.c:565 #3 0x00007f930907426b in stripe_lookup_cbk (frame=0x619a60, cookie=0x618570, this=0x6115d0, op_ret=-1, op_errno=12, inode=0x618670, buf=0x7fff2df5bc50, dict=0x0, postparent=0x7fff2df5bbc0) at stripe.c:675 #4 0x00007f93092a4bbb in client_lookup_cbk (frame=0x618570, hdr=0x619ac0, hdrlen=272, iobuf=0x0) at client-protocol.c:4954 #5 0x00007f93092a9402 in protocol_client_interpret (this=0x60f860, trans=0x615d60, hdr_p=0x619ac0 "", hdrlen=272, iobuf=0x0) at client-protocol.c:6508 #6 0x00007f93092aa0c8 in protocol_client_pollin (this=0x60f860, trans=0x615d60) at client-protocol.c:6806 #7 0x00007f93092aa73c in notify (this=0x60f860, event=2, data=0x615d60) at client-protocol.c:6925 #8 0x00007f930a45b1f1 in xlator_notify (xl=0x60f860, event=2, data=0x615d60) at xlator.c:923 #9 0x00007f930844cfb3 in socket_event_poll_in (this=0x615d60) at socket.c:729 #10 0x00007f930844d2ad in socket_event_handler (fd=9, idx=1, data=0x615d60, poll_in=1, poll_out=0, poll_err=0) at socket.c:829 #11 0x00007f930a47fd0b in event_dispatch_epoll_handler (event_pool=0x60a350, events=0x617550, i=0) at event.c:804 #12 0x00007f930a47feda in event_dispatch_epoll (event_pool=0x60a350) at event.c:867 #13 0x00007f930a4801eb in event_dispatch (event_pool=0x60a350) at event.c:975 #14 0x000000000040626f in main (argc=6, argv=0x7fff2df5c928) at glusterfsd.c:1388 (gdb) p inode $1 = (inode_t *) 0x0 (gdb) f 4 #4 0x00007f93092a4bbb in client_lookup_cbk (frame=0x618570, hdr=0x619ac0, hdrlen=272, iobuf=0x0) at client-protocol.c:4954 4954 STACK_UNWIND (frame, op_ret, op_errno, inode, &stbuf, xattr, (gdb) f 3 #3 0x00007f930907426b in stripe_lookup_cbk (frame=0x619a60, cookie=0x618570, this=0x6115d0, op_ret=-1, op_errno=12, inode=0x618670, buf=0x7fff2df5bc50, dict=0x0, postparent=0x7fff2df5bbc0) at stripe.c:675 675 STACK_UNWIND (frame, local->op_ret, local->op_errno, (gdb) l 670 local->postparent.st_size = local->postparent_size; 671 } 672 673 loc_wipe (&local->loc); 674 675 STACK_UNWIND (frame, local->op_ret, local->op_errno, 676 local->inode, &local->stbuf, local->dict, 677 &local->postparent); 678 679 if (tmp_inode) (gdb) p local->inode $2 = (inode_t *) 0x0 (gdb)q
http://patches.gluster.com/patch/2767/ fixes this and has been accepted to the repo.
The bug is still there in 3.0.3rc1. Can be reproduced by just running system_light script. Failes in LTP tests. This is the backtrace of the core generated. Reading symbols from /usr/lib64/libgcc_s.so.1...done. Loaded symbols for /usr/lib64/libgcc_s.so.1 Core was generated by `/opt/glusterfs/3.0.3rc1/sbin/glusterfs -f str_clnt.vol /mnt/hd/ -l /tmp/str_cln'. Program terminated with signal 11, Segmentation fault. [New process 7982] [New process 7983] [New process 7988] #0 0x00007fa3a1f05ef7 in inode_link (inode=0x0, parent=0x612a10, name=0x618f61 "run7999", stbuf=0x619388) at inode.c:734 734 table = inode->table; (gdb) bt #0 0x00007fa3a1f05ef7 in inode_link (inode=0x0, parent=0x612a10, name=0x618f61 "run7999", stbuf=0x619388) at inode.c:734 #1 0x00007fa3a08f2fa9 in fuse_entry_cbk (frame=0x6191e8, cookie=0x619250, this=0x60b550, op_ret=0, op_errno=17, inode=0x0, buf=0x619388) at fuse-bridge.c:480 #2 0x00007fa3a08f34c2 in fuse_newentry_cbk (frame=0x6191e8, cookie=0x619250, this=0x60b550, op_ret=0, op_errno=17, inode=0x0, buf=0x619388, preparent=0x619538, postparent=0x6195c8) at fuse-bridge.c:537 #3 0x00007fa3a0b0c455 in stripe_stack_unwind_inode_cbk (frame=0x619250, cookie=0x6198f0, this=0x611580, op_ret=-1, op_errno=17, inode=0x618da0, buf=0x7fff401d2a60, preparent=0x7fff401d29d0, postparent=0x7fff401d2940) at stripe.c:507 #4 0x00007fa3a0d3ddc9 in client_mkdir_cbk (frame=0x6198f0, hdr=0x619c40, hdrlen=348, iobuf=0x0) at client-protocol.c:4706 #5 0x00007fa3a0d43447 in protocol_client_interpret (this=0x610a60, trans=0x6144f0, hdr_p=0x619c40 "", hdrlen=348, iobuf=0x0) at client-protocol.c:6529 #6 0x00007fa3a0d4410d in protocol_client_pollin (this=0x610a60, trans=0x6144f0) at client-protocol.c:6827 #7 0x00007fa3a0d44781 in notify (this=0x610a60, event=2, data=0x6144f0) at client-protocol.c:6946 #8 0x00007fa3a1ef524c in xlator_notify (xl=0x610a60, event=2, data=0x6144f0) at xlator.c:924 #9 0x00007fa39fee608f in socket_event_poll_in (this=0x6144f0) at socket.c:731 #10 0x00007fa39fee6389 in socket_event_handler (fd=13, idx=5, data=0x6144f0, poll_in=1, poll_out=0, poll_err=0) at socket.c:831 #11 0x00007fa3a1f19f5b in event_dispatch_epoll_handler (event_pool=0x60a320, events=0x6174e0, i=0) at event.c:804 #12 0x00007fa3a1f1a12a in event_dispatch_epoll (event_pool=0x60a320) at event.c:867 #13 0x00007fa3a1f1a43b in event_dispatch (event_pool=0x60a320) at event.c:975 #14 0x000000000040631f in main (argc=6, argv=0x7fff401d3728) at glusterfsd.c:1413 (gdb) In fuse_entry_cbk and fuse_newentry_cbk op_ret is 0 even though op_errno is set. Hence reopening.
Recent patches sent regarding cleaning local->failed should fix this issue.
Below patchset should fix this bug. http://git.gluster.com/?p=glusterfs.git;a=commit;h=a35b3f0c302d920bcb4c282677b14e2eba789ec9 http://git.gluster.com/?p=glusterfs.git;a=commit;h=536e5a2208d162801367f8a4189a29ca7fd8f1a9