Description of problem: Geo-rep doesnt sync any file even the status says OK. If you do getfattr on the client-pid=-1 , it hangs . But if you dd getfattr on the normal mount , it is working . And also gsyncd process goes into 'D' state . Version-Release number of selected component (if applicable): master[c950d3f0e104fc7b78e493ad7ca0005a600b00f9r] How reproducible: Consistently Steps to Reproduce: 1. Start a geo-rep session between master and slave 2. Put some data on the master mount point 3. Check geo-rep status and also if data is synced on the slave Actual results: data is not syncing Expected results: Data should sync Additional info:
I am unable to reproduce this on my setup. syncing is perfectly fine - no hangs in {get,list}xattr. But yes, I was a witness when this issue was experienced. A quick check on kernel stack trace for the master process showed it stuck on sys_listxattr() for long. And this too only for client-pid = -1, regular mounts functioned fine. [This was reproduced in a VM setup, which i don't have access now]
Closing this as this was probably a setup issue. I followed up with the reporter of this bug and got to know that geo-replication works perfectly fine in the setup now.
this issue came up again in one of the setups. this time it was possible to get the backtrace of the client process: gdb) bt #0 0x00000033b94c1dc5 in internal_fnmatch (pattern=<optimized out>, string=string@entry=0x1a24710 "security.selinux", string_end=0x1a24720 "", no_leading_period=no_leading_period@entry=4, flags=flags@entry=4, ends=ends@entry=0x0, alloca_used=alloca_used@entry=0) at fnmatch_loop.c:183 #1 0x00000033b94c306e in __fnmatch (pattern=0x1a24710 "security.selinux", string=0x0, flags=4) at fnmatch.c:449 #2 0x00007f11b353ca3c in fuse_filter_xattr (key=0x1a24710 "security.selinux") at fuse-bridge.c:3015 #3 fuse_filter_xattr (key=0x1a24710 "security.selinux") at fuse-bridge.c:3009 #4 0x00007f11b51534e2 in dict_keys_join (value=value@entry=0x0, size=size@entry=0, dict=dict@entry=0x7f11b3942a90, filter_fn=filter_fn@entry=0x7f11b353ca00 <fuse_filter_xattr>) at dict.c:1183 #5 0x00007f11b35432ab in fuse_xattr_cbk (frame=0x7f11b3d7f148, cookie=<optimized out>, this=0x1937b30, op_ret=0, op_errno=0, dict=0x7f11b3942a90, xdata=0x0) at fuse-bridge.c:3064 #6 0x00007f11ab5cb43b in io_stats_getxattr_cbk (frame=0x7f11b3f8b02c, cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=0, dict=0x7f11b3942a90, xdata=0x0) at io-stats.c:1640 #7 0x00007f11ab7dcc06 in mdc_getxattr_cbk (frame=frame@entry=0x7f11b3f8b0d8, cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=op_errno@entry=0, xattr=<optimized out>, xdata=xdata@entry=0x0) at md-cache.c:1658 #8 0x00007f11b0618eb2 in dht_getxattr_cbk (frame=0x7f11b3f8b184, cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>, op_errno=0, xattr=<optimized out>, xdata=0x0) at dht-common.c:2041 #9 0x00007f11b0860363 in afr_getxattr_cbk (frame=0x7f11b3f8b230, cookie=<optimized out>, this=<optimized out>, op_ret=0, op_errno=0, dict=<optimized out>, xdata=0x0) at afr-inode-read.c:618 #10 0x00007f11b0acce1f in client3_3_getxattr_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f11b3f8b2dc) at client-rpc-fops.c:1115 #11 0x00007f11b4f36714 in rpc_clnt_handle_reply (clnt=clnt@entry=0x19ae0a0, pollin=0x1954d50) at rpc-clnt.c:771 #12 0x00007f11b4f36a7d in rpc_clnt_notify (trans=<optimized out>, mydata=0x19ae0d0, event=<optimized out>, data=<optimized out>) at rpc-clnt.c:890 #13 0x00007f11b4f332f3 in rpc_transport_notify (this=this@entry=0x19bdad0, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=<optimized out>) at rpc-transport.c:495 #14 0x00007f11b1b25564 in socket_event_poll_in (this=this@entry=0x19bdad0) at socket.c:2118 #15 0x00007f11b1b25cdc in socket_event_handler (fd=<optimized out>, idx=<optimized out>, data=0x19bdad0, poll_in=1, poll_out=0, poll_err=0) at socket.c:2230 #16 0x00007f11b5198d7a in event_dispatch_epoll_handler (i=<optimized out>, events=0x19536e0, event_pool=0x1936ea0) at event-epoll.c:384 #17 event_dispatch_epoll (event_pool=0x1936ea0) at event-epoll.c:445 #18 0x0000000000404926 in main (argc=5, argv=0x7fff14bf2dc8) at glusterfsd.c:1902 ... looks like it's stuck in fnmatch() for long.
REVIEW: http://review.gluster.org/4723 (libglusterfs/dict: fix infinite loop in dict_keys_join()) posted (#1) for review on master by Vijaykumar Koppad (vijaykumar.koppad)
REVIEW: http://review.gluster.org/4723 (libglusterfs/dict: fix infinite loop in dict_keys_join()) posted (#2) for review on master by Vijaykumar Koppad (vijaykumar.koppad)
REVIEW: http://review.gluster.org/4728 (libglusterfs/dict: fix infinite loop in dict_keys_join()) posted (#1) for review on release-3.4 by Vijaykumar Koppad (vkoppad)
COMMIT: http://review.gluster.org/4723 committed in master by Anand Avati (avati) ------ commit 1f7dadccd45863ebea8f60339f297ac551e89899 Author: Vijaykumar koppad <vijaykumar.koppad> Date: Tue Mar 26 17:42:32 2013 +0530 libglusterfs/dict: fix infinite loop in dict_keys_join() - missing "pairs = next" caused infinite loop Change-Id: I9171be5bec051de6095e135d616534ab49cd4797 BUG: 905871 Signed-off-by: Vijaykumar Koppad <vijaykumar.koppad> Reviewed-on: http://review.gluster.org/4723 Reviewed-by: Venky Shankar <vshankar> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Anand Avati <avati>
COMMIT: http://review.gluster.org/4728 committed in release-3.4 by Anand Avati (avati) ------ commit 1f7dadccd45863ebea8f60339f297ac551e89899 Author: Vijaykumar koppad <vijaykumar.koppad> Date: Tue Mar 26 17:42:32 2013 +0530 libglusterfs/dict: fix infinite loop in dict_keys_join() - missing "pairs = next" caused infinite loop Change-Id: I9171be5bec051de6095e135d616534ab49cd4797 BUG: 905871 Signed-off-by: Vijaykumar Koppad <vijaykumar.koppad> Reviewed-on: http://review.gluster.org/4723 Reviewed-by: Venky Shankar <vshankar> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Anand Avati <avati>