Hide Forgot
While running a map reduce application, mount point got hung. I have pasted the client log below. [2011-08-11 00:14:17.495328] I [client-handshake.c:916:client_setvolume_cbk] 0-hosdu-client-4: Connected to 10.1.11.33:24009, attached to remote volume '/data/brick'. [2011-08-11 00:14:17.495365] I [client-handshake.c:916:client_setvolume_cbk] 0-hosdu-client-6: Connected to 10.1.11.35:24009, attached to remote volume '/data/brick'. [2011-08-11 00:14:17.495381] I [afr-common.c:3291:afr_notify] 0-hosdu-replicate-3: Subvolume 'hosdu-client-6' came back up; going online. [2011-08-11 00:14:17.495404] I [client-handshake.c:916:client_setvolume_cbk] 0-hosdu-client-7: Connected to 10.1.11.36:24009, attached to remote volume '/data/brick'. [2011-08-11 00:14:17.499702] I [client-handshake.c:916:client_setvolume_cbk] 0-hosdu-client-1: Connected to 10.1.11.22:24009, attached to remote volume '/data/brick'. [2011-08-11 00:14:17.505562] I [fuse-bridge.c:3336:fuse_graph_setup] 0-fuse: switched to graph 0 [2011-08-11 00:14:17.505757] I [fuse-bridge.c:2924:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13 [2011-08-11 00:14:17.507336] I [afr-common.c:1632:afr_set_root_inode_on_first_lookup] 0-hosdu-replicate-0: added root inode [2011-08-11 00:14:17.507443] I [afr-common.c:1632:afr_set_root_inode_on_first_lookup] 0-hosdu-replicate-1: added root inode [2011-08-11 00:14:17.510505] I [afr-common.c:1632:afr_set_root_inode_on_first_lookup] 0-hosdu-replicate-2: added root inode [2011-08-11 00:14:17.510565] I [afr-common.c:1632:afr_set_root_inode_on_first_lookup] 0-hosdu-replicate-3: added root inode [2011-08-11 06:59:13.127159] W [socket.c:205:__socket_rwv] 0-hosdu-client-4: readv failed (Connection timed out) [2011-08-11 06:59:13.141706] W [socket.c:1511:__socket_proto_state_machine] 0-hosdu-client-4: reading from socket failed. Error (Connection timed out), peer (10.1.11.33:24009) [2011-08-11 06:59:13.141758] I [client.c:1883:client_rpc_notify] 0-hosdu-client-4: disconnected [2011-08-11 06:59:35.277040] E [socket.c:1714:socket_connect_finish] 0-hosdu-client-4: connection to 10.1.11.33:24009 failed (No route to host) [2011-08-11 07:39:31.608399] I [client3_1-fops.c:547:client3_1_rmdir_cbk] 0-hosdu-client-5: remote operation failed: Directory not empty [2011-08-11 07:39:31.615043] I [client3_1-fops.c:289:client3_1_mkdir_cbk] 0-hosdu-client-6: remote operation failed: File exists [2011-08-11 07:39:31.616316] I [client3_1-fops.c:289:client3_1_mkdir_cbk] 0-hosdu-client-5: remote operation failed: File exists [2011-08-11 07:39:31.616516] E [afr-common.c:379:afr_inode_set_read_ctx] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7fc6ffefc6a5] (-->/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_mkdir_cbk+0x8f0) [0x7fc6fa890400] (-->/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_mkdir_wind_cbk+0x11c) [0x7fc6fa61aa5c]))) 0-: Assertion failed: read_child >= 0 [2011-08-11 07:39:31.628356] I [client3_1-fops.c:289:client3_1_mkdir_cbk] 0-hosdu-client-7: remote operation failed: File exists [2011-08-11 07:39:31.628407] E [afr-common.c:379:afr_inode_set_read_ctx] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7fc6ffefc6a5] (-->/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_mkdir_cbk+0x8f0) [0x7fc6fa890400] (-->/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_mkdir_wind_cbk+0x11c) [0x7fc6fa61aa5c]))) 0-: Assertion failed: read_child >= 0 [2011-08-11 07:44:06.463144] I [afr-common.c:1135:afr_launch_self_heal] 0-hosdu-replicate-3: background entry self-heal triggered. path: /tmp/hadoop-root/mapred/system [2011-08-11 07:44:06.465936] I [afr-self-heal-common.c:1992:afr_self_heal_completion_cbk] 0-hosdu-replicate-3: background entry self-heal completed on /tmp/hadoop-root/mapred/system [2011-08-11 07:44:06.652971] W [client3_1-fops.c:4350:client3_1_getxattr] 0-hosdu-client-4: failed to send the fop: Transport endpoint is not connected [2011-08-11 07:44:06.655843] W [client3_1-fops.c:4350:client3_1_getxattr] 0-hosdu-client-4: failed to send the fop: Transport endpoint is not connected [2011-08-11 07:44:28.230041] I [afr-common.c:1135:afr_launch_self_heal] 0-hosdu-replicate-3: background entry self-heal triggered. path: /tmp/hadoop-root/mapred/system [2011-08-11 07:44:28.232255] I [afr-self-heal-common.c:1992:afr_self_heal_completion_cbk] 0-hosdu-replicate-3: background entry self-heal completed on /tmp/hadoop-root/mapred/system [2011-08-11 07:44:28.259581] I [client3_1-fops.c:547:client3_1_rmdir_cbk] 0-hosdu-client-5: remote operation failed: Directory not empty [2011-08-11 07:44:28.264018] I [client3_1-fops.c:289:client3_1_mkdir_cbk] 0-hosdu-client-6: remote operation failed: File exists [2011-08-11 07:44:28.264072] I [client3_1-fops.c:289:client3_1_mkdir_cbk] 0-hosdu-client-7: remote operation failed: File exists [2011-08-11 07:44:28.264151] E [afr-common.c:379:afr_inode_set_read_ctx] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7fc6ffefc6a5] (-->/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_mkdir_cbk+0x8f0) [0x7fc6fa890400] (-->/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_mkdir_wind_cbk+0x11c) [0x7fc6fa61aa5c]))) 0-: Assertion failed: read_child >= 0 [2011-08-11 07:44:28.266214] I [client3_1-fops.c:289:client3_1_mkdir_cbk] 0-hosdu-client-5: remote operation failed: File exists [2011-08-11 07:44:28.266263] E [afr-common.c:379:afr_inode_set_read_ctx] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7fc6ffefc6a5] (-->/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_mkdir_cbk+0x8f0) [0x7fc6fa890400] (-->/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_mkdir_wind_cbk+0x11c) [0x7fc6fa61aa5c]))) 0-: Assertion failed: read_child >= 0 [2011-08-11 07:44:29.85593] W [client3_1-fops.c:4350:client3_1_getxattr] 0-hosdu-client-4: failed to send the fop: Transport endpoint is not connected pending frames: frame : type(1) op(GETXATTR) frame : type(1) op(GETXATTR) patchset: git://git.gluster.com/glusterfs.git signal received: 6
After the above incident I killed the client process and remounted it. Not it's not allowing me to delete a directory. rm -rf on a empty directory says "directory not empty". Following error was seen in the client log. [2011-08-11 08:42:40.499847] I [afr-common.c:1135:afr_launch_self_heal] 0-hosdu-replicate-3: background entry self-heal triggered. path: /PiEstimator_TMP_3_141592654/out/_temporary [2011-08-11 08:42:40.502585] I [afr-self-heal-common.c:1992:afr_self_heal_completion_cbk] 0-hosdu-replicate-3: background entry self-heal completed on /PiEstimator_TMP_3_141592654/out/_temporary [2011-08-11 08:42:40.544005] I [client3_1-fops.c:547:client3_1_rmdir_cbk] 0-hosdu-client-5: remote operation failed: Directory not empty [2011-08-11 08:42:40.549590] I [client3_1-fops.c:289:client3_1_mkdir_cbk] 0-hosdu-client-6: remote operation failed: File exists [2011-08-11 08:42:40.549889] I [client3_1-fops.c:289:client3_1_mkdir_cbk] 0-hosdu-client-7: remote operation failed: File exists [2011-08-11 08:42:40.550441] E [afr-common.c:379:afr_inode_set_read_ctx] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7f34bf5476a5] (-->/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_mkdir_cbk+0x8f0) [0x7f34b9edb400] (-->/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_mkdir_wind_cbk+0x11c) [0x7f34b9c65a5c]))) 0-: Assertion failed: read_child >= 0 [2011-08-11 08:42:40.554188] I [client3_1-fops.c:289:client3_1_mkdir_cbk] 0-hosdu-client-5: remote operation failed: File exists [2011-08-11 08:42:40.554481] E [afr-common.c:379:afr_inode_set_read_ctx] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7f34bf5476a5] (-->/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_mkdir_cbk+0x8f0) [0x7f34b9edb400] (-->/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_mkdir_wind_cbk+0x11c) [0x7f34b9c65a5c]))) 0-: Assertion failed: read_child >= 0
(In reply to comment #1) > After the above incident I killed the client process and remounted it. Not it's > not allowing me to delete a directory. rm -rf on a empty directory says > "directory not empty". Following error was seen in the client log. > > [2011-08-11 08:42:40.499847] I [afr-common.c:1135:afr_launch_self_heal] > 0-hosdu-replicate-3: background entry self-heal triggered. path: > /PiEstimator_TMP_3_141592654/out/_temporary > [2011-08-11 08:42:40.502585] I > [afr-self-heal-common.c:1992:afr_self_heal_completion_cbk] 0-hosdu-replicate-3: > background entry self-heal completed on > /PiEstimator_TMP_3_141592654/out/_temporary > [2011-08-11 08:42:40.544005] I [client3_1-fops.c:547:client3_1_rmdir_cbk] > 0-hosdu-client-5: remote operation failed: Directory not empty > [2011-08-11 08:42:40.549590] I [client3_1-fops.c:289:client3_1_mkdir_cbk] > 0-hosdu-client-6: remote operation failed: File exists > [2011-08-11 08:42:40.549889] I [client3_1-fops.c:289:client3_1_mkdir_cbk] > 0-hosdu-client-7: remote operation failed: File exists > [2011-08-11 08:42:40.550441] E [afr-common.c:379:afr_inode_set_read_ctx] > (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7f34bf5476a5] > (-->/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_mkdir_cbk+0x8f0) > [0x7f34b9edb400] > (-->/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_mkdir_wind_cbk+0x11c) > [0x7f34b9c65a5c]))) 0-: Assertion failed: read_child >= 0 > [2011-08-11 08:42:40.554188] I [client3_1-fops.c:289:client3_1_mkdir_cbk] > 0-hosdu-client-5: remote operation failed: File exists > [2011-08-11 08:42:40.554481] E [afr-common.c:379:afr_inode_set_read_ctx] > (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7f34bf5476a5] > (-->/usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_mkdir_cbk+0x8f0) > [0x7f34b9edb400] > (-->/usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_mkdir_wind_cbk+0x11c) > [0x7f34b9c65a5c]))) 0-: Assertion failed: read_child >= 0 Johnny already sent a patch for this long ago, I think its not been accepted. http://review.gluster.com/#change,104 The patch above needs to be merged Pranith
*** This bug has been marked as a duplicate of bug 3251 ***