Description of problem: When replace-brick is started with ongoing I/O on the fuse mount point, the I/O exited with an ENOTCONN Version-Release number of selected component (if applicable): Upstream How reproducible: Consistently Steps to Reproduce: 1. while true; do dbench -s 10 -t 10 -D /mnt/gluster/; done 2. gluster volume replace-brick test2 shortwing:/falcon/d1 shortwing:/falcon/d2 start 3. Actual results: Running for 10 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 2 secs 1 of 10 processes prepared for launch 0 sec [3] open ./clients/client1 failed for handle 16385 (Transport endpoint is not connected) 10 of 10 processes prepared for launch 0 sec releasing clients [3] open ./clients/client4 failed for handle 16385 (Transport endpoint is not connected) (4) ERROR: handle 16385 was not found Child failed with status 1 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Expected results: I/O should continue without exiting. Additional info: Client log: [2012-04-03 15:10:10.165967] D [socket.c:193:__socket_rwv] 0-test2-client-0: EOF from peer 127.0.1.1:24009 [2012-04-03 15:10:10.166048] W [socket.c:1521:__socket_proto_state_machine] 0-test2-client-0: reading from socket failed. Error (Transport en dpoint is not connected), peer (127.0.1.1:24009) [2012-04-03 15:10:10.166077] D [socket.c:1807:socket_event_handler] 0-transport: disconnecting now [2012-04-03 15:10:10.166424] E [rpc-clnt.c:382:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x123) [0x7fd6b866949f] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x155) [0x7fd6b8668a14] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x 1f) [0x7fd6b86684a2]))) 0-test2-client-0: forced unwinding frame type(GlusterFS 3.1) op(WRITE(13)) called at 2012-04-03 15:10:09.729963 (xid= 0x1643x) [2012-04-03 15:10:10.166462] W [client3_1-fops.c:822:client3_1_writev_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is n ot connected [2012-04-03 15:10:10.166536] E [rpc-clnt.c:382:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x123) [0x7fd6b866949f] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x155) [0x7fd6b8668a14] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x 1f) [0x7fd6b86684a2]))) 0-test2-client-0: forced unwinding frame type(GlusterFS 3.1) op(WRITE(13)) called at 2012-04-03 15:10:09.730065 (xid= 0x1644x) [2012-04-03 15:10:10.166583] W [client3_1-fops.c:822:client3_1_writev_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is n ot connected [2012-04-03 15:10:10.166638] D [name.c:158:client_fill_address_family] 0-test2-client-0: address-family not specified, guessing it to be inet /inet6 [2012-04-03 15:10:10.166972] D [common-utils.c:161:gf_resolve_ip6] 0-resolver: returning ip-127.0.1.1 (port-24007) for hostname: shortwing an d port: 24007 [2012-04-03 15:10:10.167045] I [socket.c:2314:socket_submit_request] 0-test2-client-0: not connected (priv->connected = 0) [2012-04-03 15:10:10.167067] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1673x Program: Gluste rFS 3.1, ProgVers: 330, Proc: 15) to rpc-transport (test2-client-0) [2012-04-03 15:10:10.167093] W [client3_1-fops.c:882:client3_1_flush_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is no t connected [2012-04-03 15:10:10.167115] D [client.c:243:client_submit_request] 0-test2-client-0: rpc_clnt_submit failed [2012-04-03 15:10:10.167146] W [fuse-bridge.c:949:fuse_err_cbk] 0-glusterfs-fuse: 2800: FLUSH() ERR => -1 (Transport endpoint is not connected) [2012-04-03 15:10:10.167247] E [rpc-clnt.c:382:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x123) [0x7fd6b866949f] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x155) [0x7fd6b8668a14] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x1f) [0x7fd6b86684a2]))) 0-test2-client-0: forced unwinding frame type(GlusterFS 3.1) op(WRITE(13)) called at 2012-04-03 15:10:09.746359 (xid=0x1651x) [2012-04-03 15:10:10.167275] W [client3_1-fops.c:822:client3_1_writev_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected [2012-04-03 15:10:10.167290] D [client3_1-fops.c:2767:client_fdctx_destroy] 0-test2-client-0: sending release on fd [2012-04-03 15:10:10.167318] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1674x Program: GlusterFS 3.1, ProgVers: 330, Proc: 13) to rpc-transport (test2-client-0) [2012-04-03 15:10:10.167349] W [client3_1-fops.c:822:client3_1_writev_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected [2012-04-03 15:10:10.167358] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1675x Program: GlusterFS 3.1, ProgVers: 330, Proc: 41) to rpc-transport (test2-client-0) [2012-04-03 15:10:10.167399] D [client.c:243:client_submit_request] 0-test2-client-0: rpc_clnt_submit failed [2012-04-03 15:10:10.167378] D [client3_1-fops.c:104:client_submit_vec_request] 0-test2-client-0: rpc_clnt_submit failed [2012-04-03 15:10:10.167446] W [client3_1-fops.c:4000:client3_1_writev] 0-test2-client-0: failed to send the fop [2012-04-03 15:10:10.167478] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1676x Program: GlusterFS 3.1, ProgVers: 330, Proc: 13) to rpc-transport (test2-client-0) [2012-04-03 15:10:10.167502] W [client3_1-fops.c:822:client3_1_writev_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected [2012-04-03 15:10:10.167537] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1677x Program: GlusterFS 3.1, ProgVers: 330, Proc: 15) to rpc-transport (test2-client-0) [2012-04-03 15:10:10.167562] W [client3_1-fops.c:882:client3_1_flush_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected [2012-04-03 15:10:10.167583] D [client.c:243:client_submit_request] 0-test2-client-0: rpc_clnt_submit failed [2012-04-03 15:10:10.167594] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1678x Program: GlusterFS 3.1, ProgVers: 330, Proc: 27) to rpc-transport (test2-client-0) [2012-04-03 15:10:10.167609] W [fuse-bridge.c:949:fuse_err_cbk] 0-glusterfs-fuse: 2903: FLUSH() ERR => -1 (Transport endpoint is not connected) [2012-04-03 15:10:10.167628] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected. Path: /clients/client1 [2012-04-03 15:10:10.167657] D [client3_1-fops.c:104:client_submit_vec_request] 0-test2-client-0: rpc_clnt_submit failed [2012-04-03 15:10:10.167735] W [client3_1-fops.c:4000:client3_1_writev] 0-test2-client-0: failed to send the fop [2012-04-03 15:10:10.167753] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1679x Program: GlusterFS 3.1, ProgVers: 330, Proc: 27) to rpc-transport (test2-client-0) [2012-04-03 15:10:10.167800] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected. Path: /clients/client1 [2012-04-03 15:10:10.167837] W [fuse-bridge.c:272:fuse_entry_cbk] 0-glusterfs-fuse: 2942: LOOKUP() /clients/client1 => -1 (Transport endpoint is not connected)
This is a known issue. As of now, Graph switch cannot be done seamlessly at least for the cases like replace-brick when there is no translator like replicate that can provide High availability on client side. What cannot be done seamlessly during graph-switch cannot be assured by the code that does cleanup of sockets too. Hence closing this bug for now and can be reopened when requirement for such functionality arises.