This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 809409 - [0e4c74861f762d4af7b7d8ffce5384920a6aa335] I/O exits with ENOTCONN when replace-brick is started
[0e4c74861f762d4af7b7d8ffce5384920a6aa335] I/O exits with ENOTCONN when repla...
Status: CLOSED WONTFIX
Product: GlusterFS
Classification: Community
Component: transport (Show other bugs)
mainline
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Raghavendra G
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-04-03 05:53 EDT by Anush Shetty
Modified: 2012-04-04 12:23 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-04-04 12:23:45 EDT
Type: Bug
Regression: ---
Mount Type: fuse
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Anush Shetty 2012-04-03 05:53:21 EDT
Description of problem: When replace-brick is started with ongoing I/O on the fuse mount point, the I/O exited with an ENOTCONN


Version-Release number of selected component (if applicable): Upstream


How reproducible: Consistently


Steps to Reproduce:
1. while true; do dbench -s 10 -t 10 -D /mnt/gluster/; done
2. gluster volume replace-brick test2 shortwing:/falcon/d1 shortwing:/falcon/d2 start
3.
  
Actual results:
Running for 10 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 2 secs
1 of 10 processes prepared for launch   0 sec
[3] open ./clients/client1 failed for handle 16385 (Transport endpoint is not connected)
10 of 10 processes prepared for launch   0 sec
releasing clients
[3] open ./clients/client4 failed for handle 16385 (Transport endpoint is not connected)
(4) ERROR: handle 16385 was not found
Child failed with status 1
dbench version 4.00 - Copyright Andrew Tridgell 1999-2004



Expected results:

I/O should continue without exiting.

Additional info:

Client log:
[2012-04-03 15:10:10.165967] D [socket.c:193:__socket_rwv] 0-test2-client-0: EOF from peer 127.0.1.1:24009
[2012-04-03 15:10:10.166048] W [socket.c:1521:__socket_proto_state_machine] 0-test2-client-0: reading from socket failed. Error (Transport en
dpoint is not connected), peer (127.0.1.1:24009)
[2012-04-03 15:10:10.166077] D [socket.c:1807:socket_event_handler] 0-transport: disconnecting now
[2012-04-03 15:10:10.166424] E [rpc-clnt.c:382:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x123) [0x7fd6b866949f] 
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x155) [0x7fd6b8668a14] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x
1f) [0x7fd6b86684a2]))) 0-test2-client-0: forced unwinding frame type(GlusterFS 3.1) op(WRITE(13)) called at 2012-04-03 15:10:09.729963 (xid=
0x1643x)
[2012-04-03 15:10:10.166462] W [client3_1-fops.c:822:client3_1_writev_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is n
ot connected
[2012-04-03 15:10:10.166536] E [rpc-clnt.c:382:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x123) [0x7fd6b866949f] 
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x155) [0x7fd6b8668a14] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x
1f) [0x7fd6b86684a2]))) 0-test2-client-0: forced unwinding frame type(GlusterFS 3.1) op(WRITE(13)) called at 2012-04-03 15:10:09.730065 (xid=
0x1644x)
[2012-04-03 15:10:10.166583] W [client3_1-fops.c:822:client3_1_writev_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is n
ot connected
[2012-04-03 15:10:10.166638] D [name.c:158:client_fill_address_family] 0-test2-client-0: address-family not specified, guessing it to be inet
/inet6
[2012-04-03 15:10:10.166972] D [common-utils.c:161:gf_resolve_ip6] 0-resolver: returning ip-127.0.1.1 (port-24007) for hostname: shortwing an
d port: 24007
[2012-04-03 15:10:10.167045] I [socket.c:2314:socket_submit_request] 0-test2-client-0: not connected (priv->connected = 0)
[2012-04-03 15:10:10.167067] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1673x Program: Gluste
rFS 3.1, ProgVers: 330, Proc: 15) to rpc-transport (test2-client-0)
[2012-04-03 15:10:10.167093] W [client3_1-fops.c:882:client3_1_flush_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is no
t connected
[2012-04-03 15:10:10.167115] D [client.c:243:client_submit_request] 0-test2-client-0: rpc_clnt_submit failed
[2012-04-03 15:10:10.167146] W [fuse-bridge.c:949:fuse_err_cbk] 0-glusterfs-fuse: 2800: FLUSH() ERR => -1 (Transport endpoint is not connected)
[2012-04-03 15:10:10.167247] E [rpc-clnt.c:382:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x123) [0x7fd6b866949f] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x155) [0x7fd6b8668a14] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x1f) [0x7fd6b86684a2]))) 0-test2-client-0: forced unwinding frame type(GlusterFS 3.1) op(WRITE(13)) called at 2012-04-03 15:10:09.746359 (xid=0x1651x)
[2012-04-03 15:10:10.167275] W [client3_1-fops.c:822:client3_1_writev_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected
[2012-04-03 15:10:10.167290] D [client3_1-fops.c:2767:client_fdctx_destroy] 0-test2-client-0: sending release on fd
[2012-04-03 15:10:10.167318] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1674x Program: GlusterFS 3.1, ProgVers: 330, Proc: 13) to rpc-transport (test2-client-0)
[2012-04-03 15:10:10.167349] W [client3_1-fops.c:822:client3_1_writev_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected
[2012-04-03 15:10:10.167358] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1675x Program: GlusterFS 3.1, ProgVers: 330, Proc: 41) to rpc-transport (test2-client-0)
[2012-04-03 15:10:10.167399] D [client.c:243:client_submit_request] 0-test2-client-0: rpc_clnt_submit failed
[2012-04-03 15:10:10.167378] D [client3_1-fops.c:104:client_submit_vec_request] 0-test2-client-0: rpc_clnt_submit failed
[2012-04-03 15:10:10.167446] W [client3_1-fops.c:4000:client3_1_writev] 0-test2-client-0: failed to send the fop
[2012-04-03 15:10:10.167478] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1676x Program: GlusterFS 3.1, ProgVers: 330, Proc: 13) to rpc-transport (test2-client-0)
[2012-04-03 15:10:10.167502] W [client3_1-fops.c:822:client3_1_writev_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected
[2012-04-03 15:10:10.167537] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1677x Program: GlusterFS 3.1, ProgVers: 330, Proc: 15) to rpc-transport (test2-client-0)
[2012-04-03 15:10:10.167562] W [client3_1-fops.c:882:client3_1_flush_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected
[2012-04-03 15:10:10.167583] D [client.c:243:client_submit_request] 0-test2-client-0: rpc_clnt_submit failed
[2012-04-03 15:10:10.167594] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1678x Program: GlusterFS 3.1, ProgVers: 330, Proc: 27) to rpc-transport (test2-client-0)
[2012-04-03 15:10:10.167609] W [fuse-bridge.c:949:fuse_err_cbk] 0-glusterfs-fuse: 2903: FLUSH() ERR => -1 (Transport endpoint is not connected)
[2012-04-03 15:10:10.167628] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected. Path: /clients/client1
[2012-04-03 15:10:10.167657] D [client3_1-fops.c:104:client_submit_vec_request] 0-test2-client-0: rpc_clnt_submit failed
[2012-04-03 15:10:10.167735] W [client3_1-fops.c:4000:client3_1_writev] 0-test2-client-0: failed to send the fop
[2012-04-03 15:10:10.167753] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-test2-client-0: failed to submit rpc-request (XID: 0x1679x Program: GlusterFS 3.1, ProgVers: 330, Proc: 27) to rpc-transport (test2-client-0)
[2012-04-03 15:10:10.167800] W [client3_1-fops.c:2607:client3_1_lookup_cbk] 0-test2-client-0: remote operation failed: Transport endpoint is not connected. Path: /clients/client1
[2012-04-03 15:10:10.167837] W [fuse-bridge.c:272:fuse_entry_cbk] 0-glusterfs-fuse: 2942: LOOKUP() /clients/client1 => -1 (Transport endpoint is not connected)
Comment 1 Raghavendra G 2012-04-04 12:23:45 EDT
This is a known issue. As of now, Graph switch cannot be done seamlessly at least for the cases like replace-brick when there is no translator like replicate that can provide High availability on client side. What cannot be done seamlessly during graph-switch cannot be assured by the code that does cleanup of sockets too. Hence closing this bug for now and can be reopened when requirement for such functionality arises.

Note You need to log in before you can comment on or make changes to this bug.