Bug 764823 (GLUSTER-3091)

Summary: rebalance fails with "transport endpoint not connected" in 3.2.1 rdma set-up
Product: [Community] GlusterFS Reporter: M S Vishwanath Bhat <vbhat>
Component: rdmaAssignee: Raghavendra G <raghavendra>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.2.1CC: gluster-bugs, mzywusko
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description M S Vishwanath Bhat 2011-06-27 13:25:37 UTC
I created a 2 node distribute volume with rdma transport type. I added on more brick to the volume. Now I mounted and pushed in some data to the volume. Now When I run the 'rebalance', it says "rebalance failed". The same thing happens with rebalance 'fix-layout' and 'data-migration' options.

glusterd logs says that "transport endpoint is not connected". 

[2011-06-27 12:49:17.60034] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket
 failed. Error (Transport endpoint is not connected), peer (127.0.0.1:1013)                                        
[2011-06-27 12:49:17.60055] D [socket.c:1768:socket_event_handler] 0-transport: disconnecting now

Comment 1 Anand Avati 2011-07-01 05:26:08 UTC
PATCH: http://patches.gluster.com/patch/7697 in master (mount/fuse: wait till CHILD_UP event is recieved to do the first lookup.)

Comment 2 Anand Avati 2011-07-01 05:26:15 UTC
PATCH: http://patches.gluster.com/patch/7698 in master (rpc-transport/rdma: call ibv_fork_init to make rdma work with fork.)

Comment 3 Anand Avati 2011-07-01 05:26:26 UTC
PATCH: http://patches.gluster.com/patch/7706 in master (mnt/fuse: Do a pthread_cond_broadcast for both CHILD_UP and CHILD_DOWN events.)

Comment 4 Anand Avati 2011-07-01 05:26:46 UTC
PATCH: http://patches.gluster.com/patch/7702 in release-3.1 (rpc-transport/rdma: call ibv_fork_init to make rdma work with fork.)

Comment 5 Anand Avati 2011-07-01 05:26:52 UTC
PATCH: http://patches.gluster.com/patch/7703 in release-3.1 (mount/fuse: wait till CHILD_UP event is recieved to do the first lookup.)

Comment 6 Anand Avati 2011-07-01 05:26:58 UTC
PATCH: http://patches.gluster.com/patch/7704 in release-3.1 (mnt/fuse: Do a pthread_cond_broadcast for both CHILD_UP and CHILD_DOWN events.)

Comment 7 Anand Avati 2011-07-01 05:27:17 UTC
PATCH: http://patches.gluster.com/patch/7695 in release-3.2 (rpc-transport/rdma: call ibv_fork_init to make rdma work with fork.)

Comment 8 Anand Avati 2011-07-01 05:27:23 UTC
PATCH: http://patches.gluster.com/patch/7696 in release-3.2 (mount/fuse: wait till CHILD_UP event is recieved to do the first lookup.)

Comment 9 Anand Avati 2011-07-01 05:27:30 UTC
PATCH: http://patches.gluster.com/patch/7707 in release-3.2 (mnt/fuse: Do a pthread_cond_broadcast for both CHILD_UP and CHILD_DOWN events.)

Comment 10 Vijay Bellur 2011-07-07 10:04:01 UTC
PATCH: http://patches.gluster.com/patch/7772 in release-3.2 (client-handshake: skip CHILD_DOWN notifications when client is querying port using different volume names in the presence of rdma.)

Comment 11 Anand Avati 2011-07-12 04:07:06 UTC
PATCH: http://patches.gluster.com/patch/7774 in master (client-handshake: skip CHILD_DOWN notifications when client is querying port using different volume names in the presence of rdma.)

Comment 12 Anand Avati 2011-07-12 04:07:14 UTC
PATCH: http://patches.gluster.com/patch/7773 in release-3.1 (client-handshake: skip CHILD_DOWN notifications when client is querying port using different volume names in the presence of rdma.)

Comment 13 M S Vishwanath Bhat 2011-07-14 07:04:49 UTC
This issue is fixed now in 3.2.2qa8. Checked with rebalance start and also with fix-layout and migrtae-data. Both the options are working fine,

Comment 14 M S Vishwanath Bhat 2011-08-05 04:27:24 UTC
tested on 3.1.6qa3 and it's working fine. Rebalance succeeds and also 'fix-layout' and 'migrate-data' succeeds.