Description of problem: Suppose replace-brick is given on a volume. After the replace-brick is done (replace-brick status command says migration complete), kill all the gluster processes (i.e. glusterfsd, glusterfs, glusterd or the machine can be rebooted). Now start the gluster processes (starting glusterd). Now if replace-brick abort or replace-brick commit command is given, then it just gets blocked for 2 minutes and then returns without giving any output. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Start replace-brick on a running volume and wait till the migration is complete 2. killall all the gluster processes (glusterfsd, glusterfs, glusterd or reboot the machine) 3. Restart the gluster processes and give replace-brick commit/abort. Actual results: replace-brick abort/commit commands gets blocked for 2 minutes and then return without any output Expected results: Replce brick commands should work properly. Additional info: 5: option transport.socket.keepalive-time 10 6: option transport.socket.keepalive-interval 2 7: option transport.socket.read-fail-log off 8: end-volume +------------------------------------------------------------------------------+ [2012-05-03 14:19:20.284691] I [socket.c:1807:socket_event_handler] 0-transport: disconnecting now [2012-05-03 14:19:20.284774] I [socket.c:1807:socket_event_handler] 0-transport: disconnecting now [2012-05-03 14:19:20.284816] I [socket.c:1807:socket_event_handler] 0-transport: disconnecting now [2012-05-03 14:19:20.311447] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2012-05-03 14:19:20.313057] I [glusterd-pmap.c:238:pmap_registry_bind] 0-pmap: adding brick /mnt/sda6/export4 on port 24017 [2012-05-03 14:19:20.313672] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2012-05-03 14:19:20.317597] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2012-05-03 14:19:20.318349] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2012-05-03 14:19:20.336120] I [glusterd-handshake.c:255:server_event_notify] 0-: recieved defrag status updated [2012-05-03 14:19:20.341376] W [socket.c:1521:__socket_proto_state_machine] 0-management: reading from socket failed. Error (Transport endpoint is not connected), peer (/etc/glusterd/vols/mirror/rebalance/c759c363-3988-4a75-aa41-6eee929e825c.sock) [2012-05-03 14:19:20.374055] I [mem-pool.c:585:mem_pool_destroy] 0-management: size=2236 max=0 total=0 [2012-05-03 14:19:20.374084] I [mem-pool.c:585:mem_pool_destroy] 0-management: size=124 max=0 total=0 [2012-05-03 14:19:20.818158] I [glusterd-pmap.c:238:pmap_registry_bind] 0-pmap: adding brick /mnt/sda10/export3 on port 24015 [2012-05-03 14:21:18.258088] I [glusterd-replace-brick.c:98:glusterd_handle_replace_brick] 0-glusterd: Received replace brick req [2012-05-03 14:21:18.258156] I [glusterd-replace-brick.c:147:glusterd_handle_replace_brick] 0-glusterd: Received replace brick status request [2012-05-03 14:21:18.283347] I [glusterd-utils.c:283:glusterd_lock] 0-glusterd: Cluster lock held by c759c363-3988-4a75-aa41-6eee929e825c [2012-05-03 14:21:18.283477] I [glusterd-handler.c:458:glusterd_op_txn_begin] 0-management: Acquired local lock [2012-05-03 14:21:18.283844] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4 [2012-05-03 14:21:18.285059] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick [2012-05-03 14:21:18.286375] I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 0 peers [2012-05-03 14:21:18.286474] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4 [2012-05-03 14:21:18.286743] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick [2012-05-03 14:21:18.286989] I [glusterd-replace-brick.c:1229:rb_update_srcbrick_port] 0-: adding src-brick port no [2012-05-03 14:21:18.287061] I [glusterd-replace-brick.c:1286:rb_update_dstbrick_port] 0-: adding dst-brick port no [2012-05-03 14:21:18.453493] I [glusterd-op-sm.c:2358:glusterd_op_ac_send_commit_op] 0-management: Sent op req to 0 peers [2012-05-03 14:21:18.453577] I [glusterd-op-sm.c:2627:glusterd_op_txn_complete] 0-glusterd: Cleared local lock [2012-05-03 14:21:20.754563] I [glusterd-replace-brick.c:98:glusterd_handle_replace_brick] 0-glusterd: Received replace brick req [2012-05-03 14:21:20.754614] I [glusterd-replace-brick.c:147:glusterd_handle_replace_brick] 0-glusterd: Received replace brick abort request [2012-05-03 14:21:20.754647] I [glusterd-utils.c:283:glusterd_lock] 0-glusterd: Cluster lock held by c759c363-3988-4a75-aa41-6eee929e825c [2012-05-03 14:21:20.754673] I [glusterd-handler.c:458:glusterd_op_txn_begin] 0-management: Acquired local lock [2012-05-03 14:21:20.754748] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4 [2012-05-03 14:21:20.755016] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick [2012-05-03 14:21:20.755711] I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 0 peers [2012-05-03 14:21:20.755754] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4 [2012-05-03 14:21:20.755962] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick [2012-05-03 14:21:20.756195] I [glusterd-replace-brick.c:1229:rb_update_srcbrick_port] 0-: adding src-brick port no [2012-05-03 14:21:20.756260] I [glusterd-replace-brick.c:1286:rb_update_dstbrick_port] 0-: adding dst-brick port no (END)
patch sent @ http://review.gluster.com/3264
*** This bug has been marked as a duplicate of bug 816915 ***