Bug 818519 - [170a3a411c88f6ce1662c55440a372f512e901d1]: replace-brick commands (abort/commit) fail if all the gluster processes are killed and restarted
Summary: [170a3a411c88f6ce1662c55440a372f512e901d1]: replace-brick commands (abort/com...
Keywords:
Status: CLOSED DUPLICATE of bug 816915
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: ---
Assignee: krishnan parthasarathi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-05-03 09:05 UTC by Raghavendra Bhat
Modified: 2015-11-03 23:04 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-07-11 07:11:18 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Raghavendra Bhat 2012-05-03 09:05:57 UTC
Description of problem:

Suppose replace-brick is given on a volume. After the replace-brick is done (replace-brick status command says migration complete), kill all the gluster processes (i.e. glusterfsd, glusterfs, glusterd or the machine can be rebooted).

Now start the gluster processes (starting glusterd).

Now if replace-brick abort or replace-brick commit command is given, then it just gets blocked for 2 minutes and then returns without giving any output.

Version-Release number of selected component (if applicable):


How reproducible:

Always

Steps to Reproduce:
1. Start replace-brick on a running volume and wait till the migration is complete
2. killall all the gluster processes (glusterfsd, glusterfs, glusterd or reboot the machine)
3. Restart the gluster processes and give replace-brick commit/abort.
  
Actual results:

replace-brick abort/commit commands gets blocked for 2 minutes and then return without any output

Expected results:

Replce brick commands should work properly.

Additional info:


 5:     option transport.socket.keepalive-time 10
  6:     option transport.socket.keepalive-interval 2
  7:     option transport.socket.read-fail-log off
  8: end-volume

+------------------------------------------------------------------------------+
[2012-05-03 14:19:20.284691] I [socket.c:1807:socket_event_handler] 0-transport: disconnecting now
[2012-05-03 14:19:20.284774] I [socket.c:1807:socket_event_handler] 0-transport: disconnecting now
[2012-05-03 14:19:20.284816] I [socket.c:1807:socket_event_handler] 0-transport: disconnecting now
[2012-05-03 14:19:20.311447] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2012-05-03 14:19:20.313057] I [glusterd-pmap.c:238:pmap_registry_bind] 0-pmap: adding brick /mnt/sda6/export4 on port 24017
[2012-05-03 14:19:20.313672] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2012-05-03 14:19:20.317597] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2012-05-03 14:19:20.318349] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2012-05-03 14:19:20.336120] I [glusterd-handshake.c:255:server_event_notify] 0-: recieved defrag status updated
[2012-05-03 14:19:20.341376] W [socket.c:1521:__socket_proto_state_machine] 0-management: reading from socket failed. Error (Transport endpoint is not connected), peer (/etc/glusterd/vols/mirror/rebalance/c759c363-3988-4a75-aa41-6eee929e825c.sock)
[2012-05-03 14:19:20.374055] I [mem-pool.c:585:mem_pool_destroy] 0-management: size=2236 max=0 total=0
[2012-05-03 14:19:20.374084] I [mem-pool.c:585:mem_pool_destroy] 0-management: size=124 max=0 total=0
[2012-05-03 14:19:20.818158] I [glusterd-pmap.c:238:pmap_registry_bind] 0-pmap: adding brick /mnt/sda10/export3 on port 24015
[2012-05-03 14:21:18.258088] I [glusterd-replace-brick.c:98:glusterd_handle_replace_brick] 0-glusterd: Received replace brick req
[2012-05-03 14:21:18.258156] I [glusterd-replace-brick.c:147:glusterd_handle_replace_brick] 0-glusterd: Received replace brick status request
[2012-05-03 14:21:18.283347] I [glusterd-utils.c:283:glusterd_lock] 0-glusterd: Cluster lock held by c759c363-3988-4a75-aa41-6eee929e825c
[2012-05-03 14:21:18.283477] I [glusterd-handler.c:458:glusterd_op_txn_begin] 0-management: Acquired local lock
[2012-05-03 14:21:18.283844] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4
[2012-05-03 14:21:18.285059] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-03 14:21:18.286375] I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 0 peers
[2012-05-03 14:21:18.286474] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4
[2012-05-03 14:21:18.286743] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-03 14:21:18.286989] I [glusterd-replace-brick.c:1229:rb_update_srcbrick_port] 0-: adding src-brick port no
[2012-05-03 14:21:18.287061] I [glusterd-replace-brick.c:1286:rb_update_dstbrick_port] 0-: adding dst-brick port no
[2012-05-03 14:21:18.453493] I [glusterd-op-sm.c:2358:glusterd_op_ac_send_commit_op] 0-management: Sent op req to 0 peers
[2012-05-03 14:21:18.453577] I [glusterd-op-sm.c:2627:glusterd_op_txn_complete] 0-glusterd: Cleared local lock
[2012-05-03 14:21:20.754563] I [glusterd-replace-brick.c:98:glusterd_handle_replace_brick] 0-glusterd: Received replace brick req
[2012-05-03 14:21:20.754614] I [glusterd-replace-brick.c:147:glusterd_handle_replace_brick] 0-glusterd: Received replace brick abort request
[2012-05-03 14:21:20.754647] I [glusterd-utils.c:283:glusterd_lock] 0-glusterd: Cluster lock held by c759c363-3988-4a75-aa41-6eee929e825c
[2012-05-03 14:21:20.754673] I [glusterd-handler.c:458:glusterd_op_txn_begin] 0-management: Acquired local lock
[2012-05-03 14:21:20.754748] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4
[2012-05-03 14:21:20.755016] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-03 14:21:20.755711] I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 0 peers
[2012-05-03 14:21:20.755754] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4
[2012-05-03 14:21:20.755962] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-03 14:21:20.756195] I [glusterd-replace-brick.c:1229:rb_update_srcbrick_port] 0-: adding src-brick port no
[2012-05-03 14:21:20.756260] I [glusterd-replace-brick.c:1286:rb_update_dstbrick_port] 0-: adding dst-brick port no
(END)

Comment 1 Amar Tumballi 2012-07-11 06:22:21 UTC
patch sent @ http://review.gluster.com/3264

Comment 2 krishnan parthasarathi 2012-07-11 07:11:18 UTC

*** This bug has been marked as a duplicate of bug 816915 ***


Note You need to log in before you can comment on or make changes to this bug.