Bug 818519 - [170a3a411c88f6ce1662c55440a372f512e901d1]: replace-brick commands (abort/commit) fail if all the gluster processes are killed and restarted
[170a3a411c88f6ce1662c55440a372f512e901d1]: replace-brick commands (abort/com...
Status: CLOSED DUPLICATE of bug 816915
Product: GlusterFS
Classification: Community
Component: glusterd (Show other bugs)
mainline
Unspecified Unspecified
medium Severity unspecified
: ---
: ---
Assigned To: krishnan parthasarathi
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-03 05:05 EDT by Raghavendra Bhat
Modified: 2015-11-03 18:04 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-07-11 03:11:18 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Raghavendra Bhat 2012-05-03 05:05:57 EDT
Description of problem:

Suppose replace-brick is given on a volume. After the replace-brick is done (replace-brick status command says migration complete), kill all the gluster processes (i.e. glusterfsd, glusterfs, glusterd or the machine can be rebooted).

Now start the gluster processes (starting glusterd).

Now if replace-brick abort or replace-brick commit command is given, then it just gets blocked for 2 minutes and then returns without giving any output.

Version-Release number of selected component (if applicable):


How reproducible:

Always

Steps to Reproduce:
1. Start replace-brick on a running volume and wait till the migration is complete
2. killall all the gluster processes (glusterfsd, glusterfs, glusterd or reboot the machine)
3. Restart the gluster processes and give replace-brick commit/abort.
  
Actual results:

replace-brick abort/commit commands gets blocked for 2 minutes and then return without any output

Expected results:

Replce brick commands should work properly.

Additional info:


 5:     option transport.socket.keepalive-time 10
  6:     option transport.socket.keepalive-interval 2
  7:     option transport.socket.read-fail-log off
  8: end-volume

+------------------------------------------------------------------------------+
[2012-05-03 14:19:20.284691] I [socket.c:1807:socket_event_handler] 0-transport: disconnecting now
[2012-05-03 14:19:20.284774] I [socket.c:1807:socket_event_handler] 0-transport: disconnecting now
[2012-05-03 14:19:20.284816] I [socket.c:1807:socket_event_handler] 0-transport: disconnecting now
[2012-05-03 14:19:20.311447] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2012-05-03 14:19:20.313057] I [glusterd-pmap.c:238:pmap_registry_bind] 0-pmap: adding brick /mnt/sda6/export4 on port 24017
[2012-05-03 14:19:20.313672] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2012-05-03 14:19:20.317597] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2012-05-03 14:19:20.318349] I [glusterd-handler.c:860:glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2012-05-03 14:19:20.336120] I [glusterd-handshake.c:255:server_event_notify] 0-: recieved defrag status updated
[2012-05-03 14:19:20.341376] W [socket.c:1521:__socket_proto_state_machine] 0-management: reading from socket failed. Error (Transport endpoint is not connected), peer (/etc/glusterd/vols/mirror/rebalance/c759c363-3988-4a75-aa41-6eee929e825c.sock)
[2012-05-03 14:19:20.374055] I [mem-pool.c:585:mem_pool_destroy] 0-management: size=2236 max=0 total=0
[2012-05-03 14:19:20.374084] I [mem-pool.c:585:mem_pool_destroy] 0-management: size=124 max=0 total=0
[2012-05-03 14:19:20.818158] I [glusterd-pmap.c:238:pmap_registry_bind] 0-pmap: adding brick /mnt/sda10/export3 on port 24015
[2012-05-03 14:21:18.258088] I [glusterd-replace-brick.c:98:glusterd_handle_replace_brick] 0-glusterd: Received replace brick req
[2012-05-03 14:21:18.258156] I [glusterd-replace-brick.c:147:glusterd_handle_replace_brick] 0-glusterd: Received replace brick status request
[2012-05-03 14:21:18.283347] I [glusterd-utils.c:283:glusterd_lock] 0-glusterd: Cluster lock held by c759c363-3988-4a75-aa41-6eee929e825c
[2012-05-03 14:21:18.283477] I [glusterd-handler.c:458:glusterd_op_txn_begin] 0-management: Acquired local lock
[2012-05-03 14:21:18.283844] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4
[2012-05-03 14:21:18.285059] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-03 14:21:18.286375] I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 0 peers
[2012-05-03 14:21:18.286474] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4
[2012-05-03 14:21:18.286743] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-03 14:21:18.286989] I [glusterd-replace-brick.c:1229:rb_update_srcbrick_port] 0-: adding src-brick port no
[2012-05-03 14:21:18.287061] I [glusterd-replace-brick.c:1286:rb_update_dstbrick_port] 0-: adding dst-brick port no
[2012-05-03 14:21:18.453493] I [glusterd-op-sm.c:2358:glusterd_op_ac_send_commit_op] 0-management: Sent op req to 0 peers
[2012-05-03 14:21:18.453577] I [glusterd-op-sm.c:2627:glusterd_op_txn_complete] 0-glusterd: Cleared local lock
[2012-05-03 14:21:20.754563] I [glusterd-replace-brick.c:98:glusterd_handle_replace_brick] 0-glusterd: Received replace brick req
[2012-05-03 14:21:20.754614] I [glusterd-replace-brick.c:147:glusterd_handle_replace_brick] 0-glusterd: Received replace brick abort request
[2012-05-03 14:21:20.754647] I [glusterd-utils.c:283:glusterd_lock] 0-glusterd: Cluster lock held by c759c363-3988-4a75-aa41-6eee929e825c
[2012-05-03 14:21:20.754673] I [glusterd-handler.c:458:glusterd_op_txn_begin] 0-management: Acquired local lock
[2012-05-03 14:21:20.754748] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4
[2012-05-03 14:21:20.755016] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-03 14:21:20.755711] I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 0 peers
[2012-05-03 14:21:20.755754] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: hyperspace:/mnt/sda6/export4
[2012-05-03 14:21:20.755962] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-03 14:21:20.756195] I [glusterd-replace-brick.c:1229:rb_update_srcbrick_port] 0-: adding src-brick port no
[2012-05-03 14:21:20.756260] I [glusterd-replace-brick.c:1286:rb_update_dstbrick_port] 0-: adding dst-brick port no
(END)
Comment 1 Amar Tumballi 2012-07-11 02:22:21 EDT
patch sent @ http://review.gluster.com/3264
Comment 2 krishnan parthasarathi 2012-07-11 03:11:18 EDT

*** This bug has been marked as a duplicate of bug 816915 ***

Note You need to log in before you can comment on or make changes to this bug.