Bug 1162479 - replace-brick doesn't work fine .
Summary: replace-brick doesn't work fine .
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: cli
Version: 3.5.2
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-11 07:20 UTC by wangqy
Modified: 2016-06-17 15:58 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-06-17 15:58:07 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
CentOS 1 0 None None None Never

Description wangqy 2014-11-11 07:20:05 UTC
Description of problem:
I build a replica 2 volume ,and take a replace-brick test. but it does not work well.

Version-Release number of selected component (if applicable):
I prepared 3 virtual PCs with CentOS 6.5 and gluster 3.4.2.
And I also do the same test with gluster 3.5.2 and get the same result.

How reproducible:
I did this test several times,and always get same results.

Steps to Reproduce:
1.build a replVolume Name: testrep
Type: Replicate
Volume ID: 172e5c5c-fb94-4f8b-9ed4-b764b9c5d6cd
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: data-node3:/brick/testrep
Brick2: data-node4:/brick/testrep
ica volume, such as:

2.mount it on a client,and write something into it.

3.do the replace-brick oprations:
  start replace and trace its log.
  command like this:
    # gluster volume replace-brick testrep data-node4:/brick/testrep data-node5:/brick/testrep --log-level=TRACE --log-file=/tmp/replacelog start
  
  and it will take a long time to return. and then the gluster crashed.
  
  after that, i try to get volume info. it will also take a long time ,and return no volume. and get volume status ,it will return nothing.
  command like this:
    # gluster volume status
    (there is nothing return)
    # gluster volume info
    No volumes present
  
the replace-brick opration cann't continue.

Additional info:
try to get something from the trace log.

[2014-11-11 07:19:01.558436] T [rpc-clnt.c:424:rpc_clnt_reconnect] 0-glusterfs: attempting reconnect
[2014-11-11 07:19:01.558543] T [socket.c:2675:socket_connect] (-->/lib64/libpthread.so.0(+0x79d1) [0x7fa5640539d1] (-->/usr/lib64/libglusterfs.so.0(gf_timer_proc+0xc8) [0x7fa565ccd368] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_reconnect+0x116) [0x7fa5654182e6]))) 0-glusterfs: connect () called on transport already connected
[2014-11-11 07:19:01.558553] T [rpc-clnt.c:424:rpc_clnt_reconnect] 0-glusterfs: attempting reconnect
[2014-11-11 07:19:01.558574] T [socket.c:2683:socket_connect] 0-glusterfs: connecting 0x1417480, state=0 gen=0 sock=-1
[2014-11-11 07:19:01.558595] W [dict.c:1055:data_to_str] (-->/usr/lib64/glusterfs/3.5.2/rpc-transport/socket.so(+0x68ec) [0x7fa5629ab8ec] (-->/usr/lib64/glusterfs/3.5.2/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0xad) [0x7fa5629affcd] (-->/usr/lib64/glusterfs/3.5.2/rpc-transport/socket.so(client_fill_address_family+0x200) [0x7fa5629afe80]))) 0-dict: data is NULL
[2014-11-11 07:19:01.558609] W [dict.c:1055:data_to_str] (-->/usr/lib64/glusterfs/3.5.2/rpc-transport/socket.so(+0x68ec) [0x7fa5629ab8ec] (-->/usr/lib64/glusterfs/3.5.2/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0xad) [0x7fa5629affcd] (-->/usr/lib64/glusterfs/3.5.2/rpc-transport/socket.so(client_fill_address_family+0x20b) [0x7fa5629afe8b]))) 0-dict: data is NULL
[2014-11-11 07:19:01.558615] E [name.c:147:client_fill_address_family] 0-glusterfs: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options

it will try it every 3sec.

at last it exit with 110:

[2014-11-11 07:19:04.527694] D [cli-cmd.c:381:cli_cmd_submit] 0-cli: Returning 110
[2014-11-11 07:19:04.527754] D [cli-rpc-ops.c:3531:gf_cli_replace_brick] 0-cli: Returning 110
[2014-11-11 07:19:04.527768] D [cli-cmd-volume.c:1578:cli_cmd_volume_replace_brick_cbk] 0-cli: frame->local is not NULL (0x7fa5540009c0)
[2014-11-11 07:19:04.527794] I [input.c:36:cli_batch] 0-: Exiting with: 110

i wander if something wrong with my enverment.

Comment 1 Atin Mukherjee 2014-11-13 10:46:35 UTC
If glusterd is crashed, then till you restart the service cli requests will get timed out. What we need to look at here is why glusterd crashed. Can you share the sequence of glusterd logs during replace brick transaction, that would help in root causing the problem.

Comment 2 wangqy 2014-11-14 05:25:13 UTC
(In reply to Atin Mukherjee from comment #1)
> If glusterd is crashed, then till you restart the service cli requests will
> get timed out. What we need to look at here is why glusterd crashed. Can you
> share the sequence of glusterd logs during replace brick transaction, that
> would help in root causing the problem.

I try it again and get the gluster log by add --log-level=DEBUG into /etc/init.d/glusterd. And here is the debug log of glusterd:

[2014-11-14 03:33:59.116911] I [glusterd-replace-brick.c:98:__glusterd_handle_replace_brick] 0-management: Received replace brick req
[2014-11-14 03:33:59.116939] D [glusterd-replace-brick.c:140:__glusterd_handle_replace_brick] 0-management: src brick=data-node3:/brick/testrep
[2014-11-14 03:33:59.116946] D [glusterd-replace-brick.c:151:__glusterd_handle_replace_brick] 0-management: dst brick=data-node5:/brick/testrep
[2014-11-14 03:33:59.116952] I [glusterd-replace-brick.c:153:__glusterd_handle_replace_brick] 0-management: Received replace brick start request
[2014-11-14 03:33:59.116962] D [glusterd-utils.c:161:glusterd_lock] 0-management: Cluster lock held by 0f05057d-4ff1-4443-b35d-7db852b493fb
[2014-11-14 03:33:59.116969] D [glusterd-handler.c:617:glusterd_op_txn_begin] 0-management: Acquired lock on localhost
[2014-11-14 03:33:59.116976] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_START_LOCK'
[2014-11-14 03:33:59.116982] D [glusterd-handler.c:635:glusterd_op_txn_begin] 0-management: Returning 0
[2014-11-14 03:33:59.116989] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_START_LOCK'
[2014-11-14 03:33:59.117052] D [glusterd-rpc-ops.c:1214:glusterd_cluster_lock] 0-management: Returning 0
[2014-11-14 03:33:59.117072] D [glusterd-rpc-ops.c:1214:glusterd_cluster_lock] 0-management: Returning 0
[2014-11-14 03:33:59.117079] D [glusterd-op-sm.c:2412:glusterd_op_ac_send_lock] 0-management: Returning with 0
[2014-11-14 03:33:59.117086] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Default' to 'Lock sent' due to event 'GD_OP_EVENT_START_LOCK'
[2014-11-14 03:33:59.117092] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:33:59.118719] D [glusterd-rpc-ops.c:602:__glusterd_cluster_lock_cbk] 0-management: Received lock ACC from uuid: dcaa270d-d46f-43c2-b9a2-ad60197fd912
[2014-11-14 03:33:59.118767] D [glusterd-utils.c:5439:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster
[2014-11-14 03:33:59.118776] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.118784] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.118791] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Lock sent' to 'Lock sent' due to event 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.118812] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:33:59.118891] D [glusterd-rpc-ops.c:602:__glusterd_cluster_lock_cbk] 0-management: Received lock ACC from uuid: bac58735-7d09-4d5c-8187-eedbdbdf30a7
[2014-11-14 03:33:59.118900] D [glusterd-utils.c:5439:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster
[2014-11-14 03:33:59.118907] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.118913] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.118918] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_ALL_ACC'
[2014-11-14 03:33:59.118924] D [glusterd-op-sm.c:2572:glusterd_op_ac_rcvd_lock_acc] 0-management: Returning 0
[2014-11-14 03:33:59.118930] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Lock sent' to 'Lock sent' due to event 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.118935] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:33:59.118941] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_ALL_ACC'
[2014-11-14 03:33:59.118953] D [glusterd-utils.c:1156:glusterd_volinfo_find] 0-management: Volume testrep found
[2014-11-14 03:33:59.118960] D [glusterd-utils.c:1163:glusterd_volinfo_find] 0-management: Returning 0
[2014-11-14 03:33:59.118978] D [glusterd-utils.c:1156:glusterd_volinfo_find] 0-management: Volume testrep found
[2014-11-14 03:33:59.118984] D [glusterd-utils.c:1163:glusterd_volinfo_find] 0-management: Returning 0
[2014-11-14 03:33:59.118993] D [glusterd-replace-brick.c:238:glusterd_op_stage_replace_brick] 0-management: src brick=data-node3:/brick/testrep
[2014-11-14 03:33:59.119012] D [glusterd-replace-brick.c:247:glusterd_op_stage_replace_brick] 0-management: dst brick=data-node5:/brick/testrep
[2014-11-14 03:33:59.119021] D [glusterd-utils.c:1156:glusterd_volinfo_find] 0-management: Volume testrep found
014-11-14 03:33:59.119027] D [glusterd-utils.c:1163:glusterd_volinfo_find] 0-management: Returning 0
[2014-11-14 03:33:59.119042] D [glusterd-utils.c:669:glusterd_brickinfo_new] 0-management: Returning 0
[2014-11-14 03:33:59.119062] D [glusterd-utils.c:743:glusterd_brickinfo_new_from_brick] 0-management: Returning 0
[2014-11-14 03:33:59.119071] D [glusterd-utils.c:518:glusterd_volinfo_new] 0-management: Returning 0
[2014-11-14 03:33:59.119097] D [glusterd-utils.c:602:glusterd_volume_brickinfos_delete] 0-management: Returning 0
[2014-11-14 03:33:59.119109] D [store.c:434:gf_store_handle_destroy] 0-: Returning 0
[2014-11-14 03:33:59.119115] D [glusterd-utils.c:643:glusterd_volinfo_delete] 0-management: Returning 0
[2014-11-14 03:33:59.119123] D [glusterd-utils.c:669:glusterd_brickinfo_new] 0-management: Returning 0
[2014-11-14 03:33:59.119129] D [glusterd-utils.c:743:glusterd_brickinfo_new_from_brick] 0-management: Returning 0
[2014-11-14 03:33:59.119136] D [glusterd-utils.c:518:glusterd_volinfo_new] 0-management: Returning 0
[2014-11-14 03:33:59.119143] D [glusterd-utils.c:602:glusterd_volume_brickinfos_delete] 0-management: Returning 0
[2014-11-14 03:33:59.119149] D [store.c:434:gf_store_handle_destroy] 0-: Returning 0
[2014-11-14 03:33:59.119154] D [glusterd-utils.c:643:glusterd_volinfo_delete] 0-management: Returning 0
[2014-11-14 03:33:59.119163] D [glusterd-utils.c:5726:glusterd_is_rb_started] 0-: is_rb_started:status=0
[2014-11-14 03:33:59.119187] I [glusterd-utils.c:8850:glusterd_generate_and_set_task_id] 0-management: Generated task-id ee10c393-3ece-4928-9563-642a4bc637fc for key replace-brick-id
[2014-11-14 03:33:59.119197] D [glusterd-utils.c:669:glusterd_brickinfo_new] 0-management: Returning 0
[2014-11-14 03:33:59.119203] D [glusterd-utils.c:743:glusterd_brickinfo_new_from_brick] 0-management: Returning 0
[2014-11-14 03:33:59.119214] D [glusterd-utils.c:5483:glusterd_friend_find_by_hostname] 0-management: Friend data-node3 found.. state: 3
[2014-11-14 03:33:59.119220] D [glusterd-utils.c:5567:glusterd_hostname_to_uuid] 0-management: returning 0
[2014-11-14 03:33:59.119226] D [glusterd-utils.c:1026:glusterd_volume_brickinfo_get] 0-management: Found brick data-node3:/brick/testrep in volume testrep
[2014-11-14 03:33:59.119232] D [glusterd-utils.c:1035:glusterd_volume_brickinfo_get] 0-management: Returning 0
[2014-11-14 03:33:59.119237] D [glusterd-utils.c:1059:glusterd_volume_brickinfo_get_by_brick] 0-: Returning 0
[2014-11-14 03:33:59.119342] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.153
[2014-11-14 03:33:59.119419] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.153
[2014-11-14 03:33:59.119455] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.153
[2014-11-14 03:33:59.119498] D [common-utils.c:2946:gf_is_local_addr] 0-management: data-node3 is not local
[2014-11-14 03:33:59.119510] D [glusterd-utils.c:669:glusterd_brickinfo_new] 0-management: Returning 0
[2014-11-14 03:33:59.119517] D [glusterd-utils.c:743:glusterd_brickinfo_new_from_brick] 0-management: Returning 0
[2014-11-14 03:33:59.119524] D [glusterd-utils.c:5483:glusterd_friend_find_by_hostname] 0-management: Friend data-node5 found.. state: 3
[2014-11-14 03:33:59.119530] D [glusterd-utils.c:5567:glusterd_hostname_to_uuid] 0-management: returning 0
[2014-11-14 03:33:59.119535] D [glusterd-utils.c:685:glusterd_resolve_brick] 0-management: Returning 0
[2014-11-14 03:33:59.119541] D [glusterd-utils.c:5439:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster
[2014-11-14 03:33:59.119547] D [glusterd-utils.c:5718:glusterd_new_brick_validate] 0-management: returning 0
[2014-11-14 03:33:59.119553] D [glusterd-utils.c:5726:glusterd_is_rb_started] 0-: is_rb_started:status=0
[2014-11-14 03:33:59.119558] D [glusterd-utils.c:5735:glusterd_is_rb_paused] 0-: is_rb_paused:status=0
[2014-11-14 03:33:59.119564] D [glusterd-utils.c:5726:glusterd_is_rb_started] 0-: is_rb_started:status=0
[2014-11-14 03:33:59.119569] D [glusterd-utils.c:5735:glusterd_is_rb_paused] 0-: is_rb_paused:status=0
[2014-11-14 03:33:59.119605] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.119641] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.119673] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.119704] D [common-utils.c:2946:gf_is_local_addr] 0-management: data-node5 is not local
[2014-11-14 03:33:59.119739] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.119774] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.119807] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.119838] D [common-utils.c:2946:gf_is_local_addr] 0-management: data-node5 is not local
[2014-11-14 03:33:59.119846] D [glusterd-utils.c:5483:glusterd_friend_find_by_hostname] 0-management: Friend data-node5 found.. state: 3
[2014-11-14 03:33:59.119915] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.119946] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.119978] D [common-utils.c:2946:gf_is_local_addr] 0-management: data-node5 is not local
[2014-11-14 03:33:59.119985] D [glusterd-replace-brick.c:573:glusterd_op_stage_replace_brick] 0-management: Returning 0
[2014-11-14 03:33:59.119991] D [glusterd-op-sm.c:4151:glusterd_op_stage_validate] 0-management: OP = 10. Returning 0
[2014-11-14 03:33:59.120040] D [glusterd-rpc-ops.c:1306:glusterd_stage_op] 0-management: Returning 0
[2014-11-14 03:33:59.120061] D [glusterd-rpc-ops.c:1306:glusterd_stage_op] 0-management: Returning 0
[2014-11-14 03:33:59.120069] D [glusterd-op-sm.c:3008:glusterd_op_ac_send_stage_op] 0-management: Sent stage op request for 'Volume Replace brick' to 2 peers
[2014-11-14 03:33:59.120078] D [glusterd-op-sm.c:3013:glusterd_op_ac_send_stage_op] 0-management: Returning with 0
[2014-11-14 03:33:59.120085] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Lock sent' to 'Stage op sent' due to event 'GD_OP_EVENT_ALL_ACC'
[2014-11-14 03:33:59.120091] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:33:59.121626] D [glusterd-rpc-ops.c:773:__glusterd_stage_op_cbk] 0-management: Received stage ACC from uuid: dcaa270d-d46f-43c2-b9a2-ad60197fd912
[2014-11-14 03:33:59.121678] D [glusterd-utils.c:5439:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster
[2014-11-14 03:33:59.121692] D [glusterd-utils.c:7542:glusterd_rb_use_rsp_dict] 0-: src-brick-port=49152 found
[2014-11-14 03:33:59.121704] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.121715] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.121722] D [glusterd-op-sm.c:3581:glusterd_op_ac_rcvd_stage_op_acc] 0-management: Returning 0
[2014-11-14 03:33:59.121729] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Stage op sent' to 'Stage op sent' due to event 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.121736] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:33:59.134657] D [glusterd-rpc-ops.c:773:__glusterd_stage_op_cbk] 0-management: Received stage ACC from uuid: bac58735-7d09-4d5c-8187-eedbdbdf30a7
[2014-11-14 03:33:59.134692] D [glusterd-utils.c:5439:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster
[2014-11-14 03:33:59.134703] D [glusterd-utils.c:7548:glusterd_rb_use_rsp_dict] 0-: dst-brick-port=49153 found
[2014-11-14 03:33:59.134713] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.134720] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.134726] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_STAGE_ACC'
[2014-11-14 03:33:59.134731] D [glusterd-op-sm.c:3581:glusterd_op_ac_rcvd_stage_op_acc] 0-management: Returning 0
[2014-11-14 03:33:59.134738] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Stage op sent' to 'Stage op sent' due to event 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:33:59.134743] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:33:59.134749] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_STAGE_ACC'
[2014-11-14 03:33:59.134758] D [glusterd-utils.c:1156:glusterd_volinfo_find] 0-management: Volume testrep found
[2014-11-14 03:33:59.134764] D [glusterd-utils.c:1163:glusterd_volinfo_find] 0-management: Returning 0
[2014-11-14 03:33:59.134777] D [glusterd-op-sm.c:5307:glusterd_op_bricks_select] 0-management: Returning 0
[2014-11-14 03:33:59.134783] D [glusterd-rpc-ops.c:1585:glusterd_brick_op] 0-management: Sent brick op req for operation 'Volume Replace brick' to 0 bricks
[2014-11-14 03:33:59.134789] D [glusterd-rpc-ops.c:1593:glusterd_brick_op] 0-management: Returning 0
[2014-11-14 03:33:59.134795] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_ALL_ACK'
[2014-11-14 03:33:59.134800] D [glusterd-op-sm.c:5200:glusterd_op_ac_send_brick_op] 0-management: Returning with 0
[2014-11-14 03:33:59.134806] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Stage op sent' to 'Brick op sent' due to event 'GD_OP_EVENT_STAGE_ACC'
[2014-11-14 03:33:59.134811] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:33:59.134829] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_ALL_ACK'
[2014-11-14 03:33:59.134836] D [glusterd-utils.c:1156:glusterd_volinfo_find] 0-management: Volume testrep found
[2014-11-14 03:33:59.134841] D [glusterd-utils.c:1163:glusterd_volinfo_find] 0-management: Returning 0
[2014-11-14 03:33:59.134858] D [glusterd-replace-brick.c:1571:glusterd_op_replace_brick] 0-management: src brick=data-node3:/brick/testrep
[2014-11-14 03:33:59.134865] D [glusterd-replace-brick.c:1579:glusterd_op_replace_brick] 0-management: dst brick=data-node5:/brick/testrep
[2014-11-14 03:33:59.134871] D [glusterd-utils.c:1156:glusterd_volinfo_find] 0-management: Volume testrep found
[2014-11-14 03:33:59.134876] D [glusterd-utils.c:1163:glusterd_volinfo_find] 0-management: Returning 0
[2014-11-14 03:33:59.134893] D [glusterd-utils.c:743:glusterd_brickinfo_new_from_brick] 0-management: Returning 0
[2014-11-14 03:33:59.134900] D [glusterd-utils.c:5483:glusterd_friend_find_by_hostname] 0-management: Friend data-node3 found.. state: 3
[2014-11-14 03:33:59.134906] D [glusterd-utils.c:5567:glusterd_hostname_to_uuid] 0-management: returning 0
[2014-11-14 03:33:59.134912] D [glusterd-utils.c:1026:glusterd_volume_brickinfo_get] 0-management: Found brick data-node3:/brick/testrep in volume testrep
[2014-11-14 03:33:59.134917] D [glusterd-utils.c:1035:glusterd_volume_brickinfo_get] 0-management: Returning 0
[2014-11-14 03:33:59.134923] D [glusterd-utils.c:1059:glusterd_volume_brickinfo_get_by_brick] 0-: Returning 0
[2014-11-14 03:33:59.134930] D [glusterd-utils.c:5483:glusterd_friend_find_by_hostname] 0-management: Friend data-node5 found.. state: 3
[2014-11-14 03:33:59.134936] D [glusterd-utils.c:5567:glusterd_hostname_to_uuid] 0-management: returning 0
[2014-11-14 03:33:59.134941] D [glusterd-utils.c:685:glusterd_resolve_brick] 0-management: Returning 0
[2014-11-14 03:33:59.135057] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.153
[2014-11-14 03:33:59.135112] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.153
[2014-11-14 03:33:59.135145] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.153
[2014-11-14 03:33:59.135176] D [common-utils.c:2946:gf_is_local_addr] 0-management: data-node3 is not local
[2014-11-14 03:33:59.135214] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.135248] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.135280] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.135311] D [common-utils.c:2946:gf_is_local_addr] 0-management: data-node5 is not local
[2014-11-14 03:33:59.135349] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.135409] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.135442] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.135485] D [common-utils.c:2946:gf_is_local_addr] 0-management: data-node5 is not local
[2014-11-14 03:33:59.135531] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.153
[2014-11-14 03:33:59.135566] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.153
[2014-11-14 03:33:59.135598] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.153
[2014-11-14 03:33:59.135628] D [common-utils.c:2946:gf_is_local_addr] 0-management: data-node3 is not local
[2014-11-14 03:33:59.135664] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.135698] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.135729] D [common-utils.c:2930:gf_is_local_addr] 0-management: 192.168.1.155
[2014-11-14 03:33:59.135759] D [common-utils.c:2946:gf_is_local_addr] 0-management: data-node5 is not local
[2014-11-14 03:33:59.135770] D [glusterd-utils.c:5746:glusterd_set_rb_status] 0-: setting status from 0 to 1
[2014-11-14 03:33:59.135783] D [glusterd-store.c:629:glusterd_store_create_volume_dir] 0-management: Returning with 0
[2014-11-14 03:33:59.164271] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.189402] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.214469] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.239646] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.264680] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.289899] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.315100] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.340133] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.365339] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.382124] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.398932] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.415818] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.432559] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.432600] D [glusterd-store.c:653:glusterd_store_volinfo_write] 0-management: Returning 0
[2014-11-14 03:33:59.449350] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.466321] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.483142] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.508273] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.533441] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.550232] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.550275] D [glusterd-store.c:250:glusterd_store_brickinfo_write] 0-management: Returning 0
[2014-11-14 03:33:59.550290] D [glusterd-store.c:276:glusterd_store_perform_brick_store] 0-management: Returning 0
[2014-11-14 03:33:59.550296] D [glusterd-store.c:306:glusterd_store_brickinfo] 0-management: Returning with 0
[2014-11-14 03:33:59.567061] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.584035] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.600833] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.617505] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.634455] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.651231] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.651272] D [glusterd-store.c:250:glusterd_store_brickinfo_write] 0-management: Returning 0
[2014-11-14 03:33:59.651286] D [glusterd-store.c:276:glusterd_store_perform_brick_store] 0-management: Returning 0
[2014-11-14 03:33:59.651293] D [glusterd-store.c:306:glusterd_store_brickinfo] 0-management: Returning with 0
[2014-11-14 03:33:59.651298] D [glusterd-store.c:793:glusterd_store_brickinfos] 0-management: Returning 0
[2014-11-14 03:33:59.651305] D [glusterd-store.c:993:glusterd_store_perform_volume_store] 0-management: Returning 0
[2014-11-14 03:33:59.835933] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.919508] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.945196] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.970424] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.995582] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:33:59.995624] D [glusterd-store.c:852:glusterd_store_rbstate_write] 0-management: Returning 0
[2014-11-14 03:34:00.012546] D [glusterd-store.c:882:glusterd_store_perform_rbstate_store] 0-management: Returning 0
[2014-11-14 03:34:00.037861] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:34:00.062935] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:34:00.088125] D [store.c:348:gf_store_save_value] 0-: returning: 0
[2014-11-14 03:34:00.088176] D [glusterd-store.c:933:glusterd_store_node_state_write] 0-management: Returning 0
[2014-11-14 03:34:00.105106] D [glusterd-store.c:963:glusterd_store_perform_node_state_store] 0-management: Returning 0
[2014-11-14 03:34:00.105354] D [glusterd-utils.c:1784:glusterd_volume_compute_cksum] 0-management: Returning with 0
[2014-11-14 03:34:00.105364] D [glusterd-store.c:1143:glusterd_store_volinfo] 0-management: Returning 0
[2014-11-14 03:34:00.105394] D [glusterd-op-sm.c:4266:glusterd_op_commit_perform] 0-management: Returning 0
[2014-11-14 03:34:00.105488] D [glusterd-rpc-ops.c:1362:glusterd_commit_op] 0-management: Returning 0
[2014-11-14 03:34:00.105512] D [glusterd-rpc-ops.c:1362:glusterd_commit_op] 0-management: Returning 0
[2014-11-14 03:34:00.105519] D [glusterd-op-sm.c:3538:glusterd_op_ac_send_commit_op] 0-management: Sent commit op req for 'Volume Replace brick' to 2 peers
[2014-11-14 03:34:00.105527] D [glusterd-op-sm.c:3559:glusterd_op_ac_send_commit_op] 0-management: Returning with 0
[2014-11-14 03:34:00.105534] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Brick op sent' to 'Commit op sent' due to event 'GD_OP_EVENT_ALL_ACK'
[2014-11-14 03:34:00.105540] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:34:00.109741] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:00.109758] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:00.111882] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:00.111930] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:01.089999] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:01.090040] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:01.348813] D [glusterd-rpc-ops.c:901:__glusterd_commit_op_cbk] 0-management: Received commit ACC from uuid: bac58735-7d09-4d5c-8187-eedbdbdf30a7
[2014-11-14 03:34:01.348858] D [glusterd-utils.c:5439:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster
[2014-11-14 03:34:01.348870] D [glusterd-utils.c:7548:glusterd_rb_use_rsp_dict] 0-: dst-brick-port=49153 found
[2014-11-14 03:34:01.348881] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:34:01.348888] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:34:01.348896] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Commit op sent' to 'Commit op sent' due to event 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:34:01.348902] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:34:01.657424] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:01.657459] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:02.124285] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:02.124322] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:02.664488] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:02.664527] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:02.676091] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:02.676137] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:04.132191] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:04.132229] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:04.183220] D [glusterd-rpc-ops.c:901:__glusterd_commit_op_cbk] 0-management: Received commit ACC from uuid: dcaa270d-d46f-43c2-b9a2-ad60197fd912
[2014-11-14 03:34:04.183317] D [glusterd-utils.c:5439:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster
[2014-11-14 03:34:04.183331] D [glusterd-utils.c:7542:glusterd_rb_use_rsp_dict] 0-: src-brick-port=49152 found
[2014-11-14 03:34:04.183342] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:34:04.183349] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:34:04.183367] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Commit op sent' to 'Commit op sent' due to event 'GD_OP_EVENT_RCVD_ACC'
[2014-11-14 03:34:04.183409] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:34:04.267906] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:04.267953] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:04.686770] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:04.686809] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:05.139493] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:05.139538] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:05.693804] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:05.693843] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:05.709951] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:05.709994] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:07.147269] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:07.147307] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:07.761150] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:07.761189] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:08.157998] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:08.158039] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:08.773222] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:08.773265] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:08.785388] D [socket.c:492:__socket_rwv] 0-socket.management: EOF on socket
[2014-11-14 03:34:08.785435] D [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2014-11-14 03:34:09.464325] D [glusterd-replace-brick.c:1910:glusterd_do_replace_brick] 0-: Cancelling timer thread
[2014-11-14 03:34:09.464392] D [glusterd-replace-brick.c:1914:glusterd_do_replace_brick] 0-: Replace brick operation detected
[2014-11-14 03:34:09.464419] D [glusterd-replace-brick.c:1929:glusterd_do_replace_brick] 0-: src brick=data-node3:/brick/testrep
[2014-11-14 03:34:09.464428] D [glusterd-replace-brick.c:1938:glusterd_do_replace_brick] 0-: dst brick=data-node5:/brick/testrep
[2014-11-14 03:34:09.464435] D [glusterd-utils.c:1156:glusterd_volinfo_find] 0-management: Volume testrep found
[2014-11-14 03:34:09.464435] D [glusterd-utils.c:1156:glusterd_volinfo_find] 0-management: Volume testrep found
[2014-11-14 03:34:09.464441] D [glusterd-utils.c:1163:glusterd_volinfo_find] 0-management: Returning 0
[2014-11-14 03:34:09.464467] D [glusterd-utils.c:669:glusterd_brickinfo_new] 0-management: Returning 0
[2014-11-14 03:34:09.464476] D [glusterd-utils.c:743:glusterd_brickinfo_new_from_brick] 0-management: Returning 0
[2014-11-14 03:34:09.464484] D [glusterd-utils.c:5483:glusterd_friend_find_by_hostname] 0-management: Friend data-node3 found.. state: 3
[2014-11-14 03:34:09.464491] D [glusterd-utils.c:5567:glusterd_hostname_to_uuid] 0-management: returning 0
[2014-11-14 03:34:09.464498] D [glusterd-utils.c:1026:glusterd_volume_brickinfo_get] 0-management: Found brick data-node3:/brick/testrep in volume testrep
[2014-11-14 03:34:09.464516] D [glusterd-utils.c:1035:glusterd_volume_brickinfo_get] 0-management: Returning 0
[2014-11-14 03:34:09.464523] D [glusterd-utils.c:1059:glusterd_volume_brickinfo_get_by_brick] 0-: Returning 0
[2014-11-14 03:34:09.464529] D [glusterd-utils.c:5483:glusterd_friend_find_by_hostname] 0-management: Friend data-node5 found.. state: 3
[2014-11-14 03:34:09.464536] D [glusterd-utils.c:5567:glusterd_hostname_to_uuid] 0-management: returning 0
[2014-11-14 03:34:09.464541] D [glusterd-utils.c:685:glusterd_resolve_brick] 0-management: Returning 0
[2014-11-14 03:34:09.464549] D [glusterd-replace-brick.c:872:rb_generate_client_volfile] 0-management: Creating volfile
[2014-11-14 03:34:09.506554] D [run.c:190:runner_log] 0-management: Successfully started  glusterfs: /usr/sbin/glusterfs -f/var/lib/glusterd/vols/testrep/rb_client.vol /var/run/gluster/testrep-rb_mount
[2014-11-14 03:34:09.539425] D [run.c:190:runner_log] 0-management: Successfully unmounted  maintenance client: /bin/umount -l /var/run/gluster/testrep-rb_mount
[2014-11-14 03:34:09.545275] D [glusterd-op-sm.c:5640:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_COMMIT_ACC'
[2014-11-14 03:34:09.545301] D [glusterd-op-sm.c:5717:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_COMMIT_ACC'
[2014-11-14 03:34:09.545436] D [glusterd-rpc-ops.c:1248:glusterd_cluster_unlock] 0-management: Returning 0
[2014-11-14 03:34:09.545540] D [glusterd-rpc-ops.c:1248:glusterd_cluster_unlock] 0-management: Returning 0
[2014-11-14 03:34:09.545552] D [glusterd-op-sm.c:2465:glusterd_op_ac_send_unlock] 0-management: Returning with 0
[2014-11-14 03:34:09.545560] D [glusterd-utils.c:6111:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Commit op sent' to 'Unlock sent' due to event 'GD_OP_EVENT_COMMIT_ACC'
[2014-11-14 03:34:09.545566] D [glusterd-utils.c:6113:glusterd_sm_tr_log_transition_add] 0-management: returning 0
[2014-11-14 03:44:15.653040] E [rpc-clnt.c:208:call_bail] 0-management: bailing out frame type(glusterd mgmt) op(--(2)) xid = 0x1a sent = 2014-11-14 03:34:09.545536. timeout = 600 for 192.168.1.155:24007


And I check the glusterd status, it appears running. after that.



BTW, i try the same test with glusterfs 3.6.1. The replace-brick works well, except i need commit with "force", otherwise it will commit failed.

Comment 3 Niels de Vos 2016-06-17 15:58:07 UTC
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.


Note You need to log in before you can comment on or make changes to this bug.