Bug 763816 (GLUSTER-2084) - [3.1.1qa5] : replace-brick fails to migrate data when migration from same hostname
Summary: [3.1.1qa5] : replace-brick fails to migrate data when migration from same hos...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-2084
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Vijay Bellur
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-11-11 01:16 UTC by Harshavardhana
Modified: 2015-03-23 01:03 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: ---
Documentation: DNR
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Harshavardhana 2010-11-11 01:16:15 UTC
[root@compel1 ~]# gluster volume info

Volume Name: dist
Type: Distribute
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: compel1:/export1
Brick2: compel4:/export1


[root@compel4 ~]# gluster volume replace-brick dist compel4:/export1 compel4:/export2 start
replace-brick started successfully

"export2" is a different block device on the same node. 

Even after 10mins of starting the operation 

[root@compel4 ~]# gluster volume replace-brick dist compel4:/export1 compel4:/export2 status
Number of files migrated = 0       Current file=  
[root@compel4 ~]#

[root@compel4 ~]# find /export1/ | wc -l
2543

[root@compel4 ~]# find /export2/ | wc -l
1
[root@compel4 ~]# ls /export2/ 
[root@compel4 ~]#

[root@compel4 ~]# gluster volume replace-brick dist compel4:/export1 compel4:/export2 abort
replace-brick aborted successfully
[root@compel4 ~]# 

Nothing special log files.

Comment 1 Raghavendra Bhat 2010-11-11 01:49:21 UTC
[2010-11-11 10:08:01.959440] D [glusterd-op-sm.c:5337:glusterd_op_set_cli_op] : Returning 0
[2010-11-11 10:08:01.959641] I [glusterd-handler.c:1225:glusterd_handle_replace_brick] glusterd: Received replace brick req
[2010-11-11 10:08:01.959694] D [glusterd-handler.c:1258:glusterd_handle_replace_brick] : src brick=bigbang:/d/glusterfs/export/export1
[2010-11-11 10:08:01.959733] D [glusterd-handler.c:1268:glusterd_handle_replace_brick] : dst brick=bigbang:/e/glusterfs/export/export1
[2010-11-11 10:08:01.959819] I [glusterd-utils.c:232:glusterd_lock] glusterd: Cluster lock held by 92782297-9b10-4a6d-8aca-a866b764bdf1
[2010-11-11 10:08:01.959858] I [glusterd-handler.c:2788:glusterd_op_txn_begin] glusterd: Acquired local lock
[2010-11-11 10:08:01.959896] D [glusterd-op-sm.c:5194:glusterd_op_sm_inject_event] glusterd: Enqueuing event: 'GD_OP_EVENT_START_LOCK'
[2010-11-11 10:08:01.959933] D [glusterd-handler.c:2792:glusterd_op_txn_begin] glusterd: Returning 0
[2010-11-11 10:08:01.960000] D [glusterd-op-sm.c:5242:glusterd_op_sm] : Dequeued event of type: 'GD_OP_EVENT_START_LOCK'
[2010-11-11 10:08:01.960038] I [glusterd3_1-mops.c:1105:glusterd3_1_cluster_lock] glusterd: Sent lock req to 0 peers
[2010-11-11 10:08:01.960073] D [glusterd3_1-mops.c:1108:glusterd3_1_cluster_lock] glusterd: Returning 0
[2010-11-11 10:08:01.960109] D [glusterd-op-sm.c:5194:glusterd_op_sm_inject_event] glusterd: Enqueuing event: 'GD_OP_EVENT_ALL_ACC'
[2010-11-11 10:08:01.960144] D [glusterd-op-sm.c:221:glusterd_op_sm_inject_all_acc] : Returning 0
[2010-11-11 10:08:01.960179] D [glusterd-op-sm.c:4023:glusterd_op_ac_send_lock] : Returning with 0
[2010-11-11 10:08:01.960216] D [glusterd-utils.c:2605:glusterd_sm_tr_log_transition_add] glusterd: Transitioning from 'Default' to 'Lock sent' due to event 'GD_OP_EVENT_START_LOCK'
[2010-11-11 10:08:01.960254] D [glusterd-utils.c:2607:glusterd_sm_tr_log_transition_add] : returning 0
[2010-11-11 10:08:01.960290] D [glusterd-op-sm.c:5242:glusterd_op_sm] : Dequeued event of type: 'GD_OP_EVENT_ALL_ACC'
[2010-11-11 10:08:01.960347] D [glusterd-op-sm.c:914:glusterd_op_stage_replace_brick] : src brick=bigbang:/d/glusterfs/export/export1
[2010-11-11 10:08:01.960389] D [glusterd-op-sm.c:924:glusterd_op_stage_replace_brick] : dst brick=bigbang:/e/glusterfs/export/export1
[2010-11-11 10:08:01.960483] D [glusterd-utils.c:795:glusterd_volinfo_find] : Volume vol found
[2010-11-11 10:08:01.960522] D [glusterd-utils.c:803:glusterd_volinfo_find] : Returning 0
[2010-11-11 10:08:01.960557] D [glusterd-utils.c:2358:glusterd_is_rb_started] : is_rb_started:status=0
[2010-11-11 10:08:01.960583] I [glusterd-utils.c:726:glusterd_volume_brickinfo_get_by_brick] : brick: bigbang:/d/glusterfs/export/export1
[2010-11-11 10:08:01.960771] D [glusterd-utils.c:199:glusterd_is_local_addr] glusterd: bigbang is local
[2010-11-11 10:08:01.960789] D [glusterd-utils.c:2188:glusterd_hostname_to_uuid] : returning 0
[2010-11-11 10:08:01.960803] I [glusterd-utils.c:697:glusterd_volume_brickinfo_get] : Found brick
[2010-11-11 10:08:01.960816] D [glusterd-utils.c:708:glusterd_volume_brickinfo_get] : Returning 0
[2010-11-11 10:08:01.960829] D [glusterd-utils.c:755:glusterd_volume_brickinfo_get_by_brick] : Returning 0
[2010-11-11 10:08:01.960880] D [glusterd-utils.c:199:glusterd_is_local_addr] glusterd: bigbang is local
[2010-11-11 10:08:01.960896] D [glusterd-op-sm.c:1026:glusterd_op_stage_replace_brick] : I AM THE SOURCE HOST
[2010-11-11 10:08:01.960937] E [dict.c:308:dict_set] dict: @this=(nil) @value=0x21ee110, key=src-brick-port
[2010-11-11 10:08:01.960953] D [glusterd-op-sm.c:1033:glusterd_op_stage_replace_brick] : Could not set src-brick-port=24010
[2010-11-11 10:08:01.961093] D [glusterd-utils.c:199:glusterd_is_local_addr] glusterd: bigbang is local
[2010-11-11 10:08:01.961110] D [glusterd-utils.c:2188:glusterd_hostname_to_uuid] : returning 0
[2010-11-11 10:08:01.961124] D [glusterd-utils.c:708:glusterd_volume_brickinfo_get] : Returning -1
[2010-11-11 10:08:01.961147] D [glusterd-utils.c:610:glusterd_brickinfo_new] : Returning 0
[2010-11-11 10:08:01.961163] D [glusterd-utils.c:667:glusterd_brickinfo_from_brick] : Returning 0
[2010-11-11 10:08:01.961217] D [glusterd-utils.c:199:glusterd_is_local_addr] glusterd: bigbang is local
[2010-11-11 10:08:01.961262] D [glusterd-utils.c:2436:glusterd_brick_create_path] : returning 0
[2010-11-11 10:08:01.961278] D [glusterd-op-sm.c:1112:glusterd_op_stage_replace_brick] : Returning 0
[2010-11-11 10:08:01.961291] D [glusterd-op-sm.c:4941:glusterd_op_stage_validate] : Returning 0


I have observed it too. I think it is because in glusterd_op_stage_replace brick we set the port number in the dict (rsp_dict), somehow that dict has become null.

2010-11-11 10:08:01.960896] D [glusterd-op-sm.c:1026:glusterd_op_stage_replace_brick] : I AM THE SOURCE HOST
[2010-11-11 10:08:01.960937] E [dict.c:308:dict_set] dict: @this=(nil) @value=0x21ee110, key=src-brick-port
[2010-11-11 10:08:01.960953] D [glusterd-op-sm.c:1033:glusterd_op_stage_replace_brick] : Could not set src-brick-port=24010

Comment 2 Raghavendra Bhat 2010-11-11 03:40:40 UTC
[2010-11-11 12:08:03.848210] D [glusterd-utils.c:2607:glusterd_sm_tr_log_transition_add] : returning 0
[2010-11-11 12:08:03.848224] D [glusterd-op-sm.c:5244:glusterd_op_sm] : Dequeued event of type: 'GD_OP_EVENT_ALL_ACC'
[2010-11-11 12:08:03.848241] M [glusterd3_1-mops.c:1220:glusterd3_1_stage_op] : IT IS COMING HERE, and calling glusterd_op_stage_validate
[2010-11-11 12:08:03.848261] D [glusterd-op-sm.c:914:glusterd_op_stage_replace_brick] : src brick=bigbang:/d/glusterfs/export/export1
[2010-11-11 12:08:03.848277] D [glusterd-op-sm.c:924:glusterd_op_stage_replace_brick] : dst brick=bigbang:/e/glusterfs/export/export1
[2010-11-11 12:08:03.848291] D [glusterd-utils.c:795:glusterd_volinfo_find] : Volume vol found
[2010-11-11 12:08:03.848306] D [glusterd-utils.c:803:glusterd_volinfo_find] : Returning 0
[2010-11-11 12:08:03.848320] I [glusterd-utils.c:726:glusterd_volume_brickinfo_get_by_brick] : brick: bigbang:/d/glusterfs/export/export1


In the above log it says glusterd_op_stage_validate is being called from glusterd3_1_stage_op and it sends NULL in place of rsp_dict.

Comment 3 Anand Avati 2010-11-14 15:26:47 UTC
PATCH: http://patches.gluster.com/patch/5690 in master (cluster/pump: Reset saved path upon pump completion)

Comment 4 Anand Avati 2010-11-14 15:26:51 UTC
PATCH: http://patches.gluster.com/patch/5691 in master (mgmt/glusterd: fixes for uninterrupted replace-brick with nfs)

Comment 5 Anand Avati 2010-11-14 15:26:59 UTC
PATCH: http://patches.gluster.com/patch/5688 in master (check for dict also while setting the port for source brick while doing replace brick)

Comment 6 Anand Avati 2010-11-18 10:55:58 UTC
PATCH: http://patches.gluster.com/patch/5744 in master (mgmt/glusterd: Avoid creating multiple destination brickinfo during replace-brick)

Comment 7 Anand Avati 2010-11-25 05:34:02 UTC
PATCH: http://patches.gluster.com/patch/5778 in master (mgmt/glusterd: Temporary fix for a crash seen in replace-brick)

Comment 8 Raghavendra Bhat 2011-02-21 03:24:38 UTC
checked with the git head (26cedae57d5b7cb8d50ed077ce29c92e30d6e260). Migration where the source and the destination are the same machine, worked fine.


Note You need to log in before you can comment on or make changes to this bug.