Bug 763816 (GLUSTER-2084)
Summary: | [3.1.1qa5] : replace-brick fails to migrate data when migration from same hostname | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Harshavardhana <fharshav> |
Component: | glusterd | Assignee: | Vijay Bellur <vbellur> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | mainline | CC: | cww, gluster-bugs, rabhat, vijay |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | Type: | --- | |
Regression: | RTP | Mount Type: | --- |
Documentation: | DNR | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Harshavardhana
2010-11-11 01:16:15 UTC
[2010-11-11 10:08:01.959440] D [glusterd-op-sm.c:5337:glusterd_op_set_cli_op] : Returning 0 [2010-11-11 10:08:01.959641] I [glusterd-handler.c:1225:glusterd_handle_replace_brick] glusterd: Received replace brick req [2010-11-11 10:08:01.959694] D [glusterd-handler.c:1258:glusterd_handle_replace_brick] : src brick=bigbang:/d/glusterfs/export/export1 [2010-11-11 10:08:01.959733] D [glusterd-handler.c:1268:glusterd_handle_replace_brick] : dst brick=bigbang:/e/glusterfs/export/export1 [2010-11-11 10:08:01.959819] I [glusterd-utils.c:232:glusterd_lock] glusterd: Cluster lock held by 92782297-9b10-4a6d-8aca-a866b764bdf1 [2010-11-11 10:08:01.959858] I [glusterd-handler.c:2788:glusterd_op_txn_begin] glusterd: Acquired local lock [2010-11-11 10:08:01.959896] D [glusterd-op-sm.c:5194:glusterd_op_sm_inject_event] glusterd: Enqueuing event: 'GD_OP_EVENT_START_LOCK' [2010-11-11 10:08:01.959933] D [glusterd-handler.c:2792:glusterd_op_txn_begin] glusterd: Returning 0 [2010-11-11 10:08:01.960000] D [glusterd-op-sm.c:5242:glusterd_op_sm] : Dequeued event of type: 'GD_OP_EVENT_START_LOCK' [2010-11-11 10:08:01.960038] I [glusterd3_1-mops.c:1105:glusterd3_1_cluster_lock] glusterd: Sent lock req to 0 peers [2010-11-11 10:08:01.960073] D [glusterd3_1-mops.c:1108:glusterd3_1_cluster_lock] glusterd: Returning 0 [2010-11-11 10:08:01.960109] D [glusterd-op-sm.c:5194:glusterd_op_sm_inject_event] glusterd: Enqueuing event: 'GD_OP_EVENT_ALL_ACC' [2010-11-11 10:08:01.960144] D [glusterd-op-sm.c:221:glusterd_op_sm_inject_all_acc] : Returning 0 [2010-11-11 10:08:01.960179] D [glusterd-op-sm.c:4023:glusterd_op_ac_send_lock] : Returning with 0 [2010-11-11 10:08:01.960216] D [glusterd-utils.c:2605:glusterd_sm_tr_log_transition_add] glusterd: Transitioning from 'Default' to 'Lock sent' due to event 'GD_OP_EVENT_START_LOCK' [2010-11-11 10:08:01.960254] D [glusterd-utils.c:2607:glusterd_sm_tr_log_transition_add] : returning 0 [2010-11-11 10:08:01.960290] D [glusterd-op-sm.c:5242:glusterd_op_sm] : Dequeued event of type: 'GD_OP_EVENT_ALL_ACC' [2010-11-11 10:08:01.960347] D [glusterd-op-sm.c:914:glusterd_op_stage_replace_brick] : src brick=bigbang:/d/glusterfs/export/export1 [2010-11-11 10:08:01.960389] D [glusterd-op-sm.c:924:glusterd_op_stage_replace_brick] : dst brick=bigbang:/e/glusterfs/export/export1 [2010-11-11 10:08:01.960483] D [glusterd-utils.c:795:glusterd_volinfo_find] : Volume vol found [2010-11-11 10:08:01.960522] D [glusterd-utils.c:803:glusterd_volinfo_find] : Returning 0 [2010-11-11 10:08:01.960557] D [glusterd-utils.c:2358:glusterd_is_rb_started] : is_rb_started:status=0 [2010-11-11 10:08:01.960583] I [glusterd-utils.c:726:glusterd_volume_brickinfo_get_by_brick] : brick: bigbang:/d/glusterfs/export/export1 [2010-11-11 10:08:01.960771] D [glusterd-utils.c:199:glusterd_is_local_addr] glusterd: bigbang is local [2010-11-11 10:08:01.960789] D [glusterd-utils.c:2188:glusterd_hostname_to_uuid] : returning 0 [2010-11-11 10:08:01.960803] I [glusterd-utils.c:697:glusterd_volume_brickinfo_get] : Found brick [2010-11-11 10:08:01.960816] D [glusterd-utils.c:708:glusterd_volume_brickinfo_get] : Returning 0 [2010-11-11 10:08:01.960829] D [glusterd-utils.c:755:glusterd_volume_brickinfo_get_by_brick] : Returning 0 [2010-11-11 10:08:01.960880] D [glusterd-utils.c:199:glusterd_is_local_addr] glusterd: bigbang is local [2010-11-11 10:08:01.960896] D [glusterd-op-sm.c:1026:glusterd_op_stage_replace_brick] : I AM THE SOURCE HOST [2010-11-11 10:08:01.960937] E [dict.c:308:dict_set] dict: @this=(nil) @value=0x21ee110, key=src-brick-port [2010-11-11 10:08:01.960953] D [glusterd-op-sm.c:1033:glusterd_op_stage_replace_brick] : Could not set src-brick-port=24010 [2010-11-11 10:08:01.961093] D [glusterd-utils.c:199:glusterd_is_local_addr] glusterd: bigbang is local [2010-11-11 10:08:01.961110] D [glusterd-utils.c:2188:glusterd_hostname_to_uuid] : returning 0 [2010-11-11 10:08:01.961124] D [glusterd-utils.c:708:glusterd_volume_brickinfo_get] : Returning -1 [2010-11-11 10:08:01.961147] D [glusterd-utils.c:610:glusterd_brickinfo_new] : Returning 0 [2010-11-11 10:08:01.961163] D [glusterd-utils.c:667:glusterd_brickinfo_from_brick] : Returning 0 [2010-11-11 10:08:01.961217] D [glusterd-utils.c:199:glusterd_is_local_addr] glusterd: bigbang is local [2010-11-11 10:08:01.961262] D [glusterd-utils.c:2436:glusterd_brick_create_path] : returning 0 [2010-11-11 10:08:01.961278] D [glusterd-op-sm.c:1112:glusterd_op_stage_replace_brick] : Returning 0 [2010-11-11 10:08:01.961291] D [glusterd-op-sm.c:4941:glusterd_op_stage_validate] : Returning 0 I have observed it too. I think it is because in glusterd_op_stage_replace brick we set the port number in the dict (rsp_dict), somehow that dict has become null. 2010-11-11 10:08:01.960896] D [glusterd-op-sm.c:1026:glusterd_op_stage_replace_brick] : I AM THE SOURCE HOST [2010-11-11 10:08:01.960937] E [dict.c:308:dict_set] dict: @this=(nil) @value=0x21ee110, key=src-brick-port [2010-11-11 10:08:01.960953] D [glusterd-op-sm.c:1033:glusterd_op_stage_replace_brick] : Could not set src-brick-port=24010 [2010-11-11 12:08:03.848210] D [glusterd-utils.c:2607:glusterd_sm_tr_log_transition_add] : returning 0 [2010-11-11 12:08:03.848224] D [glusterd-op-sm.c:5244:glusterd_op_sm] : Dequeued event of type: 'GD_OP_EVENT_ALL_ACC' [2010-11-11 12:08:03.848241] M [glusterd3_1-mops.c:1220:glusterd3_1_stage_op] : IT IS COMING HERE, and calling glusterd_op_stage_validate [2010-11-11 12:08:03.848261] D [glusterd-op-sm.c:914:glusterd_op_stage_replace_brick] : src brick=bigbang:/d/glusterfs/export/export1 [2010-11-11 12:08:03.848277] D [glusterd-op-sm.c:924:glusterd_op_stage_replace_brick] : dst brick=bigbang:/e/glusterfs/export/export1 [2010-11-11 12:08:03.848291] D [glusterd-utils.c:795:glusterd_volinfo_find] : Volume vol found [2010-11-11 12:08:03.848306] D [glusterd-utils.c:803:glusterd_volinfo_find] : Returning 0 [2010-11-11 12:08:03.848320] I [glusterd-utils.c:726:glusterd_volume_brickinfo_get_by_brick] : brick: bigbang:/d/glusterfs/export/export1 In the above log it says glusterd_op_stage_validate is being called from glusterd3_1_stage_op and it sends NULL in place of rsp_dict. PATCH: http://patches.gluster.com/patch/5690 in master (cluster/pump: Reset saved path upon pump completion) PATCH: http://patches.gluster.com/patch/5691 in master (mgmt/glusterd: fixes for uninterrupted replace-brick with nfs) PATCH: http://patches.gluster.com/patch/5688 in master (check for dict also while setting the port for source brick while doing replace brick) PATCH: http://patches.gluster.com/patch/5744 in master (mgmt/glusterd: Avoid creating multiple destination brickinfo during replace-brick) PATCH: http://patches.gluster.com/patch/5778 in master (mgmt/glusterd: Temporary fix for a crash seen in replace-brick) checked with the git head (26cedae57d5b7cb8d50ed077ce29c92e30d6e260). Migration where the source and the destination are the same machine, worked fine. |