Bug 875412 - cannot replace brick in distributed replicated volume
Summary: cannot replace brick in distributed replicated volume
Keywords:
Status: CLOSED DUPLICATE of bug 877522
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: x86_64
OS: Linux
medium
unspecified
Target Milestone: ---
Assignee: krishnan parthasarathi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 878872
TreeView+ depends on / blocked
 
Reported: 2012-11-11 03:30 UTC by ricor.bz
Modified: 2015-11-03 23:05 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 878872 (view as bug list)
Environment:
Last Closed: 2012-12-24 10:45:55 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description ricor.bz 2012-11-11 03:30:13 UTC
Description of problem:
unable to replace a brick in an online 4 node gluster volume
4 CentOS nodes serving a distributed replicated volume
attempt to replace one node with an Ubuntu node using replace-brick directive

Version-Release number of selected component (if applicable):
gluster 3.3.1 CentOS 6.3
gluster 3.3.1 Ubuntu Server 12.10

How reproducible:
occurs on all nodes

Steps to Reproduce:
1. gluster peer probe ubus01.node
2. gluster volume replace-brick gvol_1 ceno1.node:/exports/lv01 ubus01.node:/exports/lv01 start 
  
Actual results:
/exports/lv01 or a prefix of it is already part of a volume

Expected results:
data should be migrated from node ceno1.node to node ubus01.node

Additional info:
this is the first time the node is being added to the volume
all bricks are ext4 with mount options noatime,user_xattrs

Comment 1 krishnan parthasarathi 2012-11-12 06:38:48 UTC
Ricor,
Could you paste output of "gluster peer status" from all the nodes?
Please attach the glusterd log files from ubus01.node and ceno1.node.

Comment 2 ricor.bz 2012-11-13 14:44:44 UTC
okay.

i had to replace one of the centos nodes (treating it as a FAILED server because glusterd refused to start up healthily after a restart).

that process was completed rather smoothly, so there is now one working ubuntu node in volume.

*******************
gluster volume info
*******************
Volume Name: disrep-vol
Type: Distributed-Replicate
Volume ID: 2dfe4f23-8d10-4a88-85e2-97d3e72c13c4
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: ceno4.node:/exports/vol_01
Brick2: ubus02.node:/exports/vol_01
Brick3: ceno2.node:/exports/vol_01
Brick4: ceno1.node:/exports/vol_01
Options Reconfigured:
performance.cache-size: 64MB
auth.allow: 172.16.100.2,172.16.100.3,172.16.100.20,172.16.100.21,172.16.100.22,172.16.100.23,172.16.100.24,172.16.100.25
nfs.addr-namelookup: off
nfs.rpc-auth-allow: 172.16.100.2,172.16.100.3,172.16.100.20,172.16.100.21,172.16.100.22,172.16.100.23,172.16.100.24
nfs.disable: off
nfs.ports-insecure: on


***********************
gluster peer status
***********************
**ceno1.node
Number of Peers: 4

Hostname: 172.16.100.21
Uuid: c5c4c4fc-a22d-4f11-a42f-0d1f4ef7af70
State: Peer in Cluster (Connected)

Hostname: ceno2.node
Uuid: fa7005c3-a929-4110-b1be-ccd206000a67
State: Peer in Cluster (Connected)

Hostname: ceno4.node
Uuid: 6afbf15a-4294-44a0-b351-5c45b8142513
State: Peer in Cluster (Connected)

Hostname: ubus01.node
Uuid: 2547029f-15c1-4349-ba61-7bd9e226110d
State: Peer in Cluster (Connected)


**ceno4.node
Number of Peers: 4

Hostname: ceno1.node
Uuid: 85a362dd-79eb-48cf-80bd-f675617ad01e
State: Peer in Cluster (Connected)

Hostname: ceno2.node
Uuid: fa7005c3-a929-4110-b1be-ccd206000a67
State: Peer in Cluster (Connected)

Hostname: 172.16.100.21
Uuid: c5c4c4fc-a22d-4f11-a42f-0d1f4ef7af70
State: Peer in Cluster (Connected)

Hostname: ubus01.node
Uuid: 2547029f-15c1-4349-ba61-7bd9e226110d
State: Peer in Cluster (Connected)


**ubus02.node
Number of Peers: 4

Hostname: ceno1.node
Uuid: 85a362dd-79eb-48cf-80bd-f675617ad01e
State: Peer in Cluster (Connected)

Hostname: ceno4.node
Uuid: 6afbf15a-4294-44a0-b351-5c45b8142513
State: Peer in Cluster (Connected)

Hostname: ceno2.node
Uuid: fa7005c3-a929-4110-b1be-ccd206000a67
State: Peer in Cluster (Connected)

Hostname: ubus01.node
Uuid: 2547029f-15c1-4349-ba61-7bd9e226110d
State: Peer in Cluster (Connected)


**ubus01.node
Number of Peers: 4

Hostname: ceno2.node
Uuid: fa7005c3-a929-4110-b1be-ccd206000a67
State: Peer in Cluster (Connected)

Hostname: ceno1.node
Uuid: 85a362dd-79eb-48cf-80bd-f675617ad01e
State: Peer in Cluster (Connected)

Hostname: ceno4.node
Uuid: 6afbf15a-4294-44a0-b351-5c45b8142513
State: Peer in Cluster (Connected)

Hostname: 172.16.100.21
Uuid: c5c4c4fc-a22d-4f11-a42f-0d1f4ef7af70
State: Peer in Cluster (Connected)


**ceno2.node
Number of Peers: 4

Hostname: ceno1.node
Uuid: 85a362dd-79eb-48cf-80bd-f675617ad01e
State: Peer in Cluster (Connected)

Hostname: ceno4.node
Uuid: 6afbf15a-4294-44a0-b351-5c45b8142513
State: Peer in Cluster (Connected)

Hostname: 172.16.100.21
Uuid: c5c4c4fc-a22d-4f11-a42f-0d1f4ef7af70
State: Peer in Cluster (Connected)

Hostname: ubus01.node
Uuid: 2547029f-15c1-4349-ba61-7bd9e226110d
State: Peer in Cluster (Connected)


***********************
gluster volume replace-brick disrep-vol ceno1.node:/exports/vol_01 ubus01.node:/exports/vol_01 start
***********************

contents of /var/log/glusterfs/etc-glusterfs-glusterd.vol.log

******ubus01.node
[2012-11-12 16:21:31.013000] I [glusterd-handler.c:502:glusterd_handle_cluster_lock] 0-glusterd: Received LOCK from uuid: fa7005c3-a929-4110-b1be-ccd206000a67
[2012-11-12 16:21:31.013103] I [glusterd-utils.c:285:glusterd_lock] 0-glusterd: Cluster lock held by fa7005c3-a929-4110-b1be-ccd206000a67
[2012-11-12 16:21:31.013172] I [glusterd-handler.c:1322:glusterd_op_lock_send_resp] 0-glusterd: Responded, ret: 0
[2012-11-12 16:21:31.013956] I [glusterd-handler.c:547:glusterd_req_ctx_create] 0-glusterd: Received op from uuid: fa7005c3-a929-4110-b1be-ccd206000a67
[2012-11-12 16:21:31.014092] I [glusterd-utils.c:857:glusterd_volume_brickinfo_get_by_brick] 0-: brick: ceno1.node:/exports/vol_01
[2012-11-12 16:21:31.014168] I [glusterd-utils.c:814:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-11-12 16:21:31.014870] E [glusterd-utils.c:4490:glusterd_is_path_in_use] 0-management: /exports/vol_01 or a prefix of it is already part of a volume
[2012-11-12 16:21:31.014927] E [glusterd-op-sm.c:2716:glusterd_op_ac_stage_op] 0-: Validate failed: -1
[2012-11-12 16:21:31.015010] I [glusterd-handler.c:1423:glusterd_op_stage_send_resp] 0-glusterd: Responded to stage, ret: 0
[2012-11-12 16:21:31.015233] I [glusterd-handler.c:1366:glusterd_handle_cluster_unlock] 0-glusterd: Received UNLOCK from uuid: fa7005c3-a929-4110-b1be-ccd206000a67
[2012-11-12 16:21:31.015322] I [glusterd-handler.c:1342:glusterd_op_unlock_send_resp] 0-glusterd: Responded to unlock, ret: 0


******ceno1.node
[2012-11-12 16:21:32.217035] I [glusterd-handler.c:502:glusterd_handle_cluster_lock] 0-glusterd: Received LOCK from uuid: fa7005c3-a929-4110-b1be-ccd206000a67
[2012-11-12 16:21:32.217165] I [glusterd-utils.c:285:glusterd_lock] 0-glusterd: Cluster lock held by fa7005c3-a929-4110-b1be-ccd206000a67
[2012-11-12 16:21:32.217232] I [glusterd-handler.c:1322:glusterd_op_lock_send_resp] 0-glusterd: Responded, ret: 0
[2012-11-12 16:21:32.217973] I [glusterd-handler.c:547:glusterd_req_ctx_create] 0-glusterd: Received op from uuid: fa7005c3-a929-4110-b1be-ccd206000a67
[2012-11-12 16:21:32.218084] I [glusterd-utils.c:857:glusterd_volume_brickinfo_get_by_brick] 0-: brick: ceno1.node:/exports/vol_01
[2012-11-12 16:21:32.218478] I [glusterd-utils.c:814:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-11-12 16:21:32.218981] I [glusterd-handler.c:1423:glusterd_op_stage_send_resp] 0-glusterd: Responded to stage, ret: 0
[2012-11-12 16:21:32.219282] I [glusterd-handler.c:1366:glusterd_handle_cluster_unlock] 0-glusterd: Received UNLOCK from uuid: fa7005c3-a929-4110-b1be-ccd206000a67
[2012-11-12 16:21:32.219336] I [glusterd-handler.c:1342:glusterd_op_unlock_send_resp] 0-glusterd: Responded to unlock, ret: 0



****ubus01.node
sudo ls -la /exports/vol_01/
total 24
drwxr-xr-x 3 root root  4096 Dec 31  2006 .
drwxr-xr-x 3 root root  4096 Nov 12 16:05 ..
drwx------ 2 root root 16384 Dec 31  2006 lost+found

Comment 3 krishnan parthasarathi 2012-12-24 10:45:55 UTC

*** This bug has been marked as a duplicate of bug 877522 ***

Comment 4 Niels de Vos 2015-05-26 12:33:16 UTC
This bug has been CLOSED, and there has not been a response to the requested NEEDINFO in more than 4 weeks. The NEEDINFO flag is now getting cleared so that our Bugzilla household is getting more in order.


Note You need to log in before you can comment on or make changes to this bug.