Bug 966207

Summary: replace-brick commit force refuses to work when it cannot resolve source brick
Product: [Community] GlusterFS Reporter: hans
Component: glusterdAssignee: bugs <bugs>
Status: CLOSED EOL QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: pre-releaseCC: bugs, earl, gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1275884 (view as bug list) Environment:
Last Closed: 2015-10-22 15:40:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description hans 2013-05-22 19:14:22 UTC
Description of problem:

After a peer probe from the cluster to a new server, a replace-brick on either the new server, or any other existing server in the cluster, fails if the new server cannot (DNS) resolve the old server. replace-brick commit force bails out with :
brick: oldserver:/brick/a does not exist in volume: vol01

Version-Release number of selected component (if applicable):

3.3git-v3.3.2qa2-1-g45a9d1e

Actual results:

brick: oldserver:/brick/a does not exist in volume: vol01

Expected results:

replace-brick commit-force should work. the oldserver is irrelevant (and down in my case)

Comment 1 Earl Ruby 2015-10-21 23:12:21 UTC
Still happening on GlusterFS 3.7. I created a 3 node cluster (gfs1, gfs2, gfs3), 3 bricks each, then killed gfs2. I added a 4th host (gfs4) and used "peer probe" to add it to the cluster. I can't detach gfs2 because it has bricks in the cluster, and I can't replace gfs2 bricks with gfs4 bricks because it claims that the gfs2 bricks don't exist. Check it out:

root@gfs1:~# gluster volume info
Volume Name: volume1
Type: Distributed-Replicate
Volume ID: 168a6859-91a5-451e-9c32-ab531b79df16
Status: Started
Number of Bricks: 3 x 3 = 9
Transport-type: tcp
Bricks:
Brick1: gfs1:/data/brick1/files
Brick2: gfs2:/data/brick1/files
Brick3: gfs3:/data/brick1/files
Brick4: gfs1:/data/brick2/files
Brick5: gfs2:/data/brick2/files
Brick6: gfs3:/data/brick2/files
Brick7: gfs1:/data/brick3/files
Brick8: gfs2:/data/brick3/files
Brick9: gfs3:/data/brick3/files
Options Reconfigured:
nfs.disable: off
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on

root@gfs1:~# gluster volume replace-brick volume1 gfs2:/data/brick1/files gfs4:/data/brick1/files commit force
volume replace-brick: failed: brick: gfs2:/data/brick1/files does not exist in volume: volume1

I worked around the problem by adding gfs2 to /etc/hosts on gfs1, gfs3, and gfs4. Once I did that I could replace the bricks.

Comment 2 Kaleb KEITHLEY 2015-10-22 15:40:20 UTC
pre-release version is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.

Comment 3 Earl Ruby 2015-10-27 23:21:19 UTC
Kaleb: 

You might have the ability to re-open this bug or set the version number, I do not.

I did state that the problem is still present with GlusterFS 3.7 in my comment above.

Comment 4 Earl Ruby 2015-10-28 04:35:12 UTC
Well, I can't reopen this bug, but I can clone it, so I did. Cloned copy can be found as https://bugzilla.redhat.com/show_bug.cgi?id=1275884