Bug 1214912 - Failure to recover disperse volume after add-brick failure
Summary: Failure to recover disperse volume after add-brick failure
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: 3.6.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-23 20:21 UTC by vnosov
Modified: 2016-08-01 04:43 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-01 04:42:34 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description vnosov 2015-04-23 20:21:36 UTC
Description of problem:

Non-existing brick was used at "add-brick" command. The command failed. When it was repeated with the correct brick it failed again with message that the other brick from the command "is already part of a volume". All calls to "remove-brick" failed. It seems the one option is left is to delete volume. It's not appropriate if the volume has data. 


Version-Release number of selected component (if applicable):
GlusterFS 3.6.2


How reproducible:


Steps to Reproduce:
1. Have disperse volume:

[root@SC92 log]# gluster volume info dv3

Volume Name: dv3
Type: Disperse
Volume ID: 9547a2c0-1136-4fc9-915f-47d016a30484
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.10.60.182:/exports/182-ts3/dv3
Brick2: 10.10.60.90:/exports/90-ts3/dv3
Brick3: 10.10.60.92:/exports/92-ts3/dv3
Options Reconfigured:
snap-activate-on-create: enable

2. Issue command "add-brick" on node SC92, use invalid name for brick on node SC90:

[root@SC92 log]# gluster volume add-brick dv3 10.10.60.182:/exports/182-ts4/dv3 10.10.60.90:/exports/90-ts42/dv3 10.10.60.92:/exports/92-ts4/dv3
volume add-brick: failed: Staging failed on 10.10.60.90. Error: Failed to create brick directory for brick 10.10.60.90:/exports/90-ts42/dv3. Reason : No such file or directory



3. Issue command "add-brick" on node SC92, use valid name for brick on node SC90:

[root@SC92 log]# gluster volume add-brick dv3 10.10.60.182:/exports/182-ts4/dv3 10.10.60.90:/exports/90-ts4/dv3 10.10.60.92:/exports/92-ts4/dv3
volume add-brick: failed: /exports/92-ts4/dv3 is already part of a volume

4. [root@SC92 log]# gluster volume indo dv3
unrecognized word: indo (position 1)
[root@SC92 log]# gluster volume info dv3

Volume Name: dv3
Type: Disperse
Volume ID: 9547a2c0-1136-4fc9-915f-47d016a30484
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.10.60.182:/exports/182-ts3/dv3
Brick2: 10.10.60.90:/exports/90-ts3/dv3
Brick3: 10.10.60.92:/exports/92-ts3/dv3
Options Reconfigured:
snap-activate-on-create: enable



Actual results:


Expected results:


Additional info:

Comment 1 vnosov 2015-04-24 16:09:11 UTC
The problem is with brick that was used for expansion. After command "add-brick" fails some attributes are left on expansion bricks. These attributes do not let use these bricks at "add-brick" command later. The volume itself is OK.

Comment 2 Pranith Kumar K 2015-05-09 17:58:18 UTC
Assigning to glusterd based on comment-1

Comment 3 Atin Mukherjee 2016-08-01 04:42:34 UTC
This is not a security bug, not going to fix this in 3.6.x because of
http://www.gluster.org/pipermail/gluster-users/2016-July/027682.html

Comment 4 Atin Mukherjee 2016-08-01 04:43:54 UTC
If the issue persists in the latest releases, please feel free to clone them


Note You need to log in before you can comment on or make changes to this bug.