Red Hat Bugzilla – Bug 1279319
[GlusterD]: Volume start fails post add-brick on a volume which is not started
Last modified: 2017-03-25 10:24:29 EDT
Description of problem:
Volume start is failing with " Commit failed" when brick is added to stopped state volume.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Have one node cluster
2.Create a volume of type Distributed (1*1) // ** DON'T START THE VOLUME **
3.Add a new brick
4.Start the volume now // it will fail
Volume start is failing with "Commit failed"
Volume start should work without any issue.
Here goes the RCA for the same:
the add-brick code path has caused a regression when brick(s) are added into a volume which is not started. The issue here is although the add-brick throws up a success message it doesn't generate the volfiles because of which at the time of start brick process in volume start the __server_getspec at glusterd fails since the brick volfile doesn't exist.
Additional info to reproduce the issue:
Update node from 3.1.1 build to 3.1.2
Then follow the remaining steps specified in description section.
Add-brick implementation has been changed to v3 framework from glusterfs-3.7.6 in upstream. if cluster is running a version equal to or less than GLUSTERFS_3_7_5 , we will fall back into the older implementation. This bug is in the fall back code, where it complete the commit phase without creating the volfiles for newly added brick.
The fix is already part of rhgs-3.1.2 as per comment 6, moving it to ON_QA
Verified this bug using the build - glusterfs-3.8.4-5.el7rhgs.x86_64
Fix is working good, reported issue no more exist.
Moving to verified state.