+++ This bug was initially created as a clone of Bug #1344407 +++ Description of problem: If a volume is deleted when one of the glusterd instance on a node is down in the cluster then once glusterd comes back it re-syncs the same volume to all of the nodes. User will get annoyed to see the volume back into the namespace. Version-Release number of selected component (if applicable): mainline How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from Vijay Bellur on 2016-06-09 11:38:08 EDT --- REVIEW: http://review.gluster.org/14681 (glusterd: fail volume delete if one of the node is down) posted (#2) for review on master by Atin Mukherjee (amukherj) --- Additional comment from Vijay Bellur on 2016-06-10 03:31:02 EDT --- COMMIT: http://review.gluster.org/14681 committed in master by Kaushal M (kaushal) ------ commit 5016cc548d4368b1c180459d6fa8ae012bb21d6e Author: Atin Mukherjee <amukherj> Date: Thu Jun 9 18:22:43 2016 +0530 glusterd: fail volume delete if one of the node is down Deleting a volume on a cluster where one of the node in the cluster is down is buggy since once that node comes back the resync of the same volume will happen. Till we bring in the soft delete feature tracked in http://review.gluster.org/12963 this is a safe guard to block the volume deletion. Change-Id: I9c13869c4a7e7a947f88842c6dc6f231c0eeda6c BUG: 1344407 Signed-off-by: Atin Mukherjee <amukherj> Reviewed-on: http://review.gluster.org/14681 Smoke: Gluster Build System <jenkins.com> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Kaushal M <kaushal> NetBSD-regression: NetBSD Build System <jenkins.org>
Downstream patch https://code.engineering.redhat.com/gerrit/76322 posted for review.
Laura, this needs your attention and hence I am raising a need_info now :) ~Atin
Laura, Mentioning node is unavailable will not be technically correct. This indicates that the node could be down or under maintenance. The issue is all about when one or more glusterd instances are down. Can you please reword? ~Atin
LGTM :)
Verified this bug using the build "glusterfs-3.7.9-10" Fix is working good, it's not allowing to delete the volume when the peer nodes are down and able to delete once offline nodes comes up and able to create new volume. Test cases verified for this fix are: ===================================== 1. Stop and Delete volume when one of the node is down - Pass 2. Delete the volume by starting the shutdown node - Pass 3. Stop and delete the volume when nodes are down - Pass 4. Bring up one node out of two offline nodes and delete the volume - Pass 5. Bring up all the offline nodes and delete the volume - Pass 6. Delete the volume when one of the peer node which is not hosting volume bricks is offline -Pass 7. Stop the volume when all the nodes are online and move one of node to offline and delete the volume - Pass 8. Stop the volume when one the peer node is down and probe new node and delete the volume - Pass 9. Create a volume (don't start ) and down one of the node and delete the volume - Pass 10. Have multiple volumes and down one of the peer node and delete the volumes - Pass 11. Delete the volume(s) when offline node(s) comes up - Pass 12. Delete the volume by powering off one of the peer node - Pass 13. when one of the node is down, create the volume and try to delele - Pass 14. Create a volume, down one of the node and create new volume using online node bricks -Pass With all above details moving to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240