+++ This bug was initially created as a clone of Bug #1605077 +++ Description of problem: In a cluster of n nodes, if a node goes down during the volume delete operation, When the node is back online, it will have the information about the deleted volume. The node assumes this volume as a freshly created volume and display the volume name if we trigger volume list command. All the remaining nodes in the cluster do not have any information this volume. Version-Release number of selected component (if applicable): mainline How reproducible: Always Steps to Reproduce: 1. 2. 3. Actual results: Expected results: When the disconnected node is back online, deleted volume's info should be removed from the node. volume list command should not display the volume name of deleted volume. Additional info: --- Additional comment from Worker Ant on 2018-07-31 03:29:32 EDT --- REVIEW: https://review.gluster.org/20592 (glusterd: ignore importingvolume which is undergoing a delete operation) posted (#1) for review on master by Atin Mukherjee --- Additional comment from Worker Ant on 2018-08-16 08:37:20 EDT --- COMMIT: https://review.gluster.org/20592 committed in master by "Atin Mukherjee" <amukherj> with a commit message- glusterd: ignore importing volume which is undergoing a delete operation Problem explanation: Assuming in a 3 nodes cluster, if N1 originates a delete operation and while N1's commit phase completes, either glusterd service of N2 or N3 gets disconnected from N1 (before completing the commit phase), N1 will attempt to end up importing the volume which is in-flight for a delete in other nodes as a fresh resulting into an incorrect configuration state. Fix: Mark a volume as stage deleted once a volume delete operation passes it's staging phase and reset this flag during unlock phase. Now during this intermediate phase if the same volume gets imported to other peers, it shouldn't considered to be recreated. An automated .t is quite tough to implement with the current infra. Test Case: 1. Keep creating and deleting volumes in a loop on a 3 node cluster 2. Simulate n/w failure between the peers (ifdown followed by ifup) 3. Check if output of 'gluster v list | wc -l' is same across all 3 nodes during 1 & 2. Change-Id: Ifdd5dc39699120258d7fdd42fe2deb9de25c6246 Fixes: bz#1605077 Signed-off-by: Atin Mukherjee <amukherj>
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:3432