1618221 – If a node disconnects during volume delete, it assumes deleted volume as a freshly created volume when it is back online

Bug 1618221 - If a node disconnects during volume delete, it assumes deleted volume as a freshly created volume when it is back online

Summary: If a node disconnects during volume delete, it assumes deleted volume as a fr...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	rhgs-3.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.4.z Batch Update 1
Assignee:	Atin Mukherjee
QA Contact:	Bala Konda Reddy M
Docs Contact:
URL:
Whiteboard:	ocs-dependency-issue
Depends On:	1605077
Blocks:	1565940 1582402 1589070 1631248
TreeView+	depends on / blocked

Reported:	2018-08-16 12:42 UTC by Atin Mukherjee
Modified:	2022-07-09 10:10 UTC (History)
CC List:	14 users (show)
Fixed In Version:	glusterfs-3.12.2-24
Doc Type:	Bug Fix
Doc Text:
Clone Of:	1605077
Clones:	1631248 (view as bug list)
Environment:
Last Closed:	2018-10-31 08:46:14 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2018:3432	0	None	None	None	2018-10-31 08:47:58 UTC

Description Atin Mukherjee 2018-08-16 12:42:11 UTC

+++ This bug was initially created as a clone of Bug #1605077 +++

Description of problem:
In a cluster of n nodes, if a node goes down during the volume delete operation, When the node is back online, it will have the information about the deleted volume. The node assumes this volume as a freshly created volume and display the volume name if we trigger volume list command. All the remaining nodes in the cluster do not have any information this volume.

Version-Release number of selected component (if applicable):
mainline

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:
When the disconnected node is back online, deleted volume's info should be removed from the node. volume list command should not display the volume name of deleted volume.

Additional info:

--- Additional comment from Worker Ant on 2018-07-31 03:29:32 EDT ---

REVIEW: https://review.gluster.org/20592 (glusterd: ignore importingvolume which is undergoing a delete operation) posted (#1) for review on master by Atin Mukherjee

--- Additional comment from Worker Ant on 2018-08-16 08:37:20 EDT ---

COMMIT: https://review.gluster.org/20592 committed in master by "Atin Mukherjee" <amukherj> with a commit message- glusterd: ignore importing volume which is undergoing a delete operation

Problem explanation:

Assuming in a 3 nodes cluster, if N1 originates a delete operation and
while N1's commit phase completes, either glusterd service of N2 or N3
gets disconnected from N1 (before completing the commit phase), N1 will
attempt to end up importing the volume which is in-flight for a delete
in other nodes as a fresh resulting into an incorrect configuration
state.

Fix:

Mark a volume as stage deleted once a volume delete operation passes
it's staging phase and reset this flag during unlock phase. Now during
this intermediate phase if the same volume gets imported to other peers,
it shouldn't considered to be recreated.

An automated .t is quite tough to implement with the current infra.

Test Case:

1. Keep creating and deleting volumes in a loop on a 3 node cluster
2. Simulate n/w failure between the peers (ifdown followed by ifup)
3. Check if output of 'gluster v list | wc -l' is same across all 3
nodes during 1 & 2.

Change-Id: Ifdd5dc39699120258d7fdd42fe2deb9de25c6246
Fixes: bz#1605077
Signed-off-by: Atin Mukherjee <amukherj>

Comment 16 errata-xmlrpc 2018-10-31 08:46:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3432

Note You need to log in before you can comment on or make changes to this bug.