Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1566822

Summary:	[Remove-brick] Many files were not migrated from the decommissioned bricks; commit results in data loss
Product:	[Community] GlusterFS	Reporter:	Nithya Balachandran <nbalacha>
Component:	distribute	Assignee:	bugs <bugs>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	high	Docs Contact:
Priority:	high
Version:	4.0	CC:	bugs, rhs-bugs, storage-qa-internal, tdesala
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-4.0.2	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1564198	Environment:
Last Closed:	2018-05-07 15:15:28 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1553677, 1564198
Bug Blocks:	1566820

Description Nithya Balachandran 2018-04-13 04:14:29 UTC

+++ This bug was initially created as a clone of Bug #1564198 +++

+++ This bug was initially created as a clone of Bug #1553677 +++

Description of problem:
=======================
Many files were not migrated from the decommissioned bricks; commit results in data loss.

Version-Release number of selected component (if applicable):
3.12.2-5.el7rhgs.x86_64

How reproducible:
Reporting at first occurrence

Steps to Reproduce:
===================
1) Create a x3 volume with brick-mux enabled and start it.
2) FUSE mount it on multiple clients.
3) From Client-1 : run script to create folders and files continuously 
 From client-2 : start linux kernel untar
 From client-3 : while true;do find;done
 From client-4 : while true;do ls -lRt;done
4) While step-3 is in-progress, killed server-1 brick process using kill -9 <pid>. 
As brick mux is enabled killing single brick on the server using kill -9 would take down all the bricks on the node.
5) Now, add 3 bricks to the volume and after few secs immediately start removing old bricks.
6) Wait for remove-brick to complete.

Actual results:
===============
Many files were not migrated from the decommissioned bricks; commit results in data loss.

Expected results:
=================
Remove-brick operation should migrate all the files from the decommissioned brick.

RCA:

The logs from the previous failed runs indicate 2 problems:

1. At least one process could not read directories because the first_up_subvol was not in the list of local_subvols for the process
2.Since a brick was down, some files would not be migrated if the gfid hashed to that node-uuid

--- Additional comment from Worker Ant on 2018-04-05 12:17:33 EDT ---

REVIEW: https://review.gluster.org/19827 (cluster/dht: Wind open to all subvols) posted (#1) for review on master by N Balachandran

--- Additional comment from Worker Ant on 2018-04-06 06:46:22 EDT ---

REVIEW: https://review.gluster.org/19831 (cluster/dht: Handle file migrations when brick down) posted (#1) for review on master by N Balachandran

--- Additional comment from Worker Ant on 2018-04-11 09:19:03 EDT ---

COMMIT: https://review.gluster.org/19827 committed in master by "Shyamsundar Ranganathan" <srangana> with a commit message- cluster/dht: Wind open to all subvols

dht_opendir should wind the open to all subvols
whether or not local->subvols is set. This is
because dht_readdirp winds the calls to all subvols.

Change-Id: I67a96b06dad14a08967c3721301e88555aa01017
updates: bz#1564198
Signed-off-by: N Balachandran <nbalacha>

--- Additional comment from Worker Ant on 2018-04-12 22:27:57 EDT ---

COMMIT: https://review.gluster.org/19831 committed in master by "Raghavendra G" <rgowdapp> with a commit message- cluster/dht: Handle file migrations when brick down

The decision as to which node would migrate a file
was based on the gfid of the file. Files were divided
among the nodes for the replica/disperse set. However,
if a brick was down when rebalance started, the nodeuuids
would be saved as NULL and a set of files would not be migrated.

Now, if the nodeuuid is NULL, the first non-null entry in
the set is the node responsible for migrating the file.

Change-Id: I72554c107792c7d534e0f25640654b6f8417d373
fixes: bz#1564198
Signed-off-by: N Balachandran <nbalacha>

Comment 1 Nithya Balachandran 2018-04-13 04:48:20 UTC

Patches:

https://review.gluster.org/19864
https://review.gluster.org/19865

Comment 2 Worker Ant 2018-04-13 13:15:29 UTC

REVISION POSTED: https://review.gluster.org/19864 (cluster/dht: Wind open to all subvols) posted (#2) for review on release-4.0 by Shyamsundar Ranganathan

Comment 3 Worker Ant 2018-04-13 13:15:34 UTC

REVIEW: https://review.gluster.org/19864 (cluster/dht: Wind open to all subvols) posted (#2) for review on release-4.0 by Shyamsundar Ranganathan

Comment 4 Worker Ant 2018-04-13 13:50:47 UTC

REVISION POSTED: https://review.gluster.org/19865 (cluster/dht: Handle file migrations when brick down) posted (#2) for review on release-4.0 by Shyamsundar Ranganathan

Comment 5 Worker Ant 2018-04-13 13:50:53 UTC

REVIEW: https://review.gluster.org/19865 (cluster/dht: Handle file migrations when brick down) posted (#2) for review on release-4.0 by Shyamsundar Ranganathan

Comment 6 Worker Ant 2018-04-18 13:22:13 UTC

COMMIT: https://review.gluster.org/19864 committed in release-4.0 by "N Balachandran" <nbalacha> with a commit message- cluster/dht: Wind open to all subvols

dht_opendir should wind the open to all subvols
whether or not local->subvols is set. This is
because dht_readdirp winds the calls to all subvols.

(cherry picked from commit c4251edec654b4e0127577e004923d9729bc323d)

Change-Id: I67a96b06dad14a08967c3721301e88555aa01017
updates: bz#1566822
Signed-off-by: N Balachandran <nbalacha>

Comment 7 Worker Ant 2018-04-18 13:22:33 UTC

COMMIT: https://review.gluster.org/19865 committed in release-4.0 by "Shyamsundar Ranganathan" <srangana> with a commit message- cluster/dht: Handle file migrations when brick down

The decision as to which node would migrate a file
was based on the gfid of the file. Files were divided
among the nodes for the replica/disperse set. However,
if a brick was down when rebalance started, the nodeuuids
would be saved as NULL and a set of files would not be migrated.

Now, if the nodeuuid is NULL, the first non-null entry in
the set is the node responsible for migrating the file.

Change-Id: I72554c107792c7d534e0f25640654b6f8417d373
fixes: bz#1566822
Signed-off-by: N Balachandran <nbalacha>

(cherry picked from commit 1f0765242a689980265c472646c64473a92d94c0)

Change-Id: I3072ca1f2975eb7ad3c38798e65d60d2312fd057

Comment 8 Shyamsundar 2018-05-07 15:15:28 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.0.2, please open a new bug report.

glusterfs-4.0.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-April/000097.html
[2] https://www.gluster.org/pipermail/gluster-users/