1566820 – [Remove-brick] Many files were not migrated from the decommissioned bricks; commit results in data loss

Bug 1566820 - [Remove-brick] Many files were not migrated from the decommissioned bricks; commit results in data loss

Summary: [Remove-brick] Many files were not migrated from the decommissioned bricks; c...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	distribute
Sub Component:
Version:	3.12
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	bugs@gluster.org
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1553677 1564198 1566822
Blocks:
TreeView+	depends on / blocked

Reported:	2018-04-13 04:13 UTC by Nithya Balachandran
Modified:	2018-05-07 15:10 UTC (History)
CC List:	4 users (show)
Fixed In Version:	glusterfs-3.12.9
Clone Of:	1564198
Environment:
Last Closed:	2018-05-07 15:10:10 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Nithya Balachandran 2018-04-13 04:13:51 UTC

+++ This bug was initially created as a clone of Bug #1564198 +++

+++ This bug was initially created as a clone of Bug #1553677 +++

Description of problem:
=======================
Many files were not migrated from the decommissioned bricks; commit results in data loss.

Version-Release number of selected component (if applicable):
3.12.2-5.el7rhgs.x86_64

How reproducible:
Reporting at first occurrence

Steps to Reproduce:
===================
1) Create a x3 volume with brick-mux enabled and start it.
2) FUSE mount it on multiple clients.
3) From Client-1 : run script to create folders and files continuously 
 From client-2 : start linux kernel untar
 From client-3 : while true;do find;done
 From client-4 : while true;do ls -lRt;done
4) While step-3 is in-progress, killed server-1 brick process using kill -9 <pid>. 
As brick mux is enabled killing single brick on the server using kill -9 would take down all the bricks on the node.
5) Now, add 3 bricks to the volume and after few secs immediately start removing old bricks.
6) Wait for remove-brick to complete.

Actual results:
===============
Many files were not migrated from the decommissioned bricks; commit results in data loss.

Expected results:
=================
Remove-brick operation should migrate all the files from the decommissioned brick.

RCA:

The logs from the previous failed runs indicate 2 problems:

1. At least one process could not read directories because the first_up_subvol was not in the list of local_subvols for the process
2.Since a brick was down, some files would not be migrated if the gfid hashed to that node-uuid

--- Additional comment from Worker Ant on 2018-04-05 12:17:33 EDT ---

REVIEW: https://review.gluster.org/19827 (cluster/dht: Wind open to all subvols) posted (#1) for review on master by N Balachandran

--- Additional comment from Worker Ant on 2018-04-06 06:46:22 EDT ---

REVIEW: https://review.gluster.org/19831 (cluster/dht: Handle file migrations when brick down) posted (#1) for review on master by N Balachandran

--- Additional comment from Worker Ant on 2018-04-11 09:19:03 EDT ---

COMMIT: https://review.gluster.org/19827 committed in master by "Shyamsundar Ranganathan" <srangana> with a commit message- cluster/dht: Wind open to all subvols

dht_opendir should wind the open to all subvols
whether or not local->subvols is set. This is
because dht_readdirp winds the calls to all subvols.

Change-Id: I67a96b06dad14a08967c3721301e88555aa01017
updates: bz#1564198
Signed-off-by: N Balachandran <nbalacha>

--- Additional comment from Worker Ant on 2018-04-12 22:27:57 EDT ---

COMMIT: https://review.gluster.org/19831 committed in master by "Raghavendra G" <rgowdapp> with a commit message- cluster/dht: Handle file migrations when brick down

The decision as to which node would migrate a file
was based on the gfid of the file. Files were divided
among the nodes for the replica/disperse set. However,
if a brick was down when rebalance started, the nodeuuids
would be saved as NULL and a set of files would not be migrated.

Now, if the nodeuuid is NULL, the first non-null entry in
the set is the node responsible for migrating the file.

Change-Id: I72554c107792c7d534e0f25640654b6f8417d373
fixes: bz#1564198
Signed-off-by: N Balachandran <nbalacha>

Comment 1 Nithya Balachandran 2018-04-13 04:32:34 UTC

Patches:

https://review.gluster.org/#/c/19862/
https://review.gluster.org/#/c/19863/

Comment 2 Worker Ant 2018-04-13 14:45:17 UTC

REVISION POSTED: https://review.gluster.org/19862 (cluster/dht: Wind open to all subvols) posted (#2) for review on release-3.12 by N Balachandran

Comment 3 Worker Ant 2018-04-13 14:45:23 UTC

REVIEW: https://review.gluster.org/19862 (cluster/dht: Wind open to all subvols) posted (#2) for review on release-3.12 by N Balachandran

Comment 4 Worker Ant 2018-04-13 14:46:18 UTC

REVISION POSTED: https://review.gluster.org/19863 (cluster/dht: Handle file migrations when brick down) posted (#2) for review on release-3.12 by N Balachandran

Comment 5 Worker Ant 2018-04-13 14:46:24 UTC

REVIEW: https://review.gluster.org/19863 (cluster/dht: Handle file migrations when brick down) posted (#2) for review on release-3.12 by N Balachandran

Comment 6 Worker Ant 2018-04-18 13:24:22 UTC

COMMIT: https://review.gluster.org/19862 committed in release-3.12 by "Shyamsundar Ranganathan" <srangana> with a commit message- cluster/dht: Wind open to all subvols

dht_opendir should wind the open to all subvols
whether or not local->subvols is set. This is
because dht_readdirp winds the calls to all subvols.

Change-Id: I67a96b06dad14a08967c3721301e88555aa01017
updates: bz#1566820
Signed-off-by: N Balachandran <nbalacha>
(cherry picked from commit c4251edec654b4e0127577e004923d9729bc323d)

Comment 7 Worker Ant 2018-04-18 13:24:52 UTC

COMMIT: https://review.gluster.org/19863 committed in release-3.12 by "Shyamsundar Ranganathan" <srangana> with a commit message- cluster/dht: Handle file migrations when brick down

The decision as to which node would migrate a file
was based on the gfid of the file. Files were divided
among the nodes for the replica/disperse set. However,
if a brick was down when rebalance started, the nodeuuids
would be saved as NULL and a set of files would not be migrated.

Now, if the nodeuuid is NULL, the first non-null entry in
the set is the node responsible for migrating the file.

Change-Id: I72554c107792c7d534e0f25640654b6f8417d373
fixes: bz#1566820
Signed-off-by: N Balachandran <nbalacha>

(cherry picked from commit 1f0765242a689980265c472646c64473a92d94c0)

Change-Id: Id1a6e847b0191b6a40707bea789a2a35ea3d9f68

Comment 8 Shyamsundar 2018-05-07 15:10:10 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.9, please open a new bug report.

glusterfs-3.12.9 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-April/000096.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.