Bug 1063830

Summary:	remove-brick/add-brick : remove-brick or add-brick can lead to data loss if there are pending self-heals on any of the subvolumes.
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	spandura
Component:	replicate	Assignee:	Ravishankar N <ravishankar>
Status:	CLOSED EOL	QA Contact:	storage-qa-internal <storage-qa-internal>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	2.1	CC:	asriram, nlevinki, ravishankar, rhs-bugs, storage-qa-internal, vbellur
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	Performing add brick or remove brick operations on a volume having replica pairs when there are pending self-heals can cause potential data loss. Workaround: Ensure that all bricks of the volume are up and there are no pending self-heals. You can view the pending heal info using the command `gluster volume heal <volname> info`.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2015-12-03 17:16:45 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description spandura 2014-02-11 13:45:05 UTC

Description of problem:
=========================
Consider a 3 x 2 distribute-replicate volume. If any of the brick is offline when a distribute subvolume is removed there will be pending self-heals to be performed on the offline brick of the files/dirs migrated from the removed bricks . 

After successful completion of remove-brick operation (start, status 'completed', commit) if the source brick goes offline even before sync brick comes online (connectivity issues) and sync brick comes online creating files/directories from mount point when source brick is offline will result in the data loss (loss of files / directories which got migrated as part of remove-brick operation) . 

Version-Release number of selected component (if applicable):
===========================================================
glusterfs 3.4.0.59rhs built on Feb  4 2014 08:44:13

How reproducible:
================
Often

Steps to Reproduce:
======================
1. Create 3 x 2 distribute-replicate volume. Start the volume. 

2. Create fuse mount. 

3. Bring down brick5 offline. 

4. Create 10 files from mount point. 

5. Bring back brick5 online. 

6. Wait for self-heal to happen

7. Bring brick6 offline.

8. remove the distribute sub-volume-1 from the volume.

9. Wait for the migration to complete and then commit the remove-brick operation. 

10. Bring brick5 offline.

11. Bring back brick6 online. 

12. create files from mount point. 

13. Bring brick5 online. 

Actual results:
===============
Self-heal happens from brick6 to brick5. All the files migrated to brick5 as part of remove brick operation when brick6 was offline will be lost. 

Expected results:
=================
TBD

Comment 2 spandura 2014-02-12 05:04:26 UTC

Case 2:-
========
Consider a 3 x 2 distribute-replicate volume. If any of the brick is offline when a distribute subvolume is removed there will be pending self-heals to be performed on the offline brick of the files/dirs migrated from the removed bricks.

After successful remove-brick operation , even though the graph has changed and brick5, brick6 now gets client-2 and client-3 afr change-log attributes, Opendir still refers to the previous stale client-4 and client-5 change-log attributes of the brick5 and brick6 and self-heals all the data. 

Steps to Reproduce:
======================
1. Create 3 x 2 distribute-replicate volume. Start the volume. 

2. Create fuse mount. 

3. Bring down brick5 offline. 

4. Create 10 files from mount point. 

5. Bring back brick5 online. 

6. Wait for self-heal to happen

7. Bring brick6 offline.

8. remove the distribute sub-volume-1 from the volume.

9. Wait for the migration to complete and then commit the remove-brick operation. 

10. Bring back brick6 online. 

11. perform "ls -l" from mount point. 

self-heal happens from brick5 to brick6 and the stale change-logs are still referred.

Comment 3 Ravishankar N 2014-02-12 12:32:50 UTC

Case 1 will be fixed with the persistent changelog implementation to be implemented in Denali. But it needs to be documented as a known issue for Corbett.  Hence adding to BZ 1035040. doc text needs to be set.

Case 2 happens if before self-heal, afr_opendir happens for the first time on an inode which triggers a conservative merge. This is the expected behaviour by design.

Comment 4 Shalaka 2014-02-18 08:47:00 UTC

Please review the edited doc text and sign off.

Comment 6 Shalaka 2014-02-19 09:53:06 UTC

Updated as suggested. Please sign off.

Comment 7 Ravishankar N 2014-02-19 10:07:50 UTC

Looks good to me.

Comment 10 Vivek Agarwal 2015-12-03 17:16:45 UTC

Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/

If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.