Bug 1319592

Summary:	DHT-rebalance: rebalance status shows failed when replica pair bricks are brought down in distrep volume while re-name of files going on
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Nithya Balachandran <nbalacha>
Component:	distribute	Assignee:	Nithya Balachandran <nbalacha>
Status:	CLOSED ERRATA	QA Contact:	krishnaram Karthick <kramdoss>
Severity:	high	Docs Contact:
Priority:	high
Version:	rhgs-3.1	CC:	amukherj, annair, asriram, asrivast, bmohanra, byarlaga, nbalacha, rcyriac, rgowdapp, rhinduja, rhs-bugs, rmekala, sankarshan, sashinde, shmohan, smohan, spalai
Target Milestone:	---	Keywords:	ZStream
Target Release:	RHGS 3.1.3
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.7.9-1	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	1237059	Environment:
Last Closed:	2016-06-23 05:04:13 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1311817

Comment 7 krishnaram Karthick 2016-04-19 08:27:57 UTC

Verified the bug on build - glusterfs-server-3.7.9-1.el7rhgs.x86_64

steps followed to verify:

Test1:
1) created a 4 x 2 dis rep vol (say brick-1 till brick-8)
2) created a dir and under this directory created 10k files
3) Added 4 more bricks
4) Initiated rebalance process
5) killed brick 1

Rebalance process halted on replica pair of brick-1 and brick-2. Rebalance on other bricks went on to complete. There was no inconsistency with the rebalance status. This is expected behavior as rebalance of all files fail under a directory when readdirp fails. To validate this, performed test-2

Test2:

1) created a 4 x 2 dis rep vol (say brick-1 till brick-8)
2) created 100 dirs - dir-{1..100}
3) created 1k files under each directory of directory
4) Added 4 more bricks
5) Initiated rebalance process
6) killed brick 1

Rebalance process continued on all replica-pairs. When readdirp fails on one  directory, it continued on subsequent dirs. This is as expected and rebalance status was consistent across all the nodes. 

Hence, marking this bug as verified.

Comment 9 errata-xmlrpc 2016-06-23 05:04:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240