Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1597116

Summary:	afr: don't update readables if inode refresh failed on all children
Product:	[Community] GlusterFS	Reporter:	Ravishankar N <ravishankar>
Component:	replicate	Assignee:	Ravishankar N <ravishankar>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	4.1	CC:	bugs
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-4.1.2	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1584483	Environment:
Last Closed:	2018-07-30 18:57:21 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1584483, 1599247
Bug Blocks:

Description Ravishankar N 2018-07-02 05:55:05 UTC

+++ This bug was initially created as a clone of Bug #1584483 +++

Description of problem:

To see if BZ 1329505 can be solved since the stop gap fix for it is being reverted at https://review.gluster.org/#/c/20028 . */ 

Problem: If inode refresh failed on all children of afr due to ENOENT (say file migrated by dht), it resets the readables to zero. Any inflight txn which then later comes on the inode fails with EIO because no readable children present for the inode.

Fix: Don't update readables when inode refresh fails on *all* children of afr. In that way any inflight txns will either proceed with its own inode refresh if needed and fail it with the right errno or use the old value of readables and continue with the txn.

--- Additional comment from Worker Ant on 2018-05-30 23:17:37 EDT ---

REVIEW: https://review.gluster.org/20029 (afr: don't update readables if inode refresh failed on all children) posted (#2) for review on master by Ravishankar N

--- Additional comment from Worker Ant on 2018-06-18 11:30:06 EDT ---

COMMIT: https://review.gluster.org/20029 committed in master by "Ravishankar N" <ravishankar> with a commit message- afr: don't update readables if inode refresh failed on all children

Problem:
If inode refresh failed on all children of afr due to ENOENT (say file
migrated by dht), it resets the readables to zero. Any inflight txn which
then later comes on the inode fails with EIO because no readable
children present for the inode.

Fix:
Don't update readables when inode refresh fails on *all* children of
afr. In that way any inflight txns will either proceed with its own inode
refresh if needed and fail it with the right errno or use the old value
of readables and continue with the txn.

Also, add quorum checks to the beginning of afr_transaction(). Otherwise, we
seem to be winding the lock and checking for quorum only in pre-op pahse.

Note: This should ideally fix BZ 1329505 since the stop gap fix for
it is has been reverted at https://review.gluster.org/#/c/20028.

Change-Id: Ia638c092d8d12dc27afb3cdad133394845061319
updates: bz#1584483
Signed-off-by: Ravishankar N <ravishankar>

Comment 1 Worker Ant 2018-07-02 05:56:59 UTC

REVIEW: https://review.gluster.org/20430 (afr: don't update readables if inode refresh failed on all children) posted (#1) for review on release-4.1 by Ravishankar N

Comment 2 Worker Ant 2018-07-02 17:25:11 UTC

COMMIT: https://review.gluster.org/20430 committed in release-4.1 by "Shyamsundar Ranganathan" <srangana> with a commit message- afr: don't update readables if inode refresh failed on all children

Problem:
If inode refresh failed on all children of afr due to ENOENT (say file
migrated by dht), it resets the readables to zero. Any inflight txn which
then later comes on the inode fails with EIO because no readable
children present for the inode.

Fix:
Don't update readables when inode refresh fails on *all* children of
afr. In that way any inflight txns will either proceed with its own inode
refresh if needed and fail it with the right errno or use the old value
of readables and continue with the txn.

Also, add quorum checks to the beginning of afr_transaction(). Otherwise, we
seem to be winding the lock and checking for quorum only in pre-op pahse.

Note: This should ideally fix BZ 1329505 since the stop gap fix for
it is has been reverted at https://review.gluster.org/#/c/20028.

Change-Id: Ia638c092d8d12dc27afb3cdad133394845061319
updates: bz#1597116
Signed-off-by: Ravishankar N <ravishankar>
(cherry picked from commit 0f13eed0c1fa74cefed486538b02e0c8a8708456)

Comment 3 Shyamsundar 2018-07-30 18:57:21 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.1.2, please open a new bug report.

glusterfs-4.1.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-July/000106.html
[2] https://www.gluster.org/pipermail/gluster-users/