Bug 1139996

Summary:	DHT: NFS process crashed on a node in a cluster when another storage node in the cluster went offline
Product:	[Community] GlusterFS	Reporter:	Raghavendra G <rgowdapp>
Component:	distribute	Assignee:	bugs <bugs>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	3.4.5	CC:	gluster-bugs, kkeithle, nsathyan, rhs-bugs, sgowda, spandura, vagarwal, vbellur
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	983431	Environment:
Last Closed:	2015-04-13 07:09:01 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	982181, 983431
Bug Blocks:	1125245

Comment 1 Anand Avati 2014-09-10 08:38:14 UTC

REVIEW: http://review.gluster.org/8677 (cluster/dht: Prevent dht_access from going into a loop.) posted (#1) for review on release-3.4 by Raghavendra G (rgowdapp)

Comment 2 Anand Avati 2014-10-09 05:40:23 UTC

REVIEW: http://review.gluster.org/8677 (cluster/dht: Prevent dht_access from going into a loop.) posted (#2) for review on release-3.4 by Raghavendra G (rgowdapp)

Comment 3 Anand Avati 2014-10-09 07:34:46 UTC

REVIEW: http://review.gluster.org/8677 (cluster/dht: Prevent dht_access from going into a loop.) posted (#3) for review on release-3.4 by Raghavendra G (rgowdapp)

Comment 4 Anand Avati 2014-10-20 14:54:52 UTC

COMMIT: http://review.gluster.org/8677 committed in release-3.4 by Kaleb KEITHLEY (kkeithle) 
------
commit 91175b38c9264676d75a275c16add45f7c64f4c1
Author: shishir gowda <sgowda>
Date:   Thu Jul 11 13:44:51 2013 +0530

    cluster/dht: Prevent dht_access from going into a loop.
    
    If access fails with ENOTCONN, do not wind to same subvol.
    We wind to first-up-subvol if access fails with ENOTCONN.
    In few cases, if dht has only 1 subvolume, and access fails with
    ENOTCONN, we go into a infinite loop of winding to same subvol
    
    The fix is to check if we previously wound to same subvol, and
    fail if first-up-subvol is same.
    
    Change-Id: Ib5d3ce7d33e8ea09147905a7df1ed280874fa549
    BUG: 1139996
    Signed-off-by: shishir gowda <sgowda>
    Reviewed-on: http://review.gluster.org/5319
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>
    Reviewed-on: http://review.gluster.org/8677
    Reviewed-by: N Balachandran <nbalacha>
    Reviewed-by: Kaleb KEITHLEY <kkeithle>