Bug 983431

Summary:	DHT: NFS process crashed on a node in a cluster when another storage node in the cluster went offline
Product:	[Community] GlusterFS	Reporter:	shishir gowda <sgowda>
Component:	distribute	Assignee:	Nagaprasad Sathyanarayana <nsathyan>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	mainline	CC:	gluster-bugs, nsathyan, rhs-bugs, smohan, spandura, spradhan, vbellur
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-3.5.0	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	982181
Clones:	1139996 (view as bug list)		Environment:
Last Closed:	2014-04-17 11:43:31 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	982181
Bug Blocks:	1139996

Comment 1 Anand Avati 2013-07-11 08:29:46 UTC

REVIEW: http://review.gluster.org/5319 (cluster/dht: Prevent dht_access from going into a loop.) posted (#1) for review on master by Shishir Gowda (sgowda)

Comment 2 Anand Avati 2013-07-12 06:10:51 UTC

REVIEW: http://review.gluster.org/5319 (cluster/dht: Prevent dht_access from going into a loop.) posted (#2) for review on master by Shishir Gowda (sgowda)

Comment 3 Anand Avati 2013-07-12 09:18:34 UTC

REVIEW: http://review.gluster.org/5319 (cluster/dht: Prevent dht_access from going into a loop.) posted (#3) for review on master by Shishir Gowda (sgowda)

Comment 4 Anand Avati 2013-07-16 06:57:14 UTC

COMMIT: http://review.gluster.org/5319 committed in master by Anand Avati (avati) 
------
commit 3e1d8e1689c47d8b83343a403e7d09c018472155
Author: shishir gowda <sgowda>
Date:   Thu Jul 11 13:44:51 2013 +0530

    cluster/dht: Prevent dht_access from going into a loop.
    
    If access fails with ENOTCONN, do not wind to same subvol.
    We wind to first-up-subvol if access fails with ENOTCONN.
    In few cases, if dht has only 1 subvolume, and access fails with
    ENOTCONN, we go into a infinite loop of winding to same subvol
    
    The fix is to check if we previously wound to same subvol, and
    fail if first-up-subvol is same.
    
    Change-Id: Ib5d3ce7d33e8ea09147905a7df1ed280874fa549
    BUG: 983431
    Signed-off-by: shishir gowda <sgowda>
    Reviewed-on: http://review.gluster.org/5319
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>

Comment 7 Niels de Vos 2014-04-17 11:43:31 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user