Bug 983431

Summary: DHT: NFS process crashed on a node in a cluster when another storage node in the cluster went offline
Product: [Community] GlusterFS Reporter: shishir gowda <sgowda>
Component: distributeAssignee: Nagaprasad Sathyanarayana <nsathyan>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: gluster-bugs, nsathyan, rhs-bugs, smohan, spandura, spradhan, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.5.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 982181
: 1139996 (view as bug list) Environment:
Last Closed: 2014-04-17 11:43:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 982181    
Bug Blocks: 1139996    

Comment 1 Anand Avati 2013-07-11 08:29:46 UTC
REVIEW: http://review.gluster.org/5319 (cluster/dht: Prevent dht_access from going into a loop.) posted (#1) for review on master by Shishir Gowda (sgowda)

Comment 2 Anand Avati 2013-07-12 06:10:51 UTC
REVIEW: http://review.gluster.org/5319 (cluster/dht: Prevent dht_access from going into a loop.) posted (#2) for review on master by Shishir Gowda (sgowda)

Comment 3 Anand Avati 2013-07-12 09:18:34 UTC
REVIEW: http://review.gluster.org/5319 (cluster/dht: Prevent dht_access from going into a loop.) posted (#3) for review on master by Shishir Gowda (sgowda)

Comment 4 Anand Avati 2013-07-16 06:57:14 UTC
COMMIT: http://review.gluster.org/5319 committed in master by Anand Avati (avati) 
------
commit 3e1d8e1689c47d8b83343a403e7d09c018472155
Author: shishir gowda <sgowda>
Date:   Thu Jul 11 13:44:51 2013 +0530

    cluster/dht: Prevent dht_access from going into a loop.
    
    If access fails with ENOTCONN, do not wind to same subvol.
    We wind to first-up-subvol if access fails with ENOTCONN.
    In few cases, if dht has only 1 subvolume, and access fails with
    ENOTCONN, we go into a infinite loop of winding to same subvol
    
    The fix is to check if we previously wound to same subvol, and
    fail if first-up-subvol is same.
    
    Change-Id: Ib5d3ce7d33e8ea09147905a7df1ed280874fa549
    BUG: 983431
    Signed-off-by: shishir gowda <sgowda>
    Reviewed-on: http://review.gluster.org/5319
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>

Comment 7 Niels de Vos 2014-04-17 11:43:31 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user