+++ This bug was initially created as a clone of Bug #1336381 +++ Description of problem: Reported by Sakshi Bansal sabansal Parallel rmdir from multiple clients results in application receiving "Transport end point not connected" messages even though there was no network disconnects. Steps to Reproduce: 1. Create 1x2 replica, fuse mount it from 2 clients. 2. Run the script from both clients ------------------------- #!/bin/bash dir=$(dirname $(readlink -f $0)) echo 'Script in '$dir while : do mkdir -p foo$1/bar/gee mkdir -p foo$1/bar/gne mkdir -p foo$1/lna/gme rm -rf foo$1 done ------------------------- --- Additional comment from Vijay Bellur on 2016-05-16 06:19:36 EDT --- REVIEW: http://review.gluster.org/14358 (cluster/afr: Return correct op_errno in pre-op) posted (#1) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-05-18 05:27:03 EDT --- REVIEW: http://review.gluster.org/14358 (cluster/afr: Check for required number of entrylks) posted (#2) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-05-20 07:28:18 EDT --- REVIEW: http://review.gluster.org/14358 (cluster/afr: Check for required number of entrylks) posted (#3) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-05-20 11:19:42 EDT --- REVIEW: http://review.gluster.org/14358 (cluster/afr: Check for required number of entrylks) posted (#4) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/14461 (cluster/afr: Check for required number of entrylks) posted (#1) for review on release-3.8 by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/14461 (cluster/afr: Check for required number of entrylks) posted (#2) for review on release-3.8 by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/14461 (cluster/afr: Check for required number of entrylks) posted (#3) for review on release-3.8 by Ravishankar N (ravishankar)
COMMIT: http://review.gluster.org/14461 committed in release-3.8 by Niels de Vos (ndevos) ------ commit 0c295ad2fddccea39d7fc5b402c2cd197f0825ca Author: Ravishankar N <ravishankar> Date: Wed May 18 14:37:46 2016 +0530 cluster/afr: Check for required number of entrylks Backport of http://review.gluster.org/#/c/14358/ Problem: Parallel rmdir operations on the same directory results in ENOTCONN messages eventhough there was no network disconnect. In blocking entry lock during rmdir, AFR takes 2 set of locks on all its children-One (parentdir,name of dir to be deleted), the other (full lock on the dir being deleted). We proceed to pre-op stage even if only a single lock (but not all the needed locks) was obtained, only to fail it with ENOTCONN because afr_locked_nodes_get() returns zero nodes in afr_changelog_pre_op(). Fix: After we get replies for all blocking lock requests, if we don't have the minimum number of locks to carry out the FOP, unlock and fail the FOP. The op_errno will be that of the last failed reply we got, i.e. whatever is set in afr_lock_cbk(). Change-Id: Ibef25e65b468ebb5ea6ae1f5121a5f1201072293 BUG: 1338051 Signed-off-by: Ravishankar N <ravishankar> Reviewed-on: http://review.gluster.org/14461 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu> CentOS-regression: Gluster Build System <jenkins.com>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user