+++ This bug was initially created as a clone of Bug #1136159 +++ Description of problem: Executing renames/readdirp/cat in a loop can lead to opens failing with ENOENT. Version-Release number of selected component (if applicable): How reproducible: Very Steps to Reproduce: 1. Created a plain replicate volume, disabled all performance xlators. gluster volume set $1 performance.quick-read off gluster volume set $1 performance.io-cache off gluster volume set $1 performance.write-behind off gluster volume set $1 performance.stat-prefetch off gluster volume set $1 performance.read-ahead off 2. Mounted the volume on 2 mounts using -o direct-io-mode=yes 3. On one mount execute ls -lR 4. On the other mount execute: echo abc > abc-ln while true; do ln abc-ln abc; mv abc-ln abc; echo 3>/proc/sys/vm/drop_caches; cat abc; ln abc abc-ln; mv abc abc-ln; echo 3>/proc/sys/vm/drop_caches; cat abc-ln; done Actual results: brick logs print 'Not able to open file, No such file or directory' quite a few times even though the file is always present' Expected results: No failures should come in opens of files Additional info: --- Additional comment from Anand Avati on 2014-09-02 00:29:40 EDT --- REVIEW: http://review.gluster.org/8575 (storage/posix: Prefer gfid links for inode-handle) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Anand Avati on 2014-09-02 04:14:19 EDT --- REVIEW: http://review.gluster.org/8575 (storage/posix: Prefer gfid links for inode-handle) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Anand Avati on 2014-09-02 08:10:23 EDT --- COMMIT: http://review.gluster.org/8575 committed in master by Vijay Bellur (vbellur) ------ commit 2c0a694b8d910c530899077c1d242ad1ea250965 Author: Pranith Kumar K <pkarampu> Date: Tue Sep 2 09:40:44 2014 +0530 storage/posix: Prefer gfid links for inode-handle Problem: File path could change by other entry operations in-flight so if renames are in progress at the time of other operations like open, it may lead to failures. We observed that this issue can also happen while renames and readdirps/lookups are in progress because dentry-table is going stale sometimes. Fix: Prefer gfid-handles over paths for files. For directory handles prefering gfid-handles hits performance issues because it needs to resolve paths traversing up the symlinks. Tests which test if files are opened should check on gfid path after this change. So changed couple of tests to reflect the same. Note: This patch doesn't fix the issue for directories. I think a complete fix is to come up with an entry operation serialization xlator. Until then lets live with this. Change-Id: I10bda1083036d013f3a12588db7a71039d9da6c3 BUG: 1136159 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/8575 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/8594 (storage/posix: Prefer gfid links for inode-handle) posted (#1) for review on release-3.6 by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/8594 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit 444ffda19e2052b5fc78f7dc020de161ebee8563 Author: Pranith Kumar K <pkarampu> Date: Tue Sep 2 09:40:44 2014 +0530 storage/posix: Prefer gfid links for inode-handle Backport of http://review.gluster.org/8575 Problem: File path could change by other entry operations in-flight so if renames are in progress at the time of other operations like open, it may lead to failures. We observed that this issue can also happen while renames and readdirps/lookups are in progress because dentry-table is going stale sometimes. Fix: Prefer gfid-handles over paths for files. For directory handles prefering gfid-handles hits performance issues because it needs to resolve paths traversing up the symlinks. Tests which test if files are opened should check on gfid path after this change. So changed couple of tests to reflect the same. Note: This patch doesn't fix the issue for directories. I think a complete fix is to come up with an entry operation serialization xlator. Until then lets live with this. BUG: 1136821 Change-Id: If93e46d542a4e96a81a0639b5210330f7dbe8be0 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/8594 Reviewed-by: Vijay Bellur <vbellur> Tested-by: Gluster Build System <jenkins.com>
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html [2] http://supercolony.gluster.org/pipermail/gluster-users/
REVIEW: http://review.gluster.org/9030 (cluster/afr: Perform post-op in entry selfheal inside locks) posted (#1) for review on release-3.6 by Krutika Dhananjay (kdhananj)
COMMIT: http://review.gluster.org/9030 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit e80bfe76b89ae3f40b3258a4ac388f18a0b53034 Author: Krutika Dhananjay <kdhananj> Date: Fri Oct 31 12:51:15 2014 +0530 cluster/afr: Perform post-op in entry selfheal inside locks Backport of: http://review.gluster.org/#/c/9020 Take entrylks in xlator domain before doing post-op (undo-pending) in entry self-heal. This is to prevent a parallel name self-heal on an entry under @fd->inode from reading pending xattrs while it is being modified by SHD after entry sh below, given that name self-heal takes locks ONLY in xlator domain and is free to read pending changelog in the absence of the following locking. Change-Id: I0bc92978efc0741d6e3f2439540d008e31472313 BUG: 1136821 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/9030 Reviewed-by: Pranith Kumar Karampuri <pkarampu> Tested-by: Pranith Kumar Karampuri <pkarampu> Reviewed-by: Vijay Bellur <vbellur> Tested-by: Vijay Bellur <vbellur>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report. glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html [2] http://supercolony.gluster.org/mailman/listinfo/gluster-users