+++ This bug was initially created as a clone of Bug #1379720 +++ Description of problem: Rotated logs are not accessible from replicated gluster node via NFS mounts. We've gluster with distributed replica between two sites. After shared log is rotated by a server on site A many (not all!) nodes accessing the replicated log from a node at the B site get "cannot access .log: Stale file handle" error. RCA from Soumya and Pranith, The theory which we have come up with is that --> The reason NFS server is not doing fresh lookup after it received ESTALE could be that cs->lookuptype must have been set to GF_NFS3_FRESH instead of GF_NFS3_REVALIDATE. nfs_lookup() fop starts with setting lookuptype to GF_NFS3_REVALIDATE. But as per the code changes done in the patch mentioned in the comment#52, before doing STACK_WIND on the child xlator lookup, in case if we find cached inode for that file/entry name in the inode table but with inode_ctx not set, we reset lookuptype to GF_NFS3_FRESH. This may have led to nfs xlator not sending fresh lookup on receiving ESTALE. So there could be various reasons for inode_ctx not being set. Underlying xlators could have done inode_link which we are ruling out for now as we do not see inode_link done with a file entry name (except for tiered volumes). Other possibility is that we do not set inode_ctx in readdirp_cbk path. Maybe * client1 has done readdirp on the directory 'jmsdomain6'..got the file handle of the file 'gcjrockit_jms06admin00a.log' as part of readdirp response. That means we have inode entry of this file but with no inode_ctx set. * meanwhile client2 has deleted and re-created this log file (probably as part of logrotate) * Now client1 does lookup on the earlier filehandle it received resulting in ESTALE. --- Additional comment from Niels de Vos on 2016-09-29 09:06:45 EDT --- Patch posted: http://review.gluster.org/15580 --- Additional comment from Worker Ant on 2016-10-12 03:57:17 EDT --- REVIEW: http://review.gluster.org/15580 (nfs: revalidate lookup converted to fresh lookup) posted (#2) for review on master by mohammed rafi kc (rkavunga) --- Additional comment from Worker Ant on 2016-10-12 06:54:36 EDT --- REVIEW: http://review.gluster.org/15580 (nfs: revalidate lookup converted to fresh lookup) posted (#3) for review on master by mohammed rafi kc (rkavunga) --- Additional comment from Worker Ant on 2016-10-12 06:56:47 EDT --- REVIEW: http://review.gluster.org/15580 (nfs: revalidate lookup converted to fresh lookup) posted (#4) for review on master by mohammed rafi kc (rkavunga) --- Additional comment from Worker Ant on 2016-10-13 08:02:54 EDT --- REVIEW: http://review.gluster.org/15580 (nfs: revalidate lookup converted to fresh lookup) posted (#5) for review on master by mohammed rafi kc (rkavunga) --- Additional comment from Worker Ant on 2016-11-10 17:19:13 EST --- COMMIT: http://review.gluster.org/15580 committed in master by Kaleb KEITHLEY (kkeithle) ------ commit ba7a737b1260bbafe22097bea08814035c8b655d Author: Mohammed Rafi KC <rkavunga> Date: Tue Sep 27 19:01:48 2016 +0530 nfs: revalidate lookup converted to fresh lookup when an inode ctx is missing for a linked inode the revalidate lookups are converted to fresh. This could result in sending ESTALE when the gfid are recreated We are not able to reproduce the issue with normal setup, most part of RCA was done with code reading. Possible scenario in which this bug can reproduce, Delete a file and recreate a new file with same name, at the same time from another client process try to list/or access the file. In this case the second client may throw an ESTALE error for such files Thanks to Soumya and Pranith for doing the complete RCA Change-Id: I73992a65844b09a169cefaaedc0dcfb129d66ea1 BUG: 1379720 Signed-off-by: Mohammed Rafi KC <rkavunga> Reviewed-on: http://review.gluster.org/15580 NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: soumya k <skoduri> Reviewed-by: Kaleb KEITHLEY <kkeithle>
REVIEW: http://review.gluster.org/15839 (nfs: revalidate lookup converted to fresh lookup) posted (#1) for review on release-3.9 by mohammed rafi kc (rkavunga)
COMMIT: https://review.gluster.org/15839 committed in release-3.9 by Niels de Vos (ndevos) ------ commit cce1e4c2b96bac9c496565546045a6cebec52afe Author: Mohammed Rafi KC <rkavunga> Date: Tue Sep 27 19:01:48 2016 +0530 nfs: revalidate lookup converted to fresh lookup Backport of http://review.gluster.org/15580 when an inode ctx is missing for a linked inode the revalidate lookups are converted to fresh. This could result in sending ESTALE when the gfid are recreated We are not able to reproduce the issue with normal setup, most part of RCA was done with code reading. Possible scenario in which this bug can reproduce, Delete a file and recreate a new file with same name, at the same time from another client process try to list/or access the file. In this case the second client may throw an ESTALE error for such files Thanks to Soumya and Pranith for doing the complete RCA >Change-Id: I73992a65844b09a169cefaaedc0dcfb129d66ea1 >BUG: 1379720 >Signed-off-by: Mohammed Rafi KC <rkavunga> >Reviewed-on: http://review.gluster.org/15580 >NetBSD-regression: NetBSD Build System <jenkins.org> >CentOS-regression: Gluster Build System <jenkins.org> >Smoke: Gluster Build System <jenkins.org> >Reviewed-by: soumya k <skoduri> >Reviewed-by: Kaleb KEITHLEY <kkeithle> Signed-off-by: Mohammed Rafi KC <rkavunga> Change-Id: I44c3770fb07e84183f8bc6eceb533efbc67fb67f BUG: 1394634 Signed-off-by: Mohammed Rafi KC <rkavunga> Reviewed-on: https://review.gluster.org/15839 Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Niels de Vos <ndevos> NetBSD-regression: NetBSD Build System <jenkins.org>
This bug is getting closed because GlusterFS-3.9 has reached its end-of-life [1]. Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS. If this bug still exists in newer GlusterFS releases, please open a new bug against the newer release. [1]: https://www.gluster.org/community/release-schedule/