+++ This bug was initially created as a clone of Bug #1226717 +++ Description of problem: The auth-cache feature contains a function called auth_cache_purge(). This function replaces the auth_cache->cache_dict with a new dictionary that should contain fresh caches. The placing is triggered by the _mnt3_auth_param_refresh_thread(). There is no locking of the actual auth_cache_entry structures, and auth_cache_purge() can cause the free'ing of these entries while other threads are using them. It is very rare to notice a problem, because the auth_cache_entry structures are used only very briefly. A chance for corruption is really small. Our regression tests seem to have hit this issue only once or twice in the last few months. Version-Release number of selected component (if applicable): 3.7 and mainline How reproducible: extremely difficult. Additional info: http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/11052/focus=11109
REVIEW: http://review.gluster.org/11645 (nfs: add a gf_lock_t for the auth_cache->cache_dict) posted (#1) for review on release-3.7 by Niels de Vos (ndevos)
REVIEW: http://review.gluster.org/11646 (nfs: refcount each auth_cache_entry and related data_t) posted (#1) for review on release-3.7 by Niels de Vos (ndevos)
REVIEW: http://review.gluster.org/11647 (refcount: correct the documentation) posted (#1) for review on release-3.7 by Niels de Vos (ndevos)
COMMIT: http://review.gluster.org/11647 committed in release-3.7 by Krishnan Parthasarathi (kparthas) ------ commit dd66dd9d6c249282711d56678bdfe22c2a8d0975 Author: Niels de Vos <ndevos> Date: Mon Jul 13 12:16:33 2015 +0200 refcount: correct the documentation The only check that _gf_ref_get() needs is "== 0" for detecting a failure. The actual return value is not guaranteed to be the number of active refences (they can change in other threads anyway). Cherry picked from commit c7f309116d8fa62f6b9fd6ff2902e8ce4bfa192d: > BUG: 1163543 > Change-Id: I8801601eab37046f5a5ee0bce5a62606115ca151 > Signed-off-by: Niels de Vos <ndevos> > Reviewed-on: http://review.gluster.org/11328 > Tested-by: NetBSD Build System <jenkins.org> > Tested-by: Gluster Build System <jenkins.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle> Change-Id: I8801601eab37046f5a5ee0bce5a62606115ca151 BUG: 1242515 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: http://review.gluster.org/11647 Tested-by: NetBSD Build System <jenkins.org> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Krishnan Parthasarathi <kparthas>
COMMIT: http://review.gluster.org/11645 committed in release-3.7 by Niels de Vos (ndevos) ------ commit 3d6dacd69ca439e338ad59bfab53ce6c72b028d0 Author: Niels de Vos <ndevos> Date: Mon Jul 13 12:14:53 2015 +0200 nfs: add a gf_lock_t for the auth_cache->cache_dict This is the 1st step towards implementing reference counters for the auth_cache_entry structure. Access to the structures should always be done atomically, but this can not be guaranteed by the a dict. Cherry picked from commit 67f7562b5cc9e42774d1dc569471f86f61eef040: > Change-Id: Ic165221d72f11832177976c989823d861cf12f01 > BUG: 1226717 > Signed-off-by: Niels de Vos <ndevos> > Reviewed-on: http://review.gluster.org/11021 > Tested-by: NetBSD Build System <jenkins.org> > Tested-by: Gluster Build System <jenkins.com> > Reviewed-by: jiffin tony Thottan <jthottan> Change-Id: Ic165221d72f11832177976c989823d861cf12f01 BUG: 1242515 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: http://review.gluster.org/11645 Tested-by: NetBSD Build System <jenkins.org> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: jiffin tony Thottan <jthottan>
COMMIT: http://review.gluster.org/11646 committed in release-3.7 by Niels de Vos (ndevos) ------ commit 85a7ad784e92f4b0bedb44f7e64bf4e9adfae5ce Author: Niels de Vos <ndevos> Date: Mon Jul 13 12:16:04 2015 +0200 nfs: refcount each auth_cache_entry and related data_t This makes sure that all the auth_cache_entry structures are only free'd when there is no reference to it anymore. When it is free'd, the associated data_t from the auth_cache->cache_dict gets unref'd too. Upon calling auth_cache_purge(), the auth_cache->cache_dict will free each auth_cache_entry in a secure way. Cherry picked from commit 7b51bd636fc5e5e1ae48a4e7cba48d0d20878d15: > Change-Id: If097cc11838e43599040f5414f82b30fc0fd40c6 > BUG: 1226717 > Signed-off-by: Niels de Vos <ndevos> > Reviewed-on: http://review.gluster.org/11023 > Reviewed-by: Xavier Hernandez <xhernandez> > Tested-by: Gluster Build System <jenkins.com> > Tested-by: NetBSD Build System <jenkins.org> Change-Id: If097cc11838e43599040f5414f82b30fc0fd40c6 BUG: 1242515 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: http://review.gluster.org/11646 Tested-by: NetBSD Build System <jenkins.org> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Xavier Hernandez <xhernandez>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.3, please open a new bug report. glusterfs-3.7.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/12078 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user