+++ This bug was initially created as a clone of Bug #1295107 +++ Description of problem: This bug is to track the changes required to fix mem leaks related to gfapi applications. --- Additional comment from Vijay Bellur on 2016-01-04 09:22:51 EST --- REVIEW: http://review.gluster.org/13125 (inode: Retire the inodes from the lru list in inode_table_destroy) posted (#2) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-01-05 00:51:37 EST --- REVIEW: http://review.gluster.org/13096 (gfapi: Fix inode reference/nlookup counts) posted (#3) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-01-11 05:23:23 EST --- REVIEW: http://review.gluster.org/13096 (gfapi: Fix inode nlookup counts) posted (#4) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-01-13 01:13:14 EST --- REVIEW: http://review.gluster.org/13232 (upcall: free the xdr* allocations) posted (#1) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-01-14 06:39:27 EST --- COMMIT: http://review.gluster.org/13232 committed in master by Kaleb KEITHLEY (kkeithle) ------ commit 7d6e5dad15b7ecf2e8abce468aea9fd84b876052 Author: Soumya Koduri <skoduri> Date: Wed Jan 13 11:34:27 2016 +0530 upcall: free the xdr* allocations Free the xdr string allocations after decoding the upcall cache_invalidation request. Change-Id: I0ffc64f587d6c8566cba76cf08148f937a735926 BUG: 1295107 Signed-off-by: Soumya Koduri <skoduri> Reviewed-on: http://review.gluster.org/13232 Reviewed-by: Niels de Vos <ndevos> Tested-by: NetBSD Build System <jenkins.org> Reviewed-by: Kaleb KEITHLEY <kkeithle> --- Additional comment from Vijay Bellur on 2016-02-01 02:33:26 EST --- REVIEW: http://review.gluster.org/13125 (inode: Retire the inodes from the lru list in inode_table_destroy) posted (#3) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-02-08 06:03:48 EST --- REVIEW: http://review.gluster.org/13125 (inode: Retire the inodes from the lru list in inode_table_destroy) posted (#4) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-02-08 12:26:39 EST --- REVIEW: http://review.gluster.org/13125 (inode: Retire the inodes from the lru list in inode_table_destroy) posted (#5) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-02-08 12:26:42 EST --- REVIEW: http://review.gluster.org/13096 (gfapi: Use inode_forget in case of handle objects) posted (#5) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-02-16 08:25:11 EST --- REVIEW: http://review.gluster.org/13125 (inode: Retire the inodes from the lru list in inode_table_destroy) posted (#6) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-02-16 08:25:14 EST --- REVIEW: http://review.gluster.org/13096 (gfapi: Use inode_forget in case of handle objects) posted (#6) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-02-16 08:25:17 EST --- REVIEW: http://review.gluster.org/13456 (rpc: Fix for rpc_transport_t leak) posted (#1) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-02-20 07:47:36 EST --- REVIEW: http://review.gluster.org/13456 (rpc: Fix for rpc_transport_t leak) posted (#2) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-02-22 01:59:45 EST --- REVIEW: http://review.gluster.org/13456 (rpc: Fix for rpc_transport_t leak) posted (#3) for review on master by soumya k (skoduri) --- Additional comment from Vijay Bellur on 2016-02-23 21:51:17 EST --- COMMIT: http://review.gluster.org/13096 committed in master by Kaleb KEITHLEY (kkeithle) ------ commit db747eba347e2909f4fb4fc28a67adb8f188586a Author: Soumya Koduri <skoduri> Date: Mon Feb 8 22:33:56 2016 +0530 gfapi: Use inode_forget in case of handle objects Currently with gfapi, even if applications release their reference on paritular inode entries, those entries never get destroyed as they shall always have positive nlookup count unless inode_table exceeds lru limit. Since inodes and their contexts can consume considerable amount of memory, applications may end up consuming lot of memory and may even get OOM killed. To avoid that, have considered below approaches (for handle based access)- a) Do 'inode_lookup' only if the corresponding inode is created for the first time and forget it during close of the handle This shall not work as multiple handle objects can refer to same inode entry and inode_forget from second handle object onwards shall result in assert. b) Do 'inode_lookup' during a handle or fd creation But this approach shall affect the performance of the fops which operate on neither handle nor fd. c) current approach taken- Applications using glfs handle objects hold a reference on corresponding inode entries. Hence it is safe to forget those inodes and make nlookup count to '0' while trying to delete those handle objects. That way the last unref (i.e, from the last handle close) shall result in inode_destroy(). Change-Id: Id2c7ab178894a10c0030c143ba71e7813df8d18c Signed-off-by: Soumya Koduri <skoduri> BUG: 1295107 Reviewed-on: http://review.gluster.org/13096 Reviewed-by: jiffin tony Thottan <jthottan> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Niels de Vos <ndevos> CentOS-regression: Gluster Build System <jenkins.com> Smoke: Gluster Build System <jenkins.com> Reviewed-by: Kaleb KEITHLEY <kkeithle> --- Additional comment from Vijay Bellur on 2016-02-23 23:15:29 EST --- COMMIT: http://review.gluster.org/13456 committed in master by Jeff Darcy (jdarcy) ------ commit 8cf29a4207c162be8e2993ae36f49090851cbbfc Author: Soumya Koduri <skoduri> Date: Tue Feb 16 18:50:23 2016 +0530 rpc: Fix for rpc_transport_t leak The transport object needs to get unref'ed when the rpc clnt object is getting destroyed. But currently in rpc_clnt_disable() we set conn->trans to NULL before it gets unref'ed leading to transport object leak. This change is to fix it by setting conn-tran to NULL only when it is being unref'ed. Change-Id: I79ba34e28ae19eb616035f36bbed1c2f47875b94 BUG: 1295107 Signed-off-by: Soumya Koduri <skoduri> Reviewed-on: http://review.gluster.org/13456 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Kaleb KEITHLEY <kkeithle> Reviewed-by: Jeff Darcy <jdarcy> --- Additional comment from Vijay Bellur on 2016-02-23 23:53:04 EST --- REVIEW: http://review.gluster.org/13125 (inode: Retire the inodes from the lru list in inode_table_destroy) posted (#7) for review on master by soumya k (skoduri)
REVIEW: http://review.gluster.org/13506 (gfapi: Use inode_forget in case of handle objects) posted (#1) for review on release-3.7 by soumya k (skoduri)
REVIEW: http://review.gluster.org/13507 (rpc: Fix for rpc_transport_t leak) posted (#1) for review on release-3.7 by soumya k (skoduri)
REVIEW: http://review.gluster.org/13506 (gfapi: Use inode_forget in case of handle objects) posted (#2) for review on release-3.7 by soumya k (skoduri)
REVIEW: http://review.gluster.org/13507 (rpc: Fix for rpc_transport_t leak) posted (#2) for review on release-3.7 by soumya k (skoduri)
REVIEW: http://review.gluster.org/13527 (inode: Retire the inodes from the lru list in inode_table_destroy) posted (#1) for review on release-3.7 by soumya k (skoduri)
COMMIT: http://review.gluster.org/13506 committed in release-3.7 by Kaleb KEITHLEY (kkeithle) ------ commit 6f12a5767f6912f6913cd9dfa4c6a5484a766000 Author: Soumya Koduri <skoduri> Date: Mon Feb 8 22:33:56 2016 +0530 gfapi: Use inode_forget in case of handle objects Currently with gfapi, even if applications release their reference on paritular inode entries, those entries never get destroyed as they shall always have positive nlookup count unless inode_table exceeds lru limit. Since inodes and their contexts can consume considerable amount of memory, applications may end up consuming lot of memory and may even get OOM killed. To avoid that, have considered below approaches (for handle based access)- a) Do 'inode_lookup' only if the corresponding inode is created for the first time and forget it during close of the handle This shall not work as multiple handle objects can refer to same inode entry and inode_forget from second handle object onwards shall result in assert. b) Do 'inode_lookup' during a handle or fd creation But this approach shall affect the performance of the fops which operate on neither handle nor fd. c) current approach taken- Applications using glfs handle objects hold a reference on corresponding inode entries. Hence it is safe to forget those inodes and make nlookup count to '0' while trying to delete those handle objects. That way the last unref (i.e, from the last handle close) shall result in inode_destroy(). This is backport of the below patch - http://review.gluster.org/13096 Change-Id: Id2c7ab178894a10c0030c143ba71e7813df8d18c Signed-off-by: Soumya Koduri <skoduri> BUG: 1311441 Reviewed-on: http://review.gluster.org/13096 Reviewed-by: jiffin tony Thottan <jthottan> Reviewed-by: Niels de Vos <ndevos> Reviewed-by: Kaleb KEITHLEY <kkeithle> Reviewed-on: http://review.gluster.org/13506 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com>
REVIEW: http://review.gluster.org/13527 (inode: Retire the inodes from the lru list in inode_table_destroy) posted (#2) for review on release-3.7 by soumya k (skoduri)
REVIEW: http://review.gluster.org/13507 (rpc: Fix for rpc_transport_t leak) posted (#3) for review on release-3.7 by soumya k (skoduri)
COMMIT: http://review.gluster.org/13527 committed in release-3.7 by Kaleb KEITHLEY (kkeithle) ------ commit 29adf166aa5f15202c5fe49369ad4f11df799c5b Author: Soumya Koduri <skoduri> Date: Thu Dec 31 13:53:54 2015 +0530 inode: Retire the inodes from the lru list in inode_table_destroy Inodes from the lru list are not moved to purge list unless they are retired. Also process the lru list first to unset their parent as we need to unset their dentry entries (the ones which may not be unset during '__inode_passivate' as they were hashed) which in turn shall unref their parent inodes which could be in active list. These parent inodes when unref'ed may well again fall into lru list and if we are at the end of traversing the list, we may miss to delete/retire that entry. Hence traverse the lru list till it gets empty. This is backport of the below patch - http://review.gluster.org/13125 Change-Id: Ib7666e235e9b9644144a7c7933afb5e407e506ca BUG: 1311441 Signed-off-by: Soumya Koduri <skoduri> Reviewed-on: http://review.gluster.org/13125 Reviewed-by: Kaleb KEITHLEY <kkeithle> Reviewed-by: Raghavendra G <rgowdapp> Reviewed-on: http://review.gluster.org/13527 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com>
REVIEW: http://review.gluster.org/13507 (rpc: Fix for rpc_transport_t leak) posted (#4) for review on release-3.7 by soumya k (skoduri)
REVIEW: http://review.gluster.org/13584 (rpc: Fix regression cause by #13507) posted (#1) for review on release-3.7 by soumya k (skoduri)
REVIEW: http://review.gluster.org/13507 (rpc: Fix for rpc_transport_t leak) posted (#5) for review on release-3.7 by soumya k (skoduri)
REVIEW: http://review.gluster.org/13824 (rpc: Connect back only if rpc is not disabled) posted (#1) for review on release-3.7 by soumya k (skoduri)
COMMIT: http://review.gluster.org/13507 committed in release-3.7 by Jeff Darcy (jdarcy) ------ commit 10d7b9d45a90d345f6ce4ec0e7192e0226602d80 Author: Soumya Koduri <skoduri> Date: Tue Feb 16 18:50:23 2016 +0530 rpc: Fix for rpc_transport_t leak The transport object needs to get unref'ed when the rpc clnt object is getting destroyed. But currently in rpc_clnt_disable() we set conn->trans to NULL before it gets unref'ed leading to transport object leak. This change is to fix it by setting conn-tran to NULL only when it is being unref'ed. This is backport of the below patch - http://review.gluster.org/13456 Change-Id: I79ba34e28ae19eb616035f36bbed1c2f47875b94 BUG: 1311441 Signed-off-by: Soumya Koduri <skoduri> Reviewed-on: http://review.gluster.org/13456 Reviewed-by: Kaleb KEITHLEY <kkeithle> Reviewed-by: Jeff Darcy <jdarcy> Reviewed-on: http://review.gluster.org/13507 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com>
REVIEW: http://review.gluster.org/13824 (rpc: Connect back only if rpc is not disabled) posted (#2) for review on release-3.7 by soumya k (skoduri)
COMMIT: http://review.gluster.org/13824 committed in release-3.7 by Kaushal M (kaushal) ------ commit ac6899c7eaae9983f00645109a569e75f1d0a72a Author: Kotresh HR <khiremat> Date: Wed Mar 2 11:40:24 2016 +0530 rpc: Connect back only if rpc is not disabled This is a backport of the below fix - http://review.gluster.org/13592 As discussed over http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/14284, patch #13456 caused a regression where in if there are any pending rpc invocations, we end up accessing freed object. This patch fixes it by allowing reconnect during rpc submit only if rpc is not disabled. This is to fix regression caused by below patch - http://review.gluster.org/#/c/13507/ Change-Id: I4ef4dd52bd42368bb89129f98bc973e46c6a39f4 BUG: 1311441 Signed-off-by: Kotresh HR <khiremat> Reviewed-on: http://review.gluster.org/13592 Reviewed-on: http://review.gluster.org/13824 Tested-by: soumya k <skoduri> Smoke: Gluster Build System <jenkins.com> CentOS-regression: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Jeff Darcy <jdarcy>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.10, please open a new bug report. glusterfs-3.7.10 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://www.gluster.org/pipermail/gluster-users/2016-April/026164.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.9, please open a new bug report. glusterfs-3.7.9 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://www.gluster.org/pipermail/gluster-users/2016-March/025922.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user