+++ This bug was initially created as a clone of Bug #1142052 +++ Description of problem: Rebalance has been running for about 7 days on a 2 node, 2 brick, 52TB distributed volume. Memory usage has grown slowing over time to consume all available physical memory. OOM killer stopped the last rebalance and then after restarting all Gluster processes I am attempting it again. "sync; echo 3 > /proc/sys/vm/drop_caches" has no effect to lower memory consumption. Version-Release number of selected component (if applicable): [root@hgluster01 ~]# glusterfs --version glusterfs 3.5.2 built on Jul 31 2014 18:47:52 How reproducible: So far, I cannot complete a rebalance. Steps to Reproduce: 1. Start rebalance gluster volume rebalance export_volume start Actual results: High memory consumption by glusterfs process gets OOM'd Expected results: Rebalance does not consume all available memory and completes the rebalance and fix-layout. Additional info: [root@hgluster01 ~]# gluster volume status export_volume detail Status of volume: export_volume ------------------------------------------------------------------------------ Brick : Brick hgluster01:/gluster_data Port : 49152 Online : Y Pid : 2438 File System : xfs Device : /dev/mapper/vg_data-lv_data Mount Options : rw,noatime,nodiratime,logbufs=8,logbsize=256k,inode64,nobarrier Inode Size : 512 Disk Space Free : 12.3TB Total Disk Space : 27.3TB Inode Count : 2929685696 Free Inodes : 2839872616 ------------------------------------------------------------------------------ Brick : Brick hgluster02:/gluster_data Port : 49152 Online : Y Pid : 2467 File System : xfs Device : /dev/mapper/vg_data-lv_data Mount Options : rw,noatime,nodiratime,logbufs=8,logbsize=256k,inode64,nobarrier Inode Size : 512 Disk Space Free : 12.4TB Total Disk Space : 27.3TB Inode Count : 2929685696 Free Inodes : 2839847441 [root@hgluster01 ~]# gluster volume info Volume Name: export_volume Type: Distribute Volume ID: c74cc970-31e2-4924-a244-4c70d958dadb Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: hgluster01:/gluster_data Brick2: hgluster02:/gluster_data Options Reconfigured: performance.stat-prefetch: on performance.write-behind: on performance.flush-behind: on features.quota-deem-statfs: on performance.quick-read: on performance.client-io-threads: on performance.read-ahead: on performance.io-thread-count: 24 features.quota: on cluster.eager-lock: on nfs.disable: on auth.allow: 192.168.10.*,10.0.10.*,10.8.0.* server.allow-insecure: on performance.write-behind-window-size: 4MB network.ping-timeout: 60 features.quota-timeout: 10 performance.io-cache: off [root@hgluster01 ~]# cat /proc/meminfo MemTotal: 32844100 kB MemFree: 2148772 kB Buffers: 14184 kB Cached: 35600 kB SwapCached: 204288 kB Active: 24682388 kB Inactive: 3315448 kB Active(anon): 24660896 kB Inactive(anon): 3289292 kB Active(file): 21492 kB Inactive(file): 26156 kB Unevictable: 12728 kB Mlocked: 4552 kB SwapTotal: 16490488 kB SwapFree: 15077012 kB Dirty: 32 kB Writeback: 0 kB AnonPages: 27761596 kB Mapped: 9168 kB Shmem: 4 kB Slab: 544552 kB SReclaimable: 273636 kB SUnreclaim: 270916 kB KernelStack: 4800 kB PageTables: 60592 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 32912536 kB Committed_AS: 29529576 kB VmallocTotal: 34359738367 kB VmallocUsed: 345236 kB VmallocChunk: 34340927412 kB HardwareCorrupted: 0 kB AnonHugePages: 17307648 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 4728 kB DirectMap2M: 2058240 kB DirectMap1G: 31457280 kB [root@hgluster01 ~]# pmap -x 2627 2627: /usr/sbin/glusterfs -s localhost --volfile-id export_volume --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off --xlator-option *replicate*.readdir-failover=off --xlator-option *dht.readdir-optimize=on --xlator-option *dht.rebalance-cmd=1 --xlator-option *dht.node-uuid=875dbae1-82bd-485f-98e Address Kbytes RSS Dirty Mode Mapping 0000000000400000 64 20 0 r-x-- glusterfsd 0000000000610000 8 8 8 rw--- glusterfsd 0000000001a15000 132 40 40 rw--- [ anon ] 0000000001a36000 28914064 27616220 27611240 rw--- [ anon ] 00007f0290000000 132 16 16 rw--- [ anon ] 00007f0290021000 65404 0 0 ----- [ anon ] 00007f0297a1e000 6024 1184 1184 rw--- [ anon ] 00007f0298000000 132 0 0 rw--- [ anon ] 00007f0298021000 65404 0 0 ----- [ anon ] 00007f029c000000 132 28 28 rw--- [ anon ] 00007f029c021000 65404 0 0 ----- [ anon ] 00007f02a0000000 132 8 8 rw--- [ anon ] 00007f02a0021000 65404 0 0 ----- [ anon ] 00007f02a4092000 4964 16 16 rw--- [ anon ] 00007f02a456b000 4 0 0 ----- [ anon ] 00007f02a456c000 11844 2056 2056 rw--- [ anon ] 00007f02a50fd000 4 0 0 ----- [ anon ] 00007f02a50fe000 1024 8 8 rw--- [ anon ] 00007f02a51fe000 96 12 0 r-x-- io-stats.so 00007f02a5216000 2048 0 0 ----- io-stats.so 00007f02a5416000 8 0 0 rw--- io-stats.so 00007f02a5418000 96 16 0 r-x-- io-threads.so 00007f02a5430000 2044 0 0 ----- io-threads.so 00007f02a562f000 12 4 4 rw--- io-threads.so 00007f02a5632000 52 0 0 r-x-- md-cache.so 00007f02a563f000 2044 0 0 ----- md-cache.so 00007f02a583e000 8 0 0 rw--- md-cache.so 00007f02a5840000 28 0 0 r-x-- open-behind.so 00007f02a5847000 2048 0 0 ----- open-behind.so 00007f02a5a47000 4 0 0 rw--- open-behind.so 00007f02a5a48000 28 0 0 r-x-- quick-read.so 00007f02a5a4f000 2044 0 0 ----- quick-read.so 00007f02a5c4e000 8 0 0 rw--- quick-read.so 00007f02a5c50000 44 0 0 r-x-- read-ahead.so 00007f02a5c5b000 2044 0 0 ----- read-ahead.so 00007f02a5e5a000 8 0 0 rw--- read-ahead.so 00007f02a5e5c000 48 0 0 r-x-- write-behind.so 00007f02a5e68000 2048 0 0 ----- write-behind.so 00007f02a6068000 8 0 0 rw--- write-behind.so 00007f02a606a000 300 136 0 r-x-- dht.so 00007f02a60b5000 2048 0 0 ----- dht.so 00007f02a62b5000 16 8 8 rw--- dht.so 00007f02a62b9000 240 108 0 r-x-- client.so 00007f02a62f5000 2048 0 0 ----- client.so 00007f02a64f5000 16 8 8 rw--- client.so 00007f02a64f9000 4 0 0 ----- [ anon ] 00007f02a64fa000 10240 8 8 rw--- [ anon ] 00007f02a6efa000 48 12 0 r-x-- libnss_files-2.12.so 00007f02a6f06000 2048 0 0 ----- libnss_files-2.12.so 00007f02a7106000 4 0 0 r---- libnss_files-2.12.so 00007f02a7107000 4 0 0 rw--- libnss_files-2.12.so 00007f02a7108000 116 0 0 r-x-- libselinux.so.1 00007f02a7125000 2044 0 0 ----- libselinux.so.1 00007f02a7324000 4 0 0 r---- libselinux.so.1 00007f02a7325000 4 0 0 rw--- libselinux.so.1 00007f02a7326000 4 0 0 rw--- [ anon ] 00007f02a7327000 88 0 0 r-x-- libresolv-2.12.so 00007f02a733d000 2048 0 0 ----- libresolv-2.12.so 00007f02a753d000 4 0 0 r---- libresolv-2.12.so 00007f02a753e000 4 0 0 rw--- libresolv-2.12.so 00007f02a753f000 8 0 0 rw--- [ anon ] 00007f02a7541000 8 0 0 r-x-- libkeyutils.so.1.3 00007f02a7543000 2044 0 0 ----- libkeyutils.so.1.3 00007f02a7742000 4 0 0 r---- libkeyutils.so.1.3 00007f02a7743000 4 0 0 rw--- libkeyutils.so.1.3 00007f02a7744000 40 0 0 r-x-- libkrb5support.so.0.1 00007f02a774e000 2044 0 0 ----- libkrb5support.so.0.1 00007f02a794d000 4 0 0 r---- libkrb5support.so.0.1 00007f02a794e000 4 0 0 rw--- libkrb5support.so.0.1 00007f02a794f000 164 0 0 r-x-- libk5crypto.so.3.1 00007f02a7978000 2048 0 0 ----- libk5crypto.so.3.1 00007f02a7b78000 4 0 0 r---- libk5crypto.so.3.1 00007f02a7b79000 4 0 0 rw--- libk5crypto.so.3.1 00007f02a7b7a000 4 0 0 rw--- [ anon ] 00007f02a7b7b000 12 0 0 r-x-- libcom_err.so.2.1 00007f02a7b7e000 2044 0 0 ----- libcom_err.so.2.1 00007f02a7d7d000 4 0 0 r---- libcom_err.so.2.1 00007f02a7d7e000 4 0 0 rw--- libcom_err.so.2.1 00007f02a7d7f000 876 0 0 r-x-- libkrb5.so.3.3 00007f02a7e5a000 2044 0 0 ----- libkrb5.so.3.3 00007f02a8059000 40 4 4 r---- libkrb5.so.3.3 00007f02a8063000 8 0 0 rw--- libkrb5.so.3.3 00007f02a8065000 260 0 0 r-x-- libgssapi_krb5.so.2.2 00007f02a80a6000 2048 0 0 ----- libgssapi_krb5.so.2.2 00007f02a82a6000 4 0 0 r---- libgssapi_krb5.so.2.2 00007f02a82a7000 8 4 4 rw--- libgssapi_krb5.so.2.2 00007f02a82a9000 388 4 0 r-x-- libssl.so.1.0.1e 00007f02a830a000 2048 0 0 ----- libssl.so.1.0.1e 00007f02a850a000 16 0 0 r---- libssl.so.1.0.1e 00007f02a850e000 28 0 0 rw--- libssl.so.1.0.1e 00007f02a8515000 60 44 0 r-x-- socket.so 00007f02a8524000 2048 0 0 ----- socket.so 00007f02a8724000 16 8 8 rw--- socket.so 00007f02a8728000 4 0 0 ----- [ anon ] 00007f02a8729000 10240 8 8 rw--- [ anon ] 00007f02a9129000 4 0 0 ----- [ anon ] 00007f02a912a000 10240 8 8 rw--- [ anon ] 00007f02a9b2a000 4 0 0 ----- [ anon ] 00007f02a9b2b000 10240 0 0 rw--- [ anon ] 00007f02aa52b000 20052 6296 6296 rw--- [ anon ] 00007f02ab8c0000 84 0 0 r-x-- libz.so.1.2.3 00007f02ab8d5000 2044 0 0 ----- libz.so.1.2.3 00007f02abad4000 4 0 0 r---- libz.so.1.2.3 00007f02abad5000 4 0 0 rw--- libz.so.1.2.3 00007f02abad6000 1576 756 0 r-x-- libc-2.12.so 00007f02abc60000 2048 0 0 ----- libc-2.12.so 00007f02abe60000 16 16 16 r---- libc-2.12.so 00007f02abe64000 4 4 4 rw--- libc-2.12.so 00007f02abe65000 20 16 16 rw--- [ anon ] 00007f02abe6a000 1748 0 0 r-x-- libcrypto.so.1.0.1e 00007f02ac01f000 2048 0 0 ----- libcrypto.so.1.0.1e 00007f02ac21f000 108 0 0 r---- libcrypto.so.1.0.1e 00007f02ac23a000 48 0 0 rw--- libcrypto.so.1.0.1e 00007f02ac246000 16 0 0 rw--- [ anon ] 00007f02ac24a000 92 72 0 r-x-- libpthread-2.12.so 00007f02ac261000 2048 0 0 ----- libpthread-2.12.so 00007f02ac461000 4 4 4 r---- libpthread-2.12.so 00007f02ac462000 4 4 4 rw--- libpthread-2.12.so 00007f02ac463000 16 4 4 rw--- [ anon ] 00007f02ac467000 28 4 0 r-x-- librt-2.12.so 00007f02ac46e000 2044 0 0 ----- librt-2.12.so 00007f02ac66d000 4 4 4 r---- librt-2.12.so 00007f02ac66e000 4 0 0 rw--- librt-2.12.so 00007f02ac66f000 1396 0 0 r-x-- libpython2.6.so.1.0 00007f02ac7cc000 2044 0 0 ----- libpython2.6.so.1.0 00007f02ac9cb000 240 0 0 rw--- libpython2.6.so.1.0 00007f02aca07000 56 0 0 rw--- [ anon ] 00007f02aca15000 524 0 0 r-x-- libm-2.12.so 00007f02aca98000 2044 0 0 ----- libm-2.12.so 00007f02acc97000 4 0 0 r---- libm-2.12.so 00007f02acc98000 4 0 0 rw--- libm-2.12.so 00007f02acc99000 8 0 0 r-x-- libutil-2.12.so 00007f02acc9b000 2044 0 0 ----- libutil-2.12.so 00007f02ace9a000 4 0 0 r---- libutil-2.12.so 00007f02ace9b000 4 0 0 rw--- libutil-2.12.so 00007f02ace9c000 8 0 0 r-x-- libdl-2.12.so 00007f02ace9e000 2048 0 0 ----- libdl-2.12.so 00007f02ad09e000 4 0 0 r---- libdl-2.12.so 00007f02ad09f000 4 0 0 rw--- libdl-2.12.so 00007f02ad0a0000 88 24 0 r-x-- libgfxdr.so.0.0.0 00007f02ad0b6000 2044 0 0 ----- libgfxdr.so.0.0.0 00007f02ad2b5000 4 4 4 rw--- libgfxdr.so.0.0.0 00007f02ad2b6000 96 64 0 r-x-- libgfrpc.so.0.0.0 00007f02ad2ce000 2048 0 0 ----- libgfrpc.so.0.0.0 00007f02ad4ce000 4 4 4 rw--- libgfrpc.so.0.0.0 00007f02ad4cf000 532 176 0 r-x-- libglusterfs.so.0.0.0 00007f02ad554000 2048 0 0 ----- libglusterfs.so.0.0.0 00007f02ad754000 8 8 8 rw--- libglusterfs.so.0.0.0 00007f02ad756000 16 12 12 rw--- [ anon ] 00007f02ad75a000 128 96 0 r-x-- ld-2.12.so 00007f02ad7ac000 1824 24 24 rw--- [ anon ] 00007f02ad977000 4 0 0 rw--- [ anon ] 00007f02ad978000 4 0 0 rw--- [ anon ] 00007f02ad979000 4 4 4 r---- ld-2.12.so 00007f02ad97a000 4 0 0 rw--- ld-2.12.so 00007f02ad97b000 4 0 0 rw--- [ anon ] 00007fff4c597000 124 96 96 rw--- [ stack ] 00007fff4c5ff000 4 4 0 r-x-- [ anon ] ffffffffff600000 4 0 0 r-x-- [ anon ] ---------------- ------ ------ ------ total kB 29338940 27627692 27621164 --- Additional comment from Raghavendra G on 2014-09-18 02:53:33 EDT --- Hi Ryan, If rebalance is still running, can you please get the statedump of rebalance process on all the nodes which are part of volume? We have to repeat following steps on all the nodes which are part of volume. 1. Get the pid of the rebalance process: [root@unused glusterfs]# ps ax | grep -i rebalance | grep glusterfs | cut -d" " -f 1 16537 2. Get the statedump of rebalance process: [root@unused glusterfs]# kill -SIGUSR1 16537 3. statedump can be found in /var/run/gluster/ [root@unused glusterfs]# ls /var/run/gluster/*16537* /var/run/gluster/glusterdump.16537.dump.1411022946 regards, Raghavendra. --- Additional comment from Anand Avati on 2014-09-18 05:40:52 EDT --- REVIEW: http://review.gluster.org/8763 (cluster/dht: Fix dict_t leaks in rebalance process' execution path) posted (#1) for review on master by Krutika Dhananjay (kdhananj) --- Additional comment from Krutika Dhananjay on 2014-09-18 06:39:40 EDT --- Hi Ryan, Thanks for the bug report. We have identified few leaks in rebalance and are in the process of fixing them. --- Additional comment from Ryan Clough on 2014-09-18 13:05:11 EDT --- Unfortunately, the processes had been OOM'd before I got to the office this morning. The reblance failed. --- Additional comment from Anand Avati on 2014-09-19 03:26:41 EDT --- REVIEW: http://review.gluster.org/8776 (cluster/dht: Fix dict_t leaks in rebalance process' execution path) posted (#1) for review on release-3.5 by Krutika Dhananjay (kdhananj) --- Additional comment from Anand Avati on 2014-09-19 04:13:24 EDT --- REVIEW: http://review.gluster.org/8763 (cluster/dht: Fix dict_t leaks in rebalance process' execution path) posted (#2) for review on master by Krutika Dhananjay (kdhananj) --- Additional comment from Anand Avati on 2014-09-19 10:10:33 EDT --- COMMIT: http://review.gluster.org/8763 committed in master by Vijay Bellur (vbellur) ------ commit 258e61adb5505124925c71d2a0d0375d086e32d4 Author: Krutika Dhananjay <kdhananj> Date: Thu Sep 18 14:36:38 2014 +0530 cluster/dht: Fix dict_t leaks in rebalance process' execution path Two dict_t objects are leaked for every file migrated in success codepath. It is the caller's responsibility to unref dict that it gets from calls to syncop_getxattr(); and rebalance performs two syncop_getxattr()s per file without freeing them. Also, syncop_getxattr() on GF_XATTR_LINKINFO_KEY doesn't seem to be using the response dict. Hence, NULL is now passed as opposed to @dict to syncop_getxattr(). Change-Id: I5a4b5ab834df3633dea994f239bbdbc34cbe9259 BUG: 1142052 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/8763 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Shyamsundar Ranganathan <srangana> Reviewed-by: Vijay Bellur <vbellur> --- Additional comment from Anand Avati on 2014-09-19 23:44:45 EDT --- REVIEW: http://review.gluster.org/8784 (cluster/dht: Fix dict_t leaks in rebalance process' execution path) posted (#1) for review on release-3.5 by Krutika Dhananjay (kdhananj)
REVIEW: http://review.gluster.org/8785 (cluster/dht: Fix dict_t leaks in rebalance process' execution path) posted (#1) for review on release-3.6 by Krutika Dhananjay (kdhananj)
COMMIT: http://review.gluster.org/8785 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit 999e9848d099f443a8bedbd1cd4678fe57dff11f Author: Krutika Dhananjay <kdhananj> Date: Thu Sep 18 14:36:38 2014 +0530 cluster/dht: Fix dict_t leaks in rebalance process' execution path Backport of: http://review.gluster.org/8763 Two dict_t objects are leaked for every file migrated in success codepath. It is the caller's responsibility to unref dict that it gets from calls to syncop_getxattr(); and rebalance performs two syncop_getxattr()s per file without freeing them. Also, syncop_getxattr() on GF_XATTR_LINKINFO_KEY doesn't seem to be using the response dict. Hence, NULL is now passed as opposed to @dict to syncop_getxattr(). Change-Id: I48926389db965e006da151bf0ccb6bcaf3585199 BUG: 1144640 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/8785 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html [2] http://supercolony.gluster.org/pipermail/gluster-users/
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report. glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html [2] http://supercolony.gluster.org/mailman/listinfo/gluster-users