Bug 1126831
Summary: | Memory leak in GlusterFs client | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Poornima G <pgurusid> | ||||
Component: | quick-read | Assignee: | Soumya Koduri <skoduri> | ||||
Status: | CLOSED EOL | QA Contact: | |||||
Severity: | medium | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 3.7.6 | CC: | bugs, kkeithle, krzysztof.pawlowski, kwade, markd, monotek23, pgurusid, primeroznl, rgowdapp, sama.bharat, wade.fitzpatrick | ||||
Target Milestone: | --- | Keywords: | Triaged | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-03-08 10:51:41 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
From the statedump it is evident that the iobufs are leaking. Also the hot count of the pool-name=w-vol-io-cache:rbthash_entry_t is 10053, implies io-cache xlator could be the cause of the leak. From the logs, it looks like, quick-read performance xlator is calling iobuf_free with NULL pointers, implies quick-read could be leaking iobufs as well. As a temperory solution, could you disable io-cache and/or quick-read and see if the leak still persists? $gluster volume set io-cache off $gluster volume set quick-read off This may reduce the performance to certain extent. Regards, Poornima I too was experiencing this issue on my bricks. I am using version 3.5.2 and after setting io-cache and quick-read to "off", as suggested, I am no longer seeing glusterfs gobbling memory. I noticed it first when I enabled quotas and during the quota-crawl glusterfs process would be OOM killed by the kernel. Before, my bricks would consume all available memory until swap was exhausted and the kernel OOMs the glusterfs process. There is a rebalance running right now and glusterfs is behaving. Here is some output of my current config. Let me know if I can provide anything else to help. Raising the priority of this a bit, as it sounds like a memory leak we want to get rid of in near future. :) We should also add Tamas Papp to the CC list, as he reported this upstream so may be able to test potential fixes for us. (I don't seem to be able to at the moment :() I've been running a 4×2 dist/repl cluster with io-cache and quick-read set to off. In <24 hours the RSZ and VSZ of the client-side glusterfs (fuse bridge) have grown continuously and appear to be leaking memory. You can see the data at http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longevity/client.out N.B. this is running in a new "longevity cluster" running here in the lab inside Red Hat. I think we have hit this bug too in gluster-3.7.5. There is nothing interesting in the client logs. core@comet ~ $ top top - 13:07:04 up 1 day, 19:03, 1 user, load average: 0.30, 0.23, 0.22 Tasks: 160 total, 2 running, 158 sleeping, 0 stopped, 0 zombie %Cpu(s): 2.7 us, 0.7 sy, 0.0 ni, 95.8 id, 0.1 wa, 0.1 hi, 0.6 si, 0.0 st KiB Mem: 32978184 total, 32746628 used, 231556 free, 105032 buffers KiB Swap: 16777212 total, 220340 used, 16556872 free. 1002344 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 742 root 20 0 29.179g 0.028t 3404 S 78.8 91.2 412:29.02 glusterfs 706 root 20 0 810552 598904 4780 S 32.8 1.8 83:50.18 fleetd 902 root 20 0 574400 402476 8552 R 26.3 1.2 193:03.93 ringcap 26527 root 20 0 18900 4104 1636 S 13.1 0.0 0:00.38 rsync 755 root 20 0 869676 60408 6644 S 6.6 0.2 4:54.66 node We have 93 identical client hosts but only this one (which runs dropboxd on ext4 and rsyncs data to the gluster volume) has exhibited the memory leak. I have set quick-read and io-cache off on the volume so will see how it progresses. # gluster volume info static Volume Name: static Type: Striped-Replicate Volume ID: 3f9f810d-a988-4914-a5ca-5bd7b251a273 Status: Started Number of Bricks: 1 x 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: james:/data/gluster1/static/brick1 Brick2: cupid:/data/gluster1/static/brick2 Brick3: hilton:/data/gluster1/static/brick3 Brick4: present:/data/gluster1/static/brick4 Options Reconfigured: performance.quick-read: off performance.io-cache: off features.scrub: Active features.bitrot: on performance.readdir-ahead: on geo-replication.indexing: on geo-replication.ignore-pid-check: on changelog.changelog: on changelog.rollover-time: 10 changelog.fsync-interval: 3 They are lots of memleak related bugs are going to be fixed in 3.7.9.Can u check the same after that (In reply to Poornima G from comment #1) > From the logs, it looks like, quick-read performance xlator is calling > iobuf_free with NULL pointers, implies quick-read could be leaking iobufs as > well. Patch [1] adds check before unref. So, we no longer should see these messages. Also, just the presence of these logs cannot prove quick-read is leaking iobufs. The reason is, * quick-read prunes an inode even after data is cached and an qr_inode_prune deletes the data from qr_inode. * qr_readv calls qr_readv_cached if it finds qr_inode set in inode ctx. However this is not a guarantee that data is present as a qr_inode_prune might be called on this inode. * qr_readv_cached used to (before [1]) unconditionally call iobuf_unref/iobref_unref which resulted in the log messages. Note that no iobuf/iobref is allocated in this codepath as there is no data in qr_inode. Hence its not a leak. However there is one leak of "content" allocated. See, the definition of qr_content_update: void qr_content_update (xlator_t *this, qr_inode_t *qr_inode, void *data, struct iatt *buf) { qr_private_t *priv = NULL; qr_inode_table_t *table = NULL; priv = this->private; table = &priv->table; LOCK (&table->lock); { __qr_inode_prune (table, qr_inode); qr_inode->data = data; As can be seen above qr_inode->data is unconditionally set. So, if there is a data already present, "data" will leak. However, qr_lookup won't set GF_CONTENT_KEY if a qr_inode is already set in inode ctx and hence there not too many cases where we find file content in xdata in qr_lookup_cbk (which calls this function). But if there are simultaneous parallel fresh lookups (where qr_lookup happening on the same file, we'll have more than one lookup response containing data. This in turn can result in qr_content_update invoked on a qr_inode which already has a non-NULL data and can result in a leak. Please note that this is a race that can be hit only with more than one parallel fresh lookup on a file. So, this leak _MAY_ not amount to large memory. [1] http://review.gluster.org/10206 This bug is getting closed because GlusteFS-3.7 has reached its end-of-life. Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS. If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |
Created attachment 924189 [details] Statedump of the Process leaking memory Description of problem: Small cluster is exported to windows clients via smb though local mount. The GlusterFS client process uses huge memory: 2464 root 20 0 16.8g 11g 1292 S 0 71.3 33:09.14 /usr/sbin/glusterfs --direct-io-mode=disable --volfile-id=/w-vol --volfile-server=gl0 /W/Projects 2212 root 20 0 1491m 25m 844 S 3 0.2 18:37.37 /usr/sbin/glusterfsd -s gl0 --volfile-id w-vol.gl0.mnt-brick1-export -p /var/lib/glusterd/vols/w-v The mount process log (W-Projects.log) is full off these errors: /3.4.4/xlator/performance/quick-read.so(qr_readv+0x62) [0x7fcb6d5852e2]))) 0-iobuf: invalid argument: iobref [2014-07-30 17:04:24.082004] E [iobuf.c:733:iobuf_unref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/debug/io-stats.so(io_stats_readv+0x182) [0x7fcb6d161022] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/md-cache.so(mdc_readv+0x16b) [0x7fcb6d377d0b] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/quick-read.so(qr_readv+0x62) [0x7fcb6d5852e2]))) 0-iobuf: invalid argument: iobuf [2014-07-30 17:04:24.082036] E [iobuf.c:828:iobref_unref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/debug/io-stats.so(io_stats_readv+0x182) [0x7fcb6d161022] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/md-cache.so(mdc_readv+0x16b) [0x7fcb6d377d0b] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/quick-read.so(qr_readv+0x62) [0x7fcb6d5852e2]))) 0-iobuf: invalid argument: iobref [2014-07-30 17:04:24.085388] E [iobuf.c:733:iobuf_unref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/debug/io-stats.so(io_stats_readv+0x182) [0x7fcb6d161022] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/md-cache.so(mdc_readv+0x16b) [0x7fcb6d377d0b] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/quick-read.so(qr_readv+0x62) [0x7fcb6d5852e2]))) 0-iobuf: invalid argument: iobuf [2014-07-30 17:04:24.085419] E [iobuf.c:828:iobref_unref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/debug/io-stats.so(io_stats_readv+0x182) [0x7fcb6d161022] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/md-cache.so(mdc_readv+0x16b) [0x7fcb6d377d0b] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/quick-read.so(qr_readv+0x62) [0x7fcb6d5852e2]))) 0-iobuf: invalid argument: iobref [2014-07-30 17:04:24.246769] E [iobuf.c:733:iobuf_unref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/debug/io-stats.so(io_stats_readv+0x182) [0x7fcb6d161022] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/md-cache.so(mdc_readv+0x16b) [0x7fcb6d377d0b] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/quick-read.so(qr_readv+0x62) [0x7fcb6d5852e2]))) 0-iobuf: invalid argument: iobuf rebalance.log: [2014-07-06 22:28:56.460558] I [dht-common.c:2614ht_setxattr] 0-w-vol-dht: fixing the layout of /_torolni/60_Elements/Prod/T06_APPL/T06_APPL_PACKSHOT/T06_APPL_PACKSHOT_010/co/cp [2014-07-06 22:28:56.463787] I [dht-common.c:2614ht_setxattr] 0-w-vol-dht: fixing the layout of /_torolni/60_Elements/Prod/T06_APPL/T06_APPL_PACKSHOT/T06_APPL_PACKSHOT_010/co/cp/T06_APPL_060-co_cp_v001 [2014-07-06 22:28:56.466934] W [dht-common.c:419ht_lookup_dir_cbk] 0-w-vol-dht: /_torolni/60_Elements/Prod/T06_APPL/T06_APPL_PACKSHOT/T06_APPL_PACKSHOT_010/co/cp/T06_APPL_060-co_cp_v001/f_800x800: gfid different on w-vol-client-1 [2014-07-06 22:28:56.467127] I [dht-common.c:2614ht_setxattr] 0-w-vol-dht: fixing the layout of /_torolni/60_Elements/Prod/T06_APPL/T06_APPL_PACKSHOT/T06_APPL_PACKSHOT_010/co/cp/T06_APPL_060-co_cp_v001/f_800x800 [2014-07-06 22:28:56.467553] W [client-rpc-fops.c:1044:client3_3_setxattr_cbk] 0-w-vol-client-1: remote operation failed: No such file or directory [2014-07-06 22:28:56.468787] W [client-rpc-fops.c:1994:client3_3_setattr_cbk] 0-w-vol-client-1: remote operation failed: No such file or directory [2014-07-06 22:28:56.468892] E [dht-selfheal.c:1015ht_dir_attr_heal] 0-dht: Failed to set uid/gid on /_torolni/60_Elements/Prod/T06_APPL/T06_APPL_PACKSHOT/T06_APPL_PACKSHOT_010/co/cp/T06_APPL_060-co_cp_v001/f_800x800 on w-vol-client-1 subvol (No such file or directory) [2014-07-06 22:28:56.468906] W [client-rpc-fops.c:2541:client3_3_opendir_cbk] 0-w-vol-client-1: remote operation failed: No such file or directory. Path: /_torolni/60_Elements/Prod/T06_APPL/T06_APPL_PACKSHOT/T06_APPL_PACKSHOT_010/co/cp/T06_APPL_060-co_cp_v001/f_800x800 (00000000-0000-0000-0000-000000000000) [2014-07-06 22:28:56.470407] W [client-rpc-fops.c:2329:client3_3_readdirp_cbk] 0-w-vol-client-1: remote operation failed: No such file or directory [2014-07-06 22:28:56.518470] I [dht-rebalance.c:1729:gf_defrag_status_get] 0-glusterfs: Rebalance is completed. Time taken is 1350.00 secs [2014-07-06 22:28:56.518494] I [dht-rebalance.c:1732:gf_defrag_status_get] 0-glusterfs: Files migrated: 0, size: 0, lookups: 0, failures: 0, skipped: 0 [2014-07-06 22:28:56.527984] W [glusterfsd.c:1002:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f71868543fd] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f7186b27e9a] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xc5) [0x7f7187611e75]))) 0-: received signum (15), shutting down Version-Release number of selected component (if applicable): 3.5.2, also seen in 3.4.4 How reproducible: No steps to reproduce. Steps to Reproduce: 1. 2. 3. Actual results: OOM kill Expected results: OOM kill is not expected Additional info: