Bug 1501146
| Summary: | FUSE client Memory usage issue | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Josh Coyle <joshua.coyle> | ||||||||||||||
| Component: | fuse | Assignee: | bugs <bugs> | ||||||||||||||
| Status: | CLOSED EOL | QA Contact: | |||||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||||
| Priority: | unspecified | ||||||||||||||||
| Version: | 3.10 | CC: | aboubacar.toure, bugs, danny.lee, joshua.coyle, nbalacha, rgowdapp, t.bueter, yoann.laissus | ||||||||||||||
| Target Milestone: | --- | ||||||||||||||||
| Target Release: | --- | ||||||||||||||||
| Hardware: | Unspecified | ||||||||||||||||
| OS: | Linux | ||||||||||||||||
| Whiteboard: | |||||||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||
| Last Closed: | 2018-06-20 18:24:25 UTC | Type: | Bug | ||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
| Embargoed: | |||||||||||||||||
| Attachments: |
|
||||||||||||||||
Forgot to mention, this is on Ubuntu 16.04.2 and 16.04.3 More additional info based on guidelines from gluster docs.
GlusterFS Cluster Information:
Number of volumes: 1
Volume Names: gvAA01
Volume on which the particular issue is seen [ if applicable ]: gvAA01
Type of volumes: Distributed Replicated
Volume options if available:
Options Reconfigured:
cluster.data-self-heal: off
cluster.lookup-unhashed: auto
cluster.lookup-optimize: on
cluster.self-heal-daemon: enable
client.bind-insecure: on
server.allow-insecure: on
nfs.disable: off
transport.address-family: inet
cluster.favorite-child-policy: size
Output of gluster volume info
Volume Name: gvAA01
Type: Distributed-Replicate
Volume ID: ca4ece2c-13fe-414b-856c-2878196d6118
Status: Started
Snapshot Count: 0
Number of Bricks: 5 x (2 + 1) = 15
Transport-type: tcp
Bricks:
Brick1: PB-WA-AA-01-B:/brick1/gvAA01/brick
Brick2: PB-WA-AA-02-B:/brick1/gvAA01/brick
Brick3: PB-WA-AA-00-A:/arbiterAA01/gvAA01/brick1 (arbiter)
Brick4: PB-WA-AA-01-B:/brick2/gvAA01/brick
Brick5: PB-WA-AA-02-B:/brick2/gvAA01/brick
Brick6: PB-WA-AA-00-A:/arbiterAA01/gvAA01/brick2 (arbiter)
Brick7: PB-WA-AA-01-B:/brick3/gvAA01/brick
Brick8: PB-WA-AA-02-B:/brick3/gvAA01/brick
Brick9: PB-WA-AA-00-A:/arbiterAA01/gvAA01/brick3 (arbiter)
Brick10: PB-WA-AA-01-B:/brick4/gvAA01/brick
Brick11: PB-WA-AA-02-B:/brick4/gvAA01/brick
Brick12: PB-WA-AA-00-A:/arbiterAA01/gvAA01/brick4 (arbiter)
Brick13: PB-WA-AA-01-B:/brick5/gvAA01/brick
Brick14: PB-WA-AA-02-B:/brick5/gvAA01/brick
Brick15: PB-WA-AA-00-A:/arbiterAA01/gvAA01/brick5 (arbiter)
Options Reconfigured:
cluster.data-self-heal: off
cluster.lookup-unhashed: auto
cluster.lookup-optimize: on
cluster.self-heal-daemon: enable
client.bind-insecure: on
server.allow-insecure: on
nfs.disable: off
transport.address-family: inet
cluster.favorite-child-policy: size
Output of gluster volume status
root@PB-WA-AA-00-A:/# gluster volume status
Status of volume: gvAA01
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick PB-WA-AA-01-B:/brick1/gvAA01/brick 49152 0 Y 10547
Brick PB-WA-AA-02-B:/brick1/gvAA01/brick 49152 0 Y 10380
Brick PB-WA-AA-00-A:/arbiterAA01/gvAA01/bri
ck1 49152 0 Y 16770
Brick PB-WA-AA-01-B:/brick2/gvAA01/brick 49153 0 Y 10554
Brick PB-WA-AA-02-B:/brick2/gvAA01/brick 49153 0 Y 10388
Brick PB-WA-AA-00-A:/arbiterAA01/gvAA01/bri
ck2 49153 0 Y 16789
Brick PB-WA-AA-01-B:/brick3/gvAA01/brick 49154 0 Y 10565
Brick PB-WA-AA-02-B:/brick3/gvAA01/brick 49154 0 Y 10396
Brick PB-WA-AA-00-A:/arbiterAA01/gvAA01/bri
ck3 49154 0 Y 20685
Brick PB-WA-AA-01-B:/brick4/gvAA01/brick 49155 0 Y 10571
Brick PB-WA-AA-02-B:/brick4/gvAA01/brick 49155 0 Y 10404
Brick PB-WA-AA-00-A:/arbiterAA01/gvAA01/bri
ck4 49155 0 Y 14312
Brick PB-WA-AA-01-B:/brick5/gvAA01/brick 49156 0 Y 990
Brick PB-WA-AA-02-B:/brick5/gvAA01/brick 49156 0 Y 14869
Brick PB-WA-AA-00-A:/arbiterAA01/gvAA01/bri
ck5 49156 0 Y 19462
NFS Server on localhost 2049 0 Y 2950
Self-heal Daemon on localhost N/A N/A Y 2959
NFS Server on PB-WA-AA-01-B 2049 0 Y 23815
Self-heal Daemon on PB-WA-AA-01-B N/A N/A Y 23824
NFS Server on PB-WA-AA-02-B 2049 0 Y 14889
Self-heal Daemon on PB-WA-AA-02-B N/A N/A Y 14898
Task Status of Volume gvAA01
------------------------------------------------------------------------------
Task : Rebalance
ID : 5930cdcd-bb76-4d32-aeca-c41aea8f832d
Status : in progress
Client Information
OS Type: Ubuntu Linux
Mount type: gluster FUSE client
OS Version: 16.04.3
Created attachment 1337560 [details]
Mount log file
The only large allocations I see in the statedump are: [mount/fuse.fuse - usage-type gf_common_mt_circular_buffer_t memusage] size=32768 num_allocs=1025 max_size=32768 max_num_allocs=1025 total_allocs=1197200 [mount/fuse.fuse - usage-type gf_common_mt_char memusage] size=128063 num_allocs=1024 max_size=152481 max_num_allocs=1028 total_allocs=1268262 Do you have any more statedumps taken at intervals? I did have state dumps being collected at regular intervals, however these appear to have cleared themselves. As of current, the issue appears to have ceased. We have also moved some workload from this machine to another, which may have resolved the issue. The new machine is currently displaying the same behaviour, where it gradually consumes additional memory without releasing it. I'll begin taking state dumps on this machine, however previous attempts at this have not been successful. Would you like me to raise a new bug report for the new machine? Or dump all the info into this bug report? You can add them to this BZ. There is a known issue where the Fuse mount process doesn't release inodes so as you process more files, the size of the inode table grows.However, I would like to rule out other memory leaks. Created attachment 1340591 [details]
glusterfs process statedump
Created attachment 1340592 [details]
glusterfs process statedump
Created attachment 1340593 [details]
glusterfs process statedump
I've added 3 new statedumps for this one. I do have another one, however it's 6GB in size, and I'm pretty certain it's not complete. It was filling my /run/ partition. I can truncate it, but would like to know, would the most useful info be at the start or the end of the file? We are experiencing the same problem. Our cluster is made up of 3 nodes. We created around 160K small files (4K each), then removed them. Our fuse client is still using around half a GB (after a day). I've got a couple of new statedumps for this one, however they're too large to upload to the bug report (45MB). Do you guys have somewhere I can send these? Thanks. Created attachment 1343753 [details]
statedump [Gluster Client - High Memory Usage]
Gluster Client - High Memory Usage
We created around 160K files using:
`./smallfile/smallfile_cli.py --top /usr/local/gfs/data/mirrored-data/test --threads 16 --file-size 16 --files 10000 --response-times Y"`
After deleting them, used memory barely went down.
OS: Centos 7
Gluster Versions: 3.10.5, 3.12.1, 3.12.2
(In reply to Nithya Balachandran from comment #6) > You can add them to this BZ. > > There is a known issue where the Fuse mount process doesn't release inodes > so as you process more files, the size of the inode table grows.However, I > would like to rule out other memory leaks. statedumps attached don't show large number of inodes (both active and inactive) in itable. Maximum count of inodes in active and lru list on client is less than 50. Hence its not the case of memory consumption due to kernel not forgetting inodes. I tried running the smallfiles test on various types of EC2 servers (m4.large, m4.xlarge & m4.2xlarge). The total amount of memory on these servers is 8GB, 16GB, and 32GB, respectively. The amount of memory used after writing and reading 1 million files was ~1GB, ~2GB, ~3GB, respectively. Then, I checked the statedump files for the m4.large and m4.2xlarge. There was one noticeably large difference. Under "xlator.mount.fuse.priv", the "iobuf" for the m4.large was approximately half of the m4.xlarge, which was taking about 2 times more memory for the fuse mount. I'm guessing there is some correlation with the amount of memory used and the total amount of memory on the box. Does anyone know if there is a way to place a limit on this "iobuf"? Josh Coyle's statedumps also show very large numbers for the "iobuf" value under the "xlator.mount.fuse.priv" section. This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained. As a result this bug is being closed. If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately. |
Created attachment 1337554 [details] Gluster State Dump Description of problem: The glusterfs process on client which use the client FUSE mount consume as much system memory and swap allocation as they can over time, eventually leading to the process being killed due to OOM and the mount dropping. This occurs after a large amount of data (Both size and file count, although I've not been able to rule out one over the other, as this machine does both regularly) has been transferred over the mount point. Version-Release number of selected component (if applicable): glusterfs 3.10.3 How reproducible: Highly consistently Steps to Reproduce: 1.Mount gluster volume via FUSE client 2.Transfer a lot of data 3.Watch Mem usage on glusterfs process increase over time Actual results: Memory usage increases over time eventually leading to the glusterfs process being killed by OOM and the mount dropping Expected results: For the glusterfs process to release the memory it is consuming to avoid OOM issues. Additional info: Gluster volume version is 3.10.3 I have one client on 3.10.3 and one client on 3.11.3, both experience the same issue. This only occurs on clients which pass a large amount of traffic consistently (100s of GB daily). These mounts also process a large number of concurrent connections (up to 50 at a time) which may be playing some part in the issue.