Created attachment 1347114 [details] Reproduce script Description of problem: We have a production system with a dir containing 17mil files. Quota was enables later. Accessing these files with with 'du' or a rebalance causes OOM killer to kick in. I've created a script that will reproduce. See below. Version-Release number of selected component (if applicable): How reproducible: Always. Steps to Reproduce: The attched script will show the issue. Creates an in memory distributed volume and fill it with files (# is first argument). Then enables quota and running 'du'. Actual results: OOM or high memory use depending on how many files are created. Expected results: No exceptional memory use. Additional info: 3.7 does not have the issue. 3.8, 3.10 and 3.12 do.
Forgot to mention that 50000-100000 files is enough to get the brick processes to use ~1GB of memory.
This is mostly due to the inode_ref leak that was fixed through https://bugzilla.redhat.com/show_bug.cgi?id=1497084 Please upgrade to 3.12.2 and check if that fixes the issue.
Sorry, I forgot to mention. The latest I've tested is 3.12.2 from the centos-gluster312-test repo. It is still having the bug. I guess it is related to the creation of trusted.pgfid.XXXXX xattr after quota enable. A 'du' or rebalance starts that process. When files has the pgfid there is no leak.
Thanks for confirming that, changing priority.
Any news about this one?
Hi Hans, I have problem accessing the script, can you paste it here? Or you can mention the exact steps performed in the script that will give us a reproducer. Also if you can give us the state dump, it will help us debug the memory leak if any. Thanks, Hari.
#!/bin/bash # Usage: fail <number of files to create> n=$1 # Host to create bricks on host=sciimg01 # cleanup umount /mnt yes|gluster vol stop mem yes|gluster vol delete mem umount /gluster/mem0 umount /gluster/mem1 # init truncate -s 4g /dev/shm/b0 truncate -s 4g /dev/shm/b1 mkfs.xfs -f /dev/shm/b0 mkfs.xfs -f /dev/shm/b1 mkdir -p /gluster/mem0 mkdir -p /gluster/mem1 mount -o loop /dev/shm/b0 /gluster/mem0 mount -o loop /dev/shm/b1 /gluster/mem1 gluster vol create mem ${host}:/gluster/mem0/brick ${host}:/gluster/mem1/brick gluster vol start mem mount -t glusterfs ${host}:/mem /mnt sleep 3 # create files mkdir /mnt/many i=0; while [ $i -lt $n ]; do touch /mnt/many/f$i; let i++; done gluster vol quota mem enable du -hs /mnt # Or rebalance # gluster vol rebalance mem start
You can run the script with about 100000 files to see the result.
Created attachment 1404822 [details] Dump brick 0 after leak on 3.12.6
Created attachment 1404823 [details] Dump brick 1 after leak on 3.12.6
Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.
We are not trying to focus a lot on Quota as a feature right now. Hence the reduction in priority!
This bug is moved to https://github.com/gluster/glusterfs/issues/965, and will be tracked there from now on. Visit GitHub issues URL for further details