Description of problem: There is a leak of inodes on the brick process. [root@unused ~]# gluster volume info Volume Name: ra Type: Distribute Volume ID: 258a8e92-678b-41db-ba8e-b273a360297d Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: booradley:/home/export-2/ra Options Reconfigured: diagnostics.brick-log-level: DEBUG nfs.disable: on performance.readdir-ahead: on transport.address-family: inet Script: [root@unused mnt]# for i in {1..150}; do echo $i; cp -rf /etc . && rm -rf *; done After completion of script, I can see active inodes in the brick itable [root@unused ~]# grep ra.active /var/run/gluster/home-export-2-ra.19609.dump.1465647069 conn.0.bound_xl./home/export-2/ra.active_size=149 [root@unused ~]# grep ra.active /var/run/gluster/home-export-2-ra.19609.dump.1465647069 | wc -l 150 But the client fuse mount doesn't have any inodes. [root@unused ~]# grep active /var/run/gluster/glusterdump.20612.dump.1465629006 | grep itable xlator.mount.fuse.itable.active_size=1 [xlator.mount.fuse.itable.active.1] I've not done a detailed RCA. But initial gut feeling is that there is one inode leak for every iteration of the loop. The leaked inode mostly corresponds to /mnt/etc. Version-Release number of selected component (if applicable): RHGS-3.1.3 git repo. Bug seen on upstream master too. How reproducible: Quite consistently Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 1166901 [details] brick statedump
Created attachment 1166915 [details] client statedump
There is a bug which prevents dumping the itable of first client connected. To view leaks in itable for the test-case mentioned in this bug fix [1] is also required. [1] http://review.gluster.org/14704
On further investigation, the leak seems to be in io-threads xlator. If I switch off io-threads, the leak goes away.
(In reply to Raghavendra G from comment #7) > On further investigation, the leak seems to be in io-threads xlator. If I > switch off io-threads, the leak goes away. io-threads might just be introducing a race-condition which might be causing the leak (in other components). But, there is no inode-leak in io-threads xlator proper.
Created attachment 1201131 [details] Inode leak statedump and fusedump
Hi, I did some more test with respect to inode-leak in bricks. I have attached the statedump and fusedump. Here is what I did: created a 2x2 volume mounted with fusedump option enabled # glusterfs --volfile-server=<IP ADDR> --dump-fuse=/home/dump.fdump --volfile-id=/vol /mnt/mount # cd /mnt/mount # for i in {1..50}; do mkdir healthy$i; cd healthy$i; echo dsfdsafsadfsad >> healthy$i; cd ../; done # find . # rm -rf ./* # gluster volume statedump vol # for i in {1..50}; do mkdir healthy$i; cd healthy$i; done # cd /mnt/mount # find . # rm -rf ./* # gluster volume statedump vol brick/4 has inode leaks when creating directories on mount recursively and not on samelevel. fusedump is took and parsed with https://github.com/csabahenk/parsefuse to make it human readable. fusedumps are attached.
with the above observation the fuse dump is missing to come to a conclusion. I tried recreating it but was unsuccessful In many of the runs with different types of volume this is the output of the statedump for a plain distribute volume with two bricks brick1: conn.0.bound_xl./data/gluster/bricks/b1.active_size=1 brick2: conn.0.bound_xl./data/gluster/bricks/b2.active_size=1 both have the root inode alone active the fuse dump xlator.mount.fuse.itable.active_size=1 things are working fine on the current master. So I'm closing this bug as not reproducible. Feel free to open it if it is see again.
The bug is valid in 3.3.1. It happens once in 5 tries atleast. BRICKS: ag active_size /var/run/gluster/data-gluster-bricks-b1.742.dump.1505986755 503:conn.0.bound_xl./data/gluster/bricks/b1.active_size=3 ag active_size /var/run/gluster/data-gluster-bricks-b2.761.dump.1505986756 503:conn.0.bound_xl./data/gluster/bricks/b2.active_size=2 brick1: [conn.0.bound_xl./data/gluster/bricks/b1.active.1] gfid=00000000-0000-0000-0000-000000000005 nlookup=15 fd-count=0 ref=1 ia_type=2 [xlator.features.locks.v1-locks.inode] path=/.trashcan mandatory=0 [conn.0.bound_xl./data/gluster/bricks/b1.active.2] gfid=65dbc91d-df91-4f74-bfc1-644e7bf3ccb6 nlookup=0 fd-count=0 ref=1 ia_type=2 [xlator.features.locks.v1-locks.inode] path=<gfid:65dbc91d-df91-4f74-bfc1-644e7bf3ccb6> mandatory=0 [conn.0.bound_xl./data/gluster/bricks/b1.active.3] gfid=00000000-0000-0000-0000-000000000001 nlookup=16 fd-count=0 ref=1 ia_type=2 [xlator.features.locks.v1-locks.inode] path=/ mandatory=0 brick2: [conn.0.bound_xl./data/gluster/bricks/b2.active.1] gfid=00000000-0000-0000-0000-000000000005 nlookup=14 fd-count=0 ref=1 ia_type=2 [xlator.features.locks.v1-locks.inode] path=/.trashcan mandatory=0 [conn.0.bound_xl./data/gluster/bricks/b2.active.2] gfid=00000000-0000-0000-0000-000000000001 nlookup=16 fd-count=0 ref=1 ia_type=2 [xlator.features.locks.v1-locks.inode] path=/ mandatory=0 client: xlator.mount.fuse.itable.active_size=2 [xlator.mount.fuse.itable.active.1] gfid=00000000-0000-0000-0000-000000000005 nlookup=1 fd-count=0 ref=1 ia_type=2 [xlator.cluster.dht.v1-dht.inode] layout.cnt=2 layout.preset=0 layout.gen=3 [io-cache.inode] inode.weight=1 path=/.trashcan uuid=00000000-0000-0000-0000-000000000005 [xlator.mount.fuse.itable.active.2] gfid=00000000-0000-0000-0000-000000000001 nlookup=0 fd-count=0 ref=1 ia_type=2 [xlator.cluster.dht.v1-dht.inode] layout.cnt=2 layout.preset=0 layout.gen=3 [io-cache.inode] inode.weight=1 path=/ uuid=00000000-0000-0000-0000-000000000001 As we can see client has a value and server has 3. Attaching the statedumps of client and server bricks
Created attachment 1328920 [details] The attachment is a tar of the fuse client and the two bricks of the server
The bug is not reproducible in 3.12 the result from the last try is: ➜ glusterfs git:(release-3.12) ✗ ag active_size /var/run/gluster/data-gluster-bricks-b1.31183.dump.1506340455 502:conn.0.bound_xl./data/gluster/bricks/b1.active_size=1 ➜ glusterfs git:(release-3.12) ✗ ag active_size /var/run/gluster/data-gluster-bricks-b2.31203.dump.1506340456 502:conn.0.bound_xl./data/gluster/bricks/b2.active_size=1 ➜ glusterfs git:(release-3.12) ✗ ag active_size /var/run/gluster/glusterdump.31425.dump.1506340470 432:xlator.mount.fuse.itable.active_size=1 The active_size is correct for the bricks and client.
We have noticed that the bug is not reproduced in the latest version of the product (RHGS-3.3.1+). If the bug is still relevant and is being reproduced, feel free to reopen the bug.