+++ This bug was initially created as a clone of Bug #1329773 +++ Description of problem: Olia found inode leak in the review @ https://github.com/gluster/glusterfs/commit/b8106d1127f034ffa88b5dd322c23a10e023b9b6 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from Vijay Bellur on 2016-04-22 20:11:53 EDT --- REVIEW: http://review.gluster.org/14052 (cluster/afr: Fix inode-leak in data self-heal) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Vijay Bellur on 2016-04-22 22:15:09 EDT --- REVIEW: http://review.gluster.org/14052 (cluster/afr: Fix inode-leak in data self-heal) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Vijay Bellur on 2016-04-23 21:55:32 EDT --- REVIEW: http://review.gluster.org/14052 (cluster/afr: Fix inode-leak in data self-heal) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Vijay Bellur on 2016-04-25 02:17:46 EDT --- COMMIT: http://review.gluster.org/14052 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 13e458cd70ac1943cf68d95a2c6517663626c64a Author: Pranith Kumar K <pkarampu> Date: Sat Apr 23 05:30:08 2016 +0530 cluster/afr: Fix inode-leak in data self-heal Thanks to Olia-Kremmyda for finding the bug on github review, https://github.com/gluster/glusterfs/commit/b8106d1127f034ffa88b5dd322c23a10e023b9b6 Change-Id: Ib8640ed0c331a635971d5d12052f0959c24f76a2 BUG: 1329773 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/14052 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Ravishankar N <ravishankar> Reviewed-by: Krutika Dhananjay <kdhananj>
As there is no specific way of testing this functionally. Talked with dev and did below testing. Moving to verified with the available information Had a 6 node setup created 1x2 volume fuse mnted the volume and created some files and copied linux tar brought down brick-0 from two different mounts did following IOs scped video files from my laptop to the volume mount untarred kernel In all had about 40GB of data to heal Started the heal by bringing up the brick using force start. While heal was going on continously(with sleep of 10s) issued heal info Also from one mount in a loop used dd to create 400MB size files as long as heal is happening(crreated about 42 files) the heal took about 1Hour to complete I noticed the shd process memory consumption of both the bricks and didnt see any change in consumption. On an avg the cpu consumption from top command showed 1.1% usage I noticed the shd process cpu usage of the source was b/w 50-90% during the healing In 1x2 Volume: Ran healinfo in loop while actual heal was going on to test 1330881 - Inode leaks found in data-self-heal there was about 40GB of data to be healed(one untarred lin kernel and others being folders containing many big video files took about 1 Hr to heal complete data: ############################## [root@dhcp35-191 ~]# gluster v status olia Status of volume: olia Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.98:/rhs/brick4/olia N/A N/A N N/A Brick 10.70.35.64:/rhs/brick4/olia 49170 0 Y 6547 NFS Server on localhost 2049 0 Y 20318 Self-heal Daemon on localhost N/A N/A Y 20326 NFS Server on 10.70.35.27 2049 0 Y 5753 Self-heal Daemon on 10.70.35.27 N/A N/A Y 5761 NFS Server on 10.70.35.114 2049 0 Y 5869 Self-heal Daemon on 10.70.35.114 N/A N/A Y 5877 NFS Server on 10.70.35.44 2049 0 Y 32066 Self-heal Daemon on 10.70.35.44 N/A N/A Y 32074 NFS Server on 10.70.35.98 2049 0 Y 4823 Self-heal Daemon on 10.70.35.98 N/A N/A Y 4832 NFS Server on 10.70.35.64 2049 0 Y 6574 Self-heal Daemon on 10.70.35.64 N/A N/A Y 6583 Task Status of Volume olia ------------------------------------------------------------------------------ There are no active volume tasks Volume Name: olia Type: Replicate Volume ID: 6ad242d2-b0cb-441d-97a7-8fa2db693e05 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.70.35.98:/rhs/brick4/olia Brick2: 10.70.35.64:/rhs/brick4/olia Options Reconfigured: diagnostics.count-fop-hits: on diagnostics.latency-measurement: on performance.readdir-ahead: on (turned on profiling)
Created attachment 1154497 [details] qe validation logs
Created attachment 1154498 [details] qe validation logs#2
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240