Description of problem: ======================= On a 2 x (4 + 2) Distributed-Disperse volume considering the redundancy count killed 4 bricks and can see that healing has started which is not expected as the data bricks are up and running. Version-Release number of selected component (if applicable): 3.8.4-5.el7rhgs.x86_64 How reproducible: ================= Always Steps to Reproduce: =================== 1) Create a distributed disperse volume and start it. 2) Fuse mount the volume on a client. 3) Kill the bricks based on the redundancy count. 4) From mount point, untar linux kernel package and wait till it completes. check gluster vol heal <volname> info, we can see that heal is getting triggered. I am seeing a high cpu utilization on the nodes and we are suspecting because of this issue the cpu utilization is growing. d x [k + n] --> where k is data bricks count and n is the redundancy count 2 x (4+2) Actual results: =============== Even though all the data bricks are up, healing is getting started. Expected results: ================= Healing should not happen as all the data bricks are up and running.
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/101286
qatp: QATP: ====== tc#1) test above scenario as mentioned in Description ie bring down redundant number of bricks first and then do IO , with fix CPU consumption of shd should come down====>PASS cpu is mostly at <2% and sometimes peaks to 10%(but hardly for a second) so acceptable tc#2)keep doing IOs and then bring down one redundant brick after other--->CPU utilization should be reduced --->but it is not as long as IOs are going on ....try with linux untar ====>hence FAIL===>raising a new bz The steps in this bz ie tc#1 is passing. however this fix has not considered all the cases, hence moving it to verified, while raising a new bz for tc#2 Riased bZ#1464336 - selfheal deamon cpu consumption not reducing when IOs are going on and all redundant bricks are brought down one after another for tc#2 test version:3.8.4-28
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774