Description of problem: ============== After running the self heal on ECVOL, glustershd end up consuming 46 GB in two days times and fills entire root volume Version-Release number of selected component (if applicable): ============== glusterfs-api-3.7.1-14 Steps to Reproduce: ============ 1.Create 3X(4+2) disperse volume, and create 100K files and un tar the linux kernal 2. Run the script to bring down two of the bricks and keep populating the data for 30 min and then run the rebalance and self heal and then wait for 30 min and repeat it for 100 times Actual results: =============== Self heal struck saying " remote operation failed. Path" Though file exist on the given client and logging the same messages in the log and ends up filling the root volume Expected results: ============= Self heal should complete Additional info: ================== [root@rhs-client39 glusterfs]# gluster vol status ECVOL4 Status of volume: ECVOL4 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client39.lab.eng.blr.redhat.com:/ rhs/brick1/ECVOL4 49181 0 Y 28779 Brick rhs-client9.lab.eng.blr.redhat.com:/r hs/brick1/ECVOL4 49189 0 Y 3131 Brick rhs-client39.lab.eng.blr.redhat.com:/ rhs/brick2/ECVOL4 49182 0 Y 28787 Brick rhs-client9.lab.eng.blr.redhat.com:/r hs/brick2/ECVOL4 49190 0 Y 3039 Brick rhs-client39.lab.eng.blr.redhat.com:/ rhs/brick3/ECVOL4 49183 0 Y 28795 Brick rhs-client9.lab.eng.blr.redhat.com:/r hs/brick3/ECVOL4 49191 0 Y 3111 Brick rhs-client39.lab.eng.blr.redhat.com:/ rhs/brick4/ECVOL4 49184 0 Y 28803 Brick rhs-client9.lab.eng.blr.redhat.com:/r hs/brick4/ECVOL4 49192 0 Y 3103 Brick rhs-client39.lab.eng.blr.redhat.com:/ rhs/brick5/ECVOL4 49185 0 Y 28811 Brick rhs-client9.lab.eng.blr.redhat.com:/r hs/brick5/ECVOL4 49193 0 Y 3088 Brick rhs-client39.lab.eng.blr.redhat.com:/ rhs/brick6/ECVOL4 49186 0 Y 28819 Brick rhs-client9.lab.eng.blr.redhat.com:/r hs/brick6/ECVOL4 49194 0 Y 3130 Brick rhs-client39.lab.eng.blr.redhat.com:/ rhs/brick4/ECVOL4_add1 49203 0 Y 6373 Brick rhs-client9.lab.eng.blr.redhat.com:/r hs/brick4/ECVOL4_add1 49211 0 Y 3188 Brick rhs-client39.lab.eng.blr.redhat.com:/ rhs/brick5/ECVOL4_add1 49204 0 Y 6391 Brick rhs-client9.lab.eng.blr.redhat.com:/r hs/brick5/ECVOL4_add1 49212 0 Y 3194 Brick rhs-client39.lab.eng.blr.redhat.com:/ rhs/brick6/ECVOL4_add1 49205 0 Y 6409 Brick rhs-client9.lab.eng.blr.redhat.com:/r hs/brick6/ECVOL4_add1 49213 0 Y 3199 NFS Server on localhost 2049 0 Y 19405 Self-heal Daemon on localhost N/A N/A Y 19413 NFS Server on rhs-client9.lab.eng.blr.redha t.com N/A N/A N N/A Self-heal Daemon on rhs-client9.lab.eng.blr .redhat.com N/A N/A Y 5009 Task Status of Volume ECVOL4 ------------------------------------------------------------------------------ Task : Rebalance ID : 85b7093c-5175-429d-892f-fbd39cf63876 Status : in progress
Logs are available @ /home/repo/sosreports/bug.1264804
Steps to reproduce: 1] Create 4+2 EC vol , Mount on client node [fuse] and bring down a brick 2] Add 6 bricks [2 x (4 + 2)] and bring down a brick 3] Create 20 , 1M files and verify status of heal 4] run rebalance and Verify heal status 5] Add 6 more bricks [3 x (4 + 2)] and run rebalance gluster v heal vol info is displaying correct information. Not displaying info for files which are migrated to the new bricks. Attaching o/p.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0193.html