Description of problem: ========================= When we issue a heal info, we get the o/p in a continuous stream like fashion. Say we have 4x2 volume, the the o/p is stream one brick after another However when we issue a heal info split-brain, the o/p is not streamed as is received but is dumped at the end for all the bricks together. There are two problems associated with that: 1)This gives a perception that the heal info split-brain is hung 2)we would be simply consuming memory momentarily without releasing till the end of the command(as such no mem leak), but in an extreme case can hypothetically cause a OOM kill(just think of a case where there are lakhs of splitbrain entries in a multibrick volume, and caching all these ) Note: I have reproduced it when there are splitbrain entries and even without splitbrain entries Version-Release number of selected component (if applicable): ========== 3.8.4-13 How reproducible: ========== always Steps to Reproduce: 1.create a 2x2 volume 2.create many splibtrain files, say gfid splitbrains on all bricks 3.now issue heal info split-brain Actual results: =========== all the o/p is dumped at one shot Expected results: =========== keep streaming just as in heal info instead of dumping all at end Additional info:
(In reply to nchilaka from comment #0) > > How reproducible: > ========== > always > > > > Steps to Reproduce: > 1.create a 2x2 volume > 2.create many splibtrain files, say gfid splitbrains on all bricks > 3.now issue heal info split-brain Karthik, could you take a look at this please? As you know, in glfs-heal.c, all output to stdout is done using printf. I'm wondering if we need to call fflush or if there is some other bug.
Sure Ravi. Will take a look.
Upstream patch: https://review.gluster.org/#/c/18570/
Update: ========= Build Used : glusterfs-3.12.2-7.el7rhgs.x86_64 Scenario: 1) create 1 * 2 replicate volume 2) create 10K data split-brain files 3) issue heal info split-brain with this patch , we can see split-brain files are streaming to console # gluster vol heal 12 info split-brain Brick 10.70.35.61:/bricks/brick1/b0 /file_140 /file_1 /file_2 /file_3 /file_4 /file_5 Changing status to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607