Bug 1506104

Summary: gluster volume splitbrain info needs to display output of each brick in a stream fashion instead of buffering and dumping at the end
Product: [Community] GlusterFS Reporter: Karthik U S <ksubrahm>
Component: replicateAssignee: Karthik U S <ksubrahm>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: mainlineCC: bugs, nchilaka, ravishankar, rhs-bugs, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-4.0.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1419438
: 1514419 1514420 1514424 (view as bug list) Environment:
Last Closed: 2018-03-15 11:18:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1419438, 1514419, 1514420, 1514424    

Comment 1 Karthik U S 2017-10-25 07:05:55 UTC
Description of problem:
=========================
When we issue a heal info, we get the o/p in a continuous stream like fashion.
Say we have 4x2 volume, the the o/p is stream one brick after another

However when we issue a heal info split-brain, the o/p is not streamed as is received but is dumped at the end for all the bricks together.

There are two problems associated with that:
1)This gives a perception that the heal info split-brain is hung
2)we would be simply consuming memory momentarily without releasing till the end of the command(as such no mem leak), but in an extreme case can hypothetically cause a OOM kill(just think of a case where there are lakhs of splitbrain entries in a multibrick volume, and caching all these ) 

Note: I have reproduced it when there are splitbrain entries and even without splitbrain entries

Version-Release number of selected component (if applicable):
==========
3.8.4-13

How reproducible:
==========
always



Steps to Reproduce:
1.create a 2x2 volume
2.create many splibtrain files, say gfid splitbrains on all bricks
3.now issue heal info split-brain

Comment 2 Worker Ant 2017-10-25 07:26:53 UTC
REVIEW: https://review.gluster.org/18570 (cluster/afr: Print heal info split-brain output in stream fashion) posted (#1) for review on master by Karthik U S (ksubrahm)

Comment 3 Worker Ant 2017-11-13 03:44:40 UTC
COMMIT: https://review.gluster.org/18570 committed in master by  

------------- cluster/afr: Print heal info split-brain output in stream fashion

Problem:
When we trigger the heal info split-brain command the o/p is not
streamed as it is received, but dumped at the end for all the bricks
together. This gives a perception that the command is hung.

Fix:
When we get a split brain entry while crawling throught the pending
heal entries, flush that immediately so that it prints the output
in a stream fashion and doesn't look like the cli is hung.

Change-Id: I7547e86b83202d66616749b8b31d4d0dff0abf07
BUG: 1506104
Signed-off-by: karthik-us <ksubrahm>

Comment 4 Worker Ant 2017-11-21 13:39:53 UTC
REVIEW: https://review.gluster.org/18832 (cluster/afr: Print heal info summary output in stream fashion) posted (#1) for review on master by Karthik U S

Comment 5 Karthik U S 2017-11-21 13:42:55 UTC
Heal info summary feature also had the same issue. Sent patch for that as well under this bug. So moving it back to post.

Comment 6 Worker Ant 2017-11-22 10:13:03 UTC
COMMIT: https://review.gluster.org/18832 committed in master by \"Karthik U S\" <ksubrahm> with a commit message- cluster/afr: Print heal info summary output in stream fashion

Problem:
The heal info summary was printing the output at the end after
crawling for pending heal entries completes on all the bricks.

Fix:
Printing the output immediately after the crawl on individual brick
completes, so that it won't give the impression of CLI being hung.

Change-Id: Ieaf5718736a7ee6837bac02bd30a95836e605dab
BUG: 1506104
Signed-off-by: karthik-us <ksubrahm>

Comment 7 Shyamsundar 2018-03-15 11:18:57 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.0.0, please open a new bug report.

glusterfs-4.0.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-March/000092.html
[2] https://www.gluster.org/pipermail/gluster-users/