Bug 1555261

Summary: After a replace brick command, self-heal takes some time to start healing files on disperse volumes
Product: Red Hat Gluster Storage Reporter: Xavi Hernandez <jahernan>
Component: disperseAssignee: Xavi Hernandez <jahernan>
Status: CLOSED ERRATA QA Contact: nchilaka <nchilaka>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: bugs, jahernan, rhinduja, rhs-bugs, sheggodu, srmukher, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.12.2-6 Doc Type: Bug Fix
Doc Text:
An issue prevented seal-heal to progress causing delays in the heal process after replacing a brick or bringing a failed brick online. This fix keeps self-heal active until all pending files are completely healed.
Story Points: ---
Clone Of: 1547662 Environment:
Last Closed: 2018-09-04 06:44:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On: 1547662    
Bug Blocks: 1503137    

Description Xavi Hernandez 2018-03-14 10:51:28 UTC
+++ This bug was initially created as a clone of Bug #1547662 +++

Description of problem:

After a replace brick, self-heal takes some time to start reconstructing the files, and when it starts, sometimes it pauses for a while.

Version-Release number of selected component (if applicable): mainline


How reproducible:

always

Steps to Reproduce:
1. create a disperse volume
2. replace one brick
3. check new brick contents

Actual results:

New brick is not being filled immediately. It can take some time to start moving files, and it might make pauses.

Expected results:

Self-heal should trigger immediately after replacing the brick and don't stop until finished healing all files.

Additional info:

Comment 6 nchilaka 2018-05-17 13:58:29 UTC
have tested this on 3.12.2-9 and I see selfheal kicks off immediately (as against when tested in 3.8.4-54.9 3.3.1 async, where it take a bit of a delay)
hence moving to verified

Comment 10 errata-xmlrpc 2018-09-04 06:44:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607