Bug 1538358

Summary: EC users are hitting bugs WRT healing that look to be fixed in upstream.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ben Turner <bturner>
Component: disperseAssignee: Ashish Pandey <aspandey>
Status: CLOSED NOTABUG QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rhgs-3.3CC: aspandey, bturner, jahernan, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-01-31 15:54:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ben Turner 2018-01-24 22:04:19 UTC
Description of problem:

We think we are hitting the following issues that have been fixed in upstream gluster but haven't made it to RHGS yet:

http://lists.gluster.org/pipermail/gluster-users/2017-August/032225.html
https://review.gluster.org/#/c/16468/

https://bugzilla.redhat.com/show_bug.cgi?id=1475789
https://review.gluster.org/#/c/16772/9/xlators/cluster/ec/src/ec-heal.c

https://bugzilla.redhat.com/show_bug.cgi?id=1427159

We have several customers hitting combinations of some recently fixed EC bugs that look to be fixed in the upstream.  I saw there have been some upstream EC fixes that haven't made it to downstream, I was thinking we could just backport any upstream EC fix that could be relevant.  I am pretty confident we are hitting some combination of these bugs in several different environments.

Version-Release number of selected component (if applicable):

RHGS 3.3.1

How reproducible:

Frequently.

Steps to Reproduce:
1.  Induce healing on an EC volume through disconnects
2.  Stat files / writes new data / append / read 
3.  EC healing doesn't complete due to various issues.

Actual results:

Bricks are unable to fully heal.

Expected results:

Normal operation.


Additional info:

Comment 3 Ashish Pandey 2018-01-25 04:24:12 UTC
*** Bug 1535326 has been marked as a duplicate of this bug. ***