Description of problem: We think we are hitting the following issues that have been fixed in upstream gluster but haven't made it to RHGS yet: http://lists.gluster.org/pipermail/gluster-users/2017-August/032225.html https://review.gluster.org/#/c/16468/ https://bugzilla.redhat.com/show_bug.cgi?id=1475789 https://review.gluster.org/#/c/16772/9/xlators/cluster/ec/src/ec-heal.c https://bugzilla.redhat.com/show_bug.cgi?id=1427159 We have several customers hitting combinations of some recently fixed EC bugs that look to be fixed in the upstream. I saw there have been some upstream EC fixes that haven't made it to downstream, I was thinking we could just backport any upstream EC fix that could be relevant. I am pretty confident we are hitting some combination of these bugs in several different environments. Version-Release number of selected component (if applicable): RHGS 3.3.1 How reproducible: Frequently. Steps to Reproduce: 1. Induce healing on an EC volume through disconnects 2. Stat files / writes new data / append / read 3. EC healing doesn't complete due to various issues. Actual results: Bricks are unable to fully heal. Expected results: Normal operation. Additional info:
*** Bug 1535326 has been marked as a duplicate of this bug. ***