+++ This bug was initially created as a clone of Bug #1334566 +++ Description of problem: I noticed that nagios was showing heal required in the BAGL test environment, but when I checked on node gprfc085, self heal was 0. However, I ran the following for node in gprfc085 gprfc086 gprfc087; do pssh -P -t 60 -H $node 'date; gluster vol heal engine info ; sleep 1'; done and could see that on node 85, self heal was 0 but the other two nodes show shards listed. Trying to understand why...I did note that for some reason cluster.data-self-heal/entry-self-heal and meta To date, this issue is ONLY against the 'engine' volume, which is sharded volume and has the hosted_engine vm running on node '86 Version-Release number of selected component (if applicable): How reproducible: Each time. Steps to Reproduce: 1. Run vol heal commands on each node at around the same time 2. 3. Actual results: 1 node shows the volume is clean, the other 2 invariably report shards in the heal list. Expected results: I would expect all nodes to have the same view of heal state Additional info: output attached glusterfs-3.7.9-3 build --- Additional comment from Sahina Bose on 2016-05-11 10:57:22 EDT --- Krutika, can you take a look?
REVIEW: http://review.gluster.org/14302 (cluster/afr: Handle non-zero source in heal-info decision) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
Moving this to post as patch was sent.
Hi, Just to understand basically : is this bug harmful to our data?
COMMIT: http://review.gluster.org/14302 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 7dc5d73410f0e9f846c593887637001ca43bc4a0 Author: Pranith Kumar K <pkarampu> Date: Thu May 12 13:55:44 2016 +0530 cluster/afr: Handle non-zero source in heal-info decision Problem: Spurious entries are reported in heal info when the mount is on second/third brick of the replica pair because local-child is given preference in selecting source. The code is supposed to suggest the file needs heal if the (source < 0) (failure code path), but instead it is written as if any non-zero value is considered failure. Fix: Treat +ve source as success case BUG: 1335429 Change-Id: I1be7f9defef2ae03be7eec8d7d49bf34adeca82c Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/14302 Reviewed-by: Krutika Dhananjay <kdhananj> Reviewed-by: Anuradha Talur <atalur> Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Ravishankar N <ravishankar> CentOS-regression: Gluster Build System <jenkins.com>
(In reply to Nicolas Ecarnot from comment #3) > Hi, > > Just to understand basically : is this bug harmful to our data? Not at all. It is just wrong reporting.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report. glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html [2] https://www.gluster.org/pipermail/gluster-users/