+++ This bug was initially created as a clone of Bug #1438255 +++ Problem: In afr-v2, self-blaming xattrs are not there by design. But if the FOP failed on a brick due to an error other than ENOTCONN (or even due to ENOTCONN, but we regained connection before postop was wound), we wind the post-op also on the failed brick, leading to setting self-blaming xattrs on that brick. This can lead to undesired results like healing of files in split-brain etc. Fix: If a fop failed on a brick on which pre-op was successful, do not perform post-op on it. This also produces the desired effect of not resetting the dirty xattr on the brick, which is how it should be because if the fop failed on a brick, there is no reason to clear the dirty bit which actually serves as an indication of the failure. --- Additional comment from Worker Ant on 2017-04-02 09:12:51 EDT --- REVIEW: https://review.gluster.org/16976 (afr: don't do a post-op on a brick if op failed) posted (#1) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-04-05 00:49:26 EDT --- REVIEW: https://review.gluster.org/16976 (afr: don't do a post-op on a brick if op failed) posted (#2) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-04-10 07:37:37 EDT --- REVIEW: https://review.gluster.org/16976 (afr: don't do a post-op on a brick if op failed) posted (#3) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-04-12 12:57:36 EDT --- REVIEW: https://review.gluster.org/16976 (afr: don't do a post-op on a brick if op failed) posted (#4) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-04-14 06:38:17 EDT --- REVIEW: https://review.gluster.org/16976 (afr: don't do a post-op on a brick if op failed) posted (#5) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-04-17 01:57:00 EDT --- REVIEW: https://review.gluster.org/16976 (afr: don't do a post-op on a brick if op failed) posted (#6) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-04-18 22:29:33 EDT --- COMMIT: https://review.gluster.org/16976 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 10dad995c989e9d77c341135d7c48817baba966c Author: Ravishankar N <ravishankar> Date: Sun Apr 2 18:08:04 2017 +0530 afr: don't do a post-op on a brick if op failed Problem: In afr-v2, self-blaming xattrs are not there by design. But if the FOP failed on a brick due to an error other than ENOTCONN (or even due to ENOTCONN, but we regained connection before postop was wound), we wind the post-op also on the failed brick, leading to setting self-blaming xattrs on that brick. This can lead to undesired results like healing of files in split-brain etc. Fix: If a fop failed on a brick on which pre-op was successful, do not perform post-op on it. This also produces the desired effect of not resetting the dirty xattr on the brick, which is how it should be because if the fop failed on a brick, there is no reason to clear the dirty bit which actually serves as an indication of the failure. Change-Id: I5f1caf4d1b39f36cf8093ccef940118638caa9c4 BUG: 1438255 Signed-off-by: Ravishankar N <ravishankar> Reviewed-on: https://review.gluster.org/16976 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu>
REVIEW: https://review.gluster.org/17082 (afr: don't do a post-op on a brick if op failed) posted (#1) for review on release-3.8 by Ravishankar N (ravishankar)
COMMIT: https://review.gluster.org/17082 committed in release-3.8 by Niels de Vos (ndevos) ------ commit a6d313d12c98cf533c6bbb10f491dd2ec48ca89c Author: Ravishankar N <ravishankar> Date: Wed Apr 19 16:40:05 2017 +0530 afr: don't do a post-op on a brick if op failed Problem: In afr-v2, self-blaming xattrs are not there by design. But if the FOP failed on a brick due to an error other than ENOTCONN (or even due to ENOTCONN, but we regained connection before postop was wound), we wind the post-op also on the failed brick, leading to setting self-blaming xattrs on that brick. This can lead to undesired results like healing of files in split-brain etc. Fix: If a fop failed on a brick on which pre-op was successful, do not perform post-op on it. This also produces the desired effect of not resetting the dirty xattr on the brick, which is how it should be because if the fop failed on a brick, there is no reason to clear the dirty bit which actually serves as an indication of the failure. > Reviewed-on: https://review.gluster.org/16976 > Smoke: Gluster Build System <jenkins.org> > NetBSD-regression: NetBSD Build System <jenkins.org> > CentOS-regression: Gluster Build System <jenkins.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu> Change-Id: I5f1caf4d1b39f36cf8093ccef940118638caa9c4 BUG: 1443319 Signed-off-by: Ravishankar N <ravishankar> Reviewed-on: https://review.gluster.org/17082 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.12, please open a new bug report. glusterfs-3.8.12 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2017-May/000072.html [2] https://www.gluster.org/pipermail/gluster-users/