Bug 1363721
| Summary: | [HC]: After bringing down and up of the bricks VM's are getting paused | |||
|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Krutika Dhananjay <kdhananj> | |
| Component: | replicate | Assignee: | Krutika Dhananjay <kdhananj> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | mainline | CC: | bugs, mzywusko, pkarampu, rhs-bugs, rmekala, sabose, sasundar, storage-qa-internal | |
| Target Milestone: | --- | Keywords: | Triaged | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.9.0 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | 1333406 | |||
| : | 1367270 1367272 (view as bug list) | Environment: | ||
| Last Closed: | 2017-03-27 18:25:08 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1333406 | |||
| Bug Blocks: | 1367270, 1367272 | |||
|
Description
Krutika Dhananjay
2016-08-03 12:24:42 UTC
REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought on and off in cyclic order) posted (#1) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#2) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#3) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#4) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#5) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#6) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15145 (cluster/afr: Bug fixes in txn codepath) posted (#1) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15145 (cluster/afr: Bug fixes in txn codepath) posted (#2) for review on master by Krutika Dhananjay (kdhananj) COMMIT: http://review.gluster.org/15145 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 79b9ad3dfa146ef29ac99bf87d1c31f5a6fe1fef Author: Krutika Dhananjay <kdhananj> Date: Fri Aug 5 12:18:05 2016 +0530 cluster/afr: Bug fixes in txn codepath AFR sets transaction.pre_op[] array even before actually doing the pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array but also the failed_subvols[] information before setting the pre_op_done[] flag. This patch fixes that. Change-Id: I78ccd39106bd4959441821355a82572659e3affb BUG: 1363721 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/15145 Smoke: Gluster Build System <jenkins.org> Reviewed-by: Ravishankar N <ravishankar> Reviewed-by: Pranith Kumar Karampuri <pkarampu> Reviewed-by: Anuradha Talur <atalur> CentOS-regression: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#7) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#8) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#9) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#10) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#11) for review on master by Krutika Dhananjay (kdhananj) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#12) for review on master by Pranith Kumar Karampuri (pkarampu) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#13) for review on master by Pranith Kumar Karampuri (pkarampu) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#14) for review on master by Pranith Kumar Karampuri (pkarampu) REVIEW: http://review.gluster.org/15080 (cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order) posted (#15) for review on master by Pranith Kumar Karampuri (pkarampu) COMMIT: http://review.gluster.org/15080 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit fcb5b70b1099d0379b40c81f35750df8bb9545a5 Author: Krutika Dhananjay <kdhananj> Date: Thu Jul 28 21:29:59 2016 +0530 cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order When the bricks are brought offline and then online in cyclic order while writes are in progress on a file, thanks to inode refresh in write txns, AFR will mostly fail the write attempt when the only good copy is offline. However, there is still a remote possibility that the file will run into split-brain if the brick that has the lone good copy goes offline *after* the inode refresh but *before* the write txn completes (I call it in-flight split-brain in the patch for ease of reference), requiring intervention from admin to resolve the split-brain before the IO can resume normally on the file. To get around this, the patch does the following things: i) retains the dirty xattrs on the file ii) avoids marking the last of the good copies as bad (or accused) in case it is the one to go down during the course of a write. iii) fails that particular write with the appropriate errno. This way, we still have one good copy left despite the split-brain situation which when it is back online, will be chosen as source to do the heal. Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a BUG: 1363721 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/15080 Tested-by: Pranith Kumar Karampuri <pkarampu> Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Ravishankar N <ravishankar> Reviewed-by: Oleksandr Natalenko <oleksandr> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report. glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html [2] https://www.gluster.org/pipermail/gluster-users/ |