Bug 1579788
Summary: | Thin-arbiter: Have the state of volume in memory | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Karthik U S <ksubrahm> | |
Component: | replicate | Assignee: | Ravishankar N <ravishankar> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | mainline | CC: | aspandey, bugs, pasik, ravishankar | |
Target Milestone: | --- | Keywords: | Reopened, Triaged | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-6.0 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1648205 (view as bug list) | Environment: | ||
Last Closed: | 2019-03-25 16:30:27 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1648205 |
Description
Karthik U S
2018-05-18 10:37:16 UTC
REVIEW: https://review.gluster.org/20095 (afr: thin-arbiter 2 domain locking and in-memory state) posted (#1) for review on master by Ravishankar N REVIEW: https://review.gluster.org/20103 (cluster/afr: Use 2 domain locking in SHD for thin-arbiter) posted (#1) for review on master by Karthik U S 1. This looks more like a feature than a bug fix? The keyword doesn't reflect this. 2. Severity/Priority? 3. Do we have numbers before/after? Initial bare bones MVP-0 patches to get https://github.com/gluster/glusterfs/issues/352 working were sent against the github issue itself. Since then issue has been closed and we are creating bugs to send fixes and other MVP milestones (see the document referenced in the github issue). The feature is 'on the way' to becoming demo-worthy. REVIEW: https://review.gluster.org/20748 (afr: common thin-arbiter functions) posted (#1) for review on master by Ravishankar N COMMIT: https://review.gluster.org/20748 committed in master by "Ravishankar N" <ravishankar> with a commit message- afr: common thin-arbiter functions ...that can be used by client and self-heal daemon, namely: afr_ta_post_op_lock() afr_ta_post_op_unlock() Note: These are not yet consumed. They will be used in the write txn changes patch which will introduce 2 domain locking. updates: bz#1579788 Change-Id: I636d50f8fde00736665060e8f9ee4510d5f38795 Signed-off-by: Ravishankar N <ravishankar> REVIEW: https://review.gluster.org/20994 (afr: thin-arbiter read txn changes) posted (#1) for review on master by Ravishankar N REVIEW: https://review.gluster.org/21054 (afr: thin-arbiter read txn changes) posted (#1) for review on master by Ravishankar N COMMIT: https://review.gluster.org/20994 committed in master by "Ravishankar N" <ravishankar> with a commit message- afr: thin-arbiter read txn changes If both data bricks are up, read subvol will be based on read_subvols. If only one data brick is up: - First qeury the data-brick that is up. If it blames the other brick, allow the reads. - If if doesn't, query the TA to obtain the source of truth. TODO: See if in-memory state can be maintained for read txns (BZ 1624358). updates: bz#1579788 Change-Id: I61eec35592af3a1aaf9f90846d9a358b2e4b2fcc Signed-off-by: Ravishankar N <ravishankar> REVIEW: https://review.gluster.org/21120 (afr: thin-arbiter 2 domain locking and in-memory state) posted (#1) for review on master by Ravishankar N COMMIT: https://review.gluster.org/20103 committed in master by "Ravishankar N" <ravishankar> with a commit message- cluster/afr: Use 2 domain locking in SHD for thin-arbiter With this change when SHD starts the index crawl it requests all the clients to release the AFR_TA_DOM_NOTIFY lock so that clients will know the in memory state is no more valid and any new operations needs to query the thin-arbiter if required. When SHD completes healing all the files without any failure, it will again take the AFR_TA_DOM_NOTIFY lock and gets the xattrs on TA to see whether there are any new failures happened by that time. If there are new failures marked on TA, SHD will start the crawl immediately to heal those failures as well. If there are no new failures, then SHD will take the AFR_TA_DOM_MODIFY lock and unsets the xattrs on TA, so that both the data bricks will be considered as good there after. Change-Id: I037b89a0823648f314580ba0716d877bd5ddb1f1 fixes: bz#1579788 Signed-off-by: karthik-us <ksubrahm> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.0, please open a new bug report. glusterfs-5.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2018-October/000115.html [2] https://www.gluster.org/pipermail/gluster-users/ https://review.gluster.org/#/c/glusterfs/+/20095/ is yet to be merged. Moving back to POST COMMIT: https://review.gluster.org/20095 committed in master by "Ravishankar N" <ravishankar> with a commit message- afr: thin-arbiter 2 domain locking and in-memory state 2 domain locking + xattrop for write-txn failures: -------------------------------------------------- - A post-op wound on TA takes AFR_TA_DOM_NOTIFY range lock and AFR_TA_DOM_MODIFY full lock, does xattrop on TA and releases AFR_TA_DOM_MODIFY lock and stores in-memory which brick is bad. - All further write txn failures are handled based on this in-memory value without querying the TA. - When shd heals the files, it does so by requesting full lock on AFR_TA_DOM_NOTIFY domain. Client uses this as a cue (via upcall), releases AFR_TA_DOM_NOTIFY range lock and invalidates its in-memory notion of which brick is bad. The next write txn failure is wound on TA to again update the in-memory state. - Any incomplete write txns before the AFR_TA_DOM_NOTIFY upcall release request is got is completed before the lock is released. - Any write txns got after the release request are maintained in a ta_waitq. - After the release is complete, the ta_waitq elements are spliced to a separate queue which is then processed one by one. - For fops that come in parallel when the in-memory bad brick is still unknown, only one is wound to TA on wire. The other ones are maintained in a ta_onwireq which is then processed after we get the response from TA. Change-Id: I32c7b61a61776663601ab0040e2f0767eca1fd64 updates: bz#1579788 Signed-off-by: Ravishankar N <ravishankar> Signed-off-by: Ashish Pandey <aspandey> REVIEW: https://review.gluster.org/21758 (afr: thin-arbiter 2 domain locking and in-memory state) posted (#2) for review on release-5 by Ravishankar N REVISION POSTED: https://review.gluster.org/21758 (afr: thin-arbiter 2 domain locking and in-memory state) posted (#3) for review on release-5 by Ravishankar N This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report. glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html [2] https://www.gluster.org/pipermail/gluster-users/ |