1579788 – Thin-arbiter: Have the state of volume in memory

Bug 1579788 - Thin-arbiter: Have the state of volume in memory

Summary: Thin-arbiter: Have the state of volume in memory

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	replicate
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Ravishankar N
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1648205
TreeView+	depends on / blocked

Reported:	2018-05-18 10:37 UTC by Karthik U S
Modified:	2019-03-25 16:30 UTC (History)
CC List:	4 users (show)
Fixed In Version:	glusterfs-6.0
Clone Of:
Clones:	1648205 (view as bug list)
Environment:
Last Closed:	2019-03-25 16:30:27 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Karthik U S 2018-05-18 10:37:16 UTC

Description of problem:
In the current thin-arbiter implementation we do not have the state of the volume in memory. This leads us to send the request to thin-arbiter node in every failure scenarios, which will slow down the transaction.

Keep the state of which brick is good and bad in all the clients so that we need not contact thin-arbiter brick in every failure scenarios. Contact the thin-arbiter only when we don't have the state in memory. i.e, if it is the first failure on the client or if its a failure after SHD heals and resets the in memory copy.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Worker Ant 2018-05-28 13:06:38 UTC

REVIEW: https://review.gluster.org/20095 (afr: thin-arbiter 2 domain locking and in-memory state) posted (#1) for review on master by Ravishankar N

Comment 2 Worker Ant 2018-05-30 12:09:30 UTC

REVIEW: https://review.gluster.org/20103 (cluster/afr: Use 2 domain locking in SHD for thin-arbiter) posted (#1) for review on master by Karthik U S

Comment 3 Yaniv Kaul 2018-07-11 14:26:23 UTC

1. This looks more like a feature than a bug fix? The keyword doesn't reflect this.
2. Severity/Priority?
3. Do we have numbers before/after?

Comment 4 Ravishankar N 2018-07-12 01:00:35 UTC

Initial bare bones MVP-0 patches to get https://github.com/gluster/glusterfs/issues/352 working were sent against the github issue itself. Since then issue has been closed and we are creating bugs to send fixes and other MVP milestones (see the document referenced in the github issue).  The feature is 'on the way' to becoming demo-worthy.

Comment 5 Worker Ant 2018-08-16 12:07:57 UTC

REVIEW: https://review.gluster.org/20748 (afr: common thin-arbiter functions) posted (#1) for review on master by Ravishankar N

Comment 6 Worker Ant 2018-08-23 06:38:03 UTC

COMMIT: https://review.gluster.org/20748 committed in master by "Ravishankar N" <ravishankar> with a commit message- afr: common thin-arbiter functions

...that can be used by client and self-heal daemon, namely:

afr_ta_post_op_lock()
afr_ta_post_op_unlock()

Note: These are not yet consumed. They will be used in the write txn
changes patch which will introduce 2 domain locking.

updates: bz#1579788
Change-Id: I636d50f8fde00736665060e8f9ee4510d5f38795
Signed-off-by: Ravishankar N <ravishankar>

Comment 7 Worker Ant 2018-08-25 14:34:03 UTC

REVIEW: https://review.gluster.org/20994 (afr: thin-arbiter read txn changes) posted (#1) for review on master by Ravishankar N

Comment 8 Worker Ant 2018-08-31 09:57:52 UTC

REVIEW: https://review.gluster.org/21054 (afr: thin-arbiter read txn changes) posted (#1) for review on master by Ravishankar N

Comment 9 Worker Ant 2018-09-05 08:28:54 UTC

COMMIT: https://review.gluster.org/20994 committed in master by "Ravishankar N" <ravishankar> with a commit message- afr: thin-arbiter read txn changes

If both data bricks are up, read subvol will be based on read_subvols.

If only one data brick is up:
- First qeury the data-brick that is up. If it blames the other brick,
allow the reads.

- If if doesn't, query the TA to obtain the source of truth.

TODO: See if in-memory state can be maintained for read txns (BZ 1624358).

updates: bz#1579788
Change-Id: I61eec35592af3a1aaf9f90846d9a358b2e4b2fcc
Signed-off-by: Ravishankar N <ravishankar>

Comment 10 Worker Ant 2018-09-07 12:21:23 UTC

REVIEW: https://review.gluster.org/21120 (afr: thin-arbiter 2 domain locking and in-memory state) posted (#1) for review on master by Ravishankar N

Comment 11 Worker Ant 2018-09-20 09:19:01 UTC

COMMIT: https://review.gluster.org/20103 committed in master by "Ravishankar N" <ravishankar> with a commit message- cluster/afr: Use 2 domain locking in SHD for thin-arbiter

With this change when SHD starts the index crawl it requests
all the clients to release the AFR_TA_DOM_NOTIFY lock so that
clients will know the in memory state is no more valid and
any new operations needs to query the thin-arbiter if required.

When SHD completes healing all the files without any failure, it
will again take the AFR_TA_DOM_NOTIFY lock and gets the xattrs on
TA to see whether there are any new failures happened by that time.
If there are new failures marked on TA, SHD will start the crawl
immediately to heal those failures as well. If there are no new
failures, then SHD will take the AFR_TA_DOM_MODIFY lock and unsets
the xattrs on TA, so that both the data bricks will be considered
as good there after.

Change-Id: I037b89a0823648f314580ba0716d877bd5ddb1f1
fixes: bz#1579788
Signed-off-by: karthik-us <ksubrahm>

Comment 12 Shyamsundar 2018-10-23 15:09:12 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.0, please open a new bug report.

glusterfs-5.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-October/000115.html
[2] https://www.gluster.org/pipermail/gluster-users/

Comment 13 Ravishankar N 2018-10-25 01:26:17 UTC

https://review.gluster.org/#/c/glusterfs/+/20095/ is yet to be merged. Moving back to POST

Comment 14 Worker Ant 2018-10-25 12:26:55 UTC

COMMIT: https://review.gluster.org/20095 committed in master by "Ravishankar N" <ravishankar> with a commit message- afr: thin-arbiter 2 domain locking and in-memory state

2 domain locking + xattrop for write-txn failures:
--------------------------------------------------
- A post-op wound on TA takes AFR_TA_DOM_NOTIFY range lock and
AFR_TA_DOM_MODIFY full lock, does xattrop on TA and releases
AFR_TA_DOM_MODIFY lock and stores in-memory which brick is bad.

- All further write txn failures are handled based on this in-memory
value without querying the TA.

- When shd heals the files, it does so by requesting full lock on
AFR_TA_DOM_NOTIFY domain. Client uses this as a cue (via upcall),
releases AFR_TA_DOM_NOTIFY range lock and invalidates its in-memory
notion of which brick is bad. The next write txn failure is wound on TA
to again update the in-memory state.

- Any incomplete write txns before the AFR_TA_DOM_NOTIFY upcall release
request is got is completed before the lock is released.

- Any write txns got after the release request are maintained in a ta_waitq.

- After the release is complete, the ta_waitq elements are spliced to a
separate queue which is then processed one by one.

- For fops that come in parallel when the in-memory bad brick is still
unknown, only one is wound to TA on wire. The other ones are maintained
in a ta_onwireq which is then processed after we get the response from
TA.

Change-Id: I32c7b61a61776663601ab0040e2f0767eca1fd64
updates: bz#1579788
Signed-off-by: Ravishankar N <ravishankar>
Signed-off-by: Ashish Pandey <aspandey>

Comment 15 Worker Ant 2018-12-03 06:47:49 UTC

REVIEW: https://review.gluster.org/21758 (afr: thin-arbiter 2 domain locking and in-memory state) posted (#2) for review on release-5 by Ravishankar N

Comment 16 Worker Ant 2018-12-03 08:39:00 UTC

REVISION POSTED: https://review.gluster.org/21758 (afr: thin-arbiter 2 domain locking and in-memory state) posted (#3) for review on release-5 by Ravishankar N

Comment 17 Shyamsundar 2019-03-25 16:30:27 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report.

glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.