Bug 1549606 - Eager lock should be present for both metadata and data transactions
Summary: Eager lock should be present for both metadata and data transactions
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1499644
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-27 13:36 UTC by Pranith Kumar K
Modified: 2018-06-20 18:01 UTC (History)
9 users (show)

Fixed In Version: glusterfs-v4.1.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1499644
Environment:
Last Closed: 2018-06-20 18:01:20 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Pranith Kumar K 2018-02-27 13:38:15 UTC
    cluster/afr: Make AFR eager-locking similar to EC
    
    Problem:
    1) Afr's eager-lock only works for data transactions.
    2) When there are conflicting writes, write with conflicting region initiates
    unlock of eager-lock leading to extra pre-ops and post-ops on the file. When
    eager-lock goes off, it leads to extra fsyncs for random-write workload in afr.
    
    Solution (that is modeled after EC):
    In EC, when there is a conflicting write, it waits for the current write to
    complete before it winds the conflicted write. This leads to better utilization
    of network and disk, because we will not be doing extra xattrops and FSYNCs and
    inodelk/unlock.
    
    I tried to model the solution based on EC's locking, but it is not similar to
    AFR because we had to keep backward compatibility.

Comment 2 Worker Ant 2018-02-27 13:42:47 UTC
REVIEW: https://review.gluster.org/19503 (cluster/afr: Make AFR eager-locking similar to EC) posted (#8) for review on master by Pranith Kumar Karampuri

Comment 3 Worker Ant 2018-03-02 05:00:06 UTC
REVIEW: https://review.gluster.org/19655 (cluster/afr: Remove compound-fops usage in afr) posted (#1) for review on master by Pranith Kumar Karampuri

Comment 4 Worker Ant 2018-03-02 14:21:25 UTC
REVIEW: https://review.gluster.org/19661 (cluster/afr: Remove unused code paths) posted (#1) for review on master by Pranith Kumar Karampuri

Comment 5 Worker Ant 2018-03-06 08:49:58 UTC
COMMIT: https://review.gluster.org/19661 committed in master by "Ravishankar N" <ravishankar> with a commit message- cluster/afr: Remove unused code paths

Removed
1) afr-v1 self-heal locks related code which is not used anymore
2) transaction has some data types that are not needed, so removed them
3) Never used lock tracing available in afr as gluster's network tracing does
the job. So removed that as well.
4) Changelog is always enabled and afr is always used with locks, so
__changelog_enabled, afr_lock_server_count etc functions can be deleted.
5) transaction.fop/done/resume always call the same functions, so no need
to have these variables.

BUG: 1549606
Change-Id: I370c146fec2892d40e674d232a5d7256e003c7f1
Signed-off-by: Pranith Kumar K <pkarampu>

Comment 6 Worker Ant 2018-03-14 13:33:04 UTC
COMMIT: https://review.gluster.org/19503 committed in master by "Pranith Kumar Karampuri" <pkarampu> with a commit message- cluster/afr: Make AFR eager-locking similar to EC

Problem:
1) Afr's eager-lock only works for data transactions.
2) When there are conflicting writes, write with conflicting region initiates
unlock of eager-lock leading to extra pre-ops and post-ops on the file. When
eager-lock goes off, it leads to extra fsyncs for random-write workload in afr.

Solution (that is modeled after EC):
In EC, when there is a conflicting write, it waits for the current write to
complete before it winds the conflicted write. This leads to better utilization
of network and disk, because we will not be doing extra xattrops and FSYNCs and
inodelk/unlock. Moved fd based counters to inode based counters.

I tried to model the solution based on EC's locking, but it is not similar to
AFR because we had to keep backward compatibility.

Lifecycle of lock:
==================
First transaction is added to inode->owners list and an inodelk will be sent on
the wire. All the next transactions will be put in inode->waiters list until
the first transaction completes inodelk and [f]xattrop completely.  Once
[f]xattrop also completes, all the requests in the inode->waiters list are
checked if it conflict with any of the existing locks which are in
inode->owners list and if not are added to inode->owners list and resumed with
doing transaction. When these transactions complete fop phase they will be
moved to inode->post_op list and resume the transactions that were paused
because of conflicts. Post-op and unlock will not be issued on the wire until
that is the last transaction on that inode. Last transaction when it has to
perform post-op can choose to sleep for deyed-post-op-secs value. During that
time if any other transaction comes, it will wake up the sleeping transaction
and takes over the ownership of the lock and the cycle continues. If the
dealyed-post-op-secs expire, then the timer thread will wakeup the sleeping
transaction and it will set lock->release to true and starts doing post-op and
then unlock. During this time if any other transactions come, they will be put
in inode->frozen list. Once the previous unlock comes it will move the frozen
list to waiters list and moves the first element from this waiters-list to
owners-list and attempts the lock and the cycle continues. This is the general
idea.  There is logic at the time of dealying and at the time of new
transaction or in flush fop to wakeup existing sleeping transactions or
choosing whether to delay a transaction etc, which is subjected to change based
on future enhancements etc.

Fixes: #418
BUG: 1549606
Change-Id: I88b570bbcf332a27c82d2767dfa82472f60055dc
Signed-off-by: Pranith Kumar K <pkarampu>

Comment 7 Shyamsundar 2018-06-20 18:01:20 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v4.1.0, please open a new bug report.

glusterfs-v4.1.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-June/000102.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.