Bug 1170913 - [AFR-V2] - Eliminate inodelks taken by shd during metadata self-heal in self-heal domain
Summary: [AFR-V2] - Eliminate inodelks taken by shd during metadata self-heal in self-...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Krutika Dhananjay
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1171077
TreeView+ depends on / blocked
 
Reported: 2014-12-05 05:38 UTC by Krutika Dhananjay
Modified: 2015-05-14 17:45 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.7.0
Clone Of:
: 1171077 (view as bug list)
Environment:
Last Closed: 2015-05-14 17:28:39 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Krutika Dhananjay 2014-12-05 05:38:26 UTC
Description of problem:

The reason self-heal daemons first acquire full locks in self-heal domain in AFR-V2, is to ensure that only one of them gets to enter the critical section and do the healing of a file/directory.

Although this was being achieved before AFR-V2 by way of holding full locks in xlator domain (the domain where normal modification FOPs take locks in the clients' I/O path) until the given file/directory is healed, the disadvantage with this approach was that the clients in the normal I/O path would be required to wait until this file/dir is healed, and in cases where the to-be-healed files are really large (think VM images), the clients would perceive a hung file system with nothing working.

Therefore, to eliminate this, self-heal domain locks were introduced exclusively for use by self-heal daemon.

However, metadata self-heal is a relatively fast operation (irrespective of how big the file/directory is). Hence, it can possibly eliminate locking in sh-domain and directly proceed with a blocking locks acquisition in xlator domain, perform metadata healing (a bunch of getxattrs, setxattrs, removexattrs and setattr), and a quick unlock on the held locks.

This way,
a. No two healers can concurrently perform metadata healing of a file/dir;
b. The clients themselves cannot enter the critical section either to modify metadata while selfheal is in progress.
c. The clients in the normal i/o path do not have to be blocked for a really long time to do metadata FOPs;

Version-Release number of selected component (if applicable):


How reproducible:
N/A

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Anand Avati 2014-12-05 06:15:18 UTC
REVIEW: http://review.gluster.org/9240 (cluster/afr: Eliminate locking in sh domain in metadata self-heal) posted (#1) for review on master by Krutika Dhananjay (kdhananj)

Comment 2 Anand Avati 2014-12-05 09:51:19 UTC
COMMIT: http://review.gluster.org/9240 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 5fdffa7164ffc5a798246411d065259b36658bc3
Author: Krutika Dhananjay <kdhananj>
Date:   Fri Dec 5 11:16:07 2014 +0530

    cluster/afr: Eliminate locking in sh domain in metadata self-heal
    
    Change-Id: I9ef25a17c9a43ba06fac2ad3f7c18cb47de91537
    BUG: 1170913
    Signed-off-by: Krutika Dhananjay <kdhananj>
    Reviewed-on: http://review.gluster.org/9240
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Tested-by: Pranith Kumar Karampuri <pkarampu>

Comment 3 Niels de Vos 2015-05-14 17:28:39 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 4 Niels de Vos 2015-05-14 17:35:45 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 5 Niels de Vos 2015-05-14 17:38:07 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 6 Niels de Vos 2015-05-14 17:45:14 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.