1294675 – Healing queue rarely empty

Bug 1294675 - Healing queue rarely empty

Summary: Healing queue rarely empty

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.7.6
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Pranith Kumar K
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1322850
TreeView+	depends on / blocked

Reported:	2015-12-29 15:11 UTC by Nicolas Ecarnot
Modified:	2016-06-28 12:13 UTC (History)
CC List:	3 users (show)
Fixed In Version:	glusterfs-3.7.12
Clone Of:
Clones:	1322850 (view as bug list)
Environment:
Last Closed:	2016-06-28 12:13:22 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Nicolas Ecarnot 2015-12-29 15:11:52 UTC

Description of problem:
From the command line of each host, and now constantly monitored by our Nagios/Centreon setup, we see that our 3 nodes replica-3 gluster storage volume is very frequently healing files, not to say constantly.

Version-Release number of selected component (if applicable):
Our setup : 3 Centos 7.2 nodes, with gluster 3.7.6 in replica-3, used as storage+compute for an oVirt 3.5.6 DC.

How reproducible:
Install an oVirt setup on 3 nodes with glusterFS as direct gluster storage.
We have only 3 VMs running on it, so approx not more than 8 files (yes : only 8 files - the VM qemu files).

Steps to Reproduce:
1. Just run it and watch : all is nice
2. Run "gluster volume heal some_vol info" on random nodes
3. Read that more than zero files are getting healed

Actual results:
More than zero files are getting healed

Expected results:
I expected the "Number of entries" of every node to appear in the graph as a flat zero line, most of the times, except for the rare cases of node reboot, after which healing is launched and takes some minutes (sometimes hours) but is doing good.

Additional info:
At first, I found out that I forgot to bump up the cluster.op-version, but this has been done, everything rebooted and back to up.
But this DC is very lightly used, and I'm sure the gluster clients (that are the gluster nodes themselves) should read and write in a synchronous and proper way, not leading to any healing need.

Please see :
https://www.mail-archive.com/gluster-users@gluster.org/msg22890.html

Comment 1 Pranith Kumar K 2016-01-11 09:45:59 UTC

hi Nicolas Ecarnot,
      Thanks for raising the bug. "gluster volume heal <volname> info" is designed to be run one per the volume. If we run multiple processes it may lead to "Possibly undergoing heal" messages as the two try to take same locks and they will fail.

Pranith

Comment 2 Nicolas Ecarnot 2016-01-11 09:48:11 UTC

(In reply to Pranith Kumar K from comment #1)
> hi Nicolas Ecarnot,
>       Thanks for raising the bug. "gluster volume heal <volname> info" is
> designed to be run one per the volume. If we run multiple processes it may
> lead to "Possibly undergoing heal" messages as the two try to take same
> locks and they will fail.
> 
> Pranith

Thank you Pranith for your answer.

Do you advice us to setup our Nagios/Centreon to run only *ONE* check per volume?
If so, please don't close this bug, let us change the setup, wait one week and I'll report the result here.

Tell me.

Comment 3 Pranith Kumar K 2016-01-18 10:56:32 UTC

hi Nicolas Ecarnot,
      Sorry for the delay. Sure doing that will definitely help us. There could still be one corner case of self-heal-daemon and heal info conflicting for same locks. But I would like to hear more from you.

Pranith

Comment 4 Nicolas Ecarnot 2016-01-18 13:35:28 UTC

(In reply to Pranith Kumar K from comment #3)
> hi Nicolas Ecarnot,
>       Sorry for the delay. Sure doing that will definitely help us. There
> could still be one corner case of self-heal-daemon and heal info conflicting
> for same locks. But I would like to hear more from you.
> 
> Pranith

On january 12, 2016, we modified our Nagios/Centreon to offset the checks of our 3 nodes'healing status.

2 weeks later, the graphs are showing a great decrease of healing cases, though not null.
This sounds encouraging.

Being recently noticed about sharding, this is the next feature to try and see whether it could improve the healing cases.
I let you decide if this is enough to close this bug - my opinion is that I'm still surprised that the healing cases is *not* constantly zero, but you choose.

Comment 5 Vijay Bellur 2016-04-20 11:54:10 UTC

REVIEW: http://review.gluster.org/14039 (cluster/afr: Fix spurious entries in heal info) posted (#1) for review on release-3.7 by Pranith Kumar Karampuri (pkarampu)

Comment 6 Vijay Bellur 2016-04-20 16:46:59 UTC

COMMIT: http://review.gluster.org/14039 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu) 
------
commit c55da44b49f0183948f464dae4a5e11d9ed63a24
Author: Pranith Kumar K <pkarampu>
Date:   Thu Mar 31 14:40:09 2016 +0530

    cluster/afr: Fix spurious entries in heal info
    
    Problem:
    Locking schemes in afr-v1 were locking the directory/file completely during
    self-heal. Newer schemes of locking don't require Full directory, file locking.
    But afr-v2 still has compatibility code to work-well with older clients, where
    in entry-self-heal it takes a lock on a special 256 character name which can't
    be created on the fs. Similarly for data self-heal there used to be a lock on
    (LLONG_MAX-2, 1). Old locking scheme requires heal info to take sh-domain locks
    before examining heal-state.  If it doesn't take sh-domain locks, then there is
    a possibility of heal-info hanging till self-heal completes because of
    compatibility locks.  But the problem with heal-info taking sh-domain locks is
    that if two heal-info or shd, heal-info try to inspect heal state in parallel
    using trylocks on sh-domain, there is a possibility that both of them assuming
    a heal is in progress. This was leading to spurious entries being shown in
    heal-info.
    
    Fix:
    As long as there is afr-v1 way of locking, we can't fix this problem with
    simple solutions.  If we know that the cluster is running newer versions of
    locking schemes, in those cases we can give accurate information in heal-info.
    So introduce a new option called 'locking-scheme' which if it is 'granular'
    will give correct information in heal-info. Not only that, Extra network hops
    for taking compatibility locks, sh-domain locks in heal info will not be
    necessary anymore. Thus it improves performance.
    
     >BUG: 1322850
     >Change-Id: Ia563c5f096b5922009ff0ec1c42d969d55d827a3
     >Signed-off-by: Pranith Kumar K <pkarampu>
     >Reviewed-on: http://review.gluster.org/13873
     >Smoke: Gluster Build System <jenkins.com>
     >NetBSD-regression: NetBSD Build System <jenkins.org>
     >CentOS-regression: Gluster Build System <jenkins.com>
     >Reviewed-by: Ashish Pandey <aspandey>
     >Reviewed-by: Anuradha Talur <atalur>
     >Reviewed-by: Krutika Dhananjay <kdhananj>
     >(cherry picked from commit b6a0780d86e7c6afe7ae0d9a87e6fe5c62b4d792)
    
    Change-Id: If7eee18843b48bbeff4c1355c102aa572b2c155a
    BUG: 1294675
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/14039
    Reviewed-by: Krutika Dhananjay <kdhananj>
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>

Comment 7 Pranith Kumar K 2016-04-21 11:39:33 UTC

Nicolas Ecarnot,
Please use the command:
"gluster volume set <volname> locking-scheme granular" for things to work properly. This fix will be available in 3.7.12.

Pranith

Comment 8 Kaushal 2016-06-28 12:13:22 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.12, please open a new bug report.

glusterfs-3.7.12 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-devel/2016-June/049918.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.