1398888 – self-heal info command hangs after triggering self-heal

Bug 1398888 - self-heal info command hangs after triggering self-heal

Summary: self-heal info command hangs after triggering self-heal

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	replicate
Sub Component:
Version:	3.9
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Krutika Dhananjay
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1396166 1398566
Blocks:
TreeView+	depends on / blocked

Reported:	2016-11-27 02:20 UTC by Krutika Dhananjay
Modified:	2017-03-08 10:19 UTC (History)
CC List:	5 users (show)
Fixed In Version:	glusterfs-3.9.1
Clone Of:	1398566
Environment:	RHV-RHGS HCI
Last Closed:	2017-03-08 10:19:47 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Krutika Dhananjay 2016-11-27 02:20:55 UTC

+++ This bug was initially created as a clone of Bug #1398566 +++

+++ This bug was initially created as a clone of Bug #1396166 +++

Description of problem:
------------------------
After issuing 'gluster volume heal', 'gluster volume heal info' hangs, when compound-fops is enabled on the replica 3 volume

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
RHEL 7.3
RHGS 3.2.0 interim build ( glusterfs-3.8.4-5.el7rhgs )

How reproducible:
-----------------
Always

Steps to Reproduce:
-------------------
1. Create a replica 3 volume
2. Optimize the volume for VM store usecase
3. Enable compound-fops on the volume
4. Create a VM, and install OS
5. While OS installation is in progress, kill brick1 on server1
6. After VM installation is completed, bring back the brick up
7. Trigger self-heal on the volume
8. Get the self-heal info

Actual results:
---------------
self-heal info command is hung

Expected results:
-----------------
'self-heal info' should provide the correct information about un-synced entries

Additional info:
----------------
When compound-fops is disabled on the volume, this issue is not seen

--- Additional comment from SATHEESARAN on 2016-11-17 11:19:40 EST ---

1. Cluster info
---------------
There are 3 hosts in the cluster. All of them are VMs installed with RHGS interim build over RHEL 7.3

[root@Server1 ~]# gluster peer status
Number of Peers: 2

Hostname: server2
Uuid: 209154aa-836f-47c1-8446-a5c5d15eb566
State: Peer in Cluster (Connected)

Hostname: server3
Uuid: e88a05e5-7772-4b31-9b7f-a1de1509adb7
State: Peer in Cluster (Connected)

2. gluster volume info
-----------------------
[root@server1 ~]# gluster volume info
 
Volume Name: volume1
Type: Replicate
Volume ID: aa01f3d2-4ba2-4747-893e-84058788f1dd
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: server1:/gluster/brick1/b1
Brick2: server2:/gluster/brick1/b1
Brick3: server3:/gluster/brick1/b1
Options Reconfigured:
cluster.granular-entry-heal: on
user.cifs: off
network.ping-timeout: 30
performance.strict-o-direct: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
performance.low-prio-threads: 32
features.shard-block-size: 512MB
features.shard: on
storage.owner-gid: 107
storage.owner-uid: 107
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: off
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on


--- Additional comment from Krutika Dhananjay on 2016-11-22 10:22:39 EST ---

You do have the brick statedump too, don't you? Could you please attach those as well?

-Krutika

--- Additional comment from SATHEESARAN on 2016-11-23 02:04:43 EST ---

(In reply to Krutika Dhananjay from comment #7)
> You do have the brick statedump too, don't you? Could you please attach
> those as well?
> 
> -Krutika

Hi Krutika,

I have mistakenly re-provisioned my third server in the cluster to simulate failed node scenario.

But I have brick statedump from server1 and server2. I will attach them

--- Additional comment from Worker Ant on 2016-11-25 05:36:40 EST ---

REVIEW: http://review.gluster.org/15929 (cluster/afr: Fix deadlock due to compound fops) posted (#1) for review on master by Krutika Dhananjay (kdhananj)

--- Additional comment from Worker Ant on 2016-11-26 06:10:35 EST ---

COMMIT: http://review.gluster.org/15929 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 2fe8ba52108e94268bc816ba79074a96c4538271
Author: Krutika Dhananjay <kdhananj>
Date:   Fri Nov 25 15:54:30 2016 +0530

    cluster/afr: Fix deadlock due to compound fops
    
    When an afr data transaction is eligible for using
    eager-lock, this information is represented in
    local->transaction.eager_lock_on. However, if non-blocking
    inodelk attempt (which is a full lock) fails, AFR falls back
    to blocking locks which are range locks. At this point,
    local->transaction.eager_lock[] per brick is reset but
    local->transaction.eager_lock_on is still true.
    When AFR decides to compound post-op and unlock, it is after
    confirming that the transaction did not use eager lock (well,
    except for a small bug where local->transaction.locks_acquired[]
    is not considered).
    
    But within afr_post_op_unlock_do(), afr again incorrectly sets
    the lock range to full-lock based on local->transaction.eager_lock_on
    value. This is a bug and can lead to deadlock since the locks acquired
    were range locks and a full unlock is being sent leading to unlock failure
    and thereby every other lock request (be it from SHD or other clients or
    glfsheal) getting blocked forever and the user perceives a hang.
    
    FIX:
    Unconditionally rely on the range locks in inodelk object for unlocking
    when using compounded post-op + unlock.
    
    Big thanks to Pranith for helping with the debugging.
    
    Change-Id: Idb4938f90397fb4bd90921f9ae6ea582042e5c67
    BUG: 1398566
    Signed-off-by: Krutika Dhananjay <kdhananj>
    Reviewed-on: http://review.gluster.org/15929
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 1 Worker Ant 2016-11-27 02:23:56 UTC

REVIEW: http://review.gluster.org/15932 (cluster/afr: Fix deadlock due to compound fops) posted (#1) for review on release-3.9 by Krutika Dhananjay (kdhananj)

Comment 2 Worker Ant 2016-11-27 08:38:03 UTC

COMMIT: http://review.gluster.org/15932 committed in release-3.9 by Pranith Kumar Karampuri (pkarampu) 
------
commit 13725b3f30f90a11771602c546875eb70831ae5d
Author: Krutika Dhananjay <kdhananj>
Date:   Fri Nov 25 15:54:30 2016 +0530

    cluster/afr: Fix deadlock due to compound fops
    
            Backport of: http://review.gluster.org/15929
    
    When an afr data transaction is eligible for using
    eager-lock, this information is represented in
    local->transaction.eager_lock_on. However, if non-blocking
    inodelk attempt (which is a full lock) fails, AFR falls back
    to blocking locks which are range locks. At this point,
    local->transaction.eager_lock[] per brick is reset but
    local->transaction.eager_lock_on is still true.
    When AFR decides to compound post-op and unlock, it is after
    confirming that the transaction did not use eager lock (well,
    except for a small bug where local->transaction.locks_acquired[]
    is not considered).
    
    But within afr_post_op_unlock_do(), afr again incorrectly sets
    the lock range to full-lock based on local->transaction.eager_lock_on
    value. This is a bug and can lead to deadlock since the locks acquired
    were range locks and a full unlock is being sent leading to unlock failure
    and thereby every other lock request (be it from SHD or other clients or
    glfsheal) getting blocked forever and the user perceives a hang.
    
    FIX:
    Unconditionally rely on the range locks in inodelk object for unlocking
    when using compounded post-op + unlock.
    
    Big thanks to Pranith for helping with the debugging.
    
    Change-Id: I2edcc13ac00bc1ba2e3558891ba98d0cd410b47a
    BUG: 1398888
    Signed-off-by: Krutika Dhananjay <kdhananj>
    Reviewed-on: http://review.gluster.org/15932
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 3 Kaushal 2017-03-08 10:19:47 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.1, please open a new bug report.

glusterfs-3.9.1 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-January/029725.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.