Bug 1638026 - "gluster vol heal <vol name> info" is hung on Distributed-Replicated ( Arbiter )
Summary: "gluster vol heal <vol name> info" is hung on Distributed-Replicated ( Arbiter )
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: arbiter
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: RHGS 3.4.z Async Update
Assignee: Ravishankar N
QA Contact: Vijay Avuthu
Depends On: 1636902 1637802 1637953 1637989 1638159
TreeView+ depends on / blocked
Reported: 2018-10-10 13:53 UTC by Sunil Kumar Acharya
Modified: 2019-05-24 10:14 UTC (History)
14 users (show)

Fixed In Version: glusterfs-3.12.2-18.2
Doc Type: Bug Fix
Doc Text:
Previously a flaw in the self-heal code caused an inode-lock to be taken twice on the file that needed heal but released only once. Due to this, a stale lock was left behind on the brick, causing further operations(like heal or write from the client) that needed the lock to be hung. With this update, the inode locks are released accurately without leaving behind any stale locks in the brick. This prevents further heals or writes from the client from experiencing a hang.
Clone Of: 1636902
Last Closed: 2018-10-23 09:11:02 UTC

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2970 0 None None None 2018-10-23 09:11:08 UTC

Comment 5 Vijay Avuthu 2018-10-17 12:23:28 UTC

Build Used: glusterfs-3.12.2-18.2.el7rhgs.x86_64

> Ran the automation case "test_entry_self_heal_heal_command" on 2*(2+1) and didn't see any hangs for "gluster volume heal <volname> info "

> heal info is able to list the files without hung.

[root@rhsauto049 ~]# gluster vol heal testvol_distributed-replicated info
Brick rhsauto049.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick0
Status: Connected
Number of entries: 0

Brick rhsauto029.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick1
Status: Connected
Number of entries: 0

Brick rhsauto034.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick2
Status: Connected
Number of entries: 0

Brick rhsauto039.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick3
Status: Connected
Number of entries: 9

Brick rhsauto040.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick4
Status: Connected
Number of entries: 9

Brick rhsauto041.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick5
Status: Connected
Number of entries: 1

> Since this bug is for hang, I am changing the status to verified and raise new bug for heal pending issue.

Comment 13 errata-xmlrpc 2018-10-23 09:11:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.