Bug 1647675

Summary: Rebalance could hang under certain circumstances
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Sunil Kumar Acharya <sheggodu>
Component: locksAssignee: Susant Kumar Palai <spalai>
Status: CLOSED ERRATA QA Contact: Prasad Desala <tdesala>
Severity: urgent Docs Contact:
Priority: medium    
Version: rhgs-3.4CC: atumball, rcyriac, rhs-bugs, sanandpa, sankarshan, saraut, sheggodu, spalai
Target Milestone: ---Keywords: Regression, ZStream
Target Release: RHGS 3.4.z Batch Update 2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.12.2-27 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-12-17 17:07:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 3 Sunil Kumar Acharya 2018-11-08 06:14:51 UTC
Upstream Patch: https://review.gluster.org/#/c/glusterfs/+/21579/

Comment 5 Sunil Kumar Acharya 2018-11-08 06:30:04 UTC
*** Bug 1647315 has been marked as a duplicate of this bug. ***

Comment 11 Amar Tumballi 2018-11-08 08:42:20 UTC
> 2) The patch that is going in is to address a bug that we might hit sometime
> in future, when the (related) code is supported in downstream.. is that
> correct?

Also note that, we may hit this bug if below option is enabled by user. 

> Option: cluster.lock-migration
> Default Value: off
> Description:  If enabled this feature will migrate the posix locks associated with a file during rebalance

as it is a very visible coverity reported bug, and if in any case if the code-path is hit, it gets to 'hang', recommended to have it as part of BU2. May be running coverity on downstream build and seeing it doesn't report this is good enough fix?

-----
** CID 1396581:  Program hangs  (LOCK)
/xlators/features/locks/src/posix.c: 2952 in pl_metalk()


*** CID 1396581:  Program hangs  (LOCK)
/xlators/features/locks/src/posix.c: 2952 in pl_metalk()
2946                 gf_msg(this->name, GF_LOG_WARNING, EINVAL, 0,
2947                        "More than one meta-lock can not be granted on"
2948                        "the inode");
2949                 ret = -1;
2950             }
2951         }
>>>     CID 1396581:  Program hangs  (LOCK)
>>>     "pthread_mutex_lock" locks "pl_inode->mutex" while it is locked.
2952         pthread_mutex_lock(&pl_inode->mutex);
2953     
2954         if (ret == -1) {
2955             goto out;
2956         }
2957

Comment 22 Prasad Desala 2018-11-22 05:31:28 UTC
Thank you Susant for the detailed explanation. Based on Comment21 moving this BZ to Verified.

Comment 23 errata-xmlrpc 2018-12-17 17:07:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3827