1647675 – Rebalance could hang under certain circumstances

Bug 1647675 - Rebalance could hang under certain circumstances

Summary: Rebalance could hang under certain circumstances

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	locks
Sub Component:
Version:	rhgs-3.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.4.z Batch Update 2
Assignee:	Susant Kumar Palai
QA Contact:	Prasad Desala
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1647315 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-11-08 06:13 UTC by Sunil Kumar Acharya
Modified:	2018-12-17 17:07 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.12.2-27
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-12-17 17:07:27 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:3827	0	None	None	None	2018-12-17 17:07:35 UTC

Comment 3 Sunil Kumar Acharya 2018-11-08 06:14:51 UTC

Upstream Patch: https://review.gluster.org/#/c/glusterfs/+/21579/

Comment 5 Sunil Kumar Acharya 2018-11-08 06:30:04 UTC

*** Bug 1647315 has been marked as a duplicate of this bug. ***

Comment 11 Amar Tumballi 2018-11-08 08:42:20 UTC

> 2) The patch that is going in is to address a bug that we might hit sometime
> in future, when the (related) code is supported in downstream.. is that
> correct?

Also note that, we may hit this bug if below option is enabled by user. 

> Option: cluster.lock-migration
> Default Value: off
> Description:  If enabled this feature will migrate the posix locks associated with a file during rebalance

as it is a very visible coverity reported bug, and if in any case if the code-path is hit, it gets to 'hang', recommended to have it as part of BU2. May be running coverity on downstream build and seeing it doesn't report this is good enough fix?

-----
** CID 1396581:  Program hangs  (LOCK)
/xlators/features/locks/src/posix.c: 2952 in pl_metalk()


*** CID 1396581:  Program hangs  (LOCK)
/xlators/features/locks/src/posix.c: 2952 in pl_metalk()
2946                 gf_msg(this->name, GF_LOG_WARNING, EINVAL, 0,
2947                        "More than one meta-lock can not be granted on"
2948                        "the inode");
2949                 ret = -1;
2950             }
2951         }
>>>     CID 1396581:  Program hangs  (LOCK)
>>>     "pthread_mutex_lock" locks "pl_inode->mutex" while it is locked.
2952         pthread_mutex_lock(&pl_inode->mutex);
2953     
2954         if (ret == -1) {
2955             goto out;
2956         }
2957

Comment 22 Prasad Desala 2018-11-22 05:31:28 UTC

Thank you Susant for the detailed explanation. Based on Comment21 moving this BZ to Verified.

Comment 23 errata-xmlrpc 2018-12-17 17:07:27 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3827

Note You need to log in before you can comment on or make changes to this bug.