1721355 – Heal Info is hung when I/O is in progress on a gluster block volume

Bug 1721355 - Heal Info is hung when I/O is in progress on a gluster block volume

Summary: Heal Info is hung when I/O is in progress on a gluster block volume

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	replicate
Sub Component:
Version:	rhgs-3.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.5.z Batch Update 3
Assignee:	Ravishankar N
QA Contact:	Sayalee
Docs Contact:
URL:
Whiteboard:
Duplicates (5):	1483977 1643081 1643559 1763596 1812114 (view as bug list)
Depends On:
Blocks:	1696815 1703695 1787998 1812114
TreeView+	depends on / blocked

Reported:	2019-06-18 04:56 UTC by Ravishankar N
Modified:	2021-01-07 18:13 UTC (History)
CC List:	16 users (show)
Fixed In Version:	glusterfs-6.0-38
Doc Type:	Known Issue
Doc Text:	Previously, the ‘gluster volume heal $volname info’ command would hang if the lock was already acquired by a client writing to the same file as it blocked the locks on the files to determine and print if they needed to heal. With this update, the command displays the list of files needing to heal without taking any locks.
Clone Of:
Environment:
Last Closed:	2020-12-17 04:50:16 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Article)	5691191	0	None	None	None	2021-01-07 18:13:44 UTC
Red Hat Product Errata	RHBA-2020:5603	0	None	None	None	2020-12-17 04:50:47 UTC

Description Ravishankar N 2019-06-18 04:56:44 UTC

Description of problem:
I was observing this issue while working on BZ 1707259. 
When there is no self-heals pending, but there are a lot of I/Os happening on a replicate volume with gluster block profile enabled, heal-info was hung. The moment I/O stopped, the command completed successfully. I'm guessing it has something to do with eager locking but I need to RCA it.

Version-Release number of selected component (if applicable):
rhgs-3.5.0

How reproducible:
Always on my dev VMs.

Steps to Reproduce:
- Create a 1x3 replica volume (3 node setup)
- Apply  gluster-block profile on the volume. (gluster v set $volname group gluster-block)
- Mount a fuse client on another node and run parallel 'dd's :
for i in seq{1..20}; do dd if=/dev/urandom of=FILE_$i bs=1024 count=102400& done
- After 10-20 seconds while the I/O is going on, run heal-info command - It will be hung.


Actual results:
heal info command is hung

Expected results:
It should not be hung.

Comment 13 Patric Uebele 2019-07-19 09:49:50 UTC

Ok to do it in a later BU

Comment 39 Pranith Kumar K 2019-12-03 08:33:27 UTC

*** Bug 1483977 has been marked as a duplicate of this bug. ***

Comment 40 Pranith Kumar K 2019-12-03 08:34:01 UTC

*** Bug 1643559 has been marked as a duplicate of this bug. ***

Comment 41 Pranith Kumar K 2019-12-03 08:34:48 UTC

*** Bug 1643081 has been marked as a duplicate of this bug. ***

Comment 42 Pranith Kumar K 2019-12-03 08:34:49 UTC

*** Bug 1763596 has been marked as a duplicate of this bug. ***

Comment 44 Karthik U S 2020-03-30 09:25:46 UTC

*** Bug 1812114 has been marked as a duplicate of this bug. ***

Comment 57 Arthy Loganathan 2020-11-18 11:28:31 UTC

As per #comment51, verified afr in-service upgrade scenarios of gluster and its working as expected.

Comment 64 errata-xmlrpc 2020-12-17 04:50:16 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5603

Note You need to log in before you can comment on or make changes to this bug.