1707259 – Volume heal for block hosting volume is pending for over 4 hours

Bug 1707259 - Volume heal for block hosting volume is pending for over 4 hours

Summary: Volume heal for block hosting volume is pending for over 4 hours

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	replicate
Sub Component:
Version:	ocs-3.11
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Ravishankar N
QA Contact:	Nag Pavan Chilakam
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1672543
TreeView+	depends on / blocked

Reported:	2019-05-07 07:28 UTC by Rachael
Modified:	2020-09-11 14:05 UTC (History)
CC List:	19 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1672543
Environment:
Last Closed:	2020-09-11 14:05:06 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Comment 16 Yaniv Kaul 2019-11-25 10:10:11 UTC

What's the next step here?

Comment 17 Ravishankar N 2019-11-25 10:39:57 UTC

(In reply to Yaniv Kaul from comment #16)
> What's the next step here?

(Copying from https://bugzilla.redhat.com/show_bug.cgi?id=1721355#c9)
<snip>  we need to add better eager-lock debugging infra to AFR. Since https://review.gluster.org/19503, AFR maintains multiple queues for fops for performance gains and it is a bit difficult as of now to gain insight into when eager locks are acquired and released when there is a lot of I/O being pumped. </snip>

Also, we need to write some python gdb helper scripts (my question on stack overflow (comment #13) has since received an answer) to isolate threads of interest in a multi-hundred thread process. I'm currently working on the lock less heal info bug. I can pick this up once that is complete.

Note You need to log in before you can comment on or make changes to this bug.