Bug 1707259 - Volume heal for block hosting volume is pending for over 4 hours
Summary: Volume heal for block hosting volume is pending for over 4 hours
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: replicate
Version: ocs-3.11
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Ravishankar N
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On:
Blocks: 1672543
TreeView+ depends on / blocked
 
Reported: 2019-05-07 07:28 UTC by Rachael
Modified: 2020-09-11 14:05 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1672543
Environment:
Last Closed: 2020-09-11 14:05:06 UTC
Embargoed:


Attachments (Terms of Use)

Comment 16 Yaniv Kaul 2019-11-25 10:10:11 UTC
What's the next step here?

Comment 17 Ravishankar N 2019-11-25 10:39:57 UTC
(In reply to Yaniv Kaul from comment #16)
> What's the next step here?

(Copying from https://bugzilla.redhat.com/show_bug.cgi?id=1721355#c9)
<snip>  we need to add better eager-lock debugging infra to AFR. Since https://review.gluster.org/19503, AFR maintains multiple queues for fops for performance gains and it is a bit difficult as of now to gain insight into when eager locks are acquired and released when there is a lot of I/O being pumped. </snip>

Also, we need to write some python gdb helper scripts (my question on stack overflow (comment #13) has since received an answer) to isolate threads of interest in a multi-hundred thread process. I'm currently working on the lock less heal info bug. I can pick this up once that is complete.


Note You need to log in before you can comment on or make changes to this bug.