Bug 1774011 - Heal Info is hung when I/O is in progress on a gluster block volume
Summary: Heal Info is hung when I/O is in progress on a gluster block volume
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ravishankar N
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1783858
TreeView+ depends on / blocked
 
Reported: 2019-11-19 12:38 UTC by Ravishankar N
Modified: 2019-12-16 05:37 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1783858 (view as bug list)
Environment:
Last Closed: 2019-12-12 14:01:14 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Gluster.org Gerrit 23688 None Merged afr: make heal info lockless 2019-11-28 07:17:29 UTC
Gluster.org Gerrit 23766 None Open Revert \"afr: make heal info lockless\" 2019-11-28 15:29:19 UTC
Gluster.org Gerrit 23771 None Merged afr: make heal info lockless 2019-12-12 14:01:13 UTC

Description Ravishankar N 2019-11-19 12:38:15 UTC
This bug was initially created as a copy of Bug #1721355

I am copying this bug because: 
I'd like to send a patch against it.


Description of problem:
I was observing this issue while working on BZ 1707259. 
When there is no self-heals pending, but there are a lot of I/Os happening on a replicate volume with gluster block profile enabled, heal-info was hung. The moment I/O stopped, the command completed successfully. I'm guessing it has something to do with eager locking but I need to RCA it.

Version-Release number of selected component (if applicable):
rhgs-3.5.0

How reproducible:
Always on my dev VMs.

Steps to Reproduce:
- Create a 1x3 replica volume (3 node setup)
- Apply  gluster-block profile on the volume. (gluster v set $volname group gluster-block)
- Mount a fuse client on another node and run parallel 'dd's :
for i in seq{1..20}; do dd if=/dev/urandom of=FILE_$i bs=1024 count=102400& done
- After 10-20 seconds while the I/O is going on, run heal-info command - It will be hung.


Actual results:
heal info command is hung

Expected results:
It should not be hung.

Comment 1 Worker Ant 2019-11-19 12:53:35 UTC
REVIEW: https://review.gluster.org/23688 (afr: make heal info lockless) posted (#2) for review on master by Ravishankar N

Comment 2 Worker Ant 2019-11-28 07:17:30 UTC
REVIEW: https://review.gluster.org/23688 (afr: make heal info lockless) merged (#5) on master by Ravishankar N

Comment 3 Ravishankar N 2019-11-28 07:23:55 UTC
Merged the patch before reviews by mistake. I have sent a revert at https://review.gluster.org/#/c/glusterfs/+/23766/

Comment 4 Worker Ant 2019-11-28 08:13:00 UTC
REVIEW: https://review.gluster.org/23766 (Revert \"afr: make heal info lockless\") posted (#2) for review on master by Ravishankar N

Comment 5 Worker Ant 2019-11-28 15:29:20 UTC
REVIEW: https://review.gluster.org/23766 (Revert \"afr: make heal info lockless\") merged (#2) on master by Ravishankar N

Comment 6 Worker Ant 2019-11-29 01:20:22 UTC
REVIEW: https://review.gluster.org/23771 (afr: make heal info lockless) posted (#1) for review on master by Ravishankar N

Comment 7 Worker Ant 2019-12-12 14:01:14 UTC
REVIEW: https://review.gluster.org/23771 (afr: make heal info lockless) merged (#5) on master by Ravishankar N


Note You need to log in before you can comment on or make changes to this bug.