Bug 1721355 - Heal Info is hung when I/O is in progress on a gluster block volume
Summary: Heal Info is hung when I/O is in progress on a gluster block volume
Keywords:
Status: POST
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: replicate
Version: rhgs-3.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: RHGS 3.5.z Batch Update 3
Assignee: Ravishankar N
QA Contact: nchilaka
URL:
Whiteboard:
: 1483977 1643081 1643559 1763596 1812114 (view as bug list)
Depends On:
Blocks: 1787998 1696815 1703695 1812114
TreeView+ depends on / blocked
 
Reported: 2019-06-18 04:56 UTC by Ravishankar N
Modified: 2020-05-14 02:14 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Both writing to volumes and reading information for the 'gluster heal info' command require blocking locks. This means that when a volume is under heavy write load, running 'gluster heal info' is blocked by existing locks. To work around this issue, reduce the number of write operations from clients to assist in releasing locks and completing the 'gluster heal info' operation.
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Ravishankar N 2019-06-18 04:56:44 UTC
Description of problem:
I was observing this issue while working on BZ 1707259. 
When there is no self-heals pending, but there are a lot of I/Os happening on a replicate volume with gluster block profile enabled, heal-info was hung. The moment I/O stopped, the command completed successfully. I'm guessing it has something to do with eager locking but I need to RCA it.

Version-Release number of selected component (if applicable):
rhgs-3.5.0

How reproducible:
Always on my dev VMs.

Steps to Reproduce:
- Create a 1x3 replica volume (3 node setup)
- Apply  gluster-block profile on the volume. (gluster v set $volname group gluster-block)
- Mount a fuse client on another node and run parallel 'dd's :
for i in seq{1..20}; do dd if=/dev/urandom of=FILE_$i bs=1024 count=102400& done
- After 10-20 seconds while the I/O is going on, run heal-info command - It will be hung.


Actual results:
heal info command is hung

Expected results:
It should not be hung.

Comment 13 Patric Uebele 2019-07-19 09:49:50 UTC
Ok to do it in a later BU

Comment 39 Pranith Kumar K 2019-12-03 08:33:27 UTC
*** Bug 1483977 has been marked as a duplicate of this bug. ***

Comment 40 Pranith Kumar K 2019-12-03 08:34:01 UTC
*** Bug 1643559 has been marked as a duplicate of this bug. ***

Comment 41 Pranith Kumar K 2019-12-03 08:34:48 UTC
*** Bug 1643081 has been marked as a duplicate of this bug. ***

Comment 42 Pranith Kumar K 2019-12-03 08:34:49 UTC
*** Bug 1763596 has been marked as a duplicate of this bug. ***

Comment 44 Karthik U S 2020-03-30 09:25:46 UTC
*** Bug 1812114 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.