Bug 1379974 - heal info command hangs while healing of multiple files is in progress
Summary: heal info command hangs while healing of multiple files is in progress
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: arbiter
Version: rhgs-3.2
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Ravishankar N
QA Contact: Karan Sandha
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-28 09:50 UTC by Karan Sandha
Modified: 2016-11-15 08:39 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-14 10:27:07 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Karan Sandha 2016-09-28 09:50:37 UTC
Description of problem:
gluster volume heal <vol-name> info hungs while healing of mulitple files is in progress. 

Version-Release number of selected component (if applicable):
[root@dhcp46-50 home]# gluster --version
glusterfs 3.8.4 built on Sep 20 2016 07:17:14
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.

Logs & statedumps are placed under :/var/www/html/sosreports/<bug>

How reproducible:
Hit once

Steps to Reproduce:
1. Create 4*2+1 arbiter volume testvol 
2. Remove the bricks to make it 1x(2+1) arbiter volume
3. mount the client on fuse.
4. Create files using :-
 for((i=1;i<=1000000;i++))
do
dd if=/dev/urandom of=file$i bs=6K count=1
done

5. Kill a brick process B1 i.e a data brick.
6. When the script complete. 
7. Force start the volume to start the brick process.
8. While healing is in progress . Check for heal info command 
      gluster volume heal testvol info


Actual results:
heal info command hungs and doesn't show any output.
glustershd log show healing in progress.

Expected results:

heal info command show show the healing files in progress.

Additional info:

After debugging the issue from dev. the heal info command was showing in cyclic manner on 3 different servers.

Comment 2 Ravishankar N 2016-10-21 06:14:22 UTC
Not yet RCA'ed. But we need to fix it if it is indeed a bug. Providing acks and adding internal whiteboard for tracking purposes.

Comment 5 Ravishankar N 2016-11-07 11:33:07 UTC
> After debugging the issue from dev. the heal info command was showing in
> cyclic manner on 3 different servers.

Krutika, do you remember what was the deadlock you saw when you looked into this?

Comment 6 Krutika Dhananjay 2016-11-07 12:51:51 UTC
(In reply to Ravishankar N from comment #5)
> > After debugging the issue from dev. the heal info command was showing in
> > cyclic manner on 3 different servers.
> 
> Krutika, do you remember what was the deadlock you saw when you looked into
> this?

Unfortunately no.

Comment 7 Ravishankar N 2016-11-07 13:13:15 UTC
Karan, as discussed, please see if you can re-create the issue and attach the state dumps of the clients, bricks and glfsheal along with the sos reports. We need to see if the hang we are experiencing is the same as BZ 1386626.

Comment 8 Karan Sandha 2016-11-08 10:45:30 UTC
I am not able to reproduce the issue on build 3.8.4.3 gluster build. Seems like the hang didn't hit this time. I will try again if i am able to hit this heal info hung again. till the time we can take the decision on this bug.

Thanks & Regards
Karan Sanddha

Comment 9 Pranith Kumar K 2016-11-08 16:30:58 UTC
karan agreed to do one more run of this test case and let us know the result, based on that bug will be moved appropriately.

Comment 10 Karan Sandha 2016-11-09 09:11:32 UTC
I tried again but i am not able to hit the bug. We should probably close this bug and if i hit this bug in near future. I will reopen it again.

Thanks & Regards
Karan Sandha

Comment 11 Pranith Kumar K 2016-11-09 09:17:19 UTC
Since there is not much dev needs to do for this bug, moving it to ON_QA. Please move it to VERIFIED since you do not see this bug with latest releases. If you see this bug again please reopen this bug.

Comment 14 Karan Sandha 2016-11-14 18:09:03 UTC
Hence moving the bug to verified state and will reopen if hit this again
-Karan Sandha


Note You need to log in before you can comment on or make changes to this bug.