Bug 1331164

Summary: [GSS] - High number of failed heal entries increasing
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Pablo Caruana <pcaruana>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED DUPLICATE QA Contact: storage-qa-internal <storage-qa-internal>
Severity: urgent Docs Contact:
Priority: medium    
Version: rhgs-3.1CC: bkunal, mchangir, pkarampu, ravishankar, rhs-bugs
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-14 09:01:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1472361    

Description Pablo Caruana 2016-04-27 20:03:06 UTC
Description of problem:

For several days, we have had an issue with a high amount of failed heal entries in one of our Gluster nodes.
Enviroment running 2 nodes, 3 Gluster volumes replicated on the 2 nodes.  
gluster_volume_info:cluster.quorum-count: 1
gluster_volume_info:cluster.quorum-type: fixed

Version-Release number of selected component (if applicable):
 glusterfs-3.7.5-19.el7rhgs.x86_64
 glusterfs-api-3.7.5-19.el7rhgs.x86_64
 glusterfs-cli-3.7.5-19.el7rhgs.x86_64
 glusterfs-client-xlators-3.7.5-19.el7rhgs.x86_64
 glusterfs-fuse-3.7.5-19.el7rhgs.x86_64
 glusterfs-libs-3.7.5-19.el7rhgs.x86_64
 glusterfs-server-3.7.5-19.el7rhgs.x86_64
 pcp-pmda-gluster-3.10.6-2.el7.x86_64


For some reason  some files are being healed but constantly increasing the number on thousands potential files to health according to the statics. for some reason those are are not being healed.

Comment 3 Pablo Caruana 2016-04-27 20:22:16 UTC
Extending the 1st comment/description
all clients are using glusterfs.fuse with the acl mount option
the number of unhealed files are  increasing by thousands every day during at least 5 days.

Comment 20 Bipin Kunal 2017-08-30 09:12:44 UTC
Hello Ravi,

  From comment #17, it looks like we were not able to completely RCA this issue. Is there anything more which can be done with existing data?

  Apart from this, in comment #16, I had some questions. Could you please have a look and reply to that.

-Bipin Kunal

Comment 21 Bipin Kunal 2017-08-30 09:15:53 UTC
I even see that this bug was earlier marked as dependent on bug #1339765 but it was removed later removed, Any reason for that?

-Bipin Kunal