Bug 1056276

Summary: Self-Heal Daemon is consuming excessive CPU
Product: [Community] GlusterFS Reporter: Michael Webb <webb>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED DEFERRED QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: bugs, gluster-bugs
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-14 19:40:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michael Webb 2014-01-21 21:02:17 UTC
We’re currently seeing extremely high CPU utilization and breakdown in gluster communication between our two peers during self-heal operations using GlusterFS 3.3.0. The self-healing daemon is crawling the file system.

The results appear very similar to these two bugs:

http://dev.gluster.com/pipermail/glusterfs/2011-September/006149.html

https://bugzilla.redhat.com/show_bug.cgi?id=812515

Have these been addressed in 3.3.0?

Are there other issues that could be causing this behavior?

We performed routine maintenance on Saturday the 18th, we have been seeing issues since the 20th. This involved:

1.)	Shutting down one server
2.)	Adding a phsyical drive array
3.)	Start the server
4.)	Allow it to replicate
5.)	Perform same steps for the other server

The new drive arrays are not online, nor initialized via the OS.

We have 2 servers in this replica both are at 100% cpu utilization, 1 replica volume with 1 brick on each server. Gluster peer status shows both machines online but touching a zero byte file take more than 60 seconds to complete. This normally takes much less than a second. We verified that network connectivity is up during this time using ping and ssh.

Comment 1 Niels de Vos 2014-11-27 14:54:37 UTC
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.