We’re currently seeing extremely high CPU utilization and breakdown in gluster communication between our two peers during self-heal operations using GlusterFS 3.3.0. The self-healing daemon is crawling the file system. The results appear very similar to these two bugs: http://dev.gluster.com/pipermail/glusterfs/2011-September/006149.html https://bugzilla.redhat.com/show_bug.cgi?id=812515 Have these been addressed in 3.3.0? Are there other issues that could be causing this behavior? We performed routine maintenance on Saturday the 18th, we have been seeing issues since the 20th. This involved: 1.) Shutting down one server 2.) Adding a phsyical drive array 3.) Start the server 4.) Allow it to replicate 5.) Perform same steps for the other server The new drive arrays are not online, nor initialized via the OS. We have 2 servers in this replica both are at 100% cpu utilization, 1 replica volume with 1 brick on each server. Gluster peer status shows both machines online but touching a zero byte file take more than 60 seconds to complete. This normally takes much less than a second. We verified that network connectivity is up during this time using ping and ssh.
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug. If there has been no update before 9 December 2014, this bug will get automatocally closed.