Servers can become resource starved during self-heal events causing a performance impact to the clients. Allow a configurable iops cap to shd to allow less impact to the clients. Today, a large game developer was in IRC trying to track down a problem with his write performance. His hardware was more than adequate to keep up with his iops and throughput needs (HP Z420 with 8 SSDs in RAID 0 attached to a LSI Raid controller), but during a self-heal event, writes were noticeably slower. As part of his failure strategy, if a server fails, he replaces it with a new one and populates the new server via self-heal with 23 million files totalling 2.3TB. It is during this event that he experiences slow writes. If we had a way to limit the resources used by shd, we should be able to prevent this type of problem.
Nice timing of the bug Joe, Ravi is working on self-heal throttling feature to do this. Assigning the bug to him. Pranith
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life. Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS. If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.
From glusterfs-4.0.0 we have cgroups based scripts available to regulate the usage of CPU & memory of any gluster daemon processes. This is added by the patch https://review.gluster.org/#/c/glusterfs/+/18404/ and tracked using the BZ #1496335. Hence closing this bug as CURRENTRELEASE.