Bug 1294410
Summary: | Friend update floods can render the cluster incapable of handling other commands | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Atin Mukherjee <amukherj> |
Component: | glusterd | Assignee: | bugs <bugs> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.7.0 | CC: | amukherj, bugs, ggarg, kaushal, sasundar |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.7.7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | 1292749 | Environment: | |
Last Closed: | 2016-04-19 07:52:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1292749 | ||
Bug Blocks: | 1291386 |
Description
Atin Mukherjee
2015-12-28 05:39:56 UTC
REVIEW: http://review.gluster.org/13095 (glusterd: reduce friend update flood) posted (#1) for review on release-3.7 by Gaurav Kumar Garg (ggarg) COMMIT: http://review.gluster.org/13095 committed in release-3.7 by Atin Mukherjee (amukherj) ------ commit c0cc93dfe6fc63caeae9448dc689adcf13ea3aae Author: Gaurav Kumar Garg <garg.gaurav52> Date: Mon Dec 28 11:46:54 2015 +0530 glusterd: reduce friend update flood This patch is backport of: http://review.gluster.org/#/c/12999/ When in a befriended state, glusterd would broadcast friend updates to all other peers whenver a ACC or LOCAL_ACC event occurred. When a downed glusterd came back up and established connections again, this lead to a flood of friend updates to happen on the order of N^2 (N is the number of peers in the cluster) In larger clusters this was problematic, and could lead to very long times for the cluster to settle down when a peer came back up. Multiple peers coming back up at the same time would compound the problem. Broadcasting of friend updates doesn't have much use in places other that during a peer probe. Instead of broadcasting friend updates on connection re-establishment, updates can just be exchanged between the peers involved in the connection. This patch changes the glusterd friend state-machine to send updates only to the required peer for ACC or LOCAL_ACC events when in befriended state. The number of updates sent now is in the order of N. For a 10 node cluster, the number of updates reduced by 5 times. When creating the 10 node cluster, the updates reduced from ~500 to ~150. When a glusterd restarted, the number of exchanges reduced from ~160 to ~35. >> BUG: 1292749 >> Change-Id: Ib6072090c7069b081d018cdaa3dc878819ab1d18 >> Signed-off-by: Kaushal M <kaushal> >> Reviewed-on: http://review.gluster.org/12999 >> Reviewed-by: Atin Mukherjee <amukherj> >> Tested-by: NetBSD Build System <jenkins.org> >> Tested-by: Gluster Build System <jenkins.com> Change-Id: I389de2cc224f0ed627d98ae062209dd4f93e3b19 BUG: 1294410 Signed-off-by: Gaurav Kumar Garg <ggarg> Signed-off-by: Kaushal M <kaushal> Reviewed-on: http://review.gluster.org/13095 Reviewed-by: Atin Mukherjee <amukherj> Tested-by: NetBSD Build System <jenkins.org> Tested-by: Gluster Build System <jenkins.com> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.7, please open a new bug report. glusterfs-3.7.7 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://www.gluster.org/pipermail/gluster-users/2016-February/025292.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user |