Bug 1506513
Summary: | stale brick processes getting created and volume status shows brick as down(pkill glusterfsd glusterfs ,glusterd restart) | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Atin Mukherjee <amukherj> | |
Component: | glusterd | Assignee: | Atin Mukherjee <amukherj> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | mainline | CC: | amukherj, bmekala, bugs, nchilaka, rhs-bugs, storage-qa-internal, vbellur | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | brick-multiplexing | |||
Fixed In Version: | glusterfs-3.13.0 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1505363 | |||
: | 1508283 (view as bug list) | Environment: | ||
Last Closed: | 2017-12-08 17:45:00 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1505363, 1508283, 1526368 |
Comment 1
Atin Mukherjee
2017-10-26 09:19:31 UTC
REVIEW: https://review.gluster.org/18577 (glusterd: fix brick restart parallelism) posted (#1) for review on master by Atin Mukherjee (amukherj) REVIEW: https://review.gluster.org/18577 (glusterd: fix brick restart parallelism) posted (#2) for review on master by Atin Mukherjee (amukherj) REVIEW: https://review.gluster.org/18577 (glusterd: fix brick restart parallelism) posted (#3) for review on master by Atin Mukherjee (amukherj) COMMIT: https://review.gluster.org/18577 committed in master by ------------- glusterd: fix brick restart parallelism glusterd's brick restart logic is not always sequential as there is atleast three different ways how the bricks are restarted. 1. through friend-sm and glusterd_spawn_daemons () 2. through friend-sm and handling volume quorum action 3. through friend handshaking when there is a mimatch on quorum on friend import. In a brick multiplexing setup, glusterd ended up trying to spawn the same brick process couple of times as almost in fraction of milliseconds two threads hit glusterd_brick_start () because of which glusterd didn't have any choice of rejecting any one of them as for both the case brick start criteria met. As a solution, it'd be better to control this madness by two different flags, one is a boolean called start_triggered which indicates a brick start has been triggered and it continues to be true till a brick dies or killed, the second is a mutex lock to ensure for a particular brick we don't end up getting into glusterd_brick_start () more than once at same point of time. Change-Id: I292f1e58d6971e111725e1baea1fe98b890b43e2 BUG: 1506513 Signed-off-by: Atin Mukherjee <amukherj> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report. glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html [2] https://www.gluster.org/pipermail/gluster-users/ |