Bug 1499509
Summary: | Brick Multiplexing: Gluster volume start force complains with command "Error : Request timed out" when there are multiple volumes | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Samikshan Bairagya <sbairagy> | |
Component: | glusterd | Assignee: | Sanju <srakonde> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | mainline | CC: | amukherj, bmekala, bugs, nchilaka, rhs-bugs, sbairagy, srakonde, storage-qa-internal, vbellur | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | brick-multiplexing | |||
Fixed In Version: | glusterfs-3.13.0 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1459895 | |||
: | 1501154 (view as bug list) | Environment: | ||
Last Closed: | 2017-12-08 17:43:00 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1459895, 1526373 |
Description
Samikshan Bairagya
2017-10-08 07:52:40 UTC
REVIEW: https://review.gluster.org/18444 (glusterd:disconnect event in brick multiplexing) posted (#4) for review on master by Sanju Rakonde (srakonde) REVIEW: https://review.gluster.org/18444 (glusterd:Marking all the brick status as stopped when a process goes down in brick multiplexing) posted (#5) for review on master by Sanju Rakonde (srakonde) REVIEW: https://review.gluster.org/18444 (glusterd:Marking all the brick status as stopped when a process goes down in brick multiplexing) posted (#6) for review on master by Sanju Rakonde (srakonde) REVIEW: https://review.gluster.org/18444 (glusterd:Marking all the brick status as stopped when a process goes down in brick multiplexing) posted (#7) for review on master by Sanju Rakonde (srakonde) REVIEW: https://review.gluster.org/18444 (glusterd:Marking all the brick status as stopped when a process goes down in brick multiplexing) posted (#8) for review on master by Sanju Rakonde (srakonde) COMMIT: https://review.gluster.org/18444 committed in master by Atin Mukherjee (amukherj) ------ commit 9422446d72bc054962d72ace9912ecb885946d49 Author: Sanju Rakonde <srakonde> Date: Sat Oct 7 03:33:40 2017 +0530 glusterd:Marking all the brick status as stopped when a process goes down in brick multiplexing In brick multiplexing environment, if a brick process goes down i.e., if we kill it with SIGKILL, the status of the brick for which the process came up for the first time is only changing to stopped. all other brick statuses are remain started. This is happening because the process was killed abruptly using SIGKILL signal and signal handler wasn't invoked and further cleanup wasn't triggered. When we try to start a volume using force, it shows error saying "Request timed out", since all the brickinfo->status are still in started state, we're waiting for one of the brick process to come up which never going to happen since the brick process was killed. To resolve this, In the disconnect event, We are checking all the processes that whether the brick which got disconnected belongs the process. Once we get the process we are calling a function named glusterd_mark_bricks_stopped_by_proc() and sending brick_proc_t object as an argument. From the glusterd_brick_proc_t we can get all the bricks attached to that process. but these are duplicated ones. To get the original brickinfo we are reading volinfo from brick. In volinfo we will have original brickinfo copies. We are changing brickinfo->status to stopped for all the bricks. Change-Id: Ifb9054b3ee081ef56b39b2903ae686984fe827e7 BUG: 1499509 Signed-off-by: Sanju Rakonde <srakonde> REVIEW: https://review.gluster.org/18499 (glusterd:Marking all the brick status as stopped when a process goes down in brick multiplexing) posted (#2) for review on release-3.12 by Sanju Rakonde (srakonde) This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report. glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html [2] https://www.gluster.org/pipermail/gluster-users/ |