Bug 1511293 - In distribute volume after glusterd restart, brick goes offline
Summary: In distribute volume after glusterd restart, brick goes offline
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: 3.13
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Atin Mukherjee
QA Contact:
URL:
Whiteboard:
Depends On: 1509845 1511301
Blocks: 1509102
TreeView+ depends on / blocked
 
Reported: 2017-11-09 05:36 UTC by Atin Mukherjee
Modified: 2018-01-23 21:37 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.13.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1509845
Environment:
Last Closed: 2018-01-23 21:37:19 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Atin Mukherjee 2017-11-09 05:50:29 UTC
Description of problem:
After glusterd restart on same node, brick goes offline.

Version-Release number of selected component (if applicable):
mainline

How reproducible:
3/3

Steps to Reproduce:
1. Created a distribute volume with 3 bricks of each node and start it.
2. Stopped glusterd on other two node and check the volume status where glusterd is running.
3. Restart glusterd on node where glusterd is running and check volume status.

Actual results:
Before restart glusterd

Status of volume: testvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.52:/bricks/brick0/testvol    49160     0          Y       17734
 
Task Status of Volume testvol
------------------------------------------------------------------------------
There are no active volume tasks

After restart glusterd

Status of volume: testvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.52:/bricks/brick0/testvol    N/A       N/A        N       N/A  
 
Task Status of Volume testvol
------------------------------------------------------------------------------
There are no active volume tasks


Expected results:
Brick must be online after restart glusterd.

Additional info:
Glusterd is stopped on other two nodes.

--- Additional comment from Worker Ant on 2017-11-06 03:02:33 EST ---

REVIEW: https://review.gluster.org/18669 (glusterd: restart the brick if qorum status is NOT_APPLICABLE_QUORUM) posted (#1) for review on master by Atin Mukherjee

--- Additional comment from Worker Ant on 2017-11-09 00:11:06 EST ---

COMMIT: https://review.gluster.org/18669 committed in master by  

------------- glusterd: restart the brick if qorum status is NOT_APPLICABLE_QUORUM

If a volume is not having server quorum enabled and in a trusted storage
pool all the glusterd instances from other peers are down, on restarting
glusterd the brick start trigger doesn't happen resulting into the
brick not coming up.

Change-Id: If1458e03b50a113f1653db553bb2350d11577539
BUG: 1509845
Signed-off-by: Atin Mukherjee <amukherj>

Comment 2 Worker Ant 2017-11-09 05:52:00 UTC
REVIEW: https://review.gluster.org/18700 (glusterd: restart the brick if qorum status is NOT_APPLICABLE_QUORUM) posted (#1) for review on release-3.13 by Atin Mukherjee

Comment 3 Worker Ant 2017-11-14 15:33:16 UTC
COMMIT: https://review.gluster.org/18700 committed in release-3.13 by \"Atin Mukherjee\" <amukherj> with a commit message- glusterd: restart the brick if qorum status is NOT_APPLICABLE_QUORUM

If a volume is not having server quorum enabled and in a trusted storage
pool all the glusterd instances from other peers are down, on restarting
glusterd the brick start trigger doesn't happen resulting into the
brick not coming up.

> mainline patch : https://review.gluster.org/#/c/18669/

Change-Id: If1458e03b50a113f1653db553bb2350d11577539
BUG: 1511293
Signed-off-by: Atin Mukherjee <amukherj>
(cherry picked from commit 635c1c3691a102aa658cf1219fa41ca30dd134ba)

Comment 4 Shyamsundar 2017-12-08 17:45:38 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report.

glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html
[2] https://www.gluster.org/pipermail/gluster-users/

Comment 5 Worker Ant 2018-01-05 08:23:29 UTC
REVIEW: https://review.gluster.org/19147 (glusterd: connect to an existing brick process when qourum status is NOT_APPLICABLE_QUORUM) posted (#1) for review on release-3.13 by Atin Mukherjee

Comment 6 Worker Ant 2018-01-09 13:59:11 UTC
COMMIT: https://review.gluster.org/19147 committed in release-3.13 by \"Atin Mukherjee\" <amukherj> with a commit message- glusterd: connect to an existing brick process when qourum status is NOT_APPLICABLE_QUORUM

First of all, this patch reverts commit 635c1c3 as the same is causing a
regression with bricks not coming up on time when a node is rebooted.
This patch tries to fix the problem in a different way by just trying to
connect to an existing running brick when quorum status is not
applicable.

>mainline patch : https://review.gluster.org/#/c/19134/

Change-Id: I0efb5901832824b1c15dcac529bffac85173e097
BUG: 1511293
Signed-off-by: Atin Mukherjee <amukherj>

Comment 7 Shyamsundar 2018-01-23 21:37:19 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.2, please open a new bug report.

glusterfs-3.13.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-January/000089.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.