Bug 1245045 - Data Loss:Remove brick commit passing when remove-brick process has not even started(due to killing glusterd)
Summary: Data Loss:Remove brick commit passing when remove-brick process has not even ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: Atin Mukherjee
QA Contact:
URL:
Whiteboard:
Depends On: 1236038
Blocks: 1256265
TreeView+ depends on / blocked
 
Reported: 2015-07-21 06:34 UTC by Atin Mukherjee
Modified: 2016-06-16 13:25 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of: 1236038
: 1256265 (view as bug list)
Environment:
Last Closed: 2016-06-16 13:25:33 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Anand Avati 2015-07-21 08:43:11 UTC
REVIEW: http://review.gluster.org/11726 (glusterd: Don't allow remove brick start if glusterd is down of the host of the brick) posted (#1) for review on master by Atin Mukherjee (amukherj)

Comment 2 Anand Avati 2015-08-11 09:55:13 UTC
REVIEW: http://review.gluster.org/11726 (glusterd: Don't allow remove brick start if glusterd is down of the host of the brick) posted (#2) for review on master by Atin Mukherjee (amukherj)

Comment 3 Anand Avati 2015-08-12 04:08:31 UTC
REVIEW: http://review.gluster.org/11726 (glusterd: Don't allow remove brick commit if glusterd is down of the host of the brick) posted (#3) for review on master by Atin Mukherjee (amukherj)

Comment 4 Anand Avati 2015-08-14 09:12:03 UTC
REVIEW: http://review.gluster.org/11726 (glusterd: Don't allow remove brick commit if glusterd is down of the host of the brick) posted (#4) for review on master by Atin Mukherjee (amukherj)

Comment 5 Anand Avati 2015-08-19 04:10:08 UTC
REVIEW: http://review.gluster.org/11726 (glusterd: Don't allow remove brick commit if glusterd is down of the host of the brick) posted (#5) for review on master by Atin Mukherjee (amukherj)

Comment 6 Anand Avati 2015-08-19 05:45:05 UTC
REVIEW: http://review.gluster.org/11726 (glusterd: Don't allow remove brick commit if glusterd is down of the host of the brick) posted (#6) for review on master by Atin Mukherjee (amukherj)

Comment 7 Anand Avati 2015-08-20 08:29:26 UTC
REVIEW: http://review.gluster.org/11726 (glusterd: Don't allow remove brick commit if glusterd is down of the host of the brick) posted (#7) for review on master by Atin Mukherjee (amukherj)

Comment 8 Anand Avati 2015-08-24 07:18:49 UTC
REVIEW: http://review.gluster.org/11726 (glusterd: Don't allow remove brick start/commit if glusterd is down of the host of the brick) posted (#8) for review on master by Atin Mukherjee (amukherj)

Comment 9 Anand Avati 2015-08-24 08:30:29 UTC
REVIEW: http://review.gluster.org/11726 (glusterd: Don't allow remove brick start/commit if glusterd is down of the host of the brick) posted (#9) for review on master by Atin Mukherjee (amukherj)

Comment 10 Anand Avati 2015-08-25 09:49:36 UTC
REVIEW: http://review.gluster.org/11726 (glusterd: Don't allow remove brick start/commit if glusterd is down of the host of the brick) posted (#11) for review on master by Atin Mukherjee (amukherj)

Comment 11 Anand Avati 2015-08-26 07:06:01 UTC
COMMIT: http://review.gluster.org/11726 committed in master by Krishnan Parthasarathi (kparthas) 
------
commit c9d462dc8c1250c3f3f42ca149bb062fe690335b
Author: Atin Mukherjee <amukherj>
Date:   Tue Jul 21 09:57:43 2015 +0530

    glusterd: Don't allow remove brick start/commit if glusterd is down of the host of the brick
    
    remove brick stage blindly starts the remove brick operation even if the
    glusterd instance of the node hosting the brick is down. Operationally its
    incorrect and this could result into a inconsistent rebalance status across all
    the nodes as the originator of this command will always have the rebalance
    status to 'DEFRAG_NOT_STARTED', however when the glusterd instance on the other
    nodes comes up, will trigger rebalance and make the status to completed once the
    rebalance is finished.
    
    This patch fixes two things:
    1. Add a validation in remove brick to check whether all the peers hosting the
    bricks to be removed are up.
    
    2. Don't copy volinfo->rebal.dict from stale volinfo during restore as this
    might end up in a incosistent node_state.info file resulting into volume status
    command failure.
    
    Change-Id: Ia4a76865c05037d49eec5e3bbfaf68c1567f1f81
    BUG: 1245045
    Signed-off-by: Atin Mukherjee <amukherj>
    Reviewed-on: http://review.gluster.org/11726
    Tested-by: NetBSD Build System <jenkins.org>
    Reviewed-by: N Balachandran <nbalacha>
    Reviewed-by: Krishnan Parthasarathi <kparthas>

Comment 12 Nagaprasad Sathyanarayana 2015-10-25 14:46:15 UTC
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.

Comment 13 Niels de Vos 2016-06-16 13:25:33 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.