Bug 1256265
Summary: | Data Loss:Remove brick commit passing when remove-brick process has not even started(due to killing glusterd) | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Atin Mukherjee <amukherj> |
Component: | glusterd | Assignee: | Atin Mukherjee <amukherj> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.7.3 | CC: | bugs, gluster-bugs, kaushal, nbalacha, nchilaka, nsathyan, rhs-bugs, sabansal, storage-qa-internal |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.7.4 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | 1245045 | Environment: | |
Last Closed: | 2015-09-09 09:40:22 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1236038, 1245045 | ||
Bug Blocks: |
Comment 1
Anand Avati
2015-08-24 08:47:07 UTC
COMMIT: http://review.gluster.org/11996 committed in release-3.7 by Kaushal M (kaushal) ------ commit f51ffaeda4c87b682b7865c26befd75fe1c8cb25 Author: Atin Mukherjee <amukherj> Date: Tue Jul 21 09:57:43 2015 +0530 glusterd: Don't allow remove brick start/commit if glusterd is down of the host of the brick Backport of http://review.gluster.org/#/c/11726/ remove brick stage blindly starts the remove brick operation even if the glusterd instance of the node hosting the brick is down. Operationally its incorrect and this could result into a inconsistent rebalance status across all the nodes as the originator of this command will always have the rebalance status to 'DEFRAG_NOT_STARTED', however when the glusterd instance on the other nodes comes up, will trigger rebalance and make the status to completed once the rebalance is finished. This patch fixes two things: 1. Add a validation in remove brick to check whether all the peers hosting the bricks to be removed are up. 2. Don't copy volinfo->rebal.dict from stale volinfo during restore as this might end up in a incosistent node_state.info file resulting into volume status command failure. Change-Id: Ia4a76865c05037d49eec5e3bbfaf68c1567f1f81 BUG: 1256265 Signed-off-by: Atin Mukherjee <amukherj> Reviewed-on: http://review.gluster.org/11726 Tested-by: NetBSD Build System <jenkins.org> Reviewed-by: N Balachandran <nbalacha> Reviewed-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/11996 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Kaushal M <kaushal> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.4, please open a new bug report. glusterfs-3.7.4 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/12496 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user |