1048765 – Remove-brick : 'gluster volume status' command fails when glusterd is killed before starting remove-brick and then brought back up.

Bug 1048765 - Remove-brick : 'gluster volume status' command fails when glusterd is killed before starting remove-brick and then brought back up.

Summary: Remove-brick : 'gluster volume status' command fails when glusterd is killed ...

Keywords:
Status:	CLOSED DEFERRED
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterfs
Sub Component:
Version:	2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Bug Updates Notification Mailing List
QA Contact:	storage-qa-internal@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1286146
TreeView+	depends on / blocked

Reported:	2014-01-06 10:22 UTC by Shruti Sampat
Modified:	2015-11-27 11:57 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1286146 (view as bug list)
Environment:
Last Closed:	2015-11-27 11:55:57 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Shruti Sampat 2014-01-06 10:22:50 UTC

Description of problem:
------------------------

glusterd is killed on a node, and then remove-brick is started on a volume which has bricks residing on the node where glusterd is brought down. After glusterd is brought up, subsequent volume status commands for that particular volume fail
with the message -

Commit failed on localhost. Please check the log file for more details.

From the logs -

[2014-01-06 15:54:51.430841] E [glusterd-op-sm.c:2021:_add_remove_bricks_to_dict] 0-management: Failed to get brick count
[2014-01-06 15:54:51.430914] E [glusterd-op-sm.c:2085:_add_task_to_dict] 0-management: Failed to add remove bricks to dict
[2014-01-06 15:54:51.430927] E [glusterd-op-sm.c:2170:glusterd_aggregate_task_status] 0-management: Failed to add task details to dict
[2014-01-06 15:54:51.430938] E [glusterd-op-sm.c:4037:glusterd_op_ac_commit_op] 0-management: Commit of operation 'Volume Status' failed: -22

Version-Release number of selected component (if applicable):
glusterfs 3.4.0.53rhs

How reproducible:
Observed once.

Steps to Reproduce:
1. Create a distributed-replicate volume ( 2x2, with one brick on each server in a 4-server cluster ), start and mount, create data on mount point.
2. Kill glusterd on node1 and node2 ( these hold bricks that form one replica pair )
3. Start remove-brick on the volume.
4. Start glusterd on node1 and node2.
5. Run 'gluster volume status' command for that volume on any of the nodes.

Actual results:
volume status command fails with the above described message.

Expected results:
volume status command should not fail.

Additional info:

Comment 3 Susant Kumar Palai 2015-11-27 11:55:57 UTC

Cloning this to 3.1. To be fixed in future.

Note You need to log in before you can comment on or make changes to this bug.