Description of problem: ------------------------ glusterd is killed on a node, and then remove-brick is started on a volume which has bricks residing on the node where glusterd is brought down. After glusterd is brought up, subsequent volume status commands for that particular volume fail with the message - Commit failed on localhost. Please check the log file for more details. From the logs - [2014-01-06 15:54:51.430841] E [glusterd-op-sm.c:2021:_add_remove_bricks_to_dict] 0-management: Failed to get brick count [2014-01-06 15:54:51.430914] E [glusterd-op-sm.c:2085:_add_task_to_dict] 0-management: Failed to add remove bricks to dict [2014-01-06 15:54:51.430927] E [glusterd-op-sm.c:2170:glusterd_aggregate_task_status] 0-management: Failed to add task details to dict [2014-01-06 15:54:51.430938] E [glusterd-op-sm.c:4037:glusterd_op_ac_commit_op] 0-management: Commit of operation 'Volume Status' failed: -22 Version-Release number of selected component (if applicable): glusterfs 3.4.0.53rhs How reproducible: Observed once. Steps to Reproduce: 1. Create a distributed-replicate volume ( 2x2, with one brick on each server in a 4-server cluster ), start and mount, create data on mount point. 2. Kill glusterd on node1 and node2 ( these hold bricks that form one replica pair ) 3. Start remove-brick on the volume. 4. Start glusterd on node1 and node2. 5. Run 'gluster volume status' command for that volume on any of the nodes. Actual results: volume status command fails with the above described message. Expected results: volume status command should not fail. Additional info:
Cloning this to 3.1. To be fixed in future.