When one of the servers in the cluster is brought down, `gluster volume status' throws message like: ===================== [root@rhs-client23 ~]# gluster volume status operation failed Failed to get names of volumes ===================== for a long time. This is quite frustrating. At this point no other glusterd operations are happening from command line corresponding logs: [2012-09-29 01:11:15.727802] I [glusterd-handler.c:1379:glusterd_handle_cluster_unlock] 0-glusterd: Received UNLOCK from uuid: 5b315725-90dd-41f9-abe8-827d27db8210 [2012-09-29 01:11:15.727843] I [glusterd-handler.c:1355:glusterd_op_unlock_send_resp] 0-glusterd: Responded to unlock, ret: 0 [2012-09-29 01:12:20.452403] I [glusterd-utils.c:285:glusterd_lock] 0-glusterd: Cluster lock held by 230ae9f2-310e-49a6-b9f6-440bb5962da3 [2012-09-29 01:12:20.452453] I [glusterd-handler.c:452:glusterd_op_txn_begin] 0-management: Acquired local lock [2012-09-29 01:12:20.452885] I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: b7f33530-25c1-406c-8c76-2c5feabaf7b0 [2012-09-29 01:12:20.452932] I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: 772396e0-ccae-4b64-99f9-84f7e836d101 [2012-09-29 01:12:23.766040] W [socket.c:195:__socket_rwv] 0-socket.management: readv failed (Connection timed out) [2012-09-29 01:13:18.496767] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: Unable to get lock for uuid: 230ae9f2-310e-49a6-b9f6-440bb5962da3, lock held by: 230ae9f2-310e-49a6-b9f6-440bb5962da3 [2012-09-29 01:13:18.496819] E [glusterd-handler.c:447:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1 [2012-09-29 01:13:21.313095] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: Unable to get lock for uuid: 230ae9f2-310e-49a6-b9f6-440bb5962da3, lock held by: 230ae9f2-310e-49a6-b9f6-440bb5962da3 [2012-09-29 01:13:21.313138] E [glusterd-handler.c:447:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1 [2012-09-29 01:13:57.014643] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: Unable to get lock for uuid: 230ae9f2-310e-49a6-b9f6-440bb5962da3, lock held by: 230ae9f2-310e-49a6-b9f6-440bb5962da3 [2012-09-29 01:13:57.014695] E [glusterd-handler.c:447:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1 (END)
Some more update on this issue: In a 2x2 cluster, bring down one of the servers (down the interface or drop the packets). Run `gluster volume status' (Takes forever to complete, get impatient and interrupt. Now gluster volume status never completes).
You can simulate this by either ifconfing if down or by iptables -A OUTPUT -d servername -j DROP
*** This bug has been marked as a duplicate of bug 860568 ***