Bug 861539

Summary: volume status is unresponsive when one of the servers is down
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Sachidananda Urs <sac>
Component: glusterdAssignee: Kaushal <kaushal>
Status: CLOSED DUPLICATE QA Contact: Sachidananda Urs <surs>
Severity: unspecified Docs Contact:
Priority: high    
Version: 2.0CC: amarts, rhs-bugs, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-26 06:44:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sachidananda Urs 2012-09-29 05:14:24 UTC
When one of the servers in the cluster is brought down, `gluster volume status' throws message like:

=====================

[root@rhs-client23 ~]# gluster volume status
operation failed
 
Failed to get names of volumes

=====================

for a long time. This is quite frustrating. At this point no other glusterd operations are happening from command line

corresponding logs:


[2012-09-29 01:11:15.727802] I [glusterd-handler.c:1379:glusterd_handle_cluster_unlock] 0-glusterd: Received UNLOCK from uuid: 5b315725-90dd-41f9-abe8-827d27db8210
[2012-09-29 01:11:15.727843] I [glusterd-handler.c:1355:glusterd_op_unlock_send_resp] 0-glusterd: Responded to unlock, ret: 0
[2012-09-29 01:12:20.452403] I [glusterd-utils.c:285:glusterd_lock] 0-glusterd: Cluster lock held by 230ae9f2-310e-49a6-b9f6-440bb5962da3
[2012-09-29 01:12:20.452453] I [glusterd-handler.c:452:glusterd_op_txn_begin] 0-management: Acquired local lock
[2012-09-29 01:12:20.452885] I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: b7f33530-25c1-406c-8c76-2c5feabaf7b0
[2012-09-29 01:12:20.452932] I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: 772396e0-ccae-4b64-99f9-84f7e836d101
[2012-09-29 01:12:23.766040] W [socket.c:195:__socket_rwv] 0-socket.management: readv failed (Connection timed out)
[2012-09-29 01:13:18.496767] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: Unable to get lock for uuid: 230ae9f2-310e-49a6-b9f6-440bb5962da3, lock held by: 230ae9f2-310e-49a6-b9f6-440bb5962da3
[2012-09-29 01:13:18.496819] E [glusterd-handler.c:447:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1
[2012-09-29 01:13:21.313095] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: Unable to get lock for uuid: 230ae9f2-310e-49a6-b9f6-440bb5962da3, lock held by: 230ae9f2-310e-49a6-b9f6-440bb5962da3
[2012-09-29 01:13:21.313138] E [glusterd-handler.c:447:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1
[2012-09-29 01:13:57.014643] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: Unable to get lock for uuid: 230ae9f2-310e-49a6-b9f6-440bb5962da3, lock held by: 230ae9f2-310e-49a6-b9f6-440bb5962da3
[2012-09-29 01:13:57.014695] E [glusterd-handler.c:447:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1
(END)

Comment 2 Sachidananda Urs 2012-09-29 06:04:01 UTC
Some more update on this issue:

In a 2x2 cluster, bring down one of the servers (down the interface or drop the packets).

Run `gluster volume status' (Takes forever to complete, get impatient and interrupt. Now gluster volume status never completes).

Comment 3 Sachidananda Urs 2012-09-29 11:27:57 UTC
You can simulate this by either ifconfing if down or by iptables -A OUTPUT -d servername -j DROP

Comment 4 Kaushal 2012-11-26 06:44:02 UTC

*** This bug has been marked as a duplicate of bug 860568 ***