Description of problem: Suppose there is a cluster with large number of files and directories. Now gluster volume status command is executed, and the request is sent to glusterfsd processes for obtaining the data. Now if glusterfsd does not respond in 30 minutes, then the frame which sent the request gets bailed out in glusterd. But it does not respond the reject to the source glusterd which originated the command. Thus the cluster lock held by the source glusterd will be there always and other operations fail since they cannot acquire the lock. Source glusterd logs: 2012-07-05 17:52:44.211023] I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 3 peers [2012-07-05 17:52:44.211237] I [glusterd-rpc-ops.c:880:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 92d50993-15a5-42af-92f3-a6ff7cfddd43 [2012-07-05 17:52:44.211284] I [glusterd-rpc-ops.c:880:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 0b17d7cf-c86a-4d82-929b-efb1ca2e331c [2012-07-05 17:52:44.211314] I [glusterd-rpc-ops.c:880:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: c6876b56-9729-4a98-8eea-fc9293cf92b0 [2012-07-05 17:52:44.216790] I [glusterd-op-sm.c:2384:glusterd_op_ac_send_commit_op] 0-management: Sent op req to 3 peers [2012-07-05 17:52:44.223543] I [glusterd-rpc-ops.c:1316:glusterd3_1_commit_op_cbk] 0-glusterd: Received ACC from uuid: 0b17d7cf-c86a-4d82-929b-efb1ca2e331c [2012-07-05 17:52:44.225221] I [glusterd-rpc-ops.c:1316:glusterd3_1_commit_op_cbk] 0-glusterd: Received ACC from uuid: c6876b56-9729-4a98-8eea-fc9293cf92b0 [2012-07-05 17:54:44.335825] I [glusterd-handler.c:2646:glusterd_handle_status_volume] 0-management: Received status volume req for volume new [2012-07-05 17:54:44.335879] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: Unable to get lock for uuid: adfa231a-d8e0-4d6b-bc11-ad29b987ace4, lock held by: adfa231a-d8e0-4d6b-bc11-ad29b987ace4 [2012-07-05 17:54:44.335901] E [glusterd-handler.c:453:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1 Destination glusterd logs: (call bail happened here) [2012-07-05 18:11:47.816910] E [rpc-clnt.c:208:call_bail] 0-management: bailing out frame type(brick operations) op(--(4)) x id = 0x1x sent = 2012-07-05 17:41:46.943808. timeout = 1800 [2012-07-05 23:37:15.490400] I [glusterd-handler.c:497:glusterd_handle_cluster_lock] 0-glusterd: Received LOCK from uuid: adfa231a-d8e0-4d6b-bc11-ad29b987ace4 [2012-07-05 23:37:15.490438] I [glusterd-utils.c:285:glusterd_lock] 0-glusterd: Cluster lock held by adfa231a-d8e0-4d6b-bc11-ad29b987ace4 [2012-07-05 23:37:15.490472] I [glusterd-handler.c:1315:glusterd_op_lock_send_resp] 0-glusterd: Responded, ret: 0 [2012-07-05 23:37:15.490906] I [glusterd-handler.c:542:glusterd_req_ctx_create] 0-glusterd: Received op from uuid: adfa231a-d8e0-4d6b-bc11-ad29b987ace4 [2012-07-05 23:37:15.490969] I [glusterd-handler.c:1417:glusterd_op_stage_send_resp] 0-glusterd: Responded to stage, ret: 0 [2012-07-05 23:37:15.492447] I [glusterd-handler.c:542:glusterd_req_ctx_create] 0-glusterd: Received op from uuid: adfa231a-d8e0-4d6b-bc11-ad29b987ace4 [2012-07-05 23:42:50.244253] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: Unable to get lock for uuid: 92d50993-15a5-42af-92f3-a6ff7cfddd43, lock held by: adfa231a-d8e0-4d6b-bc11-ad29b987ace4 [2012-07-05 23:42:50.244303] E [glusterd-handler.c:453:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1 [2012-07-05 23:42:57.927386] I [glusterd-handler.c:2646:glusterd_handle_status_volume] 0-management: Received status volume req for volume new [2012-07-05 23:42:57.927436] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: Unable to get lock for uuid: 92d50993-15a5-42af-92f3-a6ff7cfddd43, lock held by: adfa231a-d8e0-4d6b-bc11-ad29b987ace4 [2012-07-05 23:42:57.927457] E [glusterd-handler.c:453:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
CHANGE: http://review.gluster.com/3803 (rpc: Reduce frame-timeout for glusterd connections) merged in master by Vijay Bellur (vbellur)
CHANGE: http://review.gluster.com/3812 (rpc: Reduce frame-timeout for glusterd connections) merged in release-3.3 by Vijay Bellur (vbellur)
These are some comments on this problem and why it was solved this way. gluster uses different timeouts to make sure that a hung operation or an operation taking a long time to complete doesn't block other operations forever. The default timeout used to be 30min for glusterd and 2min for cli. This meant that cli would stop waiting for glusterd to reply and exit after 2 minutes, whereas glusterd would still keep waiting for 30 mins before cancelling the operation. Glusterd uses cluster wide locks to co-ordinate with other glusterds and keep consistency. Most of the volume operations use these locks. These locks are acquired at the beginning of the opeartion and released once the operation finishes. This would lead to problems with volume operations which take a long time to complete (like some of the 'volume status' operations). cli used to timeout after 2 mins. But glusterd would still continue to hold the locks till the operations completed or the timeout of 30 mins was reached (call_bail happens on timeout in glusterd). This lead to a potentially long window of 28 mins before any other operation could be initiated from cli. Most cli commands in this window will fail because glusterd will be unable to acquire the cluster lock. This is the same in the case of the bug report above. The truncated logs don't give the full picture. By going through the full logs we found that normal operations resumed after 30 mins as a call_bail leads to the locks being released, and the operations were only failing in this 30min window. The failure to obtain locks is because of concurrent execution of commands from different peers, not because of call_bail as stated in the bug title. The fix for this is to reduce the glusterd timeout, so that the window isn't as long hence leads to lesser waiting time before being able to successfully do volume operations once again.