Description of problem: Created a 4 peer storage pool, & then created a distributed-replicate volume using all the 4 nodes. Next, set nfs.mem-factor to 20. It worked. Then, killed glusterd, glusterfsd glusterfs on one of the peers and ran, gluster peer status from another peer. It showed that the peer is disconnected. Next, started started glusterd again. Peer came back to connected state. Now running gluster volume set commands fails. Here are the glusterd logs: On the machine where glusterd was killed: [2012-03-05 06:35:40.817825] I [glusterd-utils.c:796:glusterd_volume_brickinfo_get] 0-management: Found brick [2012-03-05 06:35:40.819475] I [glusterd-handler.c:1413:glusterd_op_stage_send_resp] 0-glusterd: Responded to stage, ret: 0 [2012-03-05 06:35:59.548733] E [glusterd-utils.c:259:glusterd_lock] 0-glusterd: Unable to get lock for uuid: 2cfd617e-b00d-4802-82c1-fa5e7bce230d, lock held by: b0e23374-0cf8-4bd2-8072-9c8467efc762 On the machine from which all the set operations are done(basically the management node): [2012-03-05 06:35:41.364249] I [glusterd-op-sm.c:1728:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 3 peers [2012-03-05 06:35:41.373872] I [glusterd-rpc-ops.c:870:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 99d424fb-8e79-45e6-9a7f-d41967dccc98 [2012-03-05 06:35:41.374057] I [glusterd-rpc-ops.c:870:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 2cfd617e-b00d-4802-82c1-fa5e7bce230d [2012-03-05 06:39:50.885653] E [glusterd-utils.c:259:glusterd_lock] 0-glusterd: Unable to get lock for uuid: b0e23374-0cf8-4bd2-8072-9c8467efc762, lock held by: b0e23374-0cf8-4bd2-8072-9c8467efc762 [2012-03-05 06:39:50.885678] E [glusterd-handler.c:448:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1 [2012-03-05 06:35:59.548753] E [glusterd-handler.c:448:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1 Version-Release number of selected component (if applicable): gluster volume info output: Volume Name: testvol Type: Distributed-Replicate Volume ID: 9987eaa6-e538-4e80-a0ab-106381c2a4d4 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.1.11.151:/export/brick Brick2: 10.1.11.152:/export/brick Brick3: 10.1.11.153:/export/brick Brick4: 10.1.11.154:/export/brick Options Reconfigured: nfs.mem-factor: 20 gluster peer status output: Number of Peers: 3 Hostname: 10.1.11.152 Uuid: 99d424fb-8e79-45e6-9a7f-d41967dccc98 State: Peer in Cluster (Connected) Hostname: 10.1.11.153 Uuid: 2cfd617e-b00d-4802-82c1-fa5e7bce230d State: Peer in Cluster (Connected) Hostname: 10.1.11.154 Uuid: 6b9483ba-ef12-41a5-a8f5-48ab90eb3085 State: Peer in Cluster (Connected)
Doesn't happen on master right now. Rahul can you confirm?
Cannot reproduce this on latest master. Closing as works for me.