Bug 799910

Summary: [7fec9b41d8e1befa8d58a76d98207debddd60b65]: volume set options not working after a glusterd stop/start.
Product: [Community] GlusterFS Reporter: Rahul C S <rahulcs>
Component: glusterdAssignee: Kaushal <kaushal>
Status: CLOSED WORKSFORME QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: pre-releaseCC: gluster-bugs, vinaraya
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-16 09:19:19 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Rahul C S 2012-03-05 06:56:45 EST
Description of problem:
Created a 4 peer storage pool, & then created a distributed-replicate volume using all the 4 nodes. Next, set nfs.mem-factor to 20. It worked. 

Then, killed glusterd, glusterfsd glusterfs on one of the peers and ran, gluster peer status from another peer. It showed that the peer is disconnected.

Next, started started glusterd again. Peer came back to connected state. Now running gluster volume set commands fails. Here are the glusterd logs:

On the machine where glusterd was killed:
[2012-03-05 06:35:40.817825] I [glusterd-utils.c:796:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-03-05 06:35:40.819475] I [glusterd-handler.c:1413:glusterd_op_stage_send_resp] 0-glusterd: Responded to stage, ret: 0
[2012-03-05 06:35:59.548733] E [glusterd-utils.c:259:glusterd_lock] 0-glusterd: Unable to get lock for uuid: 2cfd617e-b00d-4802-82c1-fa5e7bce230d, lock held by: b0e23374-0cf8-4bd2-8072-9c8467efc762

On the machine from which all the set operations are done(basically the management node):
[2012-03-05 06:35:41.364249] I [glusterd-op-sm.c:1728:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 3 peers
[2012-03-05 06:35:41.373872] I [glusterd-rpc-ops.c:870:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 99d424fb-8e79-45e6-9a7f-d41967dccc98
[2012-03-05 06:35:41.374057] I [glusterd-rpc-ops.c:870:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 2cfd617e-b00d-4802-82c1-fa5e7bce230d
[2012-03-05 06:39:50.885653] E [glusterd-utils.c:259:glusterd_lock] 0-glusterd: Unable to get lock for uuid: b0e23374-0cf8-4bd2-8072-9c8467efc762, lock held by: b0e23374-0cf8-4bd2-8072-9c8467efc762
[2012-03-05 06:39:50.885678] E [glusterd-handler.c:448:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1
[2012-03-05 06:35:59.548753] E [glusterd-handler.c:448:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1
Version-Release number of selected component (if applicable):


gluster volume info output:
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 9987eaa6-e538-4e80-a0ab-106381c2a4d4
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.1.11.151:/export/brick
Brick2: 10.1.11.152:/export/brick
Brick3: 10.1.11.153:/export/brick
Brick4: 10.1.11.154:/export/brick
Options Reconfigured:
nfs.mem-factor: 20

gluster peer status output:
Number of Peers: 3

Hostname: 10.1.11.152
Uuid: 99d424fb-8e79-45e6-9a7f-d41967dccc98
State: Peer in Cluster (Connected)

Hostname: 10.1.11.153
Uuid: 2cfd617e-b00d-4802-82c1-fa5e7bce230d
State: Peer in Cluster (Connected)

Hostname: 10.1.11.154
Uuid: 6b9483ba-ef12-41a5-a8f5-48ab90eb3085
State: Peer in Cluster (Connected)
Comment 1 Kaushal 2012-03-28 02:48:40 EDT
Doesn't happen on master right now. Rahul can you confirm?
Comment 3 Kaushal 2012-04-16 09:19:19 EDT
Cannot reproduce this on latest master. Closing as works for me.