Description of problem: If Volume info is changed when one of nodes in the storage pool is down, the changes are not updated when the node is restarted. Version-Release number of selected component (if applicable): mainline How reproducible: often Steps to Reproduce:- 1. gluster volume create <vol1> <brick1> <brick2> 2. gluster volume start <vol1> 3. bring down brick2. stop glusterd on node2 4. gluster volume stop <vol1> from node1 4. gluster volume delete <vol1> from node1 5. gluster volume create <vol1> <brick1> from node1 6. gluster volume start <vol1> from node1 7. restart glusterd on node2 8. gluster volume stop <vol1> from node1 Actual results: Reports: Volume dist is not in the started state Expected results: Update the changed volume info on node2 Additional info: Brick1 Output:- -------------- [root@APP-SERVER1 ~]# gluster volume info dist Volume Name: dist Type: Distribute Volume ID: 19957843-a792-4aa9-9eb9-391c658a4c50 Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 192.168.2.35:/export1 [root@APP-SERVER1 ~]# gluster volume stop dist Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y Volume dist is not in the started state [root@APP-SERVER1 ~]# ps -ef | grep gluster root 15681 1 0 15:46 ? 00:00:10 glusterd --xlator-option *.brick-with-valgrind=yes root 16119 1 0 17:27 ? 00:00:01 valgrind --leak-check=full --trace-children=yes --log-file=/usr/local/var/log/glusterfs/bricks/valgrnd-dist-export1.log /usr/local/sbin/glusterfsd -s localhost --volfile-id dist.192.168.2.35.export1 -p /etc/glusterd/vols/dist/run/192.168.2.35-export1.pid -S /tmp/3afd5d59b69937e399360b0ffdc33f49.socket --brick-name /export1 -l /usr/local/var/log/glusterfs/bricks/export1.log --xlator-option *-posix.glusterd-uuid=86ac1931-86d1-4fbd-a4b0-1ba0c71c5fd1 --brick-port 24013 --xlator-option dist-server.listen-port=24013 root 16200 1 0 17:30 ? 00:00:02 valgrind --leak-check=full --trace-children=yes --log-file=/usr/local/var/log/glusterfs/valgrnd-nfs.log /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid -l /usr/local/var/log/glusterfs/nfs.log root 16261 16236 0 17:33 pts/1 00:00:00 tail -f /usr/local/var/log/glusterfs/usr-local-etc-glusterfs-glusterd.vol.log Brick2 Output:- ------------------ [root@APP-SERVER2 ~]# gluster volume info dist Volume Name: dist Type: Distribute Volume ID: 5cbb49f6-c6bf-42ca-aa39-8ac9cad1bc3a Status: Stopped Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: 192.168.2.36:/export1 Brick2: 192.168.2.35:/export1 Options Reconfigured: diagnostics.brick-log-level: DEBUG diagnostics.client-log-level: DEBUG cluster.data-self-heal-algorithm: full Brick1 glusterd log:- ------------------------- [2012-02-27 17:34:35.757930] I [glusterd-volume-ops.c:353:glusterd_handle_cli_stop_volume] 0-glusterd: Received stop vol reqfor volume dist [2012-02-27 17:34:35.758044] I [glusterd-utils.c:267:glusterd_lock] 0-glusterd: Cluster lock held by 86ac1931-86d1-4fbd-a4b0-1ba0c71c5fd1 [2012-02-27 17:34:35.758074] I [glusterd-handler.c:453:glusterd_op_txn_begin] 0-management: Acquired local lock [2012-02-27 17:34:35.758704] I [glusterd-rpc-ops.c:541:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: 24728c9b-a248-40d0-9ec8-9941748cd2f4 [2012-02-27 17:34:35.758879] I [glusterd-op-sm.c:1725:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 1 peers [2012-02-27 17:34:35.759370] I [glusterd-rpc-ops.c:870:glusterd3_1_stage_op_cbk] 0-glusterd: Received RJT from uuid: 24728c9b-a248-40d0-9ec8-9941748cd2f4 [2012-02-27 17:34:35.759757] I [glusterd-rpc-ops.c:600:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: 24728c9b-a248-40d0-9ec8-9941748cd2f4 [2012-02-27 17:34:35.759808] I [glusterd-op-sm.c:2107:glusterd_op_txn_complete] 0-glusterd: Cleared local lock Brick2 glusterd log:- ------------------------ [2012-02-27 17:34:35.758394] I [glusterd-handler.c:492:glusterd_handle_cluster_lock] 0-glusterd: Received LOCK from uuid: 86ac1931-86d1-4fbd-a4b0-1ba0c71c5fd1 [2012-02-27 17:34:35.758482] I [glusterd-utils.c:267:glusterd_lock] 0-glusterd: Cluster lock held by 86ac1931-86d1-4fbd-a4b0-1ba0c71c5fd1 [2012-02-27 17:34:35.758618] I [glusterd-handler.c:1311:glusterd_op_lock_send_resp] 0-glusterd: Responded, ret: 0 [2012-02-27 17:34:35.759059] I [glusterd-handler.c:537:glusterd_req_ctx_create] 0-glusterd: Received op from uuid: 86ac1931-86d1-4fbd-a4b0-1ba0c71c5fd1 [2012-02-27 17:34:35.759133] E [glusterd-volume-ops.c:884:glusterd_op_stage_stop_volume] 0-: Volume dist has not been started [2012-02-27 17:34:35.759199] E [glusterd-op-sm.c:2170:glusterd_op_ac_stage_op] 0-: Validate failed: -1 [2012-02-27 17:34:35.759295] I [glusterd-handler.c:1413:glusterd_op_stage_send_resp] 0-glusterd: Responded to stage, ret: 0 [2012-02-27 17:34:35.759583] I [glusterd-handler.c:1355:glusterd_handle_cluster_unlock] 0-glusterd: Received UNLOCK from uuid: 86ac1931-86d1-4fbd-a4b0-1ba0c71c5fd1 [2012-02-27 17:34:35.759670] I [glusterd-handler.c:1331:glusterd_op_unlock_send_resp] 0-glusterd: Responded to unlock, ret: 0
CHANGE: http://review.gluster.com/3083 (glusterd: Added volume-id to 'op' dictionary) merged in master by Vijay Bellur (vijay)
verified on "master" . works fine.