Description of problem: ======================= Currently when a gluster volume stop is performed, it looks for all the glusterd running on nodes and shuts their glusterfsd and makes the volume offline. But this design has one flaw in following scenario: Consider 4 node cluster forming 2*2 volume from one brick in each node. The volume is mounted on clinet(Fuse) and it is accessible. Read and Write from the client will be successful. Now stop the glusterd on 2 of the nodes (one from each replica server). Read and Write from the client will be successful. Now stop the volume from the online node which makes the volume stopped. But, Read and Write from the client will still be successful, it is because the glusterfsd process on nodes where the glusterd was made down are still online and gluster volume stop did not consider that. So from user point of view the volume is stopped and few nodes glusterd is also down , but still able to read and write from the fuse mount. gluster volume stop should be handled gracefully. One of the solution could be to introduce stop and stop force for user to know that the client could still access. 1. When issued "gluster volume stop", if any of the node/glusterd down for the bricks participating in volume, fail the volume stop with proper message 2. When issued "gluster volume stop force" than explicitly inform a user that volume is stopped but their could be some glusterfsd process on down nodes still online which will serve the mount. Do you wish to continue. Solution to this is debatable, but surely a user needs education that how the volume is accessible even gluster volume info shows the volume as stopped. Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.4.1.1.snap.feb17.2014git-1.el6.x86_64 found with snapshot build but issue is in general How reproducible: ================= 1/1 Steps to Reproduce: 1. Create and start volume (2*2) 2. Mount the volume (Fuse) 3. Create some data from client 4. Bring down 2 nodes glusterd (node 2 and node 4) 5. Stop the volume from node 1 6. Check volume info, the volume should be stopped. 7. Access the volume Actual results: =============== Volume is accessible, as the glusterfsd processes are down only on node 1 and node 3, but glusterfsd process of node 2 and node 4 is still online. Expected results: ================= user needs education that how the volume is accessible even gluster volume info shows the volume as stopped.
With GlusterD 2.0 this problem will go away since the transaction will be based on central store, the transaction itself will fail if the glusterd instance of the node which hosts any one of these brick is down.
Based on comment 2 closing this bug.