+++ This bug was initially created as a clone of Bug #1306667 +++ Description of problem: ======================= Had a four node cluster and created a Distributed volume using one brick and enabled the server quorum and server-quorum ratio was 90, and stopped glusterd in one of the node to achieve server quorum not met condition based on server quorum ratio set and started the volume, *it's started and bricks are online** Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.7.5-19.el7rhgs How reproducible: ================= Every time Steps to Reproduce: =================== 1.Have 4 node cluster (node-1..4) 2.Create a simple distributed volume using one brick 3.Enabled the server quorum 4.Set the server quorum ration to 90 5.Stop glusterd on one of the node (eg :node-4) 6.Try to start the volume now //will start and bricks will be online Actual results: =============== Bricks are online when server quorum not met Expected results: ================= Bricks should be in offline when server quorum not met Additional info: --- Additional comment from Red Hat Bugzilla Rules Engine on 2016-02-11 09:41:38 EST --- This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. If this bug should be proposed for a different release, please manually change the proposed release flag. --- Additional comment from Byreddy on 2016-02-11 09:43:57 EST --- Console log for reference ========================== [root@dhcp42-67 ~]# gluster peer status Number of Peers: 3 Hostname: 10.70.43.107 Uuid: df468eed-713c-46b2-8136-81d9f7835c0a State: Peer in Cluster (Connected) Hostname: 10.70.42.185 Uuid: 6ec3558a-3b11-469d-b4d6-6f2e516a2706 State: Peer in Cluster (Connected) Hostname: 10.70.42.62 Uuid: d520b270-dd3b-4cc7-a1e4-f7be5cf4677b State: Peer in Cluster (Connected) [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster volume create Dis 10.70.42.67:/bricks/brick0/az0 volume create: Dis: success: please start the volume to access data [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster volume set Dis cluster.server-quorum-type server volume set: success [root@dhcp42-67 ~]# gluster volume info Volume Name: Dis Type: Distribute Volume ID: 94443936-9265-4646-b666-64fafcb01e1d Status: Created Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.70.42.67:/bricks/brick0/az0 Options Reconfigured: cluster.server-quorum-type: server performance.readdir-ahead: on cluster.server-quorum-ratio: 90 [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster peer status Number of Peers: 3 Hostname: 10.70.43.107 Uuid: df468eed-713c-46b2-8136-81d9f7835c0a State: Peer in Cluster (Connected) Hostname: 10.70.42.185 Uuid: 6ec3558a-3b11-469d-b4d6-6f2e516a2706 State: Peer in Cluster (Connected) Hostname: 10.70.42.62 Uuid: d520b270-dd3b-4cc7-a1e4-f7be5cf4677b State: Peer in Cluster (Disconnected) [root@dhcp42-67 ~]# gluster volume start Dis volume start: Dis: success [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster volume status Status of volume: Dis Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.42.67:/bricks/brick0/az0 49213 0 Y 14586 NFS Server on localhost 2049 0 Y 14606 NFS Server on 10.70.43.107 2049 0 Y 31819 NFS Server on 10.70.42.185 2049 0 Y 14282 Task Status of Volume Dis ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster volume stop Dis Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: Dis: failed: Quorum not met. Volume operation not allowed. [root@dhcp42-67 ~]# --- Additional comment from Atin Mukherjee on 2016-02-11 10:59:00 EST --- Gaurav, Could you please check this issue? ~Atin --- Additional comment from Atin Mukherjee on 2016-02-11 12:59:04 EST --- This looks like a regression caused by http://review.gluster.org/12718 --- Additional comment from Byreddy on 2016-02-12 00:28:00 EST --- Yes Atin, this is Regression issue, I just verified the scenario using 3.1.1 build, there it's working as per the expectation. below is the console log [root@dhcp42-67 ~]# gluster peer status Number of Peers: 3 Hostname: 10.70.43.107 Uuid: 74da4065-4c5a-4e8d-ba69-8d1bf3ae3e8b State: Peer in Cluster (Connected) Hostname: 10.70.42.185 Uuid: 7cb8e3da-e56b-488f-a84c-afa74b2ddda0 State: Peer in Cluster (Connected) Hostname: 10.70.42.62 Uuid: 32b464a0-1c93-46a3-ad76-42a150266d42 State: Peer in Cluster (Connected) [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster volume set all cluster.server-quorum-ratio 90 volume set: success [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster volume create Dis 10.70.42.67:/bricks/brick0/aj0 volume create: Dis: success: please start the volume to access data [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster volume set Dis cluster.server-quorum-type server volume set: success [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster volume info Volume Name: Dis Type: Distribute Volume ID: bfbc78ce-4303-4743-839d-ffe5dfca4863 Status: Created Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.70.42.67:/bricks/brick0/aj0 Options Reconfigured: cluster.server-quorum-type: server performance.readdir-ahead: on cluster.server-quorum-ratio: 90 [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster peer status Number of Peers: 3 Hostname: 10.70.43.107 Uuid: 74da4065-4c5a-4e8d-ba69-8d1bf3ae3e8b State: Peer in Cluster (Connected) Hostname: 10.70.42.185 Uuid: 7cb8e3da-e56b-488f-a84c-afa74b2ddda0 State: Peer in Cluster (Connected) Hostname: 10.70.42.62 Uuid: 32b464a0-1c93-46a3-ad76-42a150266d42 State: Peer in Cluster (Disconnected) [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# gluster volume start Dis volume start: Dis: failed: Quorum not met. Volume operation not allowed. [root@dhcp42-67 ~]# [root@dhcp42-67 ~]# rpm -qa |grep gluster glusterfs-client-xlators-3.7.1-16.el7rhgs.x86_64 glusterfs-rdma-3.7.1-16.el7rhgs.x86_64 glusterfs-libs-3.7.1-16.el7rhgs.x86_64 glusterfs-3.7.1-16.el7rhgs.x86_64 glusterfs-api-3.7.1-16.el7rhgs.x86_64 glusterfs-fuse-3.7.1-16.el7rhgs.x86_64 glusterfs-cli-3.7.1-16.el7rhgs.x86_64 nfs-ganesha-gluster-2.2.0-9.el7rhgs.x86_64 glusterfs-geo-replication-3.7.1-16.el7rhgs.x86_64 glusterfs-server-3.7.1-16.el7rhgs.x86_64 glusterfs-ganesha-3.7.1-16.el7rhgs.x86_64 [root@dhcp42-67 ~]# --- Additional comment from Byreddy on 2016-02-12 00:29:11 EST --- Marking this bug as Regression failed based on above details --- Additional comment from RHEL Product and Program Management on 2016-02-12 00:32:29 EST --- This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. --- Additional comment from Atin Mukherjee on 2016-02-12 02:14:44 EST --- Although its a regression and starting a volume when quorum is not met can increase the probability of data split brains, historically server side quorum doesn't guarantee that split brains can never happen. Also enabling server side quorum as a recommendation is no where documented. So I hardly believe customer uses it in production. Considering all these parameters and the stage we are in for the release, my vote would be to mark it as a known issue. --- Additional comment from Laura Bailey on 2016-02-14 21:55:02 EST ---
REVIEW: http://review.gluster.org/13442 (glusterd: volume should not start when server quorum is not met) posted (#1) for review on master by Gaurav Kumar Garg (ggarg)
REVIEW: http://review.gluster.org/13442 (glusterd: volume should not start when server quorum is not met) posted (#2) for review on master by Gaurav Kumar Garg (ggarg)
REVIEW: http://review.gluster.org/13442 (glusterd: volume should not start when server quorum is not met) posted (#3) for review on master by Gaurav Kumar Garg (ggarg)
COMMIT: http://review.gluster.org/13442 committed in master by Atin Mukherjee (amukherj) ------ commit 62db11fa017004aa6cb1d91ec6b0117ac3e96a13 Author: Gaurav Kumar Garg <garg.gaurav52> Date: Mon Feb 15 10:48:18 2016 +0530 glusterd: volume should not start when server quorum is not met Currently when server quorum is not met then upon executing # gluster volume start [force] command its starting the volume. With this patch if server side quorum is not met then it will prevent starting of the volume. Change-Id: I39734b2dcf8e90c3c68bf2762d8350aecc82cc38 BUG: 1308402 Signed-off-by: Gaurav Kumar Garg <ggarg> Reviewed-on: http://review.gluster.org/13442 Smoke: Gluster Build System <jenkins.com> Reviewed-by: Atin Mukherjee <amukherj> CentOS-regression: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user