Bug 861560 - Quorum issues in case of network issues
Quorum issues in case of network issues
Status: CLOSED WONTFIX
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
2.0
Unspecified Unspecified
medium Severity unspecified
: ---
: ---
Assigned To: Bug Updates Notification Mailing List
storage-qa-internal@redhat.com
: FutureFeature
Depends On:
Blocks: 1066140
  Show dependency treegraph
 
Reported: 2012-09-29 03:25 EDT by Sachidananda Urs
Modified: 2015-03-23 03:40 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Sachidananda Urs 2012-09-29 03:25:59 EDT
Description of problem:

In case of a 2x2 setup and a quorum percentage of 100%, if a network disrupt happens between just machines. The quorum will be lost between only those two machines, and they will end up killing glusterfsd on their respective nodes. However the cluster remains operational in degraded state despite the quorum loss.

For example: Consider the setup


[root@rhs-client19 ~]# gluster volume info
 
Volume Name: quo
Type: Distributed-Replicate
Volume ID: 96852dd0-e8f6-48f8-94e2-ef80e8c70778
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: rhs-client19.lab.eng.blr.redhat.com:/home/A
Brick2: rhs-client20.lab.eng.blr.redhat.com:/home/B
Brick3: rhs-client21.lab.eng.blr.redhat.com:/home/C
Brick4: rhs-client23.lab.eng.blr.redhat.com:/home/D
Options Reconfigured:
cluster.server-quorum-type: server
cluster.server-quorum-ratio: 100
global-option-version: 35
[root@rhs-client19 ~]# 


[root@rhs-client19 ~]# gluster peer status
Number of Peers: 3

Hostname: rhs-client20.lab.eng.blr.redhat.com
Uuid: b7f33530-25c1-406c-8c76-2c5feabaf7b0
State: Peer in Cluster (Disconnected)

Hostname: rhs-client21.lab.eng.blr.redhat.com
Uuid: 5b315725-90dd-41f9-abe8-827d27db8210
State: Peer in Cluster (Connected)

Hostname: rhs-client23.lab.eng.blr.redhat.com
Uuid: 230ae9f2-310e-49a6-b9f6-440bb5962da3
State: Peer in Cluster (Connected)
[root@rhs-client19 ~]# 

===
[root@rhs-client20 ~]# gluster peer status
Number of Peers: 3

Hostname: rhs-client21.lab.eng.blr.redhat.com
Uuid: 5b315725-90dd-41f9-abe8-827d27db8210
State: Peer in Cluster (Connected)

Hostname: rhs-client23.lab.eng.blr.redhat.com
Uuid: 230ae9f2-310e-49a6-b9f6-440bb5962da3
State: Peer in Cluster (Connected)

Hostname: 10.70.36.43
Uuid: 772396e0-ccae-4b64-99f9-84f7e836d101
State: Peer in Cluster (Disconnected)
[root@rhs-client20 ~]# 

============
[root@rhs-client21 ~]# gluster peer status
Number of Peers: 3

Hostname: rhs-client23.lab.eng.blr.redhat.com
Uuid: 230ae9f2-310e-49a6-b9f6-440bb5962da3
State: Peer in Cluster (Connected)

Hostname: 10.70.36.43
Uuid: 772396e0-ccae-4b64-99f9-84f7e836d101
State: Peer in Cluster (Connected)

Hostname: rhs-client20.lab.eng.blr.redhat.com
Uuid: b7f33530-25c1-406c-8c76-2c5feabaf7b0
State: Peer in Cluster (Connected)
[root@rhs-client21 ~]# 

===============

[root@rhs-client23 ~]# gluster peer status
Number of Peers: 3

Hostname: rhs-client20.lab.eng.blr.redhat.com
Uuid: b7f33530-25c1-406c-8c76-2c5feabaf7b0
State: Peer in Cluster (Connected)

Hostname: rhs-client21.lab.eng.blr.redhat.com
Uuid: 5b315725-90dd-41f9-abe8-827d27db8210
State: Peer in Cluster (Connected)

Hostname: 10.70.36.43
Uuid: 772396e0-ccae-4b64-99f9-84f7e836d101
State: Peer in Cluster (Connected)
[root@rhs-client23 ~]# 

==================

Now in the above scenario, machines 19 and 20 are disconnected from each other. And quorum ratio is 100%. And the disconnect between 19 and 20 breaks the quorum. But the processes are killed only on 19 and 20, since the others don't see this disconnect.

I am not sure if this is a bug or a limitation. But, however the mount point is active with another pair serving.
Comment 2 Pranith Kumar K 2012-09-29 06:08:29 EDT
Sac,
  I discussed this bug with Vijay. We are in agreement that this situation can not be handled using current implementation.
Lets keep the bug open for now. We are not going to fix it for 2.0.z though.

Pranith.
Comment 5 Vivek Agarwal 2015-03-23 03:40:05 EDT
The product version of Red Hat Storage on which this issue was reported has reached End Of Life (EOL) [1], hence this bug report is being closed. If the issue is still observed on a current version of Red Hat Storage, please file a new bug report on the current version.







[1] https://rhn.redhat.com/errata/RHSA-2014-0821.html
Comment 6 Vivek Agarwal 2015-03-23 03:40:35 EDT
The product version of Red Hat Storage on which this issue was reported has reached End Of Life (EOL) [1], hence this bug report is being closed. If the issue is still observed on a current version of Red Hat Storage, please file a new bug report on the current version.







[1] https://rhn.redhat.com/errata/RHSA-2014-0821.html

Note You need to log in before you can comment on or make changes to this bug.