Bug 1112250 - [SNAPSHOT]: On attaching a new node to the cluster while snapshot create was in progress , one of the snapshots failed with "glusterd quorum not met"
Summary: [SNAPSHOT]: On attaching a new node to the cluster while snapshot create was ...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: snapshot
Version: rhgs-3.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Avra Sengupta
QA Contact:
URL:
Whiteboard: SNAPSHOT
Depends On:
Blocks: 1085278 1114403 1216951
TreeView+ depends on / blocked
 
Reported: 2014-06-23 12:24 UTC by senaik
Modified: 2016-09-17 12:52 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Probing/detaching a new peer during any snapshot operation is not supported.
Clone Of:
: 1114403 (view as bug list)
Environment:
Last Closed: 2016-01-29 13:41:11 UTC
Embargoed:


Attachments (Terms of Use)

Description senaik 2014-06-23 12:24:10 UTC
Description of problem:
======================
On attaching a new node to the cluster while snapshot create was in progress , one of the snapshots failed with "glusterd quorum not met"

Version-Release number of selected component (if applicable):
===========================================================
 glusterfs 3.6.0.20 built on Jun 19 2014

How reproducible:
================
1/1


Steps to Reproduce:
===================
I got the following error message while attaching a new node to the cluster while snapshot create was in progress 

snapshot create: success: Snap snap4 created successfully
snapshot create: failed: glusterds are not in quorum
Snapshot command failed
snapshot create: success: Snap snap6 created successfully

All glusterds were up and running on the nodes , but still we get the message that glusterd quorum is not met. 

----------------Part of log---------------------

name:snapshot15.lab.eng.blr.redhat.com
[2014-06-23 06:03:31.887252] I [glusterd-handler.c:2522:__glusterd_handle_friend_update] 0-: Received uuid: 7e97d0f0-8ae9-40eb-b822-952cc5a8dc46, host
name:10.70.44.54
[2014-06-23 06:03:32.166226] W [glusterd-utils.c:12909:glusterd_snap_quorum_check_for_create] 0-management: glusterds are not in quorum
[2014-06-23 06:03:32.166352] W [glusterd-utils.c:13058:glusterd_snap_quorum_check] 0-management: Quorum checkfailed during snapshot create command
[2014-06-23 06:03:32.166374] W [glusterd-mgmt.c:1846:glusterd_mgmt_v3_initiate_snap_phases] 0-management: quorum check failed
[2014-06-23 06:03:32.166416] W [glusterd-snapshot.c:7012:glusterd_snapshot_postvalidate] 0-management: Snapshot create post-validation failed
[2014-06-23 06:03:32.166433] W [glusterd-mgmt.c:248:gd_mgmt_v3_post_validate_fn] 0-management: postvalidate operation failed
[2014-06-23 06:03:32.166451] E [glusterd-mgmt.c:1335:glusterd_mgmt_v3_post_validate] 0-management: Post Validation failed for operation Snapshot on local node
[2014-06-23 06:03:32.166467] E [glusterd-mgmt.c:1944:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Post Validation Failed
[2014-06-23 06:03:33.972792] I [glusterd-handshake.c:1014:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30000

Actual results:
==============
snapshot create fails with "glusterd quorum not met" error message


Expected results:
=================
Snapshot create should not fail with "glusterd quorum not met" error message, when all glusterd was up and running on all nodes. 



Additional info:

Comment 3 Joseph Elwin Fernandes 2014-06-30 02:31:27 UTC
1) Couldn't reproduce the issue by issuing snapshot create and peer probe from the same host simultaneously
2) But was able to reproduce the issue by issuing snapshot create and peer probe from different host simultaneously. 
3) The cause for this issue is , During any snapshot operation the glusterd quorum is checked for total peer list of the node. This is not necessary as glusterd quorum should be check for the list of nodes that where chosen for this operation.
 In  glusterd_mgmt_v3_initiate_snap_phases(), As a preparation, before the 3 phases(pre-validate,commit and post-validate), a transaction list is prepared in this->private->xaction_peers. This list of peers will be participating in the operation, through-out the 3 phases. During a operation, the glusterd quorum should be checked only for these peers, as the checking of the quorum is w.r.t this current operation.

4) Fix: During a snapshot operation, glusterd quorum will be checked only for the 
transaction peers list.

Comment 4 Joseph Elwin Fernandes 2014-06-30 02:51:51 UTC
Fix submitted upstream:

REVIEW: http://review.gluster.org/8200 (glusterd/snapshot: fixing glusterd quorum during snap operation) posted (#1) for review on master by Joseph Fernandes (josferna)

Comment 6 Avra Sengupta 2015-03-30 09:55:37 UTC
Not targeting for 3.1

Comment 8 monti lawrence 2015-07-22 15:31:56 UTC
Doc text is edited. Please sign off to be included in Known Issues.

Comment 9 Avra Sengupta 2015-07-27 07:14:05 UTC
Doc text looks good. Verified.

Comment 10 Avra Sengupta 2015-07-28 05:49:57 UTC
Not targetting for 3.1.1

Comment 11 Avra Sengupta 2015-08-12 05:44:26 UTC
This Bug is not fixed with the submitted patch and it requires design changes in glusterd. Hence moving this back to New.

Comment 13 Avra Sengupta 2016-01-29 13:41:11 UTC
Current Glusterd architecture does not support implementation of this feature. Therefore this feature request is deferred till Gluterd 2.0.


Note You need to log in before you can comment on or make changes to this bug.