Bug 1182554 - [SNAPSHOT]: In a n-way replica volume, snapshot should not be taken, even if one brick is down.
Summary: [SNAPSHOT]: In a n-way replica volume, snapshot should not be taken, even if ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: snapshot
Version: rhgs-3.0
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
: RHGS 3.0.4
Assignee: rjoseph
QA Contact: senaik
URL:
Whiteboard: SNAPSHOT
Depends On:
Blocks: 1182947 1184344 1186189
TreeView+ depends on / blocked
 
Reported: 2015-01-15 12:47 UTC by Avra Sengupta
Modified: 2016-09-17 13:04 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.6.0.45-1
Doc Type: Known Issue
Doc Text:
Currently brick quorum support for snapshot is not available therefore snapshot create will fail even if one brick is down. Snapshot can be taken only if all the bricks of the volume is up.
Clone Of:
: 1184344 (view as bug list)
Environment:
Last Closed: 2015-03-26 06:35:26 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0682 0 normal SHIPPED_LIVE Red Hat Storage 3.0 enhancement and bug fix update #4 2015-03-26 10:32:55 UTC

Description Avra Sengupta 2015-01-15 12:47:16 UTC
Description of problem:
In a n-way replica volume, snapshot create should fail even if one brick is down.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
snapshot create checks for quorum and if quorum is met, snapshot is taken, even if a few bricks are down.


Expected results:
snapshot create should fail even if one brick is down.

Additional info:

Comment 1 Avra Sengupta 2015-01-29 10:49:30 UTC
Fixed with https://code.engineering.redhat.com/gerrit/40933

Comment 2 senaik 2015-02-20 12:22:46 UTC
Version :
=========
glusterfs 3.6.0.45 built on Feb 12 2015 22:58:40

Verified on 6x2 and 6x3 dist-rep volumes and snap create fails with and without force option when brick/node is down 

gluster v status 
Status of volume: vol_test
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick snapshot13.lab.eng.blr.redhat.com:/rhs/brick3/b3	N/A	N	2564
Brick snapshot14.lab.eng.blr.redhat.com:/rhs/brick3/b3	49152	Y	14458
Brick snapshot15.lab.eng.blr.redhat.com:/rhs/brick3/b3	49152	Y	15846
Brick snapshot16.lab.eng.blr.redhat.com:/rhs/brick3/b3	49152	Y	15546
Brick snapshot13.lab.eng.blr.redhat.com:/rhs/brick4/b4	49153	Y	26568
Brick snapshot14.lab.eng.blr.redhat.com:/rhs/brick4/b4	49153	Y	14138
Brick snapshot15.lab.eng.blr.redhat.com:/rhs/brick4/b4	49153	Y	15858
Brick snapshot16.lab.eng.blr.redhat.com:/rhs/brick4/b4	49153	Y	15558
Brick snapshot13.lab.eng.blr.redhat.com:/rhs/brick5/b5	49154	Y	26580
Brick snapshot14.lab.eng.blr.redhat.com:/rhs/brick5/b5	49154	Y	6372
Brick snapshot15.lab.eng.blr.redhat.com:/rhs/brick5/b5	49154	Y	15870
Brick snapshot16.lab.eng.blr.redhat.com:/rhs/brick5/b5	49154	Y	15570
NFS Server on localhost					2049	Y	2577
Self-heal Daemon on localhost				N/A	Y	2586
NFS Server on snapshot14.lab.eng.blr.redhat.com		2049	Y	14471
Self-heal Daemon on snapshot14.lab.eng.blr.redhat.com	N/A	Y	14480
NFS Server on snapshot16.lab.eng.blr.redhat.com		2049	Y	23746
Self-heal Daemon on snapshot16.lab.eng.blr.redhat.com	N/A	Y	23756
NFS Server on snapshot15.lab.eng.blr.redhat.com		2049	Y	24055
Self-heal Daemon on snapshot15.lab.eng.blr.redhat.com	N/A	Y	24064
 
Task Status of Volume vol_test
------------------------------------------------------------------------------
There are no active volume tasks


gluster snapshot create SN2 vol_test
snapshot create: failed: brick snapshot13.lab.eng.blr.redhat.com:/rhs/brick3/b3 is not started. Please start the stopped brick and then issue snapshot create command or use [force] option in snapshot create to override this behavior.
Snapshot command failed

[root@snapshot13 ~]# gluster snapshot create SN2 vol_test force
snapshot create: failed: quorum is not met
Snapshot command failed

Marking the bug as 'Verified'

Note:
If a brick on another node goes down then snap create fails with "Pre-Validation" error instead of error message as shown above. 

Also when a node is down, snap create on nx2 fails with "quorum not met" and snap create on nx3  fails with "One or more bricks may be down" . The error messages should be uniform in both nx2 and nx3 volumes
Both these issues are tracked by bz 1085202

Comment 4 errata-xmlrpc 2015-03-26 06:35:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0682.html


Note You need to log in before you can comment on or make changes to this bug.