Description of problem: In a n-way replica volume, snapshot create should fail even if one brick is down. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: snapshot create checks for quorum and if quorum is met, snapshot is taken, even if a few bricks are down. Expected results: snapshot create should fail even if one brick is down. Additional info:
Fixed with https://code.engineering.redhat.com/gerrit/40933
Version : ========= glusterfs 3.6.0.45 built on Feb 12 2015 22:58:40 Verified on 6x2 and 6x3 dist-rep volumes and snap create fails with and without force option when brick/node is down gluster v status Status of volume: vol_test Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick snapshot13.lab.eng.blr.redhat.com:/rhs/brick3/b3 N/A N 2564 Brick snapshot14.lab.eng.blr.redhat.com:/rhs/brick3/b3 49152 Y 14458 Brick snapshot15.lab.eng.blr.redhat.com:/rhs/brick3/b3 49152 Y 15846 Brick snapshot16.lab.eng.blr.redhat.com:/rhs/brick3/b3 49152 Y 15546 Brick snapshot13.lab.eng.blr.redhat.com:/rhs/brick4/b4 49153 Y 26568 Brick snapshot14.lab.eng.blr.redhat.com:/rhs/brick4/b4 49153 Y 14138 Brick snapshot15.lab.eng.blr.redhat.com:/rhs/brick4/b4 49153 Y 15858 Brick snapshot16.lab.eng.blr.redhat.com:/rhs/brick4/b4 49153 Y 15558 Brick snapshot13.lab.eng.blr.redhat.com:/rhs/brick5/b5 49154 Y 26580 Brick snapshot14.lab.eng.blr.redhat.com:/rhs/brick5/b5 49154 Y 6372 Brick snapshot15.lab.eng.blr.redhat.com:/rhs/brick5/b5 49154 Y 15870 Brick snapshot16.lab.eng.blr.redhat.com:/rhs/brick5/b5 49154 Y 15570 NFS Server on localhost 2049 Y 2577 Self-heal Daemon on localhost N/A Y 2586 NFS Server on snapshot14.lab.eng.blr.redhat.com 2049 Y 14471 Self-heal Daemon on snapshot14.lab.eng.blr.redhat.com N/A Y 14480 NFS Server on snapshot16.lab.eng.blr.redhat.com 2049 Y 23746 Self-heal Daemon on snapshot16.lab.eng.blr.redhat.com N/A Y 23756 NFS Server on snapshot15.lab.eng.blr.redhat.com 2049 Y 24055 Self-heal Daemon on snapshot15.lab.eng.blr.redhat.com N/A Y 24064 Task Status of Volume vol_test ------------------------------------------------------------------------------ There are no active volume tasks gluster snapshot create SN2 vol_test snapshot create: failed: brick snapshot13.lab.eng.blr.redhat.com:/rhs/brick3/b3 is not started. Please start the stopped brick and then issue snapshot create command or use [force] option in snapshot create to override this behavior. Snapshot command failed [root@snapshot13 ~]# gluster snapshot create SN2 vol_test force snapshot create: failed: quorum is not met Snapshot command failed Marking the bug as 'Verified' Note: If a brick on another node goes down then snap create fails with "Pre-Validation" error instead of error message as shown above. Also when a node is down, snap create on nx2 fails with "quorum not met" and snap create on nx3 fails with "One or more bricks may be down" . The error messages should be uniform in both nx2 and nx3 volumes Both these issues are tracked by bz 1085202
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0682.html