Description of problem: Had a 8 node cluster. Created 1*(8+4) volume as cold tier and 1*3 volume as hot tier. Ran I/O and tried to take a snapshot on the same. Snapshot command failed with 'quorum not met' error. Version-Release number of selected component (if applicable): glusterfs-3.7.5-15.el7rhgs.x86_64 How reproducible: Hit it once Steps to Reproduce: 1. Create a disperse volume 1*(8+4) 2. Start the volume 3. Attach a tier 1*3 4. Run a dd command on a fuse mount 5. Stop the volume and take a snapshot of the same. Actual results: 'Snapshot create' fails with the error 'quorum not met' [root@dhcp43-48 ~]# gluster snap create snap1 ozone snapshot create: failed: quorum is not met Snapshot command failed [root@dhcp43-48 ~]# Expected results: 'Snapshot create' command should be successful in creating a snapshot. Additional info: [root@dhcp43-48 ~]# gluster v list o2 ozone nash [root@dhcp43-48 ~]# [root@dhcp43-48 ~]# [root@dhcp43-48 ~]# gluster snap create snap1 ozone snapshot create: failed: quorum is not met Snapshot command failed [root@dhcp43-48 ~]# [root@dhcp43-48 ~]# rpm -qa | grep gluster glusterfs-libs-3.7.5-15.el7rhgs.x86_64 glusterfs-3.7.5-15.el7rhgs.x86_64 glusterfs-api-3.7.5-15.el7rhgs.x86_64 glusterfs-fuse-3.7.5-15.el7rhgs.x86_64 glusterfs-rdma-3.7.5-15.el7rhgs.x86_64 gluster-nagios-addons-0.2.5-1.el7rhgs.x86_64 glusterfs-geo-replication-3.7.5-15.el7rhgs.x86_64 vdsm-gluster-4.16.30-1.3.el7rhgs.noarch gluster-nagios-common-0.2.3-1.el7rhgs.noarch python-gluster-3.7.5-15.el7rhgs.noarch glusterfs-client-xlators-3.7.5-15.el7rhgs.x86_64 glusterfs-cli-3.7.5-15.el7rhgs.x86_64 glusterfs-server-3.7.5-15.el7rhgs.x86_64 glusterfs-debuginfo-3.7.5-15.el7rhgs.x86_64 [root@dhcp43-48 ~]# [root@dhcp43-48 ~]# gluster peer status Number of Peers: 7 Hostname: dhcp43-174.lab.eng.blr.redhat.com Uuid: a9308802-eb28-4c95-9aac-d1b1ef49a2a4 State: Peer in Cluster (Connected) Hostname: 10.70.42.117 Uuid: d3c7d1be-1d63-4a2b-9513-e712a47380b6 State: Peer in Cluster (Connected) Hostname: 10.70.42.17 Uuid: 28c4955c-9ced-4589-a7ce-ea21bf55a61a State: Peer in Cluster (Connected) Hostname: 10.70.42.133 Uuid: b4bb2936-453b-495f-b577-740e7f954cca State: Peer in Cluster (Connected) Hostname: 10.70.42.113 Uuid: 4fcd4342-579f-4249-8d84-31ede8b13cab State: Peer in Cluster (Connected) Hostname: 10.70.43.197 Uuid: 8b186eeb-e2d8-41d6-845d-7e54deffed11 State: Peer in Cluster (Connected) Hostname: 10.70.42.116 Uuid: b4677c8a-6e93-4fba-87d4-f961eb29fa75 State: Peer in Cluster (Connected) [root@dhcp43-48 ~]# [root@dhcp43-48 ~]# [root@dhcp43-48 ~]# gluster v info ozone Volume Name: ozone Type: Tier Volume ID: f598725f-f370-43a7-b298-5ade94ed9873 Status: Started Number of Bricks: 15 Transport-type: tcp Hot Tier : Hot Tier Type : Replicate Number of Bricks: 1 x 3 = 3 Brick1: 10.70.42.133:/rhs/brick8/ozone Brick2: 10.70.43.197:/rhs/brick1/ozone Brick3: 10.70.43.174:/rhs/brick1/ozone Cold Tier: Cold Tier Type : Disperse Number of Bricks: 1 x (8 + 4) = 12 Brick4: 10.70.43.48:/rhs/brick1/ozone Brick5: 10.70.42.117:/rhs/brick1/ozone Brick6: 10.70.42.17:/rhs/brick1/ozone Brick7: 10.70.42.133:/rhs/brick1/ozone Brick8: 10.70.43.48:/rhs/brick2/ozone Brick9: 10.70.42.117:/rhs/brick2/ozone Brick10: 10.70.42.17:/rhs/brick2/ozone Brick11: 10.70.42.133:/rhs/brick2/ozone Brick12: 10.70.42.116:/rhs/brick1/ozone Brick13: 10.70.42.113:/rhs/brick1/ozone Brick14: 10.70.42.116:/rhs/brick2/ozone Brick15: 10.70.42.113:/rhs/brick2/ozone Options Reconfigured: performance.readdir-ahead: on features.ctr-enabled: on cluster.tier-mode: cache [root@dhcp43-48 ~]#
With inputs from Avra, raising it as a blocker.
Master URL : http://review.gluster.org/#/c/13260/ Release 3.7 URL : http://review.gluster.org/#/c/13261/ RHGS 3.1.2 URL : https://code.engineering.redhat.com/gerrit/#/c/65966/
Tested and verified this on the build glusterfs-3.7.5-17.el7rhgs.x86_64 Snapshot creation was successful on different volumes of the below mentioned configurations: 2*(8+4) as cold and 2*3 as hot 1*(8+4) as cold and 1*3 as hot 6*2 as cold and 4*3 as hot Moving this bug to verified in 3.1.2. Pasted below are the detailed logs. [root@dhcp42-58 ~]# gluster v list ozone testvol [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster v info ozone Volume Name: ozone Type: Distributed-Disperse Volume ID: ebe8a210-f3f1-4a11-bcb6-15fafc7e2b39 Status: Started Number of Bricks: 2 x (8 + 4) = 24 Transport-type: tcp Bricks: Brick1: 10.70.42.58:/bricks/brick0/ozone Brick2: 10.70.42.8:/bricks/brick0/ozone Brick3: 10.70.46.52:/bricks/brick0/ozone Brick4: 10.70.47.152:/bricks/brick0/ozone Brick5: 10.70.42.58:/bricks/brick1/ozone Brick6: 10.70.42.8:/bricks/brick1/ozone Brick7: 10.70.46.52:/bricks/brick1/ozone Brick8: 10.70.47.152:/bricks/brick1/ozone Brick9: 10.70.42.58:/bricks/brick2/ozone Brick10: 10.70.42.8:/bricks/brick2/ozone Brick11: 10.70.46.52:/bricks/brick2/ozone Brick12: 10.70.47.152:/bricks/brick2/ozone Brick13: 10.70.42.58:/bricks/brick3/ozone Brick14: 10.70.42.8:/bricks/brick3/ozone Brick15: 10.70.46.52:/bricks/brick3/ozone Brick16: 10.70.47.152:/bricks/brick3/ozone Brick17: 10.70.42.58:/bricks/brick4/ozone Brick18: 10.70.42.8:/bricks/brick4/ozone Brick19: 10.70.46.52:/bricks/brick4/ozone Brick20: 10.70.47.152:/bricks/brick4/ozone Brick21: 10.70.42.58:/bricks/brick5/ozone Brick22: 10.70.42.8:/bricks/brick5/ozone Brick23: 10.70.46.52:/bricks/brick5/ozone Brick24: 10.70.47.152:/bricks/brick5/ozone Options Reconfigured: performance.readdir-ahead: on [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster v tier-attach unrecognized word: tier-attach (position 1) [root@dhcp42-58 ~]# gluster v ozone tier attach unrecognized word: ozone (position 1) [root@dhcp42-58 ~]# gluster v tier attach Usage: volume tier <VOLNAME> status volume tier <VOLNAME> start [force] volume tier <VOLNAME> attach [<replica COUNT>] <NEW-BRICK>... volume tier <VOLNAME> detach <start|stop|status|commit|[force]> [root@dhcp42-58 ~]# gluster v tier ozone attach start 10.70.42.58:/bricks/brick6/ozone 10.70.42.58:/bricks/brick6/ozone 10.70.42.58:/bricks/brick6/ozone 10.70.42.58:/bricks/brick6/ozone^Ct 10.70.42.58:/bricks/brick6/ozone 10.70.42.58:/bricks/brick6/ozone 10.70.42.58:/br[root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster v tier ozone attach replica 3 start 10.70.42.58:/bricks/brick6/ozone 10.70.42.8:/bricks/brick6/ozone 10.70.46.52:/bricks/brick6/ozone 10.70.47.152:/bricks/brick6/ozone 10.70.42.58:/bricks/brick7/ozone 10.70.42.8:/bricks/brick7/ozone 10.70.46.52:/bricks/brick7/ozone 10.70.47.152:/bricks/brick7/ozone 10.70.42.58:/bricks/brick8/ozone 10.70.42.8:/bricks/brick8/ozone 10.70.46.52:/bricks/brick8/ozone 10.70.47.152:/bricks/brick8/ozone Wrong brick type: start, use <HOSTNAME>:<export-dir-abs-path> Usage: volume tier <VOLNAME> status volume tier <VOLNAME> start [force] volume tier <VOLNAME> attach [<replica COUNT>] <NEW-BRICK>... volume tier <VOLNAME> detach <start|stop|status|commit|[force]> Tier command failed [root@dhcp42-58 ~]# gluster v tier ozone attach replica 3 10.70.42.58:/bricks/brick6/ozone 10.70.42.8:/bricks/brick6/ozone 10.70.46.52:/bricks/brick6/ozone 10.70.47.152:/bricks/brick6/ozone 10.70.42.58:/bricks/brick7/ozone 10.70.42.8:/bricks/brick7/ozone 10.70.46.52:/bricks/brick7/ozone 10.70.47.152:/bricks/brick7/ozone 10.70.42.58:/bricks/brick8/ozone 10.70.42.8:/bricks/brick8/ozone 10.70.46.52:/bricks/brick8/ozone 10.70.47.152:/bricks/brick8/ozone volume attach-tier: failed: Failed to create brick directory for brick 10.70.42.58:/bricks/brick8/ozone. Reason : No such file or directory Tier command failed [root@dhcp42-58 ~]# gluster v tier ozone attach replica 3 10.70.42.58:/bricks/brick6/ozone 10.70.42.8:/bricks/brick6/ozone 10.70.46.52:/bricks/brick6/ozone 10.70.42.58:/bricks/brick7/ozone 10.70.42.8:/bricks/brick7/ozone 10.70.46.52:/bricks/brick7/ozone volume attach-tier: failed: /bricks/brick6/ozone is already part of a volume Tier command failed [root@dhcp42-58 ~]# gluster v tier ozone attach replica 3 10.70.42.58:/bricks/brick6/ozone 10.70.42.8:/bricks/brick6/ozone 10.70.46.52:/bricks/brick6/ozone 10.70.42.58:/bricks/brick7/ozone 10.70.42.8:/bricks/brick7/ozone 10.70.46.52:/bricks/brick7/ozone force volume attach-tier: success Tiering Migration Functionality: ozone: success: Attach tier is successful on ozone. use tier status to check the status. ID: 341feb6b-8eac-4dc4-ac44-6fc9249798dd [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster v info ozone Volume Name: ozone Type: Tier Volume ID: ebe8a210-f3f1-4a11-bcb6-15fafc7e2b39 Status: Started Number of Bricks: 30 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 3 = 6 Brick1: 10.70.46.52:/bricks/brick7/ozone Brick2: 10.70.42.8:/bricks/brick7/ozone Brick3: 10.70.42.58:/bricks/brick7/ozone Brick4: 10.70.46.52:/bricks/brick6/ozone Brick5: 10.70.42.8:/bricks/brick6/ozone Brick6: 10.70.42.58:/bricks/brick6/ozone Cold Tier: Cold Tier Type : Distributed-Disperse Number of Bricks: 2 x (8 + 4) = 24 Brick7: 10.70.42.58:/bricks/brick0/ozone Brick8: 10.70.42.8:/bricks/brick0/ozone Brick9: 10.70.46.52:/bricks/brick0/ozone Brick10: 10.70.47.152:/bricks/brick0/ozone Brick11: 10.70.42.58:/bricks/brick1/ozone Brick12: 10.70.42.8:/bricks/brick1/ozone Brick13: 10.70.46.52:/bricks/brick1/ozone Brick14: 10.70.47.152:/bricks/brick1/ozone Brick15: 10.70.42.58:/bricks/brick2/ozone Brick16: 10.70.42.8:/bricks/brick2/ozone Brick17: 10.70.46.52:/bricks/brick2/ozone Brick18: 10.70.47.152:/bricks/brick2/ozone Brick19: 10.70.42.58:/bricks/brick3/ozone Brick20: 10.70.42.8:/bricks/brick3/ozone Brick21: 10.70.46.52:/bricks/brick3/ozone Brick22: 10.70.47.152:/bricks/brick3/ozone Brick23: 10.70.42.58:/bricks/brick4/ozone Brick24: 10.70.42.8:/bricks/brick4/ozone Brick25: 10.70.46.52:/bricks/brick4/ozone Brick26: 10.70.47.152:/bricks/brick4/ozone Brick27: 10.70.42.58:/bricks/brick5/ozone Brick28: 10.70.42.8:/bricks/brick5/ozone Brick29: 10.70.46.52:/bricks/brick5/ozone Brick30: 10.70.47.152:/bricks/brick5/ozone Options Reconfigured: cluster.tier-mode: cache features.ctr-enabled: on performance.readdir-ahead: on [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster snap list ozone No snapshots present [root@dhcp42-58 ~]# gluster snap create ozone Invalid Syntax. Usage: snapshot create <snapname> <volname> [no-timestamp] [description <description>] [force] [root@dhcp42-58 ~]# gluster snap create snapo ozone snapshot create: success: Snap snapo_GMT-2016.01.28-05.15.53 created successfully [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster snap list ozone snapo_GMT-2016.01.28-05.15.53 [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster snap create snapo2 ozone no-timestamp snapshot create: success: Snap snapo2 created successfully [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# rpm -qa | grep gluster glusterfs-3.7.5-17.el7rhgs.x86_64 glusterfs-cli-3.7.5-17.el7rhgs.x86_64 glusterfs-geo-replication-3.7.5-17.el7rhgs.x86_64 glusterfs-client-xlators-3.7.5-17.el7rhgs.x86_64 glusterfs-server-3.7.5-17.el7rhgs.x86_64 glusterfs-rdma-3.7.5-17.el7rhgs.x86_64 glusterfs-api-3.7.5-17.el7rhgs.x86_64 glusterfs-devel-3.7.5-17.el7rhgs.x86_64 glusterfs-debuginfo-3.7.5-17.el7rhgs.x86_64 glusterfs-libs-3.7.5-17.el7rhgs.x86_64 glusterfs-fuse-3.7.5-17.el7rhgs.x86_64 glusterfs-api-devel-3.7.5-17.el7rhgs.x86_64 [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# snapshot create ^C [root@dhcp42-58 ~]# gluster v create nash replica 2 10.70.42.58:/bricks/brick0/nash 10.70.42.8:/bricks/brick0/nash 10.70.46.52:/bricks/brick0/nash 10.70.47.152:/bricks/brick0/nash 10.70.42.58:/bricks/brick1/nash 10.70.42.8:/bricks/brick1/nash 10.70.46.52:/bricks/brick1/nash 10.70.47.152:/bricks/brick1/nash 10.70.42.58:/bricks/brick2/nash 10.70.42.8:/bricks/brick2/nash 10.70.46.52:/bricks/brick2/nash 10.70.47.152:/bricks/brick2/nash volume create: nash: success: please start the volume to access data [root@dhcp42-58 ~]# gluster v info nash Volume Name: nash Type: Distributed-Replicate Volume ID: 1381eb58-72df-488d-b4b0-92ce57037908 Status: Created Number of Bricks: 6 x 2 = 12 Transport-type: tcp Bricks: Brick1: 10.70.42.58:/bricks/brick0/nash Brick2: 10.70.42.8:/bricks/brick0/nash Brick3: 10.70.46.52:/bricks/brick0/nash Brick4: 10.70.47.152:/bricks/brick0/nash Brick5: 10.70.42.58:/bricks/brick1/nash Brick6: 10.70.42.8:/bricks/brick1/nash Brick7: 10.70.46.52:/bricks/brick1/nash Brick8: 10.70.47.152:/bricks/brick1/nash Brick9: 10.70.42.58:/bricks/brick2/nash Brick10: 10.70.42.8:/bricks/brick2/nash Brick11: 10.70.46.52:/bricks/brick2/nash Brick12: 10.70.47.152:/bricks/brick2/nash Options Reconfigured: performance.readdir-ahead: on [root@dhcp42-58 ~]# gluster v tier ozone attach replica 3 10.70.42.58:/bricks/brick3/nash 10.70.42.8:/bricks/brick3/nash 10.70.46.52:/bricks/brick3/nash 10.70.47.152:/bricks/brick3/nash 10.70.42.58:/bricks/brick4/nash 10.70.42.8:/bricks/brick4/nash 10.70.46.52:/bricks/brick4/nash 10.70.47.152:/bricks/brick4/nash 10.70.42.58:/bricks/brick5/nash 10.70.42.8:/bricks/brick5/nash 10.70.46.52:/bricks/brick5/nash 10.70.47.152:/bricks/brick5/nash volume attach-tier: failed: Volume ozone is already a tier. Tier command failed [root@dhcp42-58 ~]# gluster v tier nash attach replica 3 10.70.42.58:/bricks/brick3/nash 10.70.42.8:/bricks/brick3/nash 10.70.46.52:/bricks/brick3/nash 10.70.47.152:/bricks/brick3/nash 10.70.42.58:/bricks/brick4/nash 10.70.42.8:/bricks/brick4/nash 10.70.46.52:/bricks/brick4/nash 10.70.47.152:/bricks/brick4/nash 10.70.42.58:/bricks/brick5/nash 10.70.42.8:/bricks/brick5/nash 10.70.46.52:/bricks/brick5/nash 10.70.47.152:/bricks/brick5/nash volume attach-tier: success Tiering Migration Functionality: nash: failed: Volume nash needs to be started to perform rebalance Failed to run tier start. Please execute tier start command explictly Usage : gluster volume rebalance <volname> tier start Tier command failed [root@dhcp42-58 ~]# gluster v start nash volume start: nash: success [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster v tier nash attach replica 3 10.70.42.58:/bricks/brick3/nash 10.70.42.8:/bricks/brick3/nash 10.70.46.52:/bricks/brick3/nash 10.70.47.152:/bricks/brick3/nash 10.70.42.58:/bricks/brick4/nash 10.70.42.8:/bricks/brick4/nash 10.70.46.52:/bricks/brick4/nash 10.70.47.152:/bricks/brick4/nash 10.70.42.58:/bricks/brick5/nash 10.70.42.8:/bricks/brick5/nash 10.70.46.52:/bricks/brick5/nash 10.70.47.152:/bricks/brick5/nash volume attach-tier: failed: Volume nash is already a tier. Tier command failed [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster v rebalance Usage: volume rebalance <VOLNAME> {{fix-layout start} | {start [force]|stop|status}} [root@dhcp42-58 ~]# gluster v rebalance nash tier start Tiering Migration Functionality: nash: failed: A Tier daemon is already running on volume nash [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster v info nash Volume Name: nash Type: Tier Volume ID: 1381eb58-72df-488d-b4b0-92ce57037908 Status: Started Number of Bricks: 24 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 4 x 3 = 12 Brick1: 10.70.47.152:/bricks/brick5/nash Brick2: 10.70.46.52:/bricks/brick5/nash Brick3: 10.70.42.8:/bricks/brick5/nash Brick4: 10.70.42.58:/bricks/brick5/nash Brick5: 10.70.47.152:/bricks/brick4/nash Brick6: 10.70.46.52:/bricks/brick4/nash Brick7: 10.70.42.8:/bricks/brick4/nash Brick8: 10.70.42.58:/bricks/brick4/nash Brick9: 10.70.47.152:/bricks/brick3/nash Brick10: 10.70.46.52:/bricks/brick3/nash Brick11: 10.70.42.8:/bricks/brick3/nash Brick12: 10.70.42.58:/bricks/brick3/nash Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 6 x 2 = 12 Brick13: 10.70.42.58:/bricks/brick0/nash Brick14: 10.70.42.8:/bricks/brick0/nash Brick15: 10.70.46.52:/bricks/brick0/nash Brick16: 10.70.47.152:/bricks/brick0/nash Brick17: 10.70.42.58:/bricks/brick1/nash Brick18: 10.70.42.8:/bricks/brick1/nash Brick19: 10.70.46.52:/bricks/brick1/nash Brick20: 10.70.47.152:/bricks/brick1/nash Brick21: 10.70.42.58:/bricks/brick2/nash Brick22: 10.70.42.8:/bricks/brick2/nash Brick23: 10.70.46.52:/bricks/brick2/nash Brick24: 10.70.47.152:/bricks/brick2/nash Options Reconfigured: cluster.tier-mode: cache features.ctr-enabled: on performance.readdir-ahead: on [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster snap list nahs Snapshot list : failed: Volume (nahs) does not exist Snapshot command failed [root@dhcp42-58 ~]# gluster snap list nash No snapshots present [root@dhcp42-58 ~]# gluster snap create snapn nash snapshot create: success: Snap snapn_GMT-2016.01.28-05.27.02 created successfully [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# gluster snap list nash snapn_GMT-2016.01.28-05.27.02 [root@dhcp42-58 ~]# [root@dhcp42-58 ~]# [root@dhcp42-58 ~]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0193.html