Fedora Account System
Red Hat Associate
Red Hat Customer
Description of problem: ======================= In a scenario where the restore is failed because quorum didn't meet, the subsequent restore when the quorum actually meets also fails with prevalidation. It has major impact, as the volume is also marked for deletion and entries from volume information are removed. Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.6.0.16-1.el6rhs.x86_64 How reproducible: ================= 1/1 Steps to Reproduce: =================== 1. Create and start the volume(2*2) from 4 nodes 2. Create a snapshot of volume 3. kill glusterd on node2 4. bring down the node4(poweroff) 5. offline the volume from node1 (gluster volume stop volume) 6. Restore the volume to snapshot taken at step2 7. Restore should fail as the quorum doesn't meet 8. Start the glusterd on node2 9. Restore the volume to snapshot taken at step2 Actual results: =============== Restore fails with prevalidation. [root@snapshot13 ~]# gluster snapshot restore snap1 snapshot restore: failed: Pre-validation failed on localhost. Please check log file for details Snapshot command failed [root@snapshot13 ~]# Expected results: ================= Restore should not fail. Additional info: ================ On 2 of the machines the trash has the volume information [root@snapshot13 ~]# ls /var/lib/glusterd/trash/ vols-vol0.deleted [root@snapshot13 ~]#
Verified with build: glusterfs-3.6.0.17-1.el6rhs.x86_64 [root@snapshot13 ~]# gluster snapshot list vol0 snap1 [root@snapshot13 ~]# [root@snapshot13 ~]# [root@snapshot13 ~]# [root@snapshot13 ~]# cat /var/lib/glusterd/snaps/missed_snaps_list [root@snapshot13 ~]# gluster peer status Number of Peers: 3 Hostname: snapshot14.lab.eng.blr.redhat.com Uuid: 359bb151-a987-4dd1-a1e6-6fe2c3c30b9e State: Peer in Cluster (Disconnected) Hostname: snapshot15.lab.eng.blr.redhat.com Uuid: 262f8999-3c5e-4ccf-8efc-47ccce690ff8 State: Peer in Cluster (Connected) Hostname: snapshot16.lab.eng.blr.redhat.com Uuid: 4afe2c38-2cb0-432a-81ec-18799eaea5cd State: Peer in Cluster (Disconnected) [root@snapshot13 ~]# gluster volume stop vol0 Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: vol0: success [root@snapshot13 ~]# [root@snapshot13 ~]# ls /var/lib/glusterd/trash/ ls: cannot access /var/lib/glusterd/trash/: No such file or directory [root@snapshot13 ~]# cat /var/lib/glusterd/snaps/missed_snaps_list [root@snapshot13 ~]# gluster snapshot restore snap1 snapshot restore: failed: glusterds are not in quorum Snapshot command failed [root@snapshot13 ~]# ls /var/lib/glusterd/trash/ ls: cannot access /var/lib/glusterd/trash/: No such file or directory [root@snapshot13 ~]# [root@snapshot13 ~]# gluster peer status Number of Peers: 3 Hostname: snapshot14.lab.eng.blr.redhat.com Uuid: 359bb151-a987-4dd1-a1e6-6fe2c3c30b9e State: Peer in Cluster (Connected) Hostname: snapshot15.lab.eng.blr.redhat.com Uuid: 262f8999-3c5e-4ccf-8efc-47ccce690ff8 State: Peer in Cluster (Connected) Hostname: snapshot16.lab.eng.blr.redhat.com Uuid: 4afe2c38-2cb0-432a-81ec-18799eaea5cd State: Peer in Cluster (Disconnected) [root@snapshot13 ~]# cat /var/lib/glusterd/snaps/missed_snaps_list [root@snapshot13 ~]# gluster snapshot restore snap1 Snapshot restore: snap1: Snap restored successfully [root@snapshot13 ~]# cat /var/lib/glusterd/snaps/missed_snaps_list 4afe2c38-2cb0-432a-81ec-18799eaea5cd:9e782c7e-54ac-4b88-8360-e74e072c8336=1eebeff0a2e34fe5b3ffe2460843a341:4:/var/run/gluster/snaps/1eebeff0a2e34fe5b3ffe2460843a341/brick4/b0:3:1 [root@snapshot13 ~]# ls /var/lib/glusterd/trash/ ls: cannot access /var/lib/glusterd/trash/: No such file or directory [root@snapshot13 ~]# [root@snapshot16 ~]# ls /var/lib/glusterd/snaps/ missed_snaps_list [root@snapshot16 ~]# cat /var/lib/glusterd/snaps/missed_snaps_list 4afe2c38-2cb0-432a-81ec-18799eaea5cd:9e782c7e-54ac-4b88-8360-e74e072c8336=1eebeff0a2e34fe5b3ffe2460843a341:4:/var/run/gluster/snaps/1eebeff0a2e34fe5b3ffe2460843a341/brick4/b0:3:2 [root@snapshot16 ~]# Moving the bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html