Description of problem: ======================= If glusterd /node (through which volume has been mounted)goes down, while activating the snapshots , when glusterd comes back up handshake does not happen and the snapshot is still shown as 'Stopped' on that node With the latest change that snapshots are deactivated by default and if they have to be used they should be activated specifically. This would lead to the User seeing different information on different mount points if the volume has been mounted through many servers. Version-Release number of selected component (if applicable): ============================================================ glusterfs 3.6.0.33 How reproducible: ================ always Steps to Reproduce: ================== 1.Create a 2x2 dist-rep volume and start it 2.Fuse and NFS mount the volumes from 2 servers mount -t glusterfs 10.70.40.169:/vol3 /mnt/vol3_fuse mount -t nfs -o vers=3,nolock 10.70.40.169:/vol3 /mnt/vol3_nfs/ mount -t glusterfs 10.70.40.170:/vol3 /mnt/vol3_fuse1 mount -t nfs -o vers=3,nolock 10.70.40.170:/vol3 /mnt/vol3_nfs1/ 3.Enable USS 4.Create data from all 4 mount points 5.Take snapshot vol3-snap1 on the volume 6.Check snapshot info on all the nodes - it shows 'Stopped' gluster snapshot info vol3-snap1 Snapshot : vol3-snap1 Snap UUID : 9209b34d-ba86-41dc-a8a4-05aadfd67951 Created : 2014-11-19 16:03:47 Snap Volumes: Snap Volume Name : c8e9b7472fdd46039ea82684b178ed4a Origin Volume name : vol3 Snaps taken for vol3 : 1 Snaps available for vol3 : 255 Status : Stopped 7.Stop glusterd on snapshot14(10.70.40.170) 8.Activate the snapshot gluster snapshot activate vol3-snap1 Snapshot activate: vol3-snap1: Snap activated successfully 9.Start glusterd on snapshot14(10.70.40.170) 10.Check snapshot info on all 4 nodes- it shows 'started' on all nodes except snapshot14(10.70.40.170) 11.cd to .snaps Volume mounted through 10.70.40.169: ==================================== fuse mount: ~~~~~~~~~~ [root@dhcp-0-97 vol3_fuse]# cd .snaps [root@dhcp-0-97 .snaps]# ll total 0 drwxr-xr-x. 7 root root 158 Nov 19 16:03 vol3-snap1 [root@dhcp-0-97 .snaps]# cd vol3-snap1/ [root@dhcp-0-97 vol3-snap1]# ll total 0 drwxr-xr-x. 2 root root 6 Nov 19 16:01 d1_fuse drwxr-xr-x. 2 root root 6 Nov 19 16:03 d1_fuse1 drwxr-xr-x. 2 root root 6 Nov 19 16:01 d1_nfs drwxr-xr-x. 2 root root 6 Nov 19 16:03 d1_nfs1 [root@dhcp-0-97 vol3-snap1]# pwd /mnt/vol3_fuse/.snaps/vol3-snap1 nfs mount: ~~~~~~~~~ [root@dhcp-0-97 vol3_nfs]# cd .snaps [root@dhcp-0-97 .snaps]# ll total 0 drwxr-xr-x. 7 root root 158 Nov 19 16:03 vol3-snap1 [root@dhcp-0-97 .snaps]# cd vol3-snap1/ [root@dhcp-0-97 vol3-snap1]# ll total 0 drwxr-xr-x. 2 root root 12 Nov 19 16:01 d1_fuse drwxr-xr-x. 2 root root 12 Nov 19 16:03 d1_fuse1 drwxr-xr-x. 2 root root 12 Nov 19 16:01 d1_nfs drwxr-xr-x. 2 root root 12 Nov 19 16:03 d1_nfs1 [root@dhcp-0-97 vol3-snap1]# pwd /mnt/vol3_nfs/.snaps/vol3-snap1 Volume mounted through 10.70.40.170 (where glusterd was down while snap activate was done) ====================================================== fuse mount: ~~~~~~~~~~~ [root@dhcp-0-97 .snaps]# pwd /mnt/vol3_fuse1/.snaps [root@dhcp-0-97 .snaps]# ll total 0 nfs mount: ~~~~~~~~~~ [root@dhcp-0-97 vol3_nfs1]# cd .snaps [root@dhcp-0-97 .snaps]# ll total 0 [root@dhcp-0-97 .snaps]# pwd /mnt/vol3_nfs1/.snaps Actual results: ============== If glusterd goes down on the node (through which volume has been mounted)while activating the snapshots , and when glusterd comes back up handshake does not happen and the snapshot is still shown as 'Stopped' on that node . Accessing .snaps from the mount where glusterd went down shows no snapshots listed under .snaps Expected results: ================ After glusterd comes back up on the node, handshake should happen and snapshot should be started on all nodes and accessing .snaps from all mounts should show same information. Additional info:
duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1122064. upstream patch : http://review.gluster.org/#/c/9664/
not seeing snapshots in .snaps directory, where glusterd was down when snap activate was done. (where glusterd was down while snap activate was done) [root@dhcp46-157 fuse]# cd .snaps [root@dhcp46-157 .snaps]# pwd /mnt/fuse/.snaps [root@dhcp46-157 .snaps]# ll total 0 Able to see snaps from other client: [root@dhcp47-13 .snaps]# pwd /mnt/fuse/.snaps [root@dhcp47-13 .snaps]# ls snap2 Snapshot info output from node which was down when snapshot was activated [root@rhs-arch-srv2 ~]# gluster snapshot info snap2 Snapshot : snap2 Snap UUID : 1421b902-bda3-4604-aa6b-9d2ef52832a1 Created : 2017-05-03 08:15:20 Snap Volumes: Snap Volume Name : 1cd3bcaeea0447418f0b1ad80c3ec3b6 Origin Volume name : vol1 Snaps taken for vol1 : 2 Snaps available for vol1 : 254 Status : Started Able to reproduce this bug. Hence marking this bug as failed QA
upstream master patch : https://review.gluster.org/17178
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/105517
[root@rhs-arch-srv2 core]# gluster snapshot info snap0 Snapshot : snap0 Snap UUID : 80f0b29d-b7da-419b-b595-8d216f1ffafc Created : 2017-06-20 06:58:05 Snap Volumes: Snap Volume Name : 6ce57c9284d34f828a1927c9aaeb14db Origin Volume name : newvolume Snaps taken for newvolume : 1 Snaps available for newvolume : 255 Status : Stopped [root@rhs-arch-srv2 core]# service glusterd stop [root@rhs-arch-srv1 core]# gluster snapshot activate snap0 Snapshot activate: snap0: Snap activated successfully [root@rhs-arch-srv2 core]# service glusterd start Redirecting to /bin/systemctl start glusterd.service [root@rhs-arch-srv2 core]# gluster snapshot info snap0 Snapshot : snap0 Snap UUID : 80f0b29d-b7da-419b-b595-8d216f1ffafc Created : 2017-06-20 06:58:05 Snap Volumes: Snap Volume Name : 6ce57c9284d34f828a1927c9aaeb14db Origin Volume name : newvolume Snaps taken for newvolume : 1 Snaps available for newvolume : 255 Status : Started bug verified on build glusterfs-3.8.4-28.el7rhgs.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774