Bug 1238111

Summary: Detaching a peer from the cluster doesn't remove snap related info and peer probe initiated from that node fails
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: snapshotAssignee: Mohammed Rafi KC <rkavunga>
Status: CLOSED ERRATA QA Contact: SATHEESARAN <sasundar>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: asengupt, ashah, asrivast, jbyers, nsathyan, rhs-bugs, rjoseph, rkavunga, storage-qa-internal
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: RHGS 3.1.1   
Hardware: x86_64   
OS: Linux   
Whiteboard: SNAPSHOT
Fixed In Version: glusterfs-3.7.1-13 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-05 07:17:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1203185, 1255384    
Bug Blocks: 1251815    

Description SATHEESARAN 2015-07-01 08:49:53 UTC
Description of problem:
-----------------------
When a node is removed from the 'Trusted Storage Pool' ( gluster cluster ), /var/lib/glusterd/snaps is not cleaned up.
Peer probe initiated from that node fails.

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHGS 3.1 Nightly build ( glusterfs-3.7.1-6.el6rhs )

How reproducible:
-----------------
Always

Steps to Reproduce:
--------------------
1. Create a 'Trusted Storage Pool' of 3 nodes 
2. Create a volume with bricks from first 2 nodes and start it
3. Create a gluster volume level snapshot of a volume
4. Activate the snapshot
5. Enable USS ( user-serviceable-snapshot ) on the volume
6. Detach one of the node from the cluster

Actual results:
---------------
The information of snapshot that is available under, '/var/lib/glusterd/snaps' is not removed

Expected results:
-----------------
When the node is removed from the 'Trusted Storage Pool', the snapshot related information available under '/var/lib/glusterd/snaps/' should be removed

Comment 4 Mohammed Rafi KC 2015-07-23 08:42:17 UTC
upstream patch : http://review.gluster.org/#/c/9930/

Comment 6 Avra Sengupta 2015-08-20 05:59:33 UTC
Patch upstream http://review.gluster.org/#/c/9930/

Comment 8 SATHEESARAN 2015-08-29 02:53:09 UTC
Tested with glusterfs-3.7.1-13.el7rhgs

1. Created a 2 node cluster
2. Created a distributed volume with bricks only on node1 and started the volume
3. Created a snapshot of the volume and activated it
4. Enabled USS on the volume
5. Now snap related information was available in node1 and node2 under '/var/lib/glusterd/snaps'. Also snapd was running on both the nodes
6. Detached the node2 from node1

Observed the '/var/lib/glusterd/snaps' are cleaned up in node2 ( which is detached from cluster ) and also snapd is not running on that node

[root@ ~]# ls /var/lib/glusterd/snaps/snap1/
0cdf1e35bccb41618a32c3d41b20219d  geo-replication  info

[root@ ~]# ls /var/lib/glusterd/snaps/snap1/
ls: cannot access /var/lib/glusterd/snaps/snap1/: No such file or directory

[root@ ~]# ls /var/lib/glusterd/snaps/
missed_snaps_list

Marking this bug as VERIFIED

Comment 10 errata-xmlrpc 2015-10-05 07:17:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1845.html