Bug 1100282 - [SNAPSHOT]: cleanup of stale snap volume doesnt happen after missed restore and gluster cli's hungs
Summary: [SNAPSHOT]: cleanup of stale snap volume doesnt happen after missed restore a...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: snapshot
Version: rhgs-3.0
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
: RHGS 3.0.0
Assignee: Avra Sengupta
QA Contact: Rahul Hinduja
URL:
Whiteboard: SNAPSHOT
Depends On: 1108652 1109024
Blocks: 1067342 1100324
TreeView+ depends on / blocked
 
Reported: 2014-05-22 12:28 UTC by Rahul Hinduja
Modified: 2016-09-17 13:02 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.6.0.16-1.el6rhs
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1100324 (view as bug list)
Environment:
Last Closed: 2014-09-22 19:39:09 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:1278 0 normal SHIPPED_LIVE Red Hat Storage Server 3.0 bug fix and enhancement update 2014-09-22 23:26:55 UTC

Description Rahul Hinduja 2014-05-22 12:28:13 UTC
Description of problem:
=======================

When a restore happens as part of missed restored, the entry of snap volume under snaps directory after successful restore doesnt clear. Also all the gluster cli's times out


Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.6.0.5-1.el6rhs.x86_64


How reproducible:
=================
1/1


Steps to Reproduce:
===================

1. Create and start the volume(2*2) from 4 nodes
2. Create a snapshot of volume
3. kill glusterd on node2
4. bring down the node4(poweroff)
5. offline the volume from node1 (gluster volume stop volume)
6. Restore the volume to snapshot taken at step2
7. Start the glusterd on node2
8. bring back the node4

Actual results:
===============

1. snap volume entry under /var/lib/glusterd/snaps/ of node2 and node4 still present
2. Any gluster command hung and eventually times out since glusterd is trying to do the handshake


Expected results:
=================
1. snap volume entry under /var/lib/glusterd/snaps/ of node2 and node4 should be deleted
2. Any gluster command should succeed

Comment 4 Avra Sengupta 2014-05-27 11:56:35 UTC
Fix at https://code.engineering.redhat.com/gerrit/25736

Comment 5 Rahul Hinduja 2014-06-12 12:05:07 UTC
During verification hit bz 1108652, marking it dependent for verification

Comment 6 senaik 2014-06-17 07:01:39 UTC
Version : glusterfs 3.6.0.17
=======

Repeated the steps as mentioned in 'Steps to Reproduce' missed snaps list shows the entry with Snap UUID with 3:2 ratio .

Marking the bug 'Verified' 


Node where glusterd was down :
============================
cat /var/lib/glusterd/snaps/missed_snaps_list |grep cbe565c107bf4a589420730504d1f9f8
ad1244a5-7c43-4812-808d-cae70399fecf:c8c90378-4ec4-41ff-83c9-4d731916c064=cbe565c107bf4a589420730504d1f9f8:2:/var/run/gluster/snaps/cbe565c107bf4a589420730504d1f9f8/brick2/b1:3:2
9f6160b0-a4db-47a5-ab8f-b7f0a328eadc:c8c90378-4ec4-41ff-83c9-4d731916c064=cbe565c107bf4a589420730504d1f9f8:4:/var/run/gluster/snaps/cbe565c107bf4a589420730504d1f9f8/brick4/b1:3:2


Node which was powered off and brought back up :
==============================================
cat /var/lib/glusterd/snaps/missed_snaps_list |grep cbe565c107bf4a589420730504d1f9f8
ad1244a5-7c43-4812-808d-cae70399fecf:c8c90378-4ec4-41ff-83c9-4d731916c064=cbe565c107bf4a589420730504d1f9f8:2:/var/run/gluster/snaps/cbe565c107bf4a589420730504d1f9f8/brick2/b1:3:2
9f6160b0-a4db-47a5-ab8f-b7f0a328eadc:c8c90378-4ec4-41ff-83c9-4d731916c064=cbe565c107bf4a589420730504d1f9f8:4:/var/run/gluster/snaps/cbe565c107bf4a589420730504d1f9f8/brick4/b1:3:2

Comment 8 errata-xmlrpc 2014-09-22 19:39:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html


Note You need to log in before you can comment on or make changes to this bug.