1113961 – [SNAPSHOT] : Restoring the snap-volume once the volume is already restored results in a failure.

Bug 1113961 - [SNAPSHOT] : Restoring the snap-volume once the volume is already restored results in a failure.

Summary: [SNAPSHOT] : Restoring the snap-volume once the volume is already restored re...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	snapshot
Sub Component:
Version:	rhgs-3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.0.0
Assignee:	Avra Sengupta
QA Contact:	Rahul Hinduja
Docs Contact:
URL:
Whiteboard:	SNAPSHOT
Depends On:
Blocks:	1113975
TreeView+	depends on / blocked

Reported:	2014-06-27 10:58 UTC by Sachin Pandit
Modified:	2016-09-17 13:00 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.6.0.24-1
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1113975 (view as bug list)
Environment:
Last Closed:	2014-09-22 19:42:59 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2014:1278	0	normal	SHIPPED_LIVE	Red Hat Storage Server 3.0 bug fix and enhancement update	2014-09-22 23:26:55 UTC

Description Sachin Pandit 2014-06-27 10:58:55 UTC

Description of problem:
Restoring the snap-volume once the original volume is already restored
results in a failure.

To be more precise:
Let us say we have taken 2 snaps (snap1 & snap2) of volume "vol1" and we restore the snap1 (snap1 restore is successful), after that try to
restore the snap2. It results in post-validation failure.
And gluster volume info will go in an infinite loop.


Version-Release number of selected component (if applicable):


How reproducible:
1/1

Steps to Reproduce:
1. Create a volume (vol1)
2. Take 2 snapshot of volume "vol1" (say, snap1 and snap2)
3. Stop the volume vol1
4. Restore the snap1
5. Restore the snap2

Actual results:
Restoring "snap2" fails with post-validation error.

Expected results:
Restore should not fail.

Additional info:
[2014-06-27 03:28:26.271891] W [glusterd-snapshot.c:2201:glusterd_lvm_snapshot_remove] 0-management: Failed to rmdir: /var/run/gluster/snaps/775cf44b9ba7468b91ec8863f698d900/, err: Directory not empty. More than one glusterd running on this node.
[2014-06-27 03:28:26.271946] E [glusterd-snapshot.c:6766:glusterd_snapshot_restore_cleanup] 0-management: Failed to remove LVM backend
[2014-06-27 03:28:26.271966] E [glusterd-snapshot.c:6995:glusterd_snapshot_restore_postop] 0-management: Failed to perform snapshot restore cleanup for vol1 volume
[2014-06-27 03:28:26.271984] E [glusterd-snapshot.c:7083:glusterd_snapshot_postvalidate] 0-management: Failed to perform snapshot restore post-op
[2014-06-27 03:28:26.272000] W [glusterd-mgmt.c:248:gd_mgmt_v3_post_validate_fn] 0-management: postvalidate operation failed
[2014-06-27 03:28:26.272017] E [glusterd-mgmt.c:1335:glusterd_mgmt_v3_post_validate] 0-management: Post Validation failed for operation Snapshot on local node
[2014-06-27 03:28:26.272035] E [glusterd-mgmt.c:1944:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Post Validation Failed
[2014-06-27 03:28:26.273648] I [socket.c:2246:socket_event_handler] 0-transport: disconnecting now
.......................................................................

Comment 2 Avra Sengupta 2014-06-27 11:43:27 UTC

Upstream fix at http://review.gluster.org/#/c/8192/

Comment 3 Avra Sengupta 2014-06-30 09:33:36 UTC

Fix at https://code.engineering.redhat.com/gerrit/28120

Comment 4 senaik 2014-07-03 07:11:33 UTC

Version : glusterfs 3.6.0.22 built on Jun 23 2014
=======

Similar issue seen on glusterfs 3.6.0.22 build after the volume is already restored once. Consecutive restore operation on the volume fails. 

Steps to reproduce: 
===================
Create a 2x2 dist rep volume 

Fuse/NFS mount the volume 

Create 1000+ files (empty files) 100+ directories

Stop IO

Create 2/3 snapshots of the volume 
[root@snapshot13 ~]# gluster snapshot create ss1 vol0
snapshot create: success: Snap ss1 created successfully
[root@snapshot13 ~]# gluster snapshot create ss2 vol0
snapshot create: success: Snap ss2 created successfully
[root@snapshot13 ~]# gluster snapshot create ss3 vol0
snapshot create: success: Snap ss3 created successfully


stop the volume 
[root@snapshot13 ~]# gluster volume stop vol0
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: vol0: success


Restore the first snap--> restore is successful
[root@snapshot13 ~]# gluster snapshot restore ss1
Snapshot restore: ss1: Snap restored successfully

Restore the volume to another snap . It fails as below :
[root@snapshot13 ~]# gluster snapshot restore ss2
Snapshot command failed


Additional Info : 
===============
-CLI says Snapshot command failed as it crossed the 2 min CLI window. But snapshot is restored successfully after sometime.
- During clean up, it does a recursive remove of files/dir which takes a long time. Until this clean up is completed User gets "Another Transaction is in progress" if he tries any other operation on the volume as the volume lock is still held. 

--------------------Part of the log-------------------

[2014-07-02 07:03:49.582498] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed manhoos-fuse.vol
[2014-07-02 07:03:49.582575] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 10.70.35.240:-var-run-gluster-snaps-ac142178aaf
a40939e22f4fcb642b18a-brick1
[2014-07-02 07:03:49.582657] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 10.70.35.172:-var-run-gluster-snaps-ac142178aaf
a40939e22f4fcb642b18a-brick2
[2014-07-02 07:03:49.582723] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed bricks
[2014-07-02 07:03:49.582773] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed node_state.info
[2014-07-02 07:03:49.582823] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed manhoos.10.70.35.172.var-run-gluster-snaps-ac142178aafa40939e22f4fcb642b18a-brick2.vol
[2014-07-02 07:03:49.582868] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed rbstate
[2014-07-02 07:03:49.582913] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed cksum
[2014-07-02 07:03:49.582965] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed vols-manhoos.deleted
[2014-07-02 07:03:49.583409] D [glusterd-utils.c:12640:glusterd_recursive_rmdir] 0-management: Failed to open directory /var/lib/glusterd/trash/vols-manhoos.deleted. Reason : No such file or directory
[2014-07-02 07:13:16.079409] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed xattrop
[2014-07-02 07:13:16.101493] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed indices
[2014-07-02 07:13:16.125600] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed htime
[2014-07-02 07:13:16.152486] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed changelogs
[2014-07-02 07:13:16.187373] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 00000000-0000-0000-0000-000000000001
[2014-07-02 07:13:16.204389] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 0000e5cb-cf24-4883-b39a-284adef3afe8
[2014-07-02 07:13:16.223420] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 000098c4-c46c-42f1-b9c2-88a054e24093
[2014-07-02 07:13:16.251561] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 0000aee4-37a8-4cdf-a008-32127a29e474
[2014-07-02 07:13:16.251646] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 00
[2014-07-02 07:13:16.286690] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 00b23374-9fee-498d-b05d-b3046a625033
[2014-07-02 07:13:16.305662] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 00b246cc-dbc0-4bde-a8d5-17edbd60d360
[2014-07-02 07:13:16.305742] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed b2

Comment 5 Rahul Hinduja 2014-07-08 10:19:08 UTC

Verified with build: glusterfs-3.6.0.24-1

Working as expected and multiple restored are successful.

Moving the bug to verified state.

Comment 7 errata-xmlrpc 2014-09-22 19:42:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html

Note You need to log in before you can comment on or make changes to this bug.