Description of problem: Restoring the snap-volume once the original volume is already restored results in a failure. To be more precise: Let us say we have taken 2 snaps (snap1 & snap2) of volume "vol1" and we restore the snap1 (snap1 restore is successful), after that try to restore the snap2. It results in post-validation failure. And gluster volume info will go in an infinite loop. Version-Release number of selected component (if applicable): How reproducible: 1/1 Steps to Reproduce: 1. Create a volume (vol1) 2. Take 2 snapshot of volume "vol1" (say, snap1 and snap2) 3. Stop the volume vol1 4. Restore the snap1 5. Restore the snap2 Actual results: Restoring "snap2" fails with post-validation error. Expected results: Restore should not fail. Additional info: [2014-06-27 03:28:26.271891] W [glusterd-snapshot.c:2201:glusterd_lvm_snapshot_remove] 0-management: Failed to rmdir: /var/run/gluster/snaps/775cf44b9ba7468b91ec8863f698d900/, err: Directory not empty. More than one glusterd running on this node. [2014-06-27 03:28:26.271946] E [glusterd-snapshot.c:6766:glusterd_snapshot_restore_cleanup] 0-management: Failed to remove LVM backend [2014-06-27 03:28:26.271966] E [glusterd-snapshot.c:6995:glusterd_snapshot_restore_postop] 0-management: Failed to perform snapshot restore cleanup for vol1 volume [2014-06-27 03:28:26.271984] E [glusterd-snapshot.c:7083:glusterd_snapshot_postvalidate] 0-management: Failed to perform snapshot restore post-op [2014-06-27 03:28:26.272000] W [glusterd-mgmt.c:248:gd_mgmt_v3_post_validate_fn] 0-management: postvalidate operation failed [2014-06-27 03:28:26.272017] E [glusterd-mgmt.c:1335:glusterd_mgmt_v3_post_validate] 0-management: Post Validation failed for operation Snapshot on local node [2014-06-27 03:28:26.272035] E [glusterd-mgmt.c:1944:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Post Validation Failed [2014-06-27 03:28:26.273648] I [socket.c:2246:socket_event_handler] 0-transport: disconnecting now .......................................................................
Upstream fix at http://review.gluster.org/#/c/8192/
Fix at https://code.engineering.redhat.com/gerrit/28120
Version : glusterfs 3.6.0.22 built on Jun 23 2014 ======= Similar issue seen on glusterfs 3.6.0.22 build after the volume is already restored once. Consecutive restore operation on the volume fails. Steps to reproduce: =================== Create a 2x2 dist rep volume Fuse/NFS mount the volume Create 1000+ files (empty files) 100+ directories Stop IO Create 2/3 snapshots of the volume [root@snapshot13 ~]# gluster snapshot create ss1 vol0 snapshot create: success: Snap ss1 created successfully [root@snapshot13 ~]# gluster snapshot create ss2 vol0 snapshot create: success: Snap ss2 created successfully [root@snapshot13 ~]# gluster snapshot create ss3 vol0 snapshot create: success: Snap ss3 created successfully stop the volume [root@snapshot13 ~]# gluster volume stop vol0 Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: vol0: success Restore the first snap--> restore is successful [root@snapshot13 ~]# gluster snapshot restore ss1 Snapshot restore: ss1: Snap restored successfully Restore the volume to another snap . It fails as below : [root@snapshot13 ~]# gluster snapshot restore ss2 Snapshot command failed Additional Info : =============== -CLI says Snapshot command failed as it crossed the 2 min CLI window. But snapshot is restored successfully after sometime. - During clean up, it does a recursive remove of files/dir which takes a long time. Until this clean up is completed User gets "Another Transaction is in progress" if he tries any other operation on the volume as the volume lock is still held. --------------------Part of the log------------------- [2014-07-02 07:03:49.582498] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed manhoos-fuse.vol [2014-07-02 07:03:49.582575] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 10.70.35.240:-var-run-gluster-snaps-ac142178aaf a40939e22f4fcb642b18a-brick1 [2014-07-02 07:03:49.582657] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 10.70.35.172:-var-run-gluster-snaps-ac142178aaf a40939e22f4fcb642b18a-brick2 [2014-07-02 07:03:49.582723] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed bricks [2014-07-02 07:03:49.582773] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed node_state.info [2014-07-02 07:03:49.582823] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed manhoos.10.70.35.172.var-run-gluster-snaps-ac142178aafa40939e22f4fcb642b18a-brick2.vol [2014-07-02 07:03:49.582868] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed rbstate [2014-07-02 07:03:49.582913] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed cksum [2014-07-02 07:03:49.582965] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed vols-manhoos.deleted [2014-07-02 07:03:49.583409] D [glusterd-utils.c:12640:glusterd_recursive_rmdir] 0-management: Failed to open directory /var/lib/glusterd/trash/vols-manhoos.deleted. Reason : No such file or directory [2014-07-02 07:13:16.079409] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed xattrop [2014-07-02 07:13:16.101493] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed indices [2014-07-02 07:13:16.125600] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed htime [2014-07-02 07:13:16.152486] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed changelogs [2014-07-02 07:13:16.187373] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 00000000-0000-0000-0000-000000000001 [2014-07-02 07:13:16.204389] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 0000e5cb-cf24-4883-b39a-284adef3afe8 [2014-07-02 07:13:16.223420] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 000098c4-c46c-42f1-b9c2-88a054e24093 [2014-07-02 07:13:16.251561] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 0000aee4-37a8-4cdf-a008-32127a29e474 [2014-07-02 07:13:16.251646] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 00 [2014-07-02 07:13:16.286690] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 00b23374-9fee-498d-b05d-b3046a625033 [2014-07-02 07:13:16.305662] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed 00b246cc-dbc0-4bde-a8d5-17edbd60d360 [2014-07-02 07:13:16.305742] D [glusterd-utils.c:12667:glusterd_recursive_rmdir] 0-management: Removed b2
Verified with build: glusterfs-3.6.0.24-1 Working as expected and multiple restored are successful. Moving the bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html