Bug 1404118 - Snapshot: After snapshot restore failure , snapshot goes into inconsistent state
Summary: Snapshot: After snapshot restore failure , snapshot goes into inconsistent state
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: snapshot
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Avra Sengupta
QA Contact:
URL:
Whiteboard:
Depends On: 1403672
Blocks: 1405909
TreeView+ depends on / blocked
 
Reported: 2016-12-13 06:23 UTC by Avra Sengupta
Modified: 2017-03-06 17:39 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.10.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1403672
: 1405909 (view as bug list)
Environment:
Last Closed: 2017-03-06 17:39:10 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Avra Sengupta 2016-12-13 06:23:19 UTC
+++ This bug was initially created as a clone of Bug #1403672 +++

Description of problem:

With reference to bug #1403169, After snapshot restore failure, snapshot goes into inconsistent state.
Can't activate this snapshot, coz activate command shows snapshot is already activated and snapshot status command shows some bricks process are down. 


Version-Release number of selected component (if applicable):

glusterfs-3.8.4-7.el7rhgs.x86_64


How reproducible:

100%

Steps to Reproduce:
1. Create 2*2 distribute replicate volume
2. Enable cluster.enable-shared-storage 
3. enable nfs ganesha
4. create snapshot 
5. disable nfs ganesha
6. bring gluster-shared-storage volume
7. Restore snapshot  , this command will fail
8. Check snapshot status, or trying taking clone of snapshot

Actual results:

clone command fails 
snapshot status command shows some bricks process are down and activate command says snapshot is already activated

Expected results:

clone command should not fail


Additional info:



[2016-12-12 06:51:25.513917] E [MSGID: 106122] [glusterd-snapshot.c:2389:glusterd_snapshot_clone_prevalidate] 0-management: Failed to pre validate
[2016-12-12 06:51:25.513948] E [MSGID: 106443] [glusterd-snapshot.c:2405:glusterd_snapshot_clone_prevalidate] 0-management: One or more bricks are not running. Please run snapshot status command to see brick status.
Please start the stopped brick and then issue snapshot clone command 
[2016-12-12 06:51:25.513960] W [MSGID: 106443] [glusterd-snapshot.c:8636:glusterd_snapshot_prevalidate] 0-management: Snapshot clone pre-validation failed
[2016-12-12 06:51:25.513969] W [MSGID: 106122] [glusterd-mgmt.c:167:gd_mgmt_v3_pre_validate_fn] 0-management: Snapshot Prevalidate Failed
[2016-12-12 06:51:25.513978] E [MSGID: 106122] [glusterd-mgmt.c:916:glusterd_mgmt_v3_pre_validate] 0-management: Pre Validation failed for operation Snapshot on local node
[2016-12-12 06:51:25.513987] E [MSGID: 106122] [glusterd-mgmt.c:2272:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Pre Validation Failed
[2016-12-12 06:51:25.514003] E [MSGID: 106027] [glusterd-snapshot.c:8113:glusterd_snapshot_clone_postvalidate] 0-management: unable to find clone clone1 volinfo
[2016-12-12 06:51:25.514012] W [MSGID: 106444] [glusterd-snapshot.c:9136:glusterd_snapshot_postvalidate] 0-management: Snapshot create post-validation failed
[2016-12-12 06:51:25.514019] W [MSGID: 106121] [glusterd-mgmt.c:373:gd_mgmt_v3_post_validate_fn] 0-management: postvalidate operation failed
[2016-12-12 06:51:25.514027] E [MSGID: 106121] [glusterd-mgmt.c:1689:glusterd_mgmt_v3_post_validate] 0-management: Post Validation failed for operation Snapshot on local node
[2016-12-12 06:51:25.514035] E [MSGID: 106122] [glusterd-mgmt.c:2392:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Post Validation Failed

===========================================================

[2016-12-12 07:02:29.274196] E [MSGID: 106116] [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Pre Validation failed on 10.70.36.46. Snapshot snap1 is already activated.
[2016-12-12 07:02:29.274267] E [MSGID: 106116] [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Pre Validation failed on 10.70.36.71. Snapshot snap1 is already activated.
[2016-12-12 07:02:29.274294] E [MSGID: 106116] [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Pre Validation failed on 10.70.44.7. Snapshot snap1 is already activated.
[2016-12-12 07:02:29.274328] E [MSGID: 106122] [glusterd-mgmt.c:979:glusterd_mgmt_v3_pre_validate] 0-management: Pre Validation failed on peers
[2016-12-12 07:02:29.274390] E [MSGID: 106122] [glusterd-mgmt.c:2272:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Pre Validation Failed
===========================================================


[root@rhs-client46 glusterfs]# gluster snapshot status snap1

Snap Name : snap1
Snap UUID : 09114c3e-9ac3-42d7-b8a2-d1c65e0782b8

	Brick Path        :   10.70.36.70:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick1/b1
	Volume Group      :   RHS_vg1
	Brick Running     :   No
	Brick PID         :   N/A
	Data Percentage   :   0.55
	LV Size           :   199.00g


	Brick Path        :   10.70.36.71:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick2/b2
	Volume Group      :   RHS_vg1
	Brick Running     :   Yes
	Brick PID         :   11850
	Data Percentage   :   0.57
	LV Size           :   199.00g


	Brick Path        :   10.70.36.46:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick3/b3
	Volume Group      :   RHS_vg1
	Brick Running     :   Yes
	Brick PID         :   28314
	Data Percentage   :   0.11
	LV Size           :   1.80t


	Brick Path        :   10.70.44.7:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick4/b4
	Volume Group      :   RHS_vg1
	Brick Running     :   Yes
	Brick PID         :   24756
	Data Percentage   :   0.16
	LV Size           :   926.85g


	Brick Path        :   10.70.36.70:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick5/b5
	Volume Group      :   RHS_vg2
	Brick Running     :   No
	Brick PID         :   N/A
	Data Percentage   :   0.55
	LV Size           :   199.00g


	Brick Path        :   10.70.36.71:/run/gluster/snaps/d57afcb0ccd74e9cada4953a70831515/brick6/b6
	Volume Group      :   RHS_vg2
	Brick Running     :   Yes
	Brick PID         :   11870
	Data Percentage   :   0.57
	LV Size           :   199.00g

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-12-12 02:06:40 EST ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.2.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-12-12 06:48:13 EST ---

Since this bug has been approved for the RHGS 3.2.0 release of Red Hat Gluster Storage 3, through release flag 'rhgs-3.2.0+', and through the Internal Whiteboard entry of '3.2.0', the Target Release is being automatically set to 'RHGS 3.2.0'

--- Additional comment from Rejy M Cyriac on 2016-12-12 08:13:52 EST ---

At the 'RHGS 3.2.0 - Pre-Devel-Freeze Bug Triage' meeting on 12 December, it was decided that this BZ is being accepted for fix at the RHGS 3.2.0 release

Comment 1 Worker Ant 2016-12-13 07:23:59 UTC
REVIEW: http://review.gluster.org/16116 (snapshot: Fix restore rollback to reassign snap volume ids to bricks) posted (#1) for review on master by Avra Sengupta (asengupt)

Comment 2 Worker Ant 2016-12-14 09:34:54 UTC
REVIEW: http://review.gluster.org/16116 (snapshot: Fix restore rollback to reassign snap volume ids to bricks) posted (#2) for review on master by Avra Sengupta (asengupt)

Comment 3 Worker Ant 2016-12-14 09:41:57 UTC
REVIEW: http://review.gluster.org/16116 (snapshot: Fix restore rollback to reassign snap volume ids to bricks) posted (#3) for review on master by Avra Sengupta (asengupt)

Comment 4 Worker Ant 2016-12-15 10:18:47 UTC
REVIEW: http://review.gluster.org/16116 (snapshot: Fix restore rollback to reassign snap volume ids to bricks) posted (#4) for review on master by Avra Sengupta (asengupt)

Comment 5 Worker Ant 2016-12-16 09:25:54 UTC
REVIEW: http://review.gluster.org/16116 (snapshot: Fix restore rollback to reassign snap volume ids to bricks) posted (#5) for review on master by Avra Sengupta (asengupt)

Comment 6 Worker Ant 2016-12-17 07:07:34 UTC
COMMIT: http://review.gluster.org/16116 committed in master by Rajesh Joseph (rjoseph) 
------
commit d0528cf2408533b45383a796d419c49fa96e810b
Author: Avra Sengupta <asengupt>
Date:   Tue Dec 13 11:55:19 2016 +0530

    snapshot: Fix restore rollback to reassign snap volume ids to bricks
    
    Added further checks to ensure we do not go beyond prevalidate
    when trying to restore a snapshot which has a nfs-gansha conf
    file, in a cluster when nfs-ganesha is not enabled
    
    The error message for the particular scenario is:
    "Snapshot(<snapname>) has a nfs-ganesha export conf
    file. cluster.enable-shared-storage and nfs-ganesha
    should be enabled before restoring this snapshot."
    
    Change-Id: I1b87e9907e0a5e162f26ef1ca89fe76e8da8610f
    BUG: 1404118
    Signed-off-by: Avra Sengupta <asengupt>
    Reviewed-on: http://review.gluster.org/16116
    Reviewed-by: Rajesh Joseph <rjoseph>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 7 Shyamsundar 2017-03-06 17:39:10 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.