Bug 1341474 - [geo-rep]: Snapshot creation having geo-rep session is broken
Summary: [geo-rep]: Snapshot creation having geo-rep session is broken
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: Kotresh HR
QA Contact:
URL:
Whiteboard:
Depends On: 1341316
Blocks: 1341478 1341944
TreeView+ depends on / blocked
 
Reported: 2016-06-01 07:20 UTC by Kotresh HR
Modified: 2017-03-08 09:24 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1341316
: 1341478 (view as bug list)
Environment:
Last Closed: 2017-03-08 09:24:36 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Kotresh HR 2016-06-01 07:22:10 UTC
Description of problem:
=======================
Snapshot creation on geo-rep setup is broken. Pre-requisite for creating snapshot on volume having geo-rep session is to pause the session. But currently it fails with pre-validation as it is unable to find the session file.

[root@dhcp37-162 po_10.70.37.116_shifu]# gluster volume geo-replication po 10.70.37.116::shifu statu
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED          
---------------------------------------------------------------------------------------------------------------------------------------
10.70.37.162    po            /rhs/brick1/b1    root          10.70.37.116::shifu    N/A           Paused    N/A             N/A                  
10.70.37.162    po            /rhs/brick2/b3    root          10.70.37.116::shifu    N/A           Paused    N/A             N/A                  
10.70.37.40     po            /rhs/brick1/b2    root          10.70.37.116::shifu    N/A           Paused    N/A             N/A                  
10.70.37.40     po            /rhs/brick2/b4    root          10.70.37.116::shifu    N/A           Paused    N/A             N/A                  
[root@dhcp37-162 po_10.70.37.116_shifu]# gluster snapshot create snapping po
snapshot create: failed: Commit failed on localhost. Please check log file for details.
Snapshot command failed
[root@dhcp37-162 po_10.70.37.116_shifu]# 

Log Snippet:
============

[2016-05-31 19:27:37.217512] E [MSGID: 106122] [glusterd-mgmt.c:879:glusterd_mgmt_v3_pre_validate] 0-management: Pre Validation failed for operation Snapshot on local node
[2016-05-31 19:27:37.217532] E [MSGID: 106122] [glusterd-mgmt.c:2224:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Pre Validation Failed
[2016-05-31 19:27:55.330021] I [MSGID: 106327] [glusterd-geo-rep.c:4223:glusterd_read_status_file] 0-management: Using passed config template(/var/lib/glusterd/geo-replication/po_10.70.37.116_shifu/gsyncd.conf).
[2016-05-31 19:28:08.114302] E [MSGID: 106029] [glusterd-snapshot.c:539:glusterd_copy_geo_rep_session_files] 0-management: Session files not present in /var/lib/glusterd/geo-replication/po_10.70.37.116_shifu:92ef80bb-09f1-4d87-81ec-5f687522d866 [No such file or directory]
[2016-05-31 19:28:08.114401] E [MSGID: 106029] [glusterd-snapshot.c:747:glusterd_copy_geo_rep_files] 0-management: Failed to copy files related to session po_10.70.37.116_shifu:92ef80bb-09f1-4d87-81ec-5f687522d866
[2016-05-31 19:28:08.114435] E [MSGID: 106031] [glusterd-snapshot.c:5261:glusterd_do_snap_vol] 0-management: Failed to copy geo-rep config and status files for volume po
[2016-05-31 19:28:08.114491] E [MSGID: 106033] [glusterd-store.c:1715:glusterd_store_delete_volume] 0-management: Failed to rename volume directory for volume ddf9e75881d8425bac3a33d79f89361e [No such file or directory]
[2016-05-31 19:28:08.114522] W [MSGID: 106071] [glusterd-snapshot.c:3081:glusterd_snap_volume_remove] 0-management: Failed to remove volume ddf9e75881d8425bac3a33d79f89361e from store
[2016-05-31 19:28:08.114561] W [MSGID: 106030] [glusterd-snapshot.c:6797:glusterd_snapshot_create_commit] 0-management: taking the snapshot of the volume po failed
[2016-05-31 19:28:08.114824] E [MSGID: 106030] [glusterd-snapshot.c:8154:glusterd_snapshot] 0-management: Failed to create snapshot
[2016-05-31 19:28:08.114857] W [MSGID: 106123] [glusterd-mgmt.c:272:gd_mgmt_v3_commit_fn] 0-management: Snapshot Commit Failed
[2016-05-31 19:28:08.114878] E [MSGID: 106123] [glusterd-mgmt.c:1414:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Snapshot on local node
[2016-05-31 19:28:08.114897] E [MSGID: 106123] [glusterd-mgmt.c:2285:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Commit Op Failed
[2016-05-31 19:28:08.251945] I [MSGID: 106057] [glusterd-snapshot.c:6240:glusterd_do_snap_cleanup] 0-management: Snapshot (snapping_GMT-2016.05.31-19.28.03) does not exist [Invalid argument]



Version-Release number of selected component (if applicable):
=============================================================



How reproducible:
=================
Always

Steps to Reproduce:
===================
1. Create Master and Slave volume
2. Create geo-rep session between them
3. Pause the geo-rep session
4. Create snapshot of master

Actual results:
===============
Snapshot creation fails

Expected results:
=================
Snapshot creation should succeed.

Comment 2 Kotresh HR 2016-06-01 07:34:57 UTC
Patch: http://review.gluster.org/#/c/14595/2

Comment 3 Vijay Bellur 2016-06-01 09:58:48 UTC
REVIEW: http://review.gluster.org/14595 (glusterd/snapshot: Fix snapshot creation with geo-rep) posted (#3) for review on master by Kotresh HR (khiremat)

Comment 4 Vijay Bellur 2016-06-02 05:50:02 UTC
COMMIT: http://review.gluster.org/14595 committed in master by Rajesh Joseph (rjoseph) 
------
commit 3ae22b61f9aa01f0a97f8f1b3ef75add74c02f7d
Author: Kotresh HR <khiremat>
Date:   Wed Jun 1 12:42:30 2016 +0530

    glusterd/snapshot: Fix snapshot creation with geo-rep
    
    The construction of path to geo-rep session directory
    is broken with the commit "http://review.gluster.org/13111"
    as it saves the slave volume uuid in 'gsync_slaves'
    dictionary. This patch fixes the same.
    
    Change-Id: Ic7fc3c37d368549feb44b3a08d60157ce61227c3
    Signed-off-by: Kotresh HR <khiremat>
    BUG: 1341474
    Reviewed-on: http://review.gluster.org/14595
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Rajesh Joseph <rjoseph>

Comment 5 Kotresh HR 2017-03-08 09:24:36 UTC
v3.10.0 contains the fix.


Note You need to log in before you can comment on or make changes to this bug.