Bug 1054705

Summary: [SNAPSHOT]: snap creation fails when used the same name for different volumes
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rahul Hinduja <rhinduja>
Component: snapshotAssignee: Vijaikumar Mallikarjuna <vmallika>
Status: CLOSED ERRATA QA Contact: Rahul Hinduja <rhinduja>
Severity: urgent Docs Contact:
Priority: urgent    
Version: rhgs-3.0CC: nlevinki, nsathyan, rhs-bugs, rjoseph, sdharane, senaik, smohan, spandit, ssamanta, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.0.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: SNAPSHOT
Fixed In Version: glusterfs-3.6.0-3.0.el6rhs Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1091934 (view as bug list) Environment:
Last Closed: 2014-09-22 19:31:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1091926, 1091934    

Description Rahul Hinduja 2014-01-17 10:14:19 UTC
Description of problem:
======================

We support the unique name within the volume but across volume a snap name can be different with unique ID. Currently when a snap of volume1 is taken with name S1 than snap of volume2 with snap name S1 fails. 

For example:
============

[root@snapshot-01 ~]# gluster snapshot create vol0 -n p1
snapshot create: p1: snap created successfully
[root@snapshot-01 ~]# 
[root@snapshot-01 ~]# gluster snapshot create vol1 -n p1
snapshot create: failed: Commit failed on localhost. Please check log file for details.
Snapshot command failed
[root@snapshot-01 ~]# gluster snapshot create vol1 -n p2
snapshot create: p2: snap created successfully
[root@snapshot-01 ~]# 
[root@snapshot-01 ~]# gluster snapshot create vol0 -n p2
snapshot create: failed: Commit failed on localhost. Please check log file for details.
Snapshot command failed
[root@snapshot-01 ~]# 
[root@snapshot-01 ~]# gluster snapshot create vol0 -n p3
snapshot create: p3: snap created successfully
[root@snapshot-01 ~]# 
[root@snapshot-01 ~]# gluster snapshot create vol1 -n p3
snapshot create: failed: Commit failed on localhost. Please check log file for details.
Snapshot command failed
[root@snapshot-01 ~]# 


If you look at the above output: snap named p1 and p3 are successful on vol0 but failed on vol1, similarly snap named p2 is successful on vol1 and failed on vol0

Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.4.1.snap.jan15.2014git-1.el6.x86_64


How reproducible:
=================
1/1


Steps to Reproduce:
===================
1. Create a snap of volume1 with name s1
2. Create a snap of volume2 with name s2


Actual results:
===============
snap creation on volume2 with name s2 fails.

Expected results:
=================
snap creation should be successful as it is unique within volume boundaries


Additional info:
================

log-snippet


[2014-01-17 02:56:15.238459] I [glusterd-snapshot.c:2791:glusterd_take_snapshot] 0-management: device: /dev/mapper/VolGroup0-thin_vol1
[2014-01-17 02:56:17.129759] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /run/gluster/snaps/p2/dev-mapper-VolGroup0-p2-brick/b1 on port 49155
[2014-01-17 02:56:17.133058] I [rpc-clnt.c:965:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2014-01-17 02:56:17.133191] I [socket.c:3534:socket_init] 0-management: SSL support is NOT enabled
[2014-01-17 02:56:17.133219] I [socket.c:3549:socket_init] 0-management: using system polling thread
[2014-01-17 02:56:48.280771] E [glusterd-snapshot.c:2787:glusterd_take_snapshot] 0-management: taking snapshot of the brick (10.70.43.74:/brick0/b0) of device /dev/mapper/VolGroup0-thin_vol0 failed
[2014-01-17 02:56:48.280879] E [glusterd-snapshot.c:3084:glusterd_do_snap] 0-management: Failed to take snapshot of 10.70.43.74:/brick0/b0
[2014-01-17 02:56:48.280957] W [glusterd-snapshot.c:4031:glusterd_snapshot_create_commit] 0-management: taking the snapshot of the volume vol0 failed
[2014-01-17 02:56:48.280988] E [glusterd-snapshot.c:4408:glusterd_snapshot] 0-management: Failed to create snapshot
[2014-01-17 02:56:48.281011] E [glusterd-mgmt.c:964:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Snapshot on local node
[2014-01-17 02:56:48.281033] E [glusterd-mgmt.c:1581:glusterd_mgmt_v3_initiate_snap_phases] 0-: Commit Op Failed
[2014-01-17 02:56:48.288113] E [glusterd-mgmt.c:1601:glusterd_mgmt_v3_initiate_snap_phases] 0-: Brick Ops Failed
[2014-01-17 02:56:54.627925] I [glusterd-snapshot.c:2791:glusterd_take_snapshot] 0-management: device: /dev/mapper/VolGroup0-thin_vol0
[2014-01-17 02:56:55.002629] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick /run/gluster/snaps/p3/dev-mapper-VolGroup0-p3-brick/b0 on port 49156
[2014-01-17 02:56:55.006228] I [rpc-clnt.c:965:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2014-01-17 02:56:55.006343] I [socket.c:3534:socket_init] 0-management: SSL support is NOT enabled
[2014-01-17 02:56:55.006370] I [socket.c:3549:socket_init] 0-management: using system polling thread
[2014-01-17 02:57:08.510502] E [glusterd-snapshot.c:2787:glusterd_take_snapshot] 0-management: taking snapshot of the brick (10.70.43.74:/brick1/b1) of device /dev/mapper/VolGroup0-thin_vol1 failed
[2014-01-17 02:57:08.510624] E [glusterd-snapshot.c:3084:glusterd_do_snap] 0-management: Failed to take snapshot of 10.70.43.74:/brick1/b1
[2014-01-17 02:57:08.510688] W [glusterd-snapshot.c:4031:glusterd_snapshot_create_commit] 0-management: taking the snapshot of the volume vol1 failed
[2014-01-17 02:57:08.510718] E [glusterd-snapshot.c:4408:glusterd_snapshot] 0-management: Failed to create snapshot
[2014-01-17 02:57:08.510741] E [glusterd-mgmt.c:964:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Snapshot on local node
[2014-01-17 02:57:08.510766] E [glusterd-mgmt.c:1581:glusterd_mgmt_v3_initiate_snap_phases] 0-: Commit Op Failed
[2014-01-17 02:57:08.517541] E [glusterd-mgmt.c:1601:glusterd_mgmt_v3_initiate_snap_phases] 0-: Brick Ops Failed

Comment 1 Sachin Pandit 2014-01-17 10:35:23 UTC
Snap creation with same snap name fails when we have bricks of those volumes in same Volume Group, because in LVM snapshot name is unique in a Volume Group.
This issue is being fixed as we are going with snap UUID. Most probably this fix will be given in next build.

Comment 3 Vijaikumar Mallikarjuna 2014-02-07 09:21:35 UTC
A new mechanism where internally snaps are managed using uuids (i.e snapnames and snap volumes for glusterd will be uuids instead of user given name) is being submitted for review.
http://review.gluster.org/6709

Comment 4 Rahul Hinduja 2014-02-18 10:58:42 UTC
The Issue will not be fixed even with the UUID as we increase the number of bricks in a volume from same node. We will still be facing the issue of snap creation with Same UUID for LV's belonging to same VG.

Comment 6 Vijaikumar Mallikarjuna 2014-02-18 11:11:45 UTC
I think for now we can document this as known issue.

We use volume UUID for creating all brick lvms. So with this 
we cannot have more than one brick from the same lvm volume groups added to the same gluster volume. 
Linux LVM will not allow to create duplicate LVM names within the volume group.

Comment 8 Vijaikumar Mallikarjuna 2014-04-07 09:15:14 UTC
This issue is fixed as part of patch: http://review.gluster.org/#/c/7400/

Comment 9 Vijaikumar Mallikarjuna 2014-04-14 05:37:50 UTC
Patch posted to upstream master branch: http://review.gluster.org/#/c/7461/

Comment 10 Nagaprasad Sathyanarayana 2014-04-21 06:17:45 UTC
Marking snapshot BZs to RHS 3.0.

Comment 11 Vijaikumar Mallikarjuna 2014-04-28 12:34:38 UTC
Patch #7461 has multiple fixes.
Posted a separate patch to address this issue:
http://review.gluster.org/#/c/7581

Comment 12 Nagaprasad Sathyanarayana 2014-05-19 10:56:31 UTC
Setting flags required to add BZs to RHS 3.0 Errata

Comment 14 senaik 2014-05-21 06:56:41 UTC
Version : glusterfs-3.6.0.4-1.el6rhs.x86_64
=======
Created a volume with bricks from same volume Group, snap creation on the volume is successful.

 gluster v info vol0
 
Volume Name: vol0
Type: Distributed-Replicate
Volume ID: ba377b38-b3b9-426b-b6e9-6aa39fe2a9b2
Status: Started
Snap Volume: no
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.44.54:/brick1/b1
Brick2: 10.70.44.55:/brick1/b1
Brick3: 10.70.44.54:/brick2/b2
Brick4: 10.70.44.55:/brick2/b2
Options Reconfigured:
features.barrier: disable
nfs.drc: off



 df
Filesystem           1K-blocks    Used Available Use% Mounted on
/dev/mapper/vg_snapshot01-lv_root
                      45937052 2196784  41406780   6% /
tmpfs                  4095740       0   4095740   0% /dev/shm
/dev/vda1               495844   34609    435635   8% /boot
/dev/mapper/VolGroup0-thin_vol0
                     104806400   33328 104773072   1% /brick0
/dev/mapper/VolGroup0-thin_vol1
                     104806400   33584 104772816   1% /brick1
/dev/mapper/VolGroup0-thin_vol2
                     104806400   33616 104772784   1% /brick2
/dev/mapper/VolGroup0-thin_vol3
                     104806400   33328 104773072   1% /brick3
/dev/mapper/VolGroup1-thin_vol4
                     104806400   33328 104773072   1% /brick4
/dev/mapper/VolGroup1-thin_vol5
                     104806400   33328 104773072   1% /brick5
/dev/mapper/VolGroup1-thin_vol6
                     104806400   33328 104773072   1% /brick6
/dev/mapper/VolGroup1-thin_vol7
                     104806400   33328 104773072   1% /brick7
/dev/mapper/VolGroup0-eb845753921b4b4497d2da4ffa7c7978_0
                     104806400   33584 104772816   1% /var/run/gluster/snaps/eb845753921b4b4497d2da4ffa7c7978/brick1
/dev/mapper/VolGroup0-eb845753921b4b4497d2da4ffa7c7978_1
                     104806400   33616 104772784   1% /var/run/gluster/snaps/eb845753921b4b4497d2da4ffa7c7978/brick3


Marking the bug as 'Verfied'

Comment 16 errata-xmlrpc 2014-09-22 19:31:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html