Bug 1408386

Summary: Snapshot: Subsequent clone creation command with the same name fails, if the first clone command with that name failed
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Anil Shah <ashah>
Component: snapshotAssignee: Avra Sengupta <asengupt>
Status: CLOSED DUPLICATE QA Contact: Anil Shah <ashah>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-23 09:41:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anil Shah 2016-12-23 08:41:20 UTC
Description of problem:

When creating clone, if clone command fails for any reason(it happened coz shared storage was down. Now creating clone with same name as previous clone name which failed, Clone creation will failed.

Version-Release number of selected component (if applicable):

glusterfs-3.8.4-9.el7rhgs.x86_64


How reproducible:

100%

Steps to Reproduce:
1. create 2*2 distribute replicate volume
2. Create snapshot and activate it
3. Create clone from that snapshot, which fails for some reason
4. Try cleating clone with same name as previous clone name


Actual results:

Clone creation will fail


Expected results:

Clone command should not fail

Additional info:

Comment 2 Avra Sengupta 2016-12-23 08:56:47 UTC
RCA: Snapshot create and clone use a common infra, where they create backend snapshots using the new (snap/clone)volume's name. In case of clone, the volume name is the clone name, and hence it creates specific backend snapshots with the clone name. So when we try to create a clone, and it fails at a stage where the backend clone is already created, the subsequent clone command with the same name will fail because now a backend snapshot with the same name already exists. It was not cleaned up as part of the (first)clone creation process.

We do not see this issue in snapshot create, because in snap create the snap volume's name is always a new UUID, irrespective of the snapshot's name. So even though it fails after creating the backend snapshots, the subsequent command with the same snap name will not fail as the internal snap volume's name will be different from the first time (as it is always a new UUID)

This issue is not a regression, but a day one issue. The reason it was dormant up till now is that there was no plausible way for a clone to fail after this stage uptill now. After ganesha introduced dependency on shared storage, we can now hit this point of failure, and hence uncover the issue.

The fix for this has to be decided weighing a couple of design constraints, as in if we want to change how we plan to maintain clone backends, or if we want to just perform cleanup when we fail.

Either way, we propose this bug to be deferred from 3.2.0 as it is not as crucial an issue and the user always has the workaround to use a different name for his clones.

Comment 3 Avra Sengupta 2016-12-23 09:41:38 UTC
On further analysis, it is deemed that the initial clone creation did not fail on account of nfs ganesha, but failed because an older clone of the same name was previously present and deleted, which triggered the already filed bug: https://bugzilla.redhat.com/show_bug.cgi?id=1309209

The above RCA is the same as the one mentioned in the above bug. Marking this as duplicate

*** This bug has been marked as a duplicate of bug 1309209 ***