Description of problem: When creating clone, if clone command fails for any reason(it happened coz shared storage was down. Now creating clone with same name as previous clone name which failed, Clone creation will failed. Version-Release number of selected component (if applicable): glusterfs-3.8.4-9.el7rhgs.x86_64 How reproducible: 100% Steps to Reproduce: 1. create 2*2 distribute replicate volume 2. Create snapshot and activate it 3. Create clone from that snapshot, which fails for some reason 4. Try cleating clone with same name as previous clone name Actual results: Clone creation will fail Expected results: Clone command should not fail Additional info:
RCA: Snapshot create and clone use a common infra, where they create backend snapshots using the new (snap/clone)volume's name. In case of clone, the volume name is the clone name, and hence it creates specific backend snapshots with the clone name. So when we try to create a clone, and it fails at a stage where the backend clone is already created, the subsequent clone command with the same name will fail because now a backend snapshot with the same name already exists. It was not cleaned up as part of the (first)clone creation process. We do not see this issue in snapshot create, because in snap create the snap volume's name is always a new UUID, irrespective of the snapshot's name. So even though it fails after creating the backend snapshots, the subsequent command with the same snap name will not fail as the internal snap volume's name will be different from the first time (as it is always a new UUID) This issue is not a regression, but a day one issue. The reason it was dormant up till now is that there was no plausible way for a clone to fail after this stage uptill now. After ganesha introduced dependency on shared storage, we can now hit this point of failure, and hence uncover the issue. The fix for this has to be decided weighing a couple of design constraints, as in if we want to change how we plan to maintain clone backends, or if we want to just perform cleanup when we fail. Either way, we propose this bug to be deferred from 3.2.0 as it is not as crucial an issue and the user always has the workaround to use a different name for his clones.
On further analysis, it is deemed that the initial clone creation did not fail on account of nfs ganesha, but failed because an older clone of the same name was previously present and deleted, which triggered the already filed bug: https://bugzilla.redhat.com/show_bug.cgi?id=1309209 The above RCA is the same as the one mentioned in the above bug. Marking this as duplicate *** This bug has been marked as a duplicate of bug 1309209 ***