Bug 788640

Summary: 3.1 - VDSM: Failed remove snapshot VolumeDoesNotExist:
Product: Red Hat Enterprise Linux 6 Reporter: Avi Tal <atal>
Component: vdsmAssignee: Eduardo Warszawski <ewarszaw>
Status: CLOSED DUPLICATE QA Contact: Yaniv Kaul <ykaul>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.2CC: abaron, acathrow, bazulay, danken, iheim, ilvovsky, jkt, srevivo, ykaul
Target Milestone: rcKeywords: TestBlocker
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 795108 (view as bug list) Environment:
Last Closed: 2012-07-01 13:14:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 795108    
Attachments:
Description Flags
vdsm log
none
engine log none

Description Avi Tal 2012-02-08 17:07:35 UTC
Created attachment 560317 [details]
vdsm log

Description of problem:
running remove snapshot1 from rest api on a VM that contain 8 disks and 2 snapshots.
the VDSM log show that VolumeDoesNotExist on some of the disks/images.

Steps to Reproduce (via REST API): 
1. create 8 disks (different permutations)
2. create snapshot1
3. create snapshot2
4. restore snapshot2
5. remove snapshot1

Comment 1 Avi Tal 2012-02-08 17:08:14 UTC
=============================================================================
before remove:

    |       |-- images
    |       |   |-- 323cc6d2-9ca7-4c2a-9f15-ce0dceffecfd
    |       |   |   |-- 3551ac7f-b8ae-475d-985f-42fdb8ece9ed
    |       |   |   |-- 3551ac7f-b8ae-475d-985f-42fdb8ece9ed.meta
    |       |   |   |-- a1bcc370-8aff-4505-b0b4-dda07a2a3a87
    |       |   |   `-- a1bcc370-8aff-4505-b0b4-dda07a2a3a87.meta
    |       |   |-- 5654d919-57cc-4191-8c95-cbbdaae86da0
    |       |   |   |-- 73130d19-315f-4b9a-a587-f2ca6bef832d
    |       |   |   |-- 73130d19-315f-4b9a-a587-f2ca6bef832d.meta
    |       |   |   |-- cc6d3169-f6c1-4883-a230-e907d5f529f3
    |       |   |   `-- cc6d3169-f6c1-4883-a230-e907d5f529f3.meta
    |       |   |-- 7256c667-67bd-4f7b-831c-09a34cdb5341
    |       |   |   |-- 63d25749-7028-4310-825c-21ad8550f5a3
    |       |   |   `-- 63d25749-7028-4310-825c-21ad8550f5a3.meta
    |       |   |-- 87b22375-6194-421f-8f29-7e6235b698f6
    |       |   |   |-- 40f0e01b-245f-4392-87df-323196d4ad87
    |       |   |   |-- 40f0e01b-245f-4392-87df-323196d4ad87.meta
    |       |   |   |-- 4b92660f-c57e-4095-8167-bdf7f4187dcb
    |       |   |   `-- 4b92660f-c57e-4095-8167-bdf7f4187dcb.meta
    |       |   |-- 8f539492-68df-4fb3-b670-1d5980d3e2d0
    |       |   |   |-- 0bc19895-2e10-4398-b940-25b2b76bc37f
    |       |   |   |-- 0bc19895-2e10-4398-b940-25b2b76bc37f.meta
    |       |   |   |-- 67f1e1a4-eaec-4f3e-9b16-943cd769e521
    |       |   |   `-- 67f1e1a4-eaec-4f3e-9b16-943cd769e521.meta
    |       |   |-- a2d85350-6bdd-4bd3-84fb-babfa699e9fa
    |       |   |   |-- c364e127-ea2f-4b22-86dd-25279142b037
    |       |   |   |-- c364e127-ea2f-4b22-86dd-25279142b037.meta
    |       |   |   |-- e9cc6664-fa20-4a84-9e10-6fbee025d691
    |       |   |   `-- e9cc6664-fa20-4a84-9e10-6fbee025d691.meta
    |       |   |-- ced369d3-ffc7-40f9-aa07-db6fb330b785
    |       |   |   |-- 14903ddf-2654-4a7c-a632-756b74d50f62
    |       |   |   |-- 14903ddf-2654-4a7c-a632-756b74d50f62.meta
    |       |   |   |-- 8e3778c5-a8f7-4e6b-838f-2ec6bfbb319f
    |       |   |   `-- 8e3778c5-a8f7-4e6b-838f-2ec6bfbb319f.meta
    |       |   |-- e0a55d78-ba97-4412-89e9-65ab4d90657e
    |       |   |   |-- 6531941f-453d-40d3-9ed1-b83416b4a24b
    |       |   |   `-- 6531941f-453d-40d3-9ed1-b83416b4a24b.meta
    |       |   `-- fa390530-536b-4435-906a-4b7fde81be2e
    |       |       |-- b7820e65-ac41-4d89-ac18-7c17edefbd37
    |       |       |-- b7820e65-ac41-4d89-ac18-7c17edefbd37.meta
    |       |       |-- f400ef8c-9f2e-4f6c-aa68-6933c9965e1b
    |       |       `-- f400ef8c-9f2e-4f6c-aa68-6933c9965e1b.meta



=======================================================================================
after remove:
   |       |-- images
    |       |   |-- 323cc6d2-9ca7-4c2a-9f15-ce0dceffecfd
    |       |   |   |-- 3551ac7f-b8ae-475d-985f-42fdb8ece9ed
    |       |   |   `-- 3551ac7f-b8ae-475d-985f-42fdb8ece9ed.meta
    |       |   |-- 5654d919-57cc-4191-8c95-cbbdaae86da0
    |       |   |   |-- 73130d19-315f-4b9a-a587-f2ca6bef832d
    |       |   |   |-- 73130d19-315f-4b9a-a587-f2ca6bef832d.meta
    |       |   |   |-- cc6d3169-f6c1-4883-a230-e907d5f529f3
    |       |   |   `-- cc6d3169-f6c1-4883-a230-e907d5f529f3.meta
    |       |   |-- 7256c667-67bd-4f7b-831c-09a34cdb5341
    |       |   |   |-- 63d25749-7028-4310-825c-21ad8550f5a3
    |       |   |   `-- 63d25749-7028-4310-825c-21ad8550f5a3.meta
    |       |   |-- 87b22375-6194-421f-8f29-7e6235b698f6
    |       |   |   |-- 40f0e01b-245f-4392-87df-323196d4ad87
    |       |   |   `-- 40f0e01b-245f-4392-87df-323196d4ad87.meta
    |       |   |-- 8f539492-68df-4fb3-b670-1d5980d3e2d0
    |       |   |   |-- 0bc19895-2e10-4398-b940-25b2b76bc37f
    |       |   |   `-- 0bc19895-2e10-4398-b940-25b2b76bc37f.meta
    |       |   |-- a2d85350-6bdd-4bd3-84fb-babfa699e9fa
    |       |   |   |-- e9cc6664-fa20-4a84-9e10-6fbee025d691
    |       |   |   `-- e9cc6664-fa20-4a84-9e10-6fbee025d691.meta
    |       |   |-- ced369d3-ffc7-40f9-aa07-db6fb330b785
    |       |   |   |-- 14903ddf-2654-4a7c-a632-756b74d50f62
    |       |   |   |-- 14903ddf-2654-4a7c-a632-756b74d50f62.meta
    |       |   |   |-- 8e3778c5-a8f7-4e6b-838f-2ec6bfbb319f
    |       |   |   `-- 8e3778c5-a8f7-4e6b-838f-2ec6bfbb319f.meta
    |       |   |-- e0a55d78-ba97-4412-89e9-65ab4d90657e
    |       |   |   |-- 6531941f-453d-40d3-9ed1-b83416b4a24b
    |       |   |   `-- 6531941f-453d-40d3-9ed1-b83416b4a24b.meta
    |       |   `-- fa390530-536b-4435-906a-4b7fde81be2e
    |       |       |-- b7820e65-ac41-4d89-ac18-7c17edefbd37
    |       |       |-- b7820e65-ac41-4d89-ac18-7c17edefbd37.meta
    |       |       |-- f400ef8c-9f2e-4f6c-aa68-6933c9965e1b
    |       |       `-- f400ef8c-9f2e-4f6c-aa68-6933c9965e1b.meta

Comment 2 Avi Tal 2012-02-08 17:09:33 UTC
c8195f27-4bf6-44b5-9cda-4083bb1dc6d2::ERROR::2012-02-08 18:41:24,715::image::1107::Storage.Image::(merge) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 1103, in merge
    chain = self.getSubChain(sdDom, imgUUID, ancestor, successor)
  File "/usr/share/vdsm/storage/image.py", line 848, in getSubChain
    childs = volclass.getAllChildrenList(self.repoPath, sdDom.sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/fileVolume.py", line 409, in getAllChildrenList
    if sdDom.produceVolume(imgUUID, volid).getParent() == pvolUUID:
  File "/usr/share/vdsm/storage/fileSD.py", line 160, in produceVolume
    return fileVolume.FileVolume(repoPath, self.sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/fileVolume.py", line 64, in __init__
    volume.Volume.__init__(self, repoPath, sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/volume.py", line 120, in __init__
    self.validate()
  File "/usr/share/vdsm/storage/volume.py", line 127, in validate
    self.validateVolumePath()
  File "/usr/share/vdsm/storage/fileVolume.py", line 542, in validateVolumePath
    raise se.VolumeDoesNotExist(self.volUUID)
VolumeDoesNotExist: Volume does not exist: ('e9cc6664-fa20-4a84-9e10-6fbee025d691',)






c8195f27-4bf6-44b5-9cda-4083bb1dc6d2::ERROR::2012-02-08 18:41:24,720::task::855::TaskManager.Task::(_setError) Task=`c8195f27-4bf6-44b5-9cda-4083bb1dc6d2`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 863, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 320, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 80, in wrapper
    return f(*args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1732, in mergeSnapshots
    image.Image(repoPath).merge(sdUUID, vmUUID, imgUUID, ancestor, successor, postZero)
  File "/usr/share/vdsm/storage/image.py", line 1103, in merge
    chain = self.getSubChain(sdDom, imgUUID, ancestor, successor)
  File "/usr/share/vdsm/storage/image.py", line 848, in getSubChain
    childs = volclass.getAllChildrenList(self.repoPath, sdDom.sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/fileVolume.py", line 409, in getAllChildrenList
    if sdDom.produceVolume(imgUUID, volid).getParent() == pvolUUID:
  File "/usr/share/vdsm/storage/fileSD.py", line 160, in produceVolume
    return fileVolume.FileVolume(repoPath, self.sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/fileVolume.py", line 64, in __init__
    volume.Volume.__init__(self, repoPath, sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/volume.py", line 120, in __init__
    self.validate()
  File "/usr/share/vdsm/storage/volume.py", line 127, in validate
    self.validateVolumePath()
  File "/usr/share/vdsm/storage/fileVolume.py", line 542, in validateVolumePath
    raise se.VolumeDoesNotExist(self.volUUID)
VolumeDoesNotExist: Volume does not exist: ('e9cc6664-fa20-4a84-9e10-6fbee025d691',)

Comment 3 Avi Tal 2012-02-08 17:10:45 UTC
Created attachment 560318 [details]
engine log

Comment 4 Avi Tal 2012-02-08 17:11:18 UTC
vdsm-4.9.3.3-0.fc16.x86_64

Comment 5 Igor Lvovsky 2012-02-09 12:55:35 UTC
It's a race between merges that we have on NFS setup.
Assume that we run several merges concurrently. 
When one of the merge processes rename merged volume as part of the algorithm, the second merge process can try to create this volumes as part of getAllChildrenList.

Comment 10 Eduardo Warszawski 2012-05-09 10:47:08 UTC
http://gerrit.ovirt.org/#change,3468

Comment 14 Eduardo Warszawski 2012-07-01 13:14:54 UTC
(Sitting with danken)
This bug is still on POST since it hasn't been backported yet. For some reason, bug 836562 about a problem in the initial implemntation of this bug, was opened and acked for rhev-3.1. That initial implementation has never been in rhev-3.1.

To reduce confusion, let us close this bug as a dup and track the merge race condition in bug 836562.

*** This bug has been marked as a duplicate of bug 836562 ***