Bug 982004

Summary: [vdsm] vdsm fails to rollback tasks after a failure in create snapshot (with several disks)
Product: Red Hat Enterprise Virtualization Manager Reporter: Elad <ebenahar>
Component: vdsmAssignee: Sergey Gotliv <sgotliv>
Status: CLOSED ERRATA QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: abaron, amureini, bazulay, fsimonce, iheim, jkt, lpeer, scohen, sgotliv, yeylon
Target Milestone: ---Flags: amureini: Triaged+
Target Release: 3.3.0   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: v4.13.0 Doc Type: Bug Fix
Doc Text:
After a failure in creating a snapshot of a virtual machine with several disks, VDSM could not roll back the unfinished tasks. This update adds code to handle this failure, and enables parent volume rollback.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-21 16:27:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Elad 2013-07-07 17:06:19 UTC
Created attachment 770081 [details]
logs

Description of problem:
After a failure in create snapshot to vm with several disks, vdsm fails to roll-back the unfinished tasks:

Traceback (most recent call last):
  File "/usr/share/vdsm/storage/volume.py", line 348, in parentVolumeRollback
    pvol.teardown(sdUUID, pvolUUID)
  File "/usr/share/vdsm/storage/blockVolume.py", line 381, in teardown
    rmanager.releaseResource(lvmActivationNamespace, volUUID)
  File "/usr/share/vdsm/storage/resourceManager.py", line 630, in releaseResource
    "registered" % (namespace, name))
ValueError: Resource '283c1cc8-1d44-47f6-970d-5df9f4b4dedf_lvmActivationNS.a5bb5e48-57c8-4d5a-a077-0ffa7e478dbe' is not currently registered

Version-Release number of selected component (if applicable):
vdsm-4.11.0-69.gitd70e3d5.el6.x86_64

How reproducible:
100%

Steps to Reproduce: on a block pool:
1. create vm with 3 disks
2. create snapshot to the vm
3. while snapshot creation tasks are running, restart vdsm service


Actual results:
after vdsm comes up, it fails to roll-back the unfinished create snapshot tasks 

Expected results:
vdsm should be able to perform roll-back after a failure in create snapshot

Additional info:
logs

Comment 1 Federico Simoncelli 2013-07-09 12:23:45 UTC
In parentVolumeRollback a pvol.prepare() call is missing:

    @classmethod
    def parentVolumeRollback(cls, taskObj, sdUUID, pimgUUID, pvolUUID):
        cls.log.info("parentVolumeRollback: sdUUID=%s pimgUUID=%s"
                     " pvolUUID=%s" % (sdUUID, pimgUUID, pvolUUID))
        try:
            if pvolUUID != BLANK_UUID and pimgUUID != BLANK_UUID:
                pvol = sdCache.produce(sdUUID).produceVolume(pimgUUID,
                                                             pvolUUID)
                if not pvol.isShared() and not pvol.recheckIfLeaf():
                    pvol.setLeaf()
                pvol.teardown(sdUUID, pvolUUID)
        except Exception:
            cls.log.error("Unexpected error", exc_info=True)

Comment 2 Elad 2013-10-15 15:50:40 UTC
VDSM performs roll-back after the failure, but the deleteImage task gets stuck. 
Adding this bug as depends on bug 1019394.

Comment 3 Elad 2013-10-16 10:12:48 UTC
Marking as verified, the roll-back is being done by vdsm.

Checked on RHEVM3.3 - is18.1

Comment 4 Charlie 2013-11-28 00:30:10 UTC
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 5 errata-xmlrpc 2014-01-21 16:27:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0040.html