Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1074311

Summary: [engine-backend] [external-provider] failure to import a glance image (as a template) leaves image in LOCKED state
Product: Red Hat Enterprise Virtualization Manager Reporter: Elad <ebenahar>
Component: ovirt-engineAssignee: Daniel Erez <derez>
Status: CLOSED CURRENTRELEASE QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.4.0CC: amureini, derez, ebenahar, fsimonce, gklein, iheim, lpeer, rbalakri, Rhev-m-bugs, scohen, tnisan, yeylon
Target Milestone: ---Flags: amureini: Triaged+
Target Release: 3.5.0   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: ovirt-engine-3.5.0_alpha1.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1142923, 1156165    
Attachments:
Description Flags
logs from engine and vdsm and screenshot none

Description Elad 2014-03-09 16:10:20 UTC
Created attachment 872424 [details]
logs from engine and vdsm and screenshot

Description of problem:
In a situation which engine crashes during DownloadImage - importing a glance image (with 'Import as template' = true), engine don't rolls back the action and order vdsm to delete the leftover image. Instead, after vdsm finishes successfully with downloadImage task, the disk remains as 'LOCKED' in the DB.

Version-Release number of selected component (if applicable):
rhevm-3.4.0-0.3.master.el6ev.noarch
vdsm-4.14.2-0.2.el6ev.x86_64

How reproducible:
Need engine to crash during DownloadImage

Steps to Reproduce:
On a shared DC with storage domains attached and integrated glance repository with images:
1. Import an image from glance repository with 'import as template' = true
2. Restart ovirt-engine service during DownloadImage


Actual results:
DownloadImage starts on engine, the task is being sent to vdsm. Right after that, I restarted engine.

2014-03-09 17:10:51,965 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DownloadImageVDSCommand] (org.ovirt.thread.pool-4-thread-48) START, DownloadImageVDSCommand( storagePool
Id = d3d5c88e-075a-47cf-afc2-549964721c55, ignoreFailoverLimit = false, storageDomainId = 746e7ff7-dc76-4e15-b006-0b6ef42e8317, imageGroupId = dce62458-3790-4703-a873-985ecfaa8e
a0, imageId = 00000000-0000-0000-0000-000000000000), log id: 3c3a384c
2014-03-09 17:10:51,965 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DownloadImageVDSCommand] (org.ovirt.thread.pool-4-thread-48) -- executeIrsBrokerCommand: calling 'downlo
adImage'
2014-03-09 17:10:51,966 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DownloadImageVDSCommand] (org.ovirt.thread.pool-4-thread-48) -- downloadImage parameters:
                dstSpUUID=d3d5c88e-075a-47cf-afc2-549964721c55
                dstSdUUID=746e7ff7-dc76-4e15-b006-0b6ef42e8317
                dstImageGUID=dce62458-3790-4703-a873-985ecfaa8ea0
                dstVolUUID=00000000-0000-0000-0000-000000000000


When engine recovers from the restart, the task is not reverted, the disk remains stuck in 'LOCKED' state, disk is unattached (although it was imported as template):


 imagestatus |            image_group_id            |        vm_names
-------------+--------------------------------------+------------------------
           2 | dce62458-3790-4703-a873-985ecfaa8ea0





Image exists on storage:

[root@green-vdsc images]# tree  dce62458-3790-4703-a873-985ecfaa8ea0
dce62458-3790-4703-a873-985ecfaa8ea0
|-- 8bed2ae7-a2f8-4036-b02b-b3c6d855ca22
|-- 8bed2ae7-a2f8-4036-b02b-b3c6d855ca22.lease
`-- 8bed2ae7-a2f8-4036-b02b-b3c6d855ca22.meta


Not sure if it's related to the fact that the image was imported as template.

Expected results:
If import image from glance fails do to engine crash, it should revert the DownloadImage task and delete the leftovers in storage.

Additional info:
logs from engine and vdsm and screenshot

Comment 1 Oved Ourfali 2014-03-12 11:11:16 UTC
Does it also happen when importing not as a template?
It should, iiuc, as this part of the flow is identical.

Comment 2 Elad 2014-03-12 16:19:35 UTC
I checked the scenario also with 'import as template = false', disk is removed from the system, so it doesn't reproduce in that case.

Is it possible that the disk remains LOCKED because it was supposed to be wrapped with the template configuration and it didn't happen?

Comment 3 Oved Ourfali 2014-03-12 19:44:48 UTC
(In reply to Elad from comment #2)
> I checked the scenario also with 'import as template = false', disk is
> removed from the system, so it doesn't reproduce in that case.
> 
> Is it possible that the disk remains LOCKED because it was supposed to be
> wrapped with the template configuration and it didn't happen?

Looking at the code, both scenarios should behave the same.
Tried to reproduce it, and saw that once the engine was restarted, looking at the tasks you can see that it is still running (the tasks count was 0 for some reason, but expanding it showed the import task).
Is it possible that you reported it as locked but it is still running?
Also, when you check the host running the downloadImage command, do you see that the download is still working? (ps -ef | grep -i curl-img-wrap).

Adding Federico to the CC list as well, as he might have some insight about that.

Comment 4 Oved Ourfali 2014-03-12 20:30:04 UTC
Also verified that in my environment it happens both when importing the image as a disk and when importing as a template.

Allon - it seems like neither endSuccessfully nor endWithFailure are being called in such a case, although the task finishes successfully. Can it be related to the S.E.A.T. mechanism?

Comment 5 Elad 2014-03-13 09:06:25 UTC
Did some more tests and indeed, disk remains LOCKED when importing image as disk as it happens when importing it as template. It seems that the restart to engine doesn't effect the task on vdsm (createVolume). The task keeps running until the import ends.

Comment 6 Allon Mureinik 2014-03-13 11:22:10 UTC
The task /should/ continue if the engine restarts. This is by design.
What should not happen is the disk remaining locked in case of a failure.

Daniel, can you please take a look and help out with the SEAT mechanism?

Comment 7 Oved Ourfali 2014-03-13 11:30:52 UTC
(In reply to Allon Mureinik from comment #6)
> The task /should/ continue if the engine restarts. This is by design.
> What should not happen is the disk remaining locked in case of a failure.
> 
> Daniel, can you please take a look and help out with the SEAT mechanism?

There is no failure. The task finishes correctly, but neither endSuccessfully nor endWithFailure is being called.

Comment 8 Tal Nisan 2014-04-06 14:03:43 UTC
Since there is an easy workaround to this corner case scenario, this can be pushed to 3.5, in case this issue reproduces there's the unlocker.sh script that can unlock the template entity after engine restart

Comment 9 Elad 2014-05-27 10:27:14 UTC
Tested the scenario described in comment #0. Disk stuck in 'LOCKED' state.



Bug is not fixed, re-opening.

Comment 10 Elad 2014-05-27 14:04:01 UTC
Ignore previous comment, after a failure in engine during downloadImage task in vdsm, disk is moved from LOCKED to OK. 

Verified using ovirt-engine-3.5.0-0.0.master.20140519181229.gitc6324d4.el6.noarch
Against RHEV3.4 av9.2 where the issue is reproduced.

There is another issue with importing an image from glance. Disk gets stuck in LOCKED state in case of an engine failure during createVolume phase, as reported here:   https://bugzilla.redhat.com/show_bug.cgi?id=1101541

Comment 11 Allon Mureinik 2015-02-16 19:12:51 UTC
RHEV-M 3.5.0 has been released, closing this bug.

Comment 12 Allon Mureinik 2015-02-16 19:12:51 UTC
RHEV-M 3.5.0 has been released, closing this bug.