Bug 1498580 - Snapshot preview failure leaves jobs running and image locked
Summary: Snapshot preview failure leaves jobs running and image locked
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Backend.Core
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.1.7
: 4.1.7.4
Assignee: Ravi Nori
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-04 16:37 UTC by Ravi Nori
Modified: 2017-11-13 12:25 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-13 12:25:45 UTC
oVirt Team: Infra
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: blocker+


Attachments (Terms of Use)
engine.log (804.79 KB, application/x-gzip)
2017-10-04 16:37 UTC, Ravi Nori
no flags Details
screenshots (73.11 KB, application/zip)
2017-10-19 07:13 UTC, Mor
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 82558 0 master MERGED engine : Snapshot preview failure leaves jobs running and image locked 2017-10-17 14:11:08 UTC
oVirt gerrit 82898 0 ovirt-engine-4.1 MERGED engine : Snapshot preview failure leaves jobs running and image locked 2017-10-18 07:17:15 UTC

Description Ravi Nori 2017-10-04 16:37:14 UTC
Created attachment 1334377 [details]
engine.log

Job isn't finished and stuck in the engine db.
Them image is locked for ever and can't be released. 
Preview VM Snapshot is failed and engine aware of it, but the job keep running for ever. NO time out. 

<jobs>
<jobhref="/ovirt-engine/api/jobs/19f3e594-a84c-406e-bd0f-5bf681179fc1"id="19f3e594-a84c-406e-bd0f-5bf681179fc1">
<actions>
<linkhref="/ovirt-engine/api/jobs/19f3e594-a84c-406e-bd0f-5bf681179fc1/clear"rel="clear"/>
<linkhref="/ovirt-engine/api/jobs/19f3e594-a84c-406e-bd0f-5bf681179fc1/end"rel="end"/>
</actions>
<description>Preview VM Snapshot snap4 of VM VM6</description>
<linkhref="/ovirt-engine/api/jobs/19f3e594-a84c-406e-bd0f-5bf681179fc1/steps"rel="steps"/>
<auto_cleared>true</auto_cleared>
<external>false</external>
<last_updated>2017-09-27T10:52:18.730+03:00</last_updated>
<start_time>2017-09-27T10:52:16.205+03:00</start_time>
<status>started</status>
<ownerhref="/ovirt-engine/api/users/586c19dc-00b9-00fa-0364-00000000012f"id="586c19dc-00b9-00fa-0364-00000000012f"/>
</job>

- Snapshot-Preview snap4 for VM VM6 was initiated by admin@internal-authz.
- No MAC addresses left in the MAC Address Pool.
- Some MAC addresses had to be reallocated, but operation failed because of insufficient amount of free MACs.
- Failed to complete Snapshot-Preview snap4 for VM VM6.

- In case of snapshot preview and we need to allocate new MAC for the VM and we have no MACs left, the operation must failed. It is indeed failing and engine aware of the failure, but the job run for ever and stuck in Finalize state. 
The most worst thing here is that the image got locked for ever!

- Steps to reproduce - 
1) Master 4.2
2) 2 VMs in cluster
3) MAC pool range with only
one MAC address in cluster
4) Start VM1 with 1 vNIC
with MAC 'z'
5) Create snapshot from VM1
6) Unplug the vNIC from VM1 and give MAC 'w'(not from the pool) to VM1 and assign MAC address 'z'(the origin MAC) to VM2
7) Try to preview VM1 from snapshot 
Expected Result - should fail
Actual result - Engine failed operation - 
- Snapshot-Preview snap4 for VM VM6 was initiated by admin@internal-authz.
- No MAC addresses left in the MAC Address Pool.
- Some MAC addresses had to be reallocated, but operation failed because of insufficient amount of free MACs.
- Failed to complete Snapshot-Preview snap4 for VM VM6.
But, the job is stuck and image stay locked!

Comment 1 Red Hat Bugzilla Rules Engine 2017-10-05 13:40:03 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 2 Mor 2017-10-18 15:36:12 UTC
I'm also experiencing this problem on Red Hat Virtualization Manager Version: 4.1.7.2-0.1.el7

Comment 3 Mor 2017-10-19 07:12:45 UTC
This is also relevant for 4.2.0-0.0.master.20171013142622.git15e767c.el7.centos

Comment 4 Mor 2017-10-19 07:13:36 UTC
Created attachment 1340563 [details]
screenshots

Comment 5 Martin Perina 2017-10-19 07:33:04 UTC
(In reply to Mor from comment #3)
> This is also relevant for
> 4.2.0-0.0.master.20171013142622.git15e767c.el7.centos

The fix was merged to master on Oct 17th, so it should be included in nightly build 4.2.0-0.0.master.20171018...

For 4.1.7 this fis is included in 4.1.7.4 build

Comment 6 Michael Burman 2017-10-25 06:27:57 UTC
I think that BZ 1506092 should be a blocker to this bug. It can't tested properly until fixed.

Comment 7 Michael Burman 2017-10-25 06:54:42 UTC
(In reply to Michael Burman from comment #6)
> I think that BZ 1506092 should be a blocker to this bug. It can't tested
> properly until fixed.

BZ 1506092 should be a blocker for this bug on 4.2, as 4.1.7.4 not affected.

Comment 8 Michael Burman 2017-10-26 08:46:16 UTC
As BZ 1506092 wasn't affecting 4.1.7, only 4.2, this report can be verified with out any blockers or issues.

Comment 9 Michael Burman 2017-10-26 10:49:29 UTC
Verified on - 4.1.7.4-0.1.el7


Note You need to log in before you can comment on or make changes to this bug.