Bug 1257529

Summary: job monitoring don't work as expected
Product: Red Hat Enterprise Virtualization Manager Reporter: Raz Tamir <ratamir>
Component: ovirt-engineAssignee: Moti Asayag <masayag>
Status: CLOSED CURRENTRELEASE QA Contact: Raz Tamir <ratamir>
Severity: high Docs Contact:
Priority: high    
Version: 3.6.0CC: gklein, lsurette, oourfali, pstehlik, rbalakri, Rhev-m-bugs, srevivo, ykaul
Target Milestone: ovirt-3.6.0-rc3Keywords: Automation, Regression
Target Release: 3.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-20 01:39:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine log none

Description Raz Tamir 2015-08-27 09:36:47 UTC
Created attachment 1067616 [details]
engine log

Description of problem:
In many cases the job monitoring "reset" jobs that already finished and move them to status started - See this behavior via the UI.
In DB all seems to be fine (All jobs are finished)

* The most common job is when moving disk to other SD


Version-Release number of selected component (if applicable):
rhevm-3.6.0-0.12.master.el6.noarch

How reproducible:
90%

Steps to Reproduce:
1. create vm woth disk
2. Move the disk to second SD
3.

Actual results:


Expected results:


Additional info:
In the logs - around 11:57:42

Comment 1 Raz Tamir 2015-08-27 10:30:44 UTC
Another jobs is removing snapshot

Comment 2 Oved Ourfali 2015-08-30 05:01:37 UTC
Moti, I guess it is the same issue we're working on.

Comment 3 Moti Asayag 2015-08-30 05:53:30 UTC
(In reply to Oved Ourfali from comment #2)
> Moti, I guess it is the same issue we're working on.

Indeed, this was solved by [1] and is another aspect of Bug 1248090

[1] https://gerrit.ovirt.org/#/c/45008/

Comment 5 Raz Tamir 2015-10-08 13:44:44 UTC
Still occurs.
job 'Removing Snapshot .* of VM .*' still stuck

Comment 6 Moti Asayag 2015-10-18 13:06:08 UTC
(In reply to ratamir from comment #1)
> Another jobs is removing snapshot

This scenario reproduced only for removing a snapshot of a running vm (with or without saving memory).

It seems that the Command Coordinator somehow nullify the job id associated with the action, which lead the command's context to fail in closing the job.

Comment 7 Raz Tamir 2015-11-10 12:52:05 UTC
Verified on rhevm-3.6.0.3-0.1.el6.noarch (3.6.0-18)

Remove snapshot task marked as FINISHED after it completed