Bug 1492473

Summary: Editing VM properties task hangs forever. The only way out is remove job_id from postgres and engine restart
Product: [oVirt] ovirt-engine Reporter: Polina <pagranat>
Component: GeneralAssignee: Eyal Shenitzky <eshenitz>
Status: CLOSED CURRENTRELEASE QA Contact: Polina <pagranat>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2.0CC: amureini, bugs, lveyde, pagranat, tjelinek
Target Milestone: ovirt-4.1.9Flags: rule-engine: ovirt-4.1+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ovirt-engine-4.1.9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1516907 (view as bug list) Environment:
Last Closed: 2018-01-24 10:41:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1516907    
Attachments:
Description Flags
sometimes the scenario causes OperationCanceled error attached
none
two files engine.log and vdsm.log attached none

Description Polina 2017-09-17 15:33:34 UTC
Description of problem:
Setting of High Availability properties cause Tasks window to hang up

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.create VM, run , open 'High Available' tab, check High Availability box and choose nfs_2 from drop down list, ok brings a message that VM must be restarted.
2.poweroff the VM. Open the 'High Available' again. change the nfs_2 to nfs_1. ok
3.run VM 

Actual results:
In tasks tab (right upper corner)  you see the 'Editing VM properties' task in  'Finalizing' state which just hangs up and never ends. the only way is remove job_id from job table of postgres , and then restart engine

Expected results:
VM is started with new High Available settings

Additional info:
sometimes you have to repeate the scenario (open High Availability tab/run VM) twice to reproduce

Comment 1 Polina 2017-09-17 15:38:21 UTC
Created attachment 1326994 [details]
sometimes the scenario causes  OperationCanceled error attached

Comment 2 Tomas Jelinek 2017-09-18 06:39:37 UTC
please always attach engine and vdsm logs to bug reports

Comment 3 Polina 2017-10-02 14:11:20 UTC
Created attachment 1333252 [details]
two files engine.log and vdsm.log attached

Comment 4 Tomas Jelinek 2017-10-06 11:01:57 UTC
There are simpler steps to reproduce:
- create VM
- edit VM and set it as HA with lease on SD1
- save
- (quickly) edit again and change the lease to SD2

The problem looks like a regression from https://gerrit.ovirt.org/#/c/72120/
Before that patch, the vm lease commands were sync so the VM was locked while running this commands.

Since that patch the lease commands are async and take a while to run, but the VM is for some reason not locked. And since the VM is not locked, it is possible to run an another edit (even of the lease) on it. This race causes the jobs not being finished properly.

Moving to storage for further investigation.

Comment 5 Polina 2017-11-21 08:49:10 UTC
the negative effect of the bug even worse, since after this the leases in HA tab disappear. impossible to choose HA lease.

Comment 6 Polina 2018-01-22 11:32:53 UTC
verified in environment compute-ge-he-2.qa.lab.tlv.redhat.com 
version ovirt-engine-4.1.9-0.2.el7.noarch

It is not allowed now to swap leases. must remove lease and then choose some. no hanging tasks.

Comment 7 Sandro Bonazzola 2018-01-24 10:41:10 UTC
This bugzilla is included in oVirt 4.1.9 release, published on Jan 24th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.1.9 release, published on Jan 24th 2018, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.