Bug 1492473 - Editing VM properties task hangs forever. The only way out is remove job_id from postgres and engine restart
Summary: Editing VM properties task hangs forever. The only way out is remove job_id f...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: General
Version: 4.2.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ovirt-4.1.9
: ---
Assignee: Eyal Shenitzky
QA Contact: Polina
URL:
Whiteboard:
Depends On:
Blocks: 1516907
TreeView+ depends on / blocked
 
Reported: 2017-09-17 15:33 UTC by Polina
Modified: 2018-01-24 10:41 UTC (History)
5 users (show)

Fixed In Version: ovirt-engine-4.1.9
Clone Of:
: 1516907 (view as bug list)
Environment:
Last Closed: 2018-01-24 10:41:10 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+


Attachments (Terms of Use)
sometimes the scenario causes OperationCanceled error attached (169.90 KB, image/png)
2017-09-17 15:38 UTC, Polina
no flags Details
two files engine.log and vdsm.log attached (1.25 MB, application/zip)
2017-10-02 14:11 UTC, Polina
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 84893 0 master MERGED core: Change lock scope to 'Command' in UpdateVMCommand 2021-02-19 09:41:13 UTC
oVirt gerrit 85626 0 ovirt-engine-4.1 MERGED core: Change lock scope to 'Command' in UpdateVMCommand 2021-02-19 09:41:13 UTC

Description Polina 2017-09-17 15:33:34 UTC
Description of problem:
Setting of High Availability properties cause Tasks window to hang up

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.create VM, run , open 'High Available' tab, check High Availability box and choose nfs_2 from drop down list, ok brings a message that VM must be restarted.
2.poweroff the VM. Open the 'High Available' again. change the nfs_2 to nfs_1. ok
3.run VM 

Actual results:
In tasks tab (right upper corner)  you see the 'Editing VM properties' task in  'Finalizing' state which just hangs up and never ends. the only way is remove job_id from job table of postgres , and then restart engine

Expected results:
VM is started with new High Available settings

Additional info:
sometimes you have to repeate the scenario (open High Availability tab/run VM) twice to reproduce

Comment 1 Polina 2017-09-17 15:38:21 UTC
Created attachment 1326994 [details]
sometimes the scenario causes  OperationCanceled error attached

Comment 2 Tomas Jelinek 2017-09-18 06:39:37 UTC
please always attach engine and vdsm logs to bug reports

Comment 3 Polina 2017-10-02 14:11:20 UTC
Created attachment 1333252 [details]
two files engine.log and vdsm.log attached

Comment 4 Tomas Jelinek 2017-10-06 11:01:57 UTC
There are simpler steps to reproduce:
- create VM
- edit VM and set it as HA with lease on SD1
- save
- (quickly) edit again and change the lease to SD2

The problem looks like a regression from https://gerrit.ovirt.org/#/c/72120/
Before that patch, the vm lease commands were sync so the VM was locked while running this commands.

Since that patch the lease commands are async and take a while to run, but the VM is for some reason not locked. And since the VM is not locked, it is possible to run an another edit (even of the lease) on it. This race causes the jobs not being finished properly.

Moving to storage for further investigation.

Comment 5 Polina 2017-11-21 08:49:10 UTC
the negative effect of the bug even worse, since after this the leases in HA tab disappear. impossible to choose HA lease.

Comment 6 Polina 2018-01-22 11:32:53 UTC
verified in environment compute-ge-he-2.qa.lab.tlv.redhat.com 
version ovirt-engine-4.1.9-0.2.el7.noarch

It is not allowed now to swap leases. must remove lease and then choose some. no hanging tasks.

Comment 7 Sandro Bonazzola 2018-01-24 10:41:10 UTC
This bugzilla is included in oVirt 4.1.9 release, published on Jan 24th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.1.9 release, published on Jan 24th 2018, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.