Bug 1584885 - VM remains migrating forever with no Host (actually doesn't exist) after StopVmCommand fails to DestroyVDS
Summary: VM remains migrating forever with no Host (actually doesn't exist) after Stop...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.2.3
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ovirt-4.2.4
: ---
Assignee: Arik
QA Contact: meital avital
URL:
Whiteboard:
Depends On: 1558709
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-31 21:08 UTC by Koutuk Shukla
Modified: 2022-07-09 09:56 UTC (History)
11 users (show)

Fixed In Version: ovirt-engine-4.2.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1558709
Environment:
Last Closed: 2018-06-27 10:02:42 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3465821 0 None None None 2018-07-25 12:48:20 UTC
Red Hat Product Errata RHSA-2018:2071 0 None None None 2018-06-27 10:03:32 UTC
oVirt gerrit 90938 0 None MERGED core: cleanup in destroy-vm parameters 2020-12-23 22:22:55 UTC
oVirt gerrit 90939 0 None MERGED core: do not remove vm async running command on power-off 2020-12-23 22:23:29 UTC
oVirt gerrit 90940 0 None MERGED core: remove async vm running command only on successful destroy 2020-12-23 22:22:55 UTC
oVirt gerrit 90941 0 None MERGED core: omit misleading log during vm migration 2020-12-23 22:22:57 UTC
oVirt gerrit 90942 0 None MERGED core: fix power-off migrating vm 2020-12-23 22:22:57 UTC
oVirt gerrit 91059 0 None MERGED core: ignore noVM when monitoring destroys migrating vm 2020-12-23 22:22:58 UTC
oVirt gerrit 91199 0 None MERGED core: cleanup in destroy-vm parameters 2020-12-23 22:22:58 UTC
oVirt gerrit 91200 0 None MERGED core: do not remove vm async running command on power-off 2020-12-23 22:22:56 UTC
oVirt gerrit 91201 0 None MERGED core: remove async vm running command only on successful destroy 2020-12-23 22:22:56 UTC
oVirt gerrit 91202 0 None MERGED core: omit misleading log during vm migration 2020-12-23 22:22:58 UTC
oVirt gerrit 91203 0 None MERGED core: fix power-off migrating vm 2020-12-23 22:22:56 UTC
oVirt gerrit 91204 0 None MERGED core: ignore noVM when monitoring destroys migrating vm 2020-12-23 22:22:58 UTC

Description Koutuk Shukla 2018-05-31 21:08:38 UTC
+++ This bug was initially created as a clone of Bug #1558709 +++

Description of problem:VM remains migrating forever with no Host (actually doesn't exist) after StopVmCommand fails to DestroyVDS


Version-Release number of selected component (if applicable):rhv-release-4.2.3.8-0.1.el7


How reproducible:30%


Steps to Reproduce:
1. Create affinity group under given cluster with 
{'name': 'affinity_enforcement_01', 'cluster_name': 'golden_env_mixed_1', 
'vms_rule': {'enabled': False}, 
'hosts_rule': {'positive': True, 'enforcing': True}, 
'hosts': ['host_mixed_1'], 
'vms': ['golden_env_mixed_virtio_1_0']}
2. RunOnce VM on host host_mixed_2
3. Wait for balancing to migrate the VM to the host_mixed_1.
4. Stop VM

Actual result: sometimes this scenario  ends with failure:
               'org.ovirt.engine.core.bll.StopVmCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to DestroyVDS, error = Virtual machine does not exist: {'vmId': '8114111b-dc24-4321-ab50-97f9b6fa1f6b'}, code = 1 (Failed with error noVM and code 1)
               ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-24) [vms_syncAction_0af46fa7-91b9-450c] EVENT_ID: USER_FAILED_STOP_VM(56), Failed to power off VM golden_env_mixed_virtio_1_0 (Host: host_mixed_2, User: admin@internal-authz).   
               StopVmCommand] (default task-1) [vms_syncAction_2d443c2e-323e-4f67] Strange, according to the status 'MigratingTo' virtual machine '8114111b-dc24-4321-ab50-97f9b6fa1f6b' should be running in a host but it isn't.

Expected: The VM is stopped with no errors
logs and screenshot from UI are attached.

Additional info: logs attached

--- Additional comment from Michal Skrivanek on 2018-03-21 02:28:06 EDT ---

This should be reproducible with any migration

--- Additional comment from Arik on 2018-05-08 08:51:09 EDT ---

Increase the severity as the user has no way of recovering from this state (VM in MigratingTo state while run_on_vds=NULL) without manually changing the database.

--- Additional comment from Polina on 2018-05-31 11:12:07 EDT ---

The bug is verified in version  rhvm-4.2.4-0.1.el7.noarch
The Steps according to the description and also automation test (TestEnforcementUnderHostAffinity01/02 classes in art/tests/rhevmtests/compute/sla/scheduler_tests/affinity_host_to_vm/affinity_host_to_vm_test.py)

Comment 3 Sandro Bonazzola 2018-06-01 08:32:36 UTC
The bug is verified in version  rhvm-4.2.4-0.1.el7.noarch

Comment 6 errata-xmlrpc 2018-06-27 10:02:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2071

Comment 7 Franta Kust 2019-05-16 13:04:10 UTC
BZ<2>Jira Resync

Comment 8 Daniel Gur 2019-08-28 13:12:07 UTC
sync2jira

Comment 9 Daniel Gur 2019-08-28 13:16:19 UTC
sync2jira


Note You need to log in before you can comment on or make changes to this bug.