Created attachment 1051397 [details] logs Description of problem: VM appears as external although it removed by engine. Version-Release number of selected component (if applicable): ovirt 3.6.0-03 oVirt Engine Version: 3.6.0-0.0.master.20150627185750.git6f063c1.el6 How reproducible: Happened in automation test once: https://rhev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/3.6-GE-Tier1/1/consoleFull Steps to Reproduce: 1. Create VM from Glance 2. Start VM 3. Stop and remove VM Actual results: VM removed from engine but appears as external VM. In addition the external VM is stuck in migration. See log info below. Expected results: VM remove from engine Additional info: Engine log: VMname:clone_vm_from_snapshot_vm_nfs Vm-Id:b695dc60-4440-4964-ba82-b71d1fad4524 VM created: 2015-07-12 10:33:06,121 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-2) [24da2226] Correlation ID: vms_create_a7ef3750-3e28-45c1, Job ID: e1907544-c05c-4c69-9d6c-84ac10782c0b, Call Stack: null, Custom Event ID: -1, Message: VM clone_vm_from_snapshot_vm_nfs was created by admin@internal. VM up and running: 2015-07-12 10:33:34,488 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (ForkJoinPool-1-worker-338) [] VM 'b695dc60-4440-4964-ba82-b71d1fad4524'(clone_vm_from_snapshot_vm_nfs) moved from 'PoweringUp' --> 'Up' 2015-07-12 10:33:34,499 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-338) [] Correlation ID: vms_syncAction_2342818c-5172-4e41, Job ID: bbf5a938-6ae3-432c-b550-b5337ca41b82, Call Stack: null, Custom Event ID: -1, Message: VM clone_vm_from_snapshot_vm_nfs started on Host host_mixed_1 Engine internal error: 2015-07-12 10:33:44,544 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (DefaultQuartzScheduler_Worker-89) [] VM 'b695dc60-4440-4964-ba82-b71d1fad4524(clone_vm_from_snapshot_vm_nfs) is running in db and not running in VDS 'host_mixed_4' VM removed: 2015-07-12 10:33:47,195 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-27) [] Correlation ID: vms_delete_0adfcfb6-f0f7-4b7e, Job ID: 9b99faad-4908-4bc3-9a08-b672380cf135, Call Stack: null, Custom Event ID: -1, Message: VM clone_vm_from_snapshot_vm_nfs was successfully removed. Failed to monitor VM: 2015-07-12 10:33:47,962 ERROR [org.ovirt.engine.core.vdsbroker.jsonrpc.EventVMStatsRefresher] (ForkJoinPool-1-worker-339) [] failed to fetch VM 'b695dc60-4440-4964-ba82-b71d1fad4524' from db. Status remain unchanged VM Appear as external on engine: 2015-07-12 10:33:48,908 INFO [org.ovirt.engine.core.vdsbroker.VmsMonitoring] (DefaultQuartzScheduler_Worker-77) [] Importing VM 'clone_vm_from_snapshot_vm_nfs' as 'external-clone_vm_from_snapshot_vm_nfs', as it is running on the on Host, but does not exist in the engine. vdsm.log: Thread-503446::DEBUG::2015-07-12 10:33:43,747::__init__::496::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.destroy' in bridge with {u'vmID': u'b695dc60-4440-4964-ba82-b71d1fad4524'} Thread-503446::INFO::2015-07-12 10:33:43,784::API::336::vds::(destroy) vmContainerLock acquired by vm b695dc60-4440-4964-ba82-b71d1fad4524 Migration: 2015-07-12 10:58:42,617 INFO [org.ovirt.engine.core.bll.InternalMigrateVmCommand] (ForkJoinPool-1-worker-371) [] Lock freed to object 'EngineLock:{exclusiveLocks='[b695dc60-4440-4964-ba82-b71d1fad4524=<VM, ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED$VmName external-clone_vm_from_snapshot_vm_nfs>]', sharedLocks='null'}' 2015-07-12 10:58:42,625 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-371) [] START, DestroyVDSCommand(HostName = host_mixed_3, DestroyVmVDSCommandParameters:{runAsync='true', hostId='5325ea4d-72ac-46a5-9437-27a627cd6d79', vmId='b695dc60-4440-4964-ba82-b71d1fad4524', force='true', secondsToWait='0', gracefully='false', reason=''}), log id: 1ad3ebc9 2015-07-12 10:58:42,646 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-371) [] FINISH, DestroyVDSCommand, log id: 1ad3ebc9 2015-07-12 10:58:42,650 INFO [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (ForkJoinPool-1-worker-371) [] FINISH, DestroyVmVDSCommand, return: MigratingFrom, log id: 1c6a15f2 2015-07-12 10:58:42,650 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (ForkJoinPool-1-worker-371) [] Stopped migrating VM: 'external-clone_vm_from_snapshot_vm_nfs' on VDS: '5325ea4d-72ac-46a5-9437-27a627cd6d79' 2015-07-12 10:58:42,657 INFO [org.ovirt.engine.core.bll.ProcessDownVmCommand] (org.ovirt.thread.pool-8-thread-31) [55c4f73f] Running command: ProcessDownVmCommand internal: true. 2015-07-12 10:58:42,666 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-371) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Migration failed: Unable to get volume size for domain 6fb7c2dc-0e4b-40b7-a767-b1ff34bb6732 volume a5a07f39-ebf8-4dc1-b1de-6178113e15d2 (VM: external-clone_vm_from_snapshot_vm_nfs, Source: host_mixed_1, Destination: <UNKNOWN>). 2015-07-12 10:58:42,672 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-371) [] START, DestroyVDSCommand(HostName = host_mixed_1, DestroyVmVDSCommandParameters:{runAsync='true', hostId='98da3522-d6ab-47e7-8ec2-b5d4aad97203', vmId='b695dc60-4440-4964-ba82-b71d1fad4524', force='false', secondsToWait='0', gracefully='false', reason=''}), log id: b7a213e 2015-07-12 10:58:43,103 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-371) [] FINISH, DestroyVDSCommand, log id: b7a213e 2015-07-12 10:58:43,105 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (ForkJoinPool-1-worker-371) [] VM 'b695dc60-4440-4964-ba82-b71d1fad4524(external-clone_vm_from_snapshot_vm_nfs) is running in db and not running in VDS 'host_mixed_1' Test link: https://rhev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/3.6-GE-Tier1/1/consoleFull Test info: 10:33:07 2015-07-12 10:33:10,395 - MainThread - disks - INFO - Property 'disk->alias' has correct value: clone_vm_from_snapshot_vm_nfs_Disk_glance 10:33:07 2015-07-12 10:33:10,601 - MainThread - vms - INFO - Using Correlation-Id: vms_syncAction_2342818c-5172-4e41 10:33:08 2015-07-12 10:33:11,257 - MainThread - vms - INFO - Waiting for query `name=clone_vm_from_snapshot_vm_nfs and status=up or name=clone_vm_from_snapshot_vm_nfs and status=up` and event_id None up to 600 seconds,sampling every 10 second. 10:33:39 2015-07-12 10:33:42,592 - MainThread - art.rhevm_api.tests_lib.low_level.vms - INFO - Got ip 10.35.118.3 10:33:40 2015-07-12 10:33:43,575 - MainThread - vms - INFO - Using Correlation-Id: vms_syncAction_dcafaa49-8bed-4c9f 10:33:41 2015-07-12 10:33:43,819 - MainThread - api_utils - ERROR - Failed to syncAction element NOT as expected: 10:33:41 Status: 400 10:33:41 Reason: Bad Request 10:33:41 Detail: [Internal Engine Error] 10:33:41 10:33:41 2015-07-12 10:33:43,820 - MainThread - vms - ERROR - Response code is not valid, expected is: [200, 202], actual is: 400 10:33:41 2015-07-12 10:33:43,821 - MainThread - plmanagement.unittest_loader - ERROR - TEST GROUP setUp ERROR: : rhevmtests.storage.storage_clone_vm_from_snapshot 10:33:41 Traceback (most recent call last): 10:33:41 File "/var/lib/jenkins/workspace/3.6-GE-Tier1/automation/ART/art/test_handler/plmanagement/plugins/unittest_test_runner_plugin.py", line 279, in __iter__ 10:33:41 self.context.setUp() 10:33:41 File "/usr/lib/python2.6/site-packages/nose/suite.py", line 291, in setUp 10:33:41 self.setupContext(ancestor) 10:33:41 File "/usr/lib/python2.6/site-packages/nose/suite.py", line 314, in setupContext 10:33:41 try_run(context, names) 10:33:41 File "/usr/lib/python2.6/site-packages/nose/util.py", line 469, in try_run 10:33:41 return func() 10:33:41 File "/var/lib/jenkins/workspace/3.6-GE-Tier1/automation/ART/art/tests/rhevmtests/storage/storage_clone_vm_from_snapshot/__init__.py", line 45, in setup_module 10:33:41 assert stopVm(True, vm=vm_name) 10:33:41 AssertionError 10:33:41 2015-07-12 10:33:43,848 - MainThread - plmanagement.unittest_loader - INFO - TEST GROUP tearDown: rhevmtests.storage.storage_clone_vm_from_snapshot 10:33:41 2015-07-12 10:33:43,849 - MainThread - art.rhevm_api.tests_lib.low_level.vms - INFO - Removing vms ['clone_vm_from_snapshot_vm_nfs'] 10:33:41 2015-07-12 10:33:44,050 - MainThread - art.rhevm_api.tests_lib.low_level.vms - INFO - Stops vms: ['clone_vm_from_snapshot_vm_nfs'] 10:33:42 2015-07-12 10:33:44,989 - MainThread - art.rhevm_api.tests_lib.low_level.vms - INFO - Got ip 10.35.118.3 10:33:43 2015-07-12 10:33:45,813 - MainThread - vms - INFO - Using Correlation-Id: vms_syncAction_5ecd886c-deb4-423c 10:33:43 2015-07-12 10:33:46,085 - MainThread - vms - INFO - vm status is 'down' 10:33:43 2015-07-12 10:33:46,383 - VM removing - vms - INFO - Using Correlation-Id: vms_delete_0adfcfb6-f0f7-4b7e 10:33:44 2015-07-12 10:33:47,200 - MainThread - art.rhevm_api.tests_lib.low_level.vms - INFO - set([<Thread(VM removing, stopped 140375650621184)>]) 10:33:44 2015-07-12 10:33:47,201 - MainThread - art.rhevm_api.tests_lib.low_level.vms - INFO - VM 'clone_vm_from_snapshot_vm_nfs' deleted asynchronously. 10:33:44 2015-07-12 10:33:47,330 - MainThread - test_utils - INFO - All ['clone_vm_from_snapshot_vm_nfs'] are gone.
will try to fix events sending and handling (https://gerrit.ovirt.org/#/q/topic:vdsm_driven_events5 and corresponding engine side https://gerrit.ovirt.org/#/c/43450/ and few others) and reproduce then.
RHEVM version: 3.6.0-0.12.master.el6 VDSM: vdsm-4.17.3-1.el7ev (RHEL 7.2) Verify manually and in automation.