RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 965172 - "Internal Engine Error" after ongoing live migration times out.
Summary: "Internal Engine Error" after ongoing live migration times out.
Keywords:
Status: CLOSED DUPLICATE of bug 970645
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.4
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Martin Kletzander
QA Contact: Virtualization Bugs
URL:
Whiteboard: virt
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-05-20 15:23 UTC by Julio Entrena Perez
Modified: 2018-12-05 16:00 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-08-15 14:05:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 377203 0 None None None Never

Description Julio Entrena Perez 2013-05-20 15:23:58 UTC
Description of problem:
"Internal Engine Error" is reported by destination host after an ongoing live migration times out ('migration_timeout' reached).
Webadmin portal presents user with unhelpful error message which doesn't provide any hint of what did go wrong.

Version-Release number of selected component (if applicable):
vdsm-4.10.2-1.13.el6ev
rhevm-3.1.0-53.el6ev

How reproducible:
Unknown, it happened once so far.

Steps to Reproduce:
1. Start a live migration which won't complete within the default 'migration_timeout' of 300 seconds. 
2. Wait until those 5 minutes have elapsed.

Actual results:
Migration failed due to Error: Internal Engine Error (VM: vm_hostname, Source Host: host_hostname).


Expected results:
A more descriptive error message, e.g. "Migration failed to complete within default timeout of $migration_timeout seconds".


Additional info:

- Source host:

[...]
Thread-61770::INFO::2013-05-17 12:33:12,750::libvirtvm::413::vm.Vm::(run) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Migration Progress: 292 seconds elapsed, 50% of data processed, 50% of mem processed
Thread-61770::INFO::2013-05-17 12:33:26,731::libvirtvm::413::vm.Vm::(run) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Migration Progress: 306 seconds elapsed, 51% of data processed, 51% of mem processed
Thread-61769::DEBUG::2013-05-17 12:33:26,302::libvirtvm::351::vm.Vm::(run) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::setting migration downtime to 250
Thread-61770::WARNING::2013-05-17 12:33:38,616::libvirtvm::399::vm.Vm::(run) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Migration is stuck: Hasn't progressed in 307.269540071 seconds. Aborting.
Thread-61770::DEBUG::2013-05-17 12:33:42,844::libvirtvm::416::vm.Vm::(stop) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::stopping migration monitor thread
libvirtEventLoop::DEBUG::2013-05-17 12:33:51,554::libvirtvm::2808::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::event Resumed detail 1 opaque None
Thread-61766::DEBUG::2013-05-17 12:33:51,577::libvirtvm::357::vm.Vm::(cancel) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::canceling migration downtime thread
Thread-61766::DEBUG::2013-05-17 12:33:51,579::libvirtvm::416::vm.Vm::(stop) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::stopping migration monitor thread
Thread-61769::DEBUG::2013-05-17 12:33:51,579::libvirtvm::354::vm.Vm::(run) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::migration downtime thread exiting
Thread-61766::ERROR::2013-05-17 12:33:51,581::vm::195::vm.Vm::(_recover) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::operation aborted: migration job: canceled by client
Thread-61766::ERROR::2013-05-17 12:33:58,758::vm::283::vm.Vm::(run) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Failed to migrate
[...]
Thread-61836::DEBUG::2013-05-17 12:34:00,605::BindingXMLRPC::903::vds::(wrapper) client [2.3.4.5]::call vmGetMigrationStatus with ('d6446340-b00a-4068-8778-2227f89776fd',) {}
Thread-61836::DEBUG::2013-05-17 12:34:00,605::BindingXMLRPC::910::vds::(wrapper) return vmGetMigrationStatus with {'status': {'message': 'Migration canceled', 'code': {'status': {'message': 'Migration not in progress', 'code': 47}}}}


- Destination host:

Thread-50523::ERROR::2013-05-17 12:33:51,488::utils::416::vm.Vm::(collect) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Stats function failed: <AdvancedStatsFunction _highWrite at 0x21d5d50>
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 412, in collect
    statsFunction()
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 287, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/libvirtvm.py", line 109, in _highWrite
    self._vm._dom.blockInfo(vmDrive.path, 0)
  File "/usr/share/vdsm/libvirtvm.py", line 525, in f
    raise toe
TimeoutError: Timed out during operation: cannot acquire state change lock
[...]
libvirtEventLoop::DEBUG::2013-05-17 12:33:51,509::libvirtvm::2808::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::event Stopped detail 5 opaque None
Thread-50523::DEBUG::2013-05-17 12:33:51,509::task::957::TaskManager.Task::(_decref) Task=`579740f0-17e8-4ac9-8605-a078bc168489`::ref 0 aborting False
libvirtEventLoop::INFO::2013-05-17 12:33:51,512::vm::713::vm.Vm::(_onQemuDeath) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::underlying process disconnected
libvirtEventLoop::INFO::2013-05-17 12:33:51,516::libvirtvm::2374::vm.Vm::(releaseVm) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Release VM resources
[...]
Thread-50546::DEBUG::2013-05-17 12:33:51,733::BindingXMLRPC::903::vds::(wrapper) client [1.2.3.4]::call vmDestroy with ('d6446340-b00a-4068-8778-2227f89776fd',) {}
[...]
Thread-50523::ERROR::2013-05-17 12:33:51,683::libvirtconnection::106::vds::(wrapper) Unknown libvirterror: ecode: 1 edom: 10
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1849, in getCPUStats
    if ret is None: raise libvirtError ('virDomainGetCPUStats() failed', dom=self)
libvirtError: internal error No such domain <D6>Dc@<B0>
@h<87>x"'<F8><97>v<FD>
[...]
Thread-50546::INFO::2013-05-17 12:33:51,775::API::308::vds::(destroy) vmContainerLock acquired by vm d6446340-b00a-4068-8778-2227f89776fd
[...]
Thread-50546::DEBUG::2013-05-17 12:33:52,066::libvirtvm::2425::vm.Vm::(destroy) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::destroy Called
Thread-50523::ERROR::2013-05-17 12:33:51,792::utils::416::vm.Vm::(collect) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Stats function failed: <AdvancedStatsFunction _sampleCpu at 0x21d5e30>
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 412, in collect
    statsFunction()
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 287, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/libvirtvm.py", line 137, in _sampleCpu
    cpuStats = self._vm._dom.getCPUStats(True, 0)
  File "/usr/share/vdsm/libvirtvm.py", line 517, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1849, in getCPUStats
    if ret is None: raise libvirtError ('virDomainGetCPUStats() failed', dom=self)
libvirtError: internal error No such domain <D6>Dc@<B0>
@h<87>x"'<F8><97>v<FD>
[...]
Thread-50523::ERROR::2013-05-17 12:33:52,135::libvirtconnection::106::vds::(wrapper) Unknown libvirterror: ecode: 42 edom: 10
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1820, in blockStats
    if ret is None: raise libvirtError ('virDomainBlockStats() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid 'd6446340-b00a-4068-8778-2227f89776fd'
[...]
Thread-50523::ERROR::2013-05-17 12:33:52,239::libvirtconnection::106::vds::(wrapper) Unknown libvirterror: ecode: 42 edom: 10
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1826, in blockStatsFlags
    if ret is None: raise libvirtError ('virDomainBlockStatsFlags() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid 'd6446340-b00a-4068-8778-2227f89776fd'
Thread-50523::ERROR::2013-05-17 12:33:52,286::libvirtconnection::106::vds::(wrapper) Unknown libvirterror: ecode: 42 edom: 10
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1867, in interfaceStats
    if ret is None: raise libvirtError ('virDomainInterfaceStats() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid 'd6446340-b00a-4068-8778-2227f89776fd'
libvirtEventLoop::ERROR::2013-05-17 12:33:52,364::libvirtconnection::106::vds::(wrapper) Unknown libvirterror: ecode: 42 edom: 10
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 381, in XMLDesc
    if ret is None: raise libvirtError ('virDomainGetXMLDesc() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid 'd6446340-b00a-4068-8778-2227f89776fd'
libvirtEventLoop::DEBUG::2013-05-17 12:33:52,380::utils::353::vm.Vm::(stop) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Stop statistics collection
libvirtEventLoop::DEBUG::2013-05-17 12:33:52,394::vmChannels::208::vds::(unregister) Delete fileno 89 from listener.
Thread-50523::DEBUG::2013-05-17 12:33:52,394::utils::384::vm.Vm::(run) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Stats thread finished
[...]
libvirtEventLoop::ERROR::2013-05-17 12:33:52,420::libvirtconnection::106::vds::(wrapper) Unknown libvirterror: ecode: 42 edom: 10
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 755, in destroyFlags
    if ret == -1: raise libvirtError ('virDomainDestroyFlags() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid 'd6446340-b00a-4068-8778-2227f89776fd'
[...]
Thread-50546::DEBUG::2013-05-17 12:33:52,625::libvirtvm::2419::vm.Vm::(deleteVm) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Total desktops after destroy of d6446340-b00a-4068-8778-2227f89776fd is 75
Thread-50546::DEBUG::2013-05-17 12:33:52,632::BindingXMLRPC::910::vds::(wrapper) return vmDestroy with {'status': {'message': 'Machine destroyed', 'code': 0}}
libvirtEventLoop::ERROR::2013-05-17 12:33:52,636::libvirtconnection::106::vds::(wrapper) Unknown libvirterror: ecode: 42 edom: 10
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 381, in XMLDesc
    if ret is None: raise libvirtError ('virDomainGetXMLDesc() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid 'd6446340-b00a-4068-8778-2227f89776fd'
libvirtEventLoop::DEBUG::2013-05-17 12:33:52,657::vm::1027::vm.Vm::(setDownStatus) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Changed state to Down: Admin shut down
libvirtEventLoop::DEBUG::2013-05-17 12:33:52,664::utils::353::vm.Vm::(stop) vmId=`d6446340-b00a-4068-8778-2227f89776fd`::Stop statistics collection
libvirtEventLoop::ERROR::2013-05-17 12:33:52,681::libvirtconnection::106::vds::(wrapper) Unknown libvirterror: ecode: 42 edom: 10
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 83, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 381, in XMLDesc
    if ret is None: raise libvirtError ('virDomainGetXMLDesc() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid 'd6446340-b00a-4068-8778-2227f89776fd'


- Manager:

2013-05-17 12:34:00,521 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-18) VM vm_hostname d6446340-b00a-4068-8778-2227f89776fd moved from MigratingFrom --> Up
2013-05-17 12:34:00,521 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-18) adding VM d6446340-b00a-4068-8778-2227f89776fd to re-run list
2013-05-17 12:34:00,522 ERROR [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-18) Rerun vm d6446340-b00a-4068-8778-2227f89776fd. Called from vds src_hostname
2013-05-17 12:34:00,569 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (pool-3-thread-49) START, MigrateStatusVDSCommand(HostName = src_hostname
, HostId = 1a62f776-695e-11e2-a97a-fb8bf5530f36, vmId=d6446340-b00a-4068-8778-2227f89776fd), log id: 3b3e8edd
2013-05-17 12:34:00,607 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand] (pool-3-thread-49) Failed in MigrateStatusVDS method, for vds: src_hostname
; host: 1.2.3.4
2013-05-17 12:34:00,607 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-3-thread-49) Command MigrateStatusVDS execution failed. Exception: ClassCastException: java.util.HashMap cannot be cast to java.lang.Integer
2013-05-17 12:34:00,607 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (pool-3-thread-49) FINISH, MigrateStatusVDSCommand, log id: 3b3e8edd
[...]
2013-05-17 12:34:00,781 WARN  [org.ovirt.engine.core.bll.MigrateVmCommand] (pool-3-thread-49) CanDoAction of action MigrateVm failed. Reasons:ACTION_TYPE_FAILED_VDS_VM_CLUSTER,VAR__ACTION__MIGRATE,VAR__TYPE__VM


Uncompressed sosreports of both hosts, the manager and the guest are also available at http://jentrena-xw8600.usersys.redhat.com/00846482/ .

Comment 3 Michal Skrivanek 2013-05-27 12:22:46 UTC
errors on destination host points to some internal libvirt problem. please take a look...

Comment 8 Martin Kletzander 2013-08-14 14:39:59 UTC
It is very hard to find the cause in these logs.  What is the name of the machine that didn't migrate?  Grepping all the errors and looking at the logs shown in this bz I can't find anything unusual on the bug.  Can you point me in a right direction?

Comment 9 Martin Kletzander 2013-08-15 14:05:18 UTC
As mentioned in comment #6, this is already tracked in bug 970645.  After discussion with Michal and going through the logs I've found no other problem beyond the mentioned one, so I'm closing it as DUPLICATE.

*** This bug has been marked as a duplicate of bug 970645 ***


Note You need to log in before you can comment on or make changes to this bug.