Bug 1252099

Summary: Migration failed libvirtError in migrateToURI3
Product: Red Hat Enterprise Linux 7 Reporter: Israel Pinto <ipinto>
Component: qemu-kvm-rhevAssignee: Ademar Reis <areis>
Status: CLOSED NEXTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 7.1CC: areis, bazulay, dyuan, ecohen, fjin, fromani, gklein, huding, ipinto, jsuchane, juzhang, knoel, lpeer, lsurette, michal.skrivanek, rbalakri, stefanha, tjelinek, virt-bugs, virt-maint, xfu, ycui, yeylon, zhwang
Target Milestone: pre-dev-freeze   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: virt
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-09-22 14:24:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm and engine logs
none
update_vdsm_engine_logs
none
New_vdsm_ &&_engine_logs
none
libvirt_logs_host1
none
libvirt_logs_host_2 none

Description Israel Pinto 2015-08-10 17:26:22 UTC
Description of problem:
Migration VM with IO threads. Migration failed with error:
libvirtError: Requested operation is not valid: domain 'vm_with_ui' is already active

Version-Release number of selected component (if applicable):
3.6.0-5
3.6.0-0.0.master.20150804111407.git122a3a0.el6
VDSM: vdsm-4.17.0-1239.git6575e3f.el7.noarch

How reproducible:
Not all the time.

Steps to Reproduce:
Migration VM manually

Actual results:
Migration failed

Expected results:
Migration success 


Additional info:
VDSM log(host 10.35.161.159):
Thread-45223::ERROR::2015-08-10 19:56:24,132::migration::310::virt.vm::(run) vmId=`b88b433c-7eeb-4236-83f9-1997c9f9d0eb`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 294, in run
    self._startUnderlyingMigration(time.time())
  File "/usr/share/vdsm/virt/migration.py", line 364, in _startUnderlyingMigration
    self._perform_migration(duri, muri)
  File "/usr/share/vdsm/virt/migration.py", line 403, in _perform_migration
    self._vm._dom.migrateToURI3(duri, params, flags)
  File "/usr/share/vdsm/virt/vm.py", line 206, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1712, in migrateToURI3
    if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
libvirtError: Requested operation is not valid: domain 'vm_with_ui' is already active

Engine log:
2015-08-10 19:57:31,584 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-24) [16b06d51] Correlation ID: 111e8cf8, Job ID: 617e72d7-b90f-48de-a6b5-5923edd6860a, Call Stack: null, Custom Event ID: -1, Message: Failed to migrate VM vm_with_ui to Host host_2 . Trying to migrate to another Host.
2015-08-10 19:57:31,823 WARN  [org.ovirt.engine.core.bll.MigrateVmCommand] (org.ovirt.thread.pool-8-thread-24) [16b06d51] CanDoAction of action 'MigrateVm' failed for user admin@internal. Reasons: VAR__ACTION__MIGRATE,VAR__TYPE__VM,VAR__ACTION__MIGRATE,VAR__TYPE__VM,VAR__ACTION__MIGRATE,VAR__TYPE__VM,SCHEDULING_NO_HOSTS
2015-08-10 19:57:31,824 INFO  [org.ovirt.engine.core.bll.MigrateVmCommand] (org.ovirt.thread.pool-8-thread-24) [16b06d51] Lock freed to object 'EngineLock:{exclusiveLocks='[b88b433c-7eeb-4236-83f9-1997c9f9d0eb=<VM, ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED$VmName vm_with_ui>]', sharedLocks='null'}'
2015-08-10 19:57:31,845 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-24) [16b06d51] Correlation ID: 111e8cf8, Job ID: 617e72d7-b90f-48de-a6b5-5923edd6860a, Call Stack: null, Custom Event ID: -1, Message: Migration failed, No available host found (VM: vm_with_ui, Source: host_1).

Comment 1 Israel Pinto 2015-08-10 17:28:26 UTC
Created attachment 1061179 [details]
vdsm and engine logs

Comment 2 Israel Pinto 2015-08-11 07:07:34 UTC
Adding logs,
source host log:
Thread-66209::ERROR::2015-08-11 10:05:49,785::vm::1416::virt.vm::(_getRunningVmStats) vmId=`b88b433c-7eeb-4236-83f9-1997c9f9d0eb`::Error fetching vm stats
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 1413, in _getRunningVmStats
    vm_sample.interval)
  File "/usr/share/vdsm/virt/vmstats.py", line 43, in produce
    disks(vm, stats, first_sample, last_sample, interval)
  File "/usr/share/vdsm/virt/vmstats.py", line 262, in disks
    interval))
  File "/usr/share/vdsm/virt/vmstats.py", line 291, in _disk_rate
    (last_sample['block.%d.rd.bytes' % last_index] -
KeyError: 'block.0.rd.bytes'
Thread-66209::INFO::2015-08-11 10:05:49,810::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:54663 stopped

destination host log:
JsonRpc (StompReactor)::DEBUG::2015-08-11 10:01:08,968::stompreactor::236::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command=u'SEND'>
Thread-31299::ERROR::2015-08-11 10:01:08,972::vm::780::virt.vm::(_startUnderlyingVm) vmId=`b88b433c-7eeb-4236-83f9-1997c9f9d0eb`::Failed to start a migration destination vm
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 750, in _startUnderlyingVm
    self._completeIncomingMigration()
  File "/usr/share/vdsm/virt/vm.py", line 2732, in _completeIncomingMigration
    self._incomingMigrationFinished.isSet(), usedTimeout)
  File "/usr/share/vdsm/virt/vm.py", line 2785, in _attachLibvirtDomainAfterMigration
    raise MigrationError(e.get_error_message())
MigrationError: Domain not found: no domain with matching uuid 'b88b433c-7eeb-4236-83f9-1997c9f9d0eb'
Thread-31299::INFO::2015-08-11 10:01:08,975::vm::1305::virt.vm::(setDownStatus) vmId=`b88b433c-7eeb-4236-83f9-1997c9f9d0eb`::Changed state to Down: VM failed to migrate (code=8)

Comment 4 Michal Skrivanek 2015-08-11 08:00:05 UTC
looking at the logs....there are multiple failures, seem unrelated to each other. Which one you are referring to?
The one incoming migration to .159 host ends with "libvirtError: internal error: process exited while connecting to monitor:", which points to qemu crash. We would need libvirt and qemu logs

Also note this is a recent feature on qemu/libvirt side, it may behave differently on EL7.2

Comment 5 Michal Skrivanek 2015-08-11 08:00:32 UTC
also note the logs do not cover the period you highlighted in comment #2

Comment 6 Tomas Jelinek 2015-08-11 08:13:33 UTC
also it seems a lot like the issue with the starting of the VM which had 12345 IO Threads configured. Is this happening when you have this amount of IO  threads configured? If yes, there is a patch going which will limit the num of threads per vm.

Comment 7 Israel Pinto 2015-08-11 08:34:41 UTC
Created attachment 1061346 [details]
update_vdsm_engine_logs

Comment 8 Israel Pinto 2015-08-11 08:49:14 UTC
about comment 6:
We can't vm with 12345 threads see bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1251899
I see the failure with 5 threads.
I attached update logs with migration failures. 
If we disable the IO Threads the migration working OK.
(test 5 times consecutively)

Comment 9 Michal Skrivanek 2015-08-11 09:54:58 UTC
thanks. but again, "libvirtError: internal error: process exited while connecting to monitor:" indicate it failed in qemu, so we need qemu and libvirt logs as well

Comment 10 Israel Pinto 2015-08-12 12:22:20 UTC
Created attachment 1061928 [details]
New_vdsm_ &&_engine_logs

Comment 11 Israel Pinto 2015-08-12 12:23:32 UTC
Created attachment 1061929 [details]
libvirt_logs_host1

Comment 12 Israel Pinto 2015-08-12 12:24:45 UTC
Created attachment 1061931 [details]
libvirt_logs_host_2

Comment 13 Israel Pinto 2015-08-12 12:26:51 UTC
Added new logs for libvirt and vdsm.
The migration failed also when IO-Thread are disabled

Comment 14 Michal Skrivanek 2015-08-12 13:13:14 UTC
I can see QEMU crashed/exited right after create on destination side with

"2015-08-12T11:38:15.551530Z qemu-kvm: load of migration failed: Input/output error"

Moving to libvirt team for investigation.
Note it's 7.1 qemu

Nisim, I would suggest to retest on RHEL 7.2 host

Comment 16 Jaroslav Suchanek 2015-08-12 13:34:39 UTC
Moving one layer down to qemu-kvm-rhev for further investigation.

Comment 21 Israel Pinto 2015-08-30 13:17:40 UTC
engine version:  3.6.0-0.12.master.el6
hosts rhel 7.2: 
vdsm version: 3.6.0-0.12.master.el6

Did not reproduce.

Comment 22 Michal Skrivanek 2015-09-22 14:24:08 UTC
(In reply to Israel Pinto from comment #21)
> engine version:  3.6.0-0.12.master.el6
> hosts rhel 7.2: 
> vdsm version: 3.6.0-0.12.master.el6
> 
> Did not reproduce.

Then we're good I guess
All we need is 7.2 which is aligned to RHEV 3.6