Bug 1371957 - [Migration] Failed to migrate VM in pause state
Summary: [Migration] Failed to migrate VM in pause state
Keywords:
Status: CLOSED DUPLICATE of bug 1273720
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 3.6.7
Hardware: Unspecified
OS: Unspecified
unspecified
high vote
Target Milestone: ---
: ---
Assignee: Michal Skrivanek
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-31 14:04 UTC by Israel Pinto
Modified: 2016-09-01 13:17 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-01 08:19:24 UTC
oVirt Team: Virt
ipinto: planning_ack?
ipinto: devel_ack?
ipinto: testing_ack?


Attachments (Terms of Use)
engine.log (2.20 MB, application/x-gzip)
2016-08-31 14:04 UTC, Israel Pinto
no flags Details
libvirt_log_rhel (50.87 KB, application/x-gzip)
2016-08-31 14:17 UTC, Israel Pinto
no flags Details
vm_qemu_log_rhel_host (1.79 KB, application/x-gzip)
2016-08-31 14:19 UTC, Israel Pinto
no flags Details
vdsm_log_rhel (633.92 KB, application/x-gzip)
2016-08-31 14:19 UTC, Israel Pinto
no flags Details
vdsm_log_rhevh (475.68 KB, application/x-gzip)
2016-08-31 14:21 UTC, Israel Pinto
no flags Details
vm_qemu_log_rhevh_host (475.68 KB, application/x-gzip)
2016-08-31 14:21 UTC, Israel Pinto
no flags Details
RHEL_72_source_host_and_engine_log (681.18 KB, application/x-gzip)
2016-09-01 07:31 UTC, Israel Pinto
no flags Details
libivrt_vdsm_log_source_host (407.44 KB, application/x-gzip)
2016-09-01 08:04 UTC, Israel Pinto
no flags Details

Description Israel Pinto 2016-08-31 14:04:46 UTC
Created attachment 1196398 [details]
engine.log

Description of problem:
Failed to migrate VM in pause mode.

Version-Release number of selected component (if applicable):
Engine: 3.6.9-0.1.el6
Hosts:
RHEL 
OS Version:RHEL - 7.3 - 6.el7
Kernel Version:3.10.0 - 481.el7.x86_64
KVM Version:2.6.0 - 22.el7
LIBVIRT Version:libvirt-2.0.0-6.el7
VDSM Version:vdsm-4.17.34-5.git14b8cc2.el7
SPICE Version:0.12.4 - 18.el7
RHEV-H
OS Version:RHEV Hypervisor - 7.2 - 20160826.1.el7ev
Kernel Version:3.10.0 - 327.28.3.el7.x86_64
KVM Version:2.3.0 - 31.el7_2.16
LIBVIRT Version:libvirt-1.2.17-13.el7_2.5
VDSM Version:vdsm-4.17.34-1.el7ev
SPICE Version:0.12.4 - 15.el7_2.1



How reproducible:
All the time

Steps to Reproduce:
1. Create VM and set it to start in pause mode
2. Run VM 
3. Migrate VM
4. Rerun on each host type

Actual results:
Migrate matrices 
RHEL to RHEL - Failed
RHEL to RHEV-H - Failed
RHEV-H to RHEL - PASS


Expected results:
All case migrate should work

Additional info:
selinux status:
[root@lynx18 libvirt]# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      28


[root@lynx19 ~]# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      28

Comment 1 Israel Pinto 2016-08-31 14:16:51 UTC
Engine log:
Migrate from RHEV-H to RHEL:
2016-08-31 15:49:10,222 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-31) [43eb903] Correlation ID: 43eb903, Job ID: b5b7b1d3-aa41-4d12-ba91-08c892cca4d3, Call Stack: null, Custom Event ID: -1, Message: Migration started (VM: mig_test, Source: host_mixed_2, Destination: host_mixed_1, User: admin@internal).
2016-08-31 15:49:13,312 INFO  [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (ForkJoinPool-1-worker-15) [] RefreshVmList VM id 'c369758e-ef57-4fc2-a3e4-01b005f720b1' status = 'Paused' on VDS 'host_mixed_1' ignoring it in the refresh until migration is done
2016-08-31 15:49:14,312 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-15) [] START, DestroyVDSCommand(HostName = host_mixed_2, DestroyVmVDSCommandParameters:{runAsync='true', hostId='0761c6e3-5245-44d2-a41a-4bd0bd954f56', vmId='c369758e-ef57-4fc2-a3e4-01b005f720b1', force='false', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='true'}), log id: 72328d18
2016-08-31 15:49:16,709 INFO  [org.ovirt.engine.core.vdsbroker.VmsStatisticsFetcher] (DefaultQuartzScheduler_Worker-71) [5fc775c9] Fetched 1 VMs from VDS '287ee2a7-22ff-4efd-92cc-396877c81a2f'
2016-08-31 15:49:17,705 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-15) [] FINISH, DestroyVDSCommand, log id: 72328d18
2016-08-31 15:49:17,716 INFO  [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (ForkJoinPool-1-worker-15) [] Handing over VM 'mig_test'(c369758e-ef57-4fc2-a3e4-01b005f720b1) to Host '287ee2a7-22ff-4efd-92cc-396877c81a2f'. Setting VM to status 'MigratingTo'
2016-08-31 15:49:17,716 INFO  [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (ForkJoinPool-1-worker-15) [] VM 'c369758e-ef57-4fc2-a3e4-01b005f720b1(mig_test) is running in db and not running in VDS 'host_mixed_2'
2016-08-31 15:49:25,610 INFO  [org.ovirt.engine.core.vdsbroker.VmsStatisticsFetcher] (DefaultQuartzScheduler_Worker-20) [70ffddad] Fetched 0 VMs from VDS '0761c6e3-5245-44d2-a41a-4bd0bd954f56'
2016-08-31 15:49:32,739 INFO  [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (DefaultQuartzScheduler_Worker-97) [74c71f74] VM 'c369758e-ef57-4fc2-a3e4-01b005f720b1'(mig_test) moved from 'MigratingTo' --> 'Paused'
2016-08-31 15:49:32,772 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-97) [74c71f74] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM mig_test has been paused.


Migrate from RHEL to RHEV-H:
2016-08-31 15:49:55,078 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-4) [50d38b7f] Correlation ID: 50d38b7f, Job ID: ae5192cf-6075-4d7b-a31a-012e6b7632d8, Call Stack: null, Custom Event ID: -1, Message: Migration started (VM: mig_test, Source: host_mixed_1, Destination: host_mixed_2, User: admin@internal).
2016-08-31 15:49:57,660 INFO  [org.ovirt.engine.core.vdsbroker.VmsStatisticsFetcher] (DefaultQuartzScheduler_Worker-59) [15bb73ad] Fetched 1 VMs from VDS '0761c6e3-5245-44d2-a41a-4bd0bd954f56'
2016-08-31 15:49:57,668 INFO  [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (DefaultQuartzScheduler_Worker-59) [15bb73ad] RefreshVmList VM id 'c369758e-ef57-4fc2-a3e4-01b005f720b1' status = 'NotResponding' on VDS 'host_mixed_2' ignoring it in the refresh until migration is done
2016-08-31 15:49:58,691 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-8) [] START, DestroyVDSCommand(HostName = host_mixed_2, DestroyVmVDSCommandParameters:{runAsync='true', hostId='0761c6e3-5245-44d2-a41a-4bd0bd954f56', vmId='c369758e-ef57-4fc2-a3e4-01b005f720b1', force='false', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='true'}), log id: 2b128264
2016-08-31 15:50:01,542 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-8) [] Failed to destroy VM 'c369758e-ef57-4fc2-a3e4-01b005f720b1' because VM does not exist, ignoring
2016-08-31 15:50:01,542 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-8) [] FINISH, DestroyVDSCommand, log id: 2b128264
2016-08-31 15:50:01,546 INFO  [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (ForkJoinPool-1-worker-8) [] RefreshVmList VM id 'c369758e-ef57-4fc2-a3e4-01b005f720b1' status = 'Down' on VDS 'host_mixed_2' ignoring it in the refresh until migration is done
2016-08-31 15:50:02,473 INFO  [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (ForkJoinPool-1-worker-8) [] VM 'c369758e-ef57-4fc2-a3e4-01b005f720b1'(mig_test) moved from 'MigratingFrom' --> 'Paused'

Comment 2 Israel Pinto 2016-08-31 14:17:48 UTC
Created attachment 1196403 [details]
libvirt_log_rhel

Comment 3 Israel Pinto 2016-08-31 14:19:22 UTC
Created attachment 1196404 [details]
vm_qemu_log_rhel_host

Comment 4 Israel Pinto 2016-08-31 14:19:57 UTC
Created attachment 1196405 [details]
vdsm_log_rhel

Comment 5 Israel Pinto 2016-08-31 14:21:11 UTC
Created attachment 1196406 [details]
vdsm_log_rhevh

Comment 6 Israel Pinto 2016-08-31 14:21:57 UTC
Created attachment 1196407 [details]
vm_qemu_log_rhevh_host

Comment 7 Yaniv Kaul 2016-09-01 06:58:26 UTC
Israel:

1. Is it a regression?
2. Does it have anything to do with RHEL 7.3? Does it work with 7.2?
3. Why did migration fail? I see lots of logs, but no real reason. Do you have any idea?

Comment 8 Israel Pinto 2016-09-01 07:30:30 UTC
(In reply to Yaniv Kaul from comment #7)
> Israel:
> 
> 1. Is it a regression?
No it not regression, as i understood from Tomas the migration should work in pause mode. We have negative case that check it not possible  to migrate.
It pass(migrate failed) since we check it always with RHEL to RHEL.
This time since we check it in mix enviroment i see it working.
We changing the test to positive 

> 2. Does it have anything to do with RHEL 7.3? Does it work with 7.2?
It not relate to OS type, it happened also in RHEL 7.2

> 3. Why did migration fail? I see lots of logs, but no real reason. Do you
> have any idea?
It failing only if VM is running on RHEL host, 
See reproduce with RHEL 7.2 

VDSM log:
jsonrpc.Executor/0::INFO::2016-09-01 10:14:26,445::__init__::512::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getStats succeeded in 0.01 seconds
Thread-257240::DEBUG::2016-09-01 10:14:26,527::migration::188::virt.vm::(_setupVdsConnection) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::Initiating connection with destination
Thread-257240::DEBUG::2016-09-01 10:14:26,528::stompreactor::408::jsonrpc.AsyncoreClient::(send) Sending response
Thread-257240::DEBUG::2016-09-01 10:14:26,531::migration::200::virt.vm::(_setupVdsConnection) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::Destination server is: virt-nested-vm14.scl.lab.tlv.redhat.com:54321
Thread-257240::DEBUG::2016-09-01 10:14:26,532::migration::246::virt.vm::(_prepareGuest) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::Migration started
Thread-257240::DEBUG::2016-09-01 10:14:26,540::migration::353::virt.vm::(run) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::migration semaphore acquired after 1 seconds
Thread-257240::DEBUG::2016-09-01 10:14:26,548::stompreactor::408::jsonrpc.AsyncoreClient::(send) Sending response
Thread-257240::INFO::2016-09-01 10:14:26,823::migration::407::virt.vm::(_startUnderlyingMigration) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::Creation of destination VM took: 0 seconds
Thread-257240::INFO::2016-09-01 10:14:26,823::migration::429::virt.vm::(_startUnderlyingMigration) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::starting migration to qemu+tls://virt-nested-vm14.scl.lab.tlv.redhat.com/system with miguri tcp://10.35.161.161
Thread-257240::DEBUG::2016-09-01 10:14:26,823::migration::481::virt.vm::(_perform_with_downtime_thread) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::performing migration with downtime thread
Thread-257241::DEBUG::2016-09-01 10:14:26,823::migration::620::virt.vm::(run) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::starting migration monitor thread
Thread-257241::DEBUG::2016-09-01 10:14:26,824::migration::647::virt.vm::(monitor_migration) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::setting initial migration downtime
Thread-257241::DEBUG::2016-09-01 10:14:26,824::migration::574::virt.vm::(_set_downtime) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::setting migration downtime to 101
Thread-257240::DEBUG::2016-09-01 10:14:30,559::migration::715::virt.vm::(stop) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::stopping migration monitor thread
Thread-257240::ERROR::2016-09-01 10:14:30,560::migration::252::virt.vm::(_recover) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::operation failed: domain is no longer running
Thread-257240::DEBUG::2016-09-01 10:14:30,560::stompreactor::408::jsonrpc.AsyncoreClient::(send) Sending response
Thread-257241::DEBUG::2016-09-01 10:14:30,561::migration::570::virt.vm::(stop) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::stopping migration downtime thread
Thread-257241::DEBUG::2016-09-01 10:14:30,561::migration::629::virt.vm::(run) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::stopped migration monitor thread
Thread-257240::DEBUG::2016-09-01 10:14:30,597::__init__::208::jsonrpc.Notification::(emit) Sending event {"params": {"notify_time": 7044937090, "6adec1d9-519a-4ca7-b3b6-bae6369777a8": {"status": "Paused"}}, "jsonrpc": "2.0", "method": "|virt|VM_status|6adec1d9-519a-4ca7-b3b6-bae6369777a8"}
Thread-257240::ERROR::2016-09-01 10:14:30,598::migration::381::virt.vm::(run) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 363, in run
    self._startUnderlyingMigration(time.time())
  File "/usr/share/vdsm/virt/migration.py", line 438, in _startUnderlyingMigration
    self._perform_with_downtime_thread(duri, muri)
  File "/usr/share/vdsm/virt/migration.py", line 489, in _perform_with_downtime_thread
    self._perform_migration(duri, muri)
  File "/usr/share/vdsm/virt/migration.py", line 476, in _perform_migration
    self._vm._dom.migrateToURI3(duri, params, flags)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 916, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1836, in migrateToURI3
    if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
libvirtError: operation failed: domain is no longer running

Engine log:
2016-09-01 10:14:24,710 INFO  [org.ovirt.engine.core.bll.MigrateVmCommand] (org.ovirt.thread.pool-6-thread-46) [729f8d76] Running command: MigrateVmCommand internal: false. Entities affected :  ID: 6adec1d9-519a-4ca7-b3b6-bae6369777a8 Type: VMAction group MIGRATE_VM with role type USER
2016-09-01 10:14:25,094 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (org.ovirt.thread.pool-6-thread-46) [729f8d76] START, MigrateVDSCommand( MigrateVDSCommandParameters:{runAsync='true', hostId='7e467b7f-765b-4fdd-951b-d22e885c3672', vmId='6adec1d9-519a-4ca7-b3b6-bae6369777a8', srcHost='seal08.qa.lab.tlv.redhat.com', dstVdsId='8b7697ec-70a6-4efb-8fe6-6a86bcaa875b', dstHost='virt-nested-vm14.scl.lab.tlv.redhat.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='false', migrateCompressed='false', consoleAddress='null', maxBandwidth='null', enableGuestEvents='false', convergenceSchedule='null'}), log id: 5381b7d8
2016-09-01 10:14:25,095 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (org.ovirt.thread.pool-6-thread-46) [729f8d76] START, MigrateBrokerVDSCommand(HostName = host_mixed_3, MigrateVDSCommandParameters:{runAsync='true', hostId='7e467b7f-765b-4fdd-951b-d22e885c3672', vmId='6adec1d9-519a-4ca7-b3b6-bae6369777a8', srcHost='seal08.qa.lab.tlv.redhat.com', dstVdsId='8b7697ec-70a6-4efb-8fe6-6a86bcaa875b', dstHost='virt-nested-vm14.scl.lab.tlv.redhat.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='false', migrateCompressed='false', consoleAddress='null', maxBandwidth='null', enableGuestEvents='false', convergenceSchedule='null'}), log id: 7a38c389
2016-09-01 10:14:26,273 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (org.ovirt.thread.pool-6-thread-46) [729f8d76] FINISH, MigrateBrokerVDSCommand, log id: 7a38c389
2016-09-01 10:14:26,641 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (org.ovirt.thread.pool-6-thread-46) [729f8d76] FINISH, MigrateVDSCommand, return: MigratingFrom, log id: 5381b7d8
2016-09-01 10:14:26,678 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-46) [729f8d76] Correlation ID: 729f8d76, Job ID: 84451e5e-f4b5-4f57-b049-45d429e4ef84, Call Stack: null, Custom Event ID: -1, Message: Migration started (VM: golden_env_mixed_virtio_0, Source: host_mixed_3, Destination: host_mixed_1, User: admin@internal-authz).
2016-09-01 10:14:26,777 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmsStatisticsFetcher] (DefaultQuartzScheduler10) [679b0b67] Fetched 1 VMs from VDS '7e467b7f-765b-4fdd-951b-d22e885c3672'
2016-09-01 10:14:26,777 WARN  [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (DefaultQuartzScheduler10) [679b0b67] skipping VM '6adec1d9-519a-4ca7-b3b6-bae6369777a8' from this monitoring cycle - the VM data has changed since fetching the data
2016-09-01 10:14:30,599 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-1) [] VM '6adec1d9-519a-4ca7-b3b6-bae6369777a8' was reported as Down on VDS '8b7697ec-70a6-4efb-8fe6-6a86bcaa875b'(host_mixed_1)
2016-09-01 10:14:30,601 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-1) [] START, DestroyVDSCommand(HostName = host_mixed_1, DestroyVmVDSCommandParameters:{runAsync='true', hostId='8b7697ec-70a6-4efb-8fe6-6a86bcaa875b', vmId='6adec1d9-519a-4ca7-b3b6-bae6369777a8', force='false', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='true'}), log id: 2f5980a1
2016-09-01 10:14:31,350 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-1) [] Failed to destroy VM '6adec1d9-519a-4ca7-b3b6-bae6369777a8' because VM does not exist, ignoring
2016-09-01 10:14:31,350 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-1) [] FINISH, DestroyVDSCommand, log id: 2f5980a1
2016-09-01 10:14:34,008 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmsStatisticsFetcher] (DefaultQuartzScheduler8) [2620befb] Fetched 0 VMs from VDS '8b7697ec-70a6-4efb-8fe6-6a86bcaa875b'
2016-09-01 10:14:35,012 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmsStatisticsFetcher] (DefaultQuartzScheduler10) [679b0b67] Fetched 0 VMs from VDS 'd400cf6c-7db1-4930-90da-c76b2f6a8555'
2016-09-01 10:14:42,794 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmsStatisticsFetcher] (DefaultQuartzScheduler8) [2620befb] Fetched 1 VMs from VDS '7e467b7f-765b-4fdd-951b-d22e885c3672'
2016-09-01 10:14:42,797 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (DefaultQuartzScheduler8) [2620befb] VM '6adec1d9-519a-4ca7-b3b6-bae6369777a8'(golden_env_mixed_virtio_0) moved from 'MigratingFrom' --> 'Paused'
2016-09-01 10:14:42,951 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler8) [2620befb] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM golden_env_mixed_virtio_0 has been paused.
2016-09-01 10:14:43,024 INFO  [org.ovirt.engine.core.bll.MigrateVmCommand] (DefaultQuartzScheduler8) [2620befb] Lock freed to object 'EngineLock:{exclusiveLocks='[6adec1d9-519a-4ca7-b3b6-bae6369777a8=<VM, ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED$VmName golden_env_mixed_virtio_0>]', sharedLocks='null'}'

Attached logs

Comment 9 Israel Pinto 2016-09-01 07:31:16 UTC
Created attachment 1196608 [details]
RHEL_72_source_host_and_engine_log

Comment 10 Yaniv Kaul 2016-09-01 07:42:32 UTC
1. You can't migrate every paused VM. For example, if it is paused due to IO errors, it must not be migrated. I hope you test that.
2. Did you try to correlate between the libvirt logs and the VDSM log? I couldn't.
3. I don't think engine log is interesting here.
4. Did you verify with Virt QE they are testing this? What were their results?

Comment 11 Israel Pinto 2016-09-01 08:03:27 UTC
(In reply to Yaniv Kaul from comment #10)
> 1. You can't migrate every paused VM. For example, if it is paused due to IO
> errors, it must not be migrated. I hope you test that.
It more simple case, we start VM in pause mode.
> 2. Did you try to correlate between the libvirt logs and the VDSM log? I
> couldn't.
I rerun it and add  also libvirt log:
From vdsm log:
Thread-72::ERROR::2016-09-01 10:52:28,345::migration::381::virt.vm::(run) vmId=`6adec1d9-519a-4ca7-b3b6-bae6369777a8`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 363, in run
    self._startUnderlyingMigration(time.time())
  File "/usr/share/vdsm/virt/migration.py", line 438, in _startUnderlyingMigration
    self._perform_with_downtime_thread(duri, muri)
  File "/usr/share/vdsm/virt/migration.py", line 489, in _perform_with_downtime_thread
    self._perform_migration(duri, muri)
  File "/usr/share/vdsm/virt/migration.py", line 476, in _perform_migration
    self._vm._dom.migrateToURI3(duri, params, flags)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 916, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1836, in migrateToURI3
    if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
libvirtError: operation failed: domain is no longer running

From libvirt log:

2016-09-01 07:52:28.310+0000: 28803: error : virNetClientProgramDispatchError:177 
: operation failed: domain is no longer running

> 3. I don't think engine log is interesting here.  NP
> 4. Did you verify with Virt QE they are testing this? What were their
> results?
Will check with Virt QE about this case - NP

Comment 12 Israel Pinto 2016-09-01 08:04:35 UTC
Created attachment 1196616 [details]
libivrt_vdsm_log_source_host

Comment 13 Yaniv Kaul 2016-09-01 08:19:24 UTC
The issue is probably somewhere buried in the destination:
ERROR: invalid runstate transition: 'inmigrate' -> 'postmigrate'

See /var/log/libvirt/qemu/golden_env_mixed_virtio_0.log:

2016-09-01 07:52:27.215+0000: starting up libvirt version: 1.2.17, package: 13.el7_2.5 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-05-26-07:48:46, x86-020.build.eng.bos.redhat.com), qemu version: 
2.3.0 (qemu-kvm-rhev-2.3.0-31.el7_2.21)
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name golden_env_mixed_virtio_0 -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu Conroe -m size=104
8576k,slots=16,maxmem=4294967296k -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 -numa node,nodeid=0,cpus=0,mem=1024 -uuid 6adec1d9-519a-4ca7-b3b6-bae6369777a8 -smbios type=1,manufacturer=Red
 Hat,product=RHEV Hypervisor,version=7.2-9.el7_2.1,serial=00000000-0000-0000-0000-0CC47A3A2B72,uuid=6adec1d9-519a-4ca7-b3b6-bae6369777a8 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-golden_env_mixed_virtio_0/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2016-09-01T07:52:26,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/rhev/data-center/3a68a7f2-982f-4829-94cb-57a9577bd9b1/4df3d6f1-c1ce-4c25-a9ea-e1415eb629c1/images/a91854bc-b39f-4ace-adc6-e55128894905/b45f3a61-74af-4f51-ac77-9c02b1186328,if=none,id=drive-virtio-disk0,format=qcow2,serial=a91854bc-b39f-4ace-adc6-e55128894905,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:16:01:5c,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/6adec1d9-519a-4ca7-b3b6-bae6369777a8.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/6adec1d9-519a-4ca7-b3b6-bae6369777a8.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -spice tls-port=5900,addr=10.35.161.161,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=default,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on -k en-us -vnc 10.35.161.161:1,password -k en-us -device qxl-vga,id=video0,ram_size=67108864,vram_size=8388608,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -incoming tcp:0.0.0.0:49152 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on
Domain id=3 is tainted: hook-script
2016-09-01 07:52:27.218+0000: 3417: debug : virFileClose:102 : Closed fd 27
2016-09-01 07:52:27.218+0000: 3417: debug : virFileClose:102 : Closed fd 36
2016-09-01 07:52:27.219+0000: 3417: debug : virFileClose:102 : Closed fd 3
2016-09-01 07:52:27.221+0000: 3418: debug : virExec:691 : Run hook 0x7fcc095eba00 0x7fcc2d76bd50
2016-09-01 07:52:27.221+0000: 3418: debug : qemuProcessHook:3219 : Obtaining domain lock
2016-09-01 07:52:27.221+0000: 3418: debug : virSecuritySELinuxSetSecuritySocketLabel:2215 : Setting VM golden_env_mixed_virtio_0 socket context system_u:system_r:svirt_t:s0:c796,c857
2016-09-01 07:52:27.222+0000: 3418: debug : virDomainLockProcessStart:179 : plugin=0x7fcc1c16efe0 dom=0x7fcc18018fb0 paused=1 fd=0x7fcc2d76b880
2016-09-01 07:52:27.222+0000: 3418: debug : virDomainLockManagerNew:134 : plugin=0x7fcc1c16efe0 dom=0x7fcc18018fb0 withResources=1
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerPluginGetDriver:281 : plugin=0x7fcc1c16efe0
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerNew:305 : driver=0x7fcc00bf7000 type=0 nparams=5 params=0x7fcc2d76b730 flags=1
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerLogParams:98 :   key=uuid type=uuid value=6adec1d9-519a-4ca7-b3b6-bae6369777a8
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerLogParams:91 :   key=name type=string value=golden_env_mixed_virtio_0
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerLogParams:79 :   key=id type=uint value=3
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerLogParams:79 :   key=pid type=uint value=3418
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerLogParams:94 :   key=uri type=cstring value=qemu:///system
2016-09-01 07:52:27.222+0000: 3418: debug : virDomainLockManagerNew:146 : Adding leases
2016-09-01 07:52:27.222+0000: 3418: debug : virDomainLockManagerNew:151 : Adding disks
2016-09-01 07:52:27.222+0000: 3418: debug : virDomainLockManagerAddImage:90 : Add disk /rhev/data-center/3a68a7f2-982f-4829-94cb-57a9577bd9b1/4df3d6f1-c1ce-4c25-a9ea-e1415eb629c1/images/a91854bc-b39f-4ace-adc6-e55128894905/b45f3a61-74af-4f51-ac77-9c02b1186328
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerAddResource:332 : lock=0x7fcc18005ee0 type=0 name=/rhev/data-center/3a68a7f2-982f-4829-94cb-57a9577bd9b1/4df3d6f1-c1ce-4c25-a9ea-e1415eb629c1/images/a91854bc-b39f-4ace-adc6-e55128894905/b45f3a61-74af-4f51-ac77-9c02b1186328 nparams=0 params=(nil) flags=0
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerAcquire:350 : lock=0x7fcc18005ee0 state='<null>' flags=3 action=0 fd=0x7fcc2d76b880
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerSanlockAcquire:917 : Register sanlock 3
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerSanlockAcquire:1011 : Acquire completed fd=3
2016-09-01 07:52:27.222+0000: 3418: debug : virLockManagerFree:387 : lock=0x7fcc18005ee0
2016-09-01 07:52:27.223+0000: 3418: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x7fcc1c1072a0
2016-09-01 07:52:27.223+0000: 3418: debug : qemuProcessHook:3260 : Hook complete ret=0
2016-09-01 07:52:27.223+0000: 3418: debug : virExec:693 : Done hook 0
2016-09-01 07:52:27.223+0000: 3418: debug : virExec:700 : Setting child security label to system_u:system_r:svirt_t:s0:c796,c857
2016-09-01 07:52:27.223+0000: 3418: debug : virExec:730 : Setting child uid:gid to 107:107 with caps 0
2016-09-01 07:52:27.224+0000: 3418: debug : virCommandHandshakeChild:432 : Notifying parent for handshake start on 33
2016-09-01 07:52:27.224+0000: 3418: debug : virCommandHandshakeChild:440 : Waiting on parent for handshake complete on 34
2016-09-01 07:52:27.281+0000: 3418: debug : virFileClose:102 : Closed fd 33
2016-09-01 07:52:27.281+0000: 3418: debug : virFileClose:102 : Closed fd 34
2016-09-01 07:52:27.281+0000: 3418: debug : virCommandHandshakeChild:460 : Handshake with parent is done
2016-09-01T07:52:27.359279Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2016-09-01T07:52:27.359492Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config
ERROR: invalid runstate transition: 'inmigrate' -> 'postmigrate'
2016-09-01 07:52:27.904+0000: shutting down


And Google reveals that error to be in https://bugzilla.redhat.com/show_bug.cgi?id=1273720

*** This bug has been marked as a duplicate of bug 1273720 ***

Comment 14 Michal Skrivanek 2016-09-01 13:17:08 UTC
note this happens only when the guest never runs, i.e. run in paused mode and left like that, then migrated


Note You need to log in before you can comment on or make changes to this bug.