Bug 1409246

Summary: VM migration failing with "Returning backwards compatible migration error code"
Product: Red Hat Enterprise Virtualization Manager Reporter: Gordon Watson <gwatson>
Component: vdsmAssignee: Martin Betak <mbetak>
Status: CLOSED ERRATA QA Contact: Israel Pinto <ipinto>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.0.2CC: bazulay, ddash, gklein, gwatson, lsurette, mavital, mgoldboi, michal.skrivanek, mkalinin, shipatil, srevivo, tjelinek, trichard, ycui, ykaul
Target Milestone: ovirt-4.1.0-rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, the Manager didn't send migration limits to VDSM during VM migration operations. The destination VDSM understood this lack of migration limits as migration from an old source that didn't support the retry logic, failing the migration altogether instead of having the source attempt a proper retry. This could ultimately result in the host entering an error state; especially problematic, for example, when attempting to put a host to maintenance. Now, the Manager sends migrations limits as part of the VM migration operation, so the migration retry logic between the source and destination VDSM is properly triggered when necessary.
Story Points: ---
Clone Of:
: 1413939 (view as bug list) Environment:
Last Closed: 2017-04-25 00:41:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1413939    

Description Gordon Watson 2016-12-30 17:01:37 UTC
Description of problem:

VM migration fails with;

  - src host;

Thread-12509491::INFO::2016-12-22 09:49:56,306::migration::407::virt.vm::(_startUnderlyingMigration) vmId=`36710f55-4f25-4c80-912e-d7c9dfc87b99`::Creation of destination VM took: 0 seconds
Thread-12509491::ERROR::2016-12-22 09:49:56,307::migration::252::virt.vm::(_recover) vmId=`36710f55-4f25-4c80-912e-d7c9dfc87b99`::migration destination error: Fatal error during migration


  - dest host;

jsonrpc.Executor/0::DEBUG::2016-12-22 09:49:56,304::API::601::vds::(migrationCreate) Migration create - Failed
jsonrpc.Executor/0::DEBUG::2016-12-22 09:49:56,305::API::607::vds::(migrationCreate) Returning backwards compatible migration error code


There are no clues in the VM's qemu log, the 'messages' file or the journald logs. VDSM should report the reason for the failure to create the VM.



Version-Release number of selected component (if applicable):

- RHEV 4.0.2
- RHEL 7.2 hosts;
  - vdsm-4.18.11-1.el7
  - libvirt-1.2.17-13.el7_2.5
  - qemu-kvm-rhev-2.3.0-31.el7_2.21


How reproducible:

Not reproducible.


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:

A meaningful error.


Additional info:

A side-effect of this is that when putting a host into maintenance mode, if some migrations fail in this manner and they are all to the same host, that host can get placed into an 'ERROR' state.

Comment 15 Michal Skrivanek 2017-01-03 10:09:46 UTC
the problem is on the engine side where the incoming/outgoing limits are not sent

Comment 19 Israel Pinto 2017-02-07 13:11:38 UTC
Verify with:
Engine: Version 4.1.0.3-0.1.el7

Step:
migrate vm and check the incoming/outgoing limit is set.

Results:
Engine log: [maxOutgoingMigrations = 2, maxIncomingMigrations = 2]
2017-02-07 14:41:47,859+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (org.ovirt.thread.pool-6-thread-50) [3add7741-1207-4618-b2f7-d8e51cad6c6d] START, MigrateBrokerVDSCommand(HostName = host_mixed_1, MigrateVDSCommandParameters:{runAsync='true', hostId='b879d50c-67e0-479a-9300-a7b7b2f1339c', vmId='1b1217db-ff0a-4e98-b408-a4f71e8af776', srcHost='seal08.qa.lab.tlv.redhat.com', dstVdsId='30ef250f-3a7d-4ed5-a8ca-75834da14cdb', dstHost='seal09.qa.lab.tlv.redhat.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='5000', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]'}), log id: 451aeef4

vdms log: [incomingLimit = 2 , outgoingLimit = 2]
2017-02-07 14:41:42,272 INFO  (periodic/2) [dispatcher] Run and protect: repoStats, Return response: {'06a1f8f4-81aa-49b0-b831-88284b222737': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000222759', 'lastCheck': '5.5', 'valid': True}, '62430d5b-8c57-4147-92dc-1ea907963ff9': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000242301', 'lastCheck': '0.9', 'valid': True}, 'fa65b176-cb84-41fd-8bf6-dae05dcc2595': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000257589', 'lastCheck': '0.9', 'valid': True}, 'bbb7e15c-7884-4d52-8bc4-d2de58e21a78': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000227469', 'lastCheck': '0.9', 'valid': True}} (logUtils:52)
2017-02-07 14:41:46,857 INFO  (jsonrpc/7) [vdsm.api] START migrate args=(<virt.vm.Vm object at 0x371fad0>, {'incomingLimit': 2, 'src': 'seal08.qa.lab.tlv.redhat.com', 'dstqemu': '10.35.117.90', 'autoConverge': 'true', 'tunneled': 'false', 'enableGuestEvents': True, 'dst': 'seal09.qa.lab.tlv.redhat.com:54321', 'convergenceSchedule': {'init': [{'params': ['100'], 'name': 'setDowntime'}], 'stalling': [{'action': {'params': ['150'], 'name': 'setDowntime'}, 'limit': 1}, {'action': {'params': ['200'], 'name': 'setDowntime'}, 'limit': 2}, {'action': {'params': ['300'], 'name': 'setDowntime'}, 'limit': 3}, {'action': {'params': ['400'], 'name': 'setDowntime'}, 'limit': 4}, {'action': {'params': ['500'], 'name': 'setDowntime'}, 'limit': 6}, {'action': {'params': [], 'name': 'abort'}, 'limit': -1}]}, 'vmId': '1b1217db-ff0a-4e98-b408-a4f71e8af776', 'abortOnError': 'true', 'outgoingLimit': 2, 'compressed': 'false', 'maxBandwidth': 5000, 'method': 'online', 'mode': 'remote'}) kwargs={} (api:37)