Bug 1703372 - Live Migration failed after update to ovirt-4.3
Summary: Live Migration failed after update to ovirt-4.3
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: ---
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Milan Zamazal
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-26 09:20 UTC by Rubystar
Modified: 2023-09-14 05:27 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-03-18 16:30:23 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)

Description Rubystar 2019-04-26 09:20:16 UTC
Description of problem:
After upgrade 2 hosts to ovirt-4.3.3.6 live migration stopped working.
On non-updated hosts, migration works.
How can I know why my VM is not migrated? I analyzed the engine.log, but no results.

Version-Release number of selected component (if applicable):
ovirt-engine-4.3.3.6
vdsm-4.30.13-1.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Click "Migrate"
2.
3.

Actual results:
VM is migrated on other host in the cluster.

Expected results:
VM is still work on the same host.


Additional info:
Part of engine.log

2019-04-26 13:22:41,697+05 INFO  [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] Lock Acquired to object 'EngineLock:{exclusiveLocks='[0b525d1a-cbbd-442d-aa90-44b27b09b3e9=VM]', sharedLocks=''}'
2019-04-26 13:22:41,739+05 INFO  [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] Running command: MigrateVmToServerCommand internal: false. Entities affected :  ID: 0b525d1a-cbbd-442d-aa90-44b27b09b3e9 Type: VMAction group MIGRATE_VM with role type USER
2019-04-26 13:22:41,788+05 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] START, MigrateVDSCommand( MigrateVDSCommandParameters:{hostId='fe0717b5-fe2e-43cd-9c57-a7cdfc62e3a1', vmId='0b525d1a-cbbd-442d-aa90-44b27b09b3e9', srcHost='vhs-ovirt07.my.org.com', dstVdsId='e884e167-94d6-4580-a6a1-488361ddb950', dstHost='vhs-ovirt08.my.org.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='62', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]', dstQemu='172.20.27.33'}), log id: 775b8db0
2019-04-26 13:22:41,789+05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] START, MigrateBrokerVDSCommand(HostName = vhs-ovirt07, MigrateVDSCommandParameters:{hostId='fe0717b5-fe2e-43cd-9c57-a7cdfc62e3a1', vmId='0b525d1a-cbbd-442d-aa90-44b27b09b3e9', srcHost='vhs-ovirt07.my.org.com', dstVdsId='e884e167-94d6-4580-a6a1-488361ddb950', dstHost='vhs-ovirt08.my.org.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='62', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]', dstQemu='172.20.27.33'}), log id: 59fbf8cb
2019-04-26 13:22:41,796+05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] FINISH, MigrateBrokerVDSCommand, return: , log id: 59fbf8cb
2019-04-26 13:22:41,803+05 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] FINISH, MigrateVDSCommand, return: MigratingFrom, log id: 775b8db0
2019-04-26 13:22:41,821+05 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] EVENT_ID: VM_MIGRATION_START(62), Migration started (VM: us-ora-tst01, Source: vhs-ovirt07, Destination: vhs-ovirt08, User: admin@internal-authz). 
2019-04-26 13:22:41,828+05 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-6) [] VM '0b525d1a-cbbd-442d-aa90-44b27b09b3e9'(us-ora-tst01) moved from 'MigratingFrom' --> 'Up'
2019-04-26 13:22:41,828+05 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-6) [] Adding VM '0b525d1a-cbbd-442d-aa90-44b27b09b3e9'(us-ora-tst01) to re-run list
2019-04-26 13:22:41,833+05 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-6) [] Rerun VM '0b525d1a-cbbd-442d-aa90-44b27b09b3e9'. Called from VDS 'vhs-ovirt07'
2019-04-26 13:22:41,836+05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1998) [] START, MigrateStatusVDSCommand(HostName = vhs-ovirt07, MigrateStatusVDSCommandParameters:{hostId='fe0717b5-fe2e-43cd-9c57-a7cdfc62e3a1', vmId='0b525d1a-cbbd-442d-aa90-44b27b09b3e9'}), log id: 4d96b471
2019-04-26 13:22:41,839+05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1998) [] FINISH, MigrateStatusVDSCommand, return: , log id: 4d96b471
2019-04-26 13:22:41,849+05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1998) [] EVENT_ID: VM_MIGRATION_TO_SERVER_FAILED(120), Migration failed  (VM: us-ora-tst01, Source: vhs-ovirt07, Destination: vhs-ovirt08).
2019-04-26 13:22:41,853+05 INFO  [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (EE-ManagedThreadFactory-engine-Thread-1998) [] Lock freed to object 'EngineLock:{exclusiveLocks='[0b525d1a-cbbd-442d-aa90-44b27b09b3e9=VM]', sharedLocks=''}'
2019-04-26 13:22:44,086+05 INFO  [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-81) [] User admin@internal successfully logged in with scopes: ovirt-app-api ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access
2019-04-26 13:22:44,104+05 INFO  [org.ovirt.engine.core.bll.aaa.CreateUserSessionCommand] (default task-83) [753cec10] Running command: CreateUserSessionCommand internal: false

Comment 1 Rubystar 2019-04-26 09:54:58 UTC
I tried to migrate another VM on this cluster and it got stuck. Run infinitely. The logs contain information only about the beginning of the migration.

Comment 2 Simone Tiraboschi 2019-05-08 09:35:02 UTC
Can you please attach vdsm.log for the relevant time frame?

Comment 3 Rubystar 2019-06-10 10:34:51 UTC
When I try to migrate VM I got an error:

2019-06-10 15:03:22,198+0500 INFO  (jsonrpc/7) [api.virt] FINISH migrate return={'status': {'message': 'Migration in progress', 'code': 0}, 'progress': 0} from=::ffff:172.20.27.28,44184, flow_id=e70a4ed5-06e8-4e72-aee6-b57ac926d332, vmId=6226b719-b0a2-42fd-9668-3e7324a784bb (api:54)
2019-06-10 15:03:22,198+0500 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call VM.migrate succeeded in 0.01 seconds (__init__:312)
2019-06-10 15:03:22,217+0500 ERROR (migsrc/6226b719) [virt.vm] (vmId='6226b719-b0a2-42fd-9668-3e7324a784bb') [Errno -2] Name or service not known (migration:282)
2019-06-10 15:03:22,218+0500 ERROR (migsrc/6226b719) [virt.vm] (vmId='6226b719-b0a2-42fd-9668-3e7324a784bb') Failed to migrate (migration:450)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 402, in _regular_run
    self._setupVdsConnection()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 239, in _setupVdsConnection
    client = self._createClient(port)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 227, in _createClient
    client_socket = utils.create_connected_socket(host, int(port), sslctx)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 435, in create_connected_socket
    socket.AF_UNSPEC, socket.SOCK_STREAM)
gaierror: [Errno -2] Name or service not known


Name or service not known????
is it DNS issue?
But my DNS work perfectly...
How do I understand what name or service is unknown?

Comment 4 Rubystar 2019-06-10 10:55:02 UTC
More logs engine.log (other VM):

2019-06-10 15:52:23,132+05 INFO  [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] Lock Acquired to object 'EngineLock:{exclusiveLocks='[2f09264e-25bb-4ccc-ae70-4ef6e661e57b=VM]', sharedLocks=''}'
2019-06-10 15:52:23,166+05 INFO  [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] Running command: MigrateVmCommand internal: false. Entities affected :  ID: 2f09264e-25bb-4ccc-ae70-4ef6e661e57b Type: VMAction group MIGRATE_VM with role type USER
2019-06-10 15:52:23,208+05 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] START, MigrateVDSCommand( MigrateVDSCommandParameters:{hostId='e7b853f1-916e-4464-acbc-0624dccd150f', vmId='2f09264e-25bb-4ccc-ae70-4ef6e661e57b', srcHost='us-vhs-ovirt07.us.metinvest.com', dstVdsId='240f5e0c-0e87-4187-8ea1-df02885c3ad4', dstHost='us-vhs-ovirt08.us.metinvest.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='62', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]', dstQemu='172.20.27.33'}), log id: 7001b351
2019-06-10 15:52:23,209+05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] START, MigrateBrokerVDSCommand(HostName = us-vhs-ovirt07, MigrateVDSCommandParameters:{hostId='e7b853f1-916e-4464-acbc-0624dccd150f', vmId='2f09264e-25bb-4ccc-ae70-4ef6e661e57b', srcHost='us-vhs-ovirt07.us.metinvest.com', dstVdsId='240f5e0c-0e87-4187-8ea1-df02885c3ad4', dstHost='us-vhs-ovirt08.us.metinvest.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='62', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]', dstQemu='172.20.27.33'}), log id: 515c9bbe
2019-06-10 15:52:23,216+05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] FINISH, MigrateBrokerVDSCommand, return: , log id: 515c9bbe
2019-06-10 15:52:23,218+05 INFO  [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] FINISH, MigrateVDSCommand, return: MigratingFrom, log id: 7001b351
2019-06-10 15:52:23,226+05 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] EVENT_ID: VM_MIGRATION_START(62), Migration started (VM: us-nag-mon01, Source: us-vhs-ovirt07, Destination: us-vhs-ovirt08, User: d.filov@internal-authz). 
2019-06-10 15:52:28,443+05 INFO  [org.ovirt.engine.core.utils.servlet.CORSSupportFilter] (default task-108) [] CORS support is enabled for origins "*".
2019-06-10 15:52:37,689+05 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [] VM '2f09264e-25bb-4ccc-ae70-4ef6e661e57b'(us-nag-mon01) moved from 'MigratingFrom' --> 'Up'
2019-06-10 15:52:37,689+05 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [] Adding VM '2f09264e-25bb-4ccc-ae70-4ef6e661e57b'(us-nag-mon01) to re-run list
2019-06-10 15:52:37,698+05 INFO  [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-7) [43e87be3] Lock Acquired to object 'EngineLock:{exclusiveLocks='[76ff1170-c933-46ae-a6b8-8041b40bc1d4=PROVIDER]', sharedLocks=''}'
2019-06-10 15:52:37,698+05 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [] Rerun VM '2f09264e-25bb-4ccc-ae70-4ef6e661e57b'. Called from VDS 'us-vhs-ovirt07'
2019-06-10 15:52:37,707+05 INFO  [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-7) [43e87be3] Running command: SyncNetworkProviderCommand internal: true.
2019-06-10 15:52:37,739+05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-5698) [] START, MigrateStatusVDSCommand(HostName = us-vhs-ovirt07, MigrateStatusVDSCommandParameters:{hostId='e7b853f1-916e-4464-acbc-0624dccd150f', vmId='2f09264e-25bb-4ccc-ae70-4ef6e661e57b'}), log id: 6218858b
2019-06-10 15:52:37,743+05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-5698) [] FINISH, MigrateStatusVDSCommand, return: , log id: 6218858b
2019-06-10 15:52:37,750+05 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-5698) [] EVENT_ID: VM_MIGRATION_TRYING_RERUN(128), Failed to migrate VM us-nag-mon01 to Host us-vhs-ovirt08 . Trying to migrate to another Host.
2019-06-10 15:52:37,798+05 WARN  [org.ovirt.engine.core.bll.MigrateVmCommand] (EE-ManagedThreadFactory-engine-Thread-5698) [] Validation of action 'MigrateVm' failed for user d.filov@internal-authz. Reasons: VAR__ACTION__MIGRATE,VAR__TYPE__VM,VAR__ACTION__MIGRATE,VAR__TYPE__VM,SCHEDULING_NO_HOSTS
2019-06-10 15:52:37,800+05 INFO  [org.ovirt.engine.core.bll.MigrateVmCommand] (EE-ManagedThreadFactory-engine-Thread-5698) [] Lock freed to object 'EngineLock:{exclusiveLocks='[2f09264e-25bb-4ccc-ae70-4ef6e661e57b=VM]', sharedLocks=''}'
2019-06-10 15:52:37,806+05 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-5698) [] EVENT_ID: VM_MIGRATION_NO_VDS_TO_MIGRATE_TO(166), No available host was found to migrate VM us-nag-mon01 to.
2019-06-10 15:52:37,808+05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-5698) [] EVENT_ID: VM_MIGRATION_FAILED(65), Migration failed  (VM: us-nag-mon01, Source: us-vhs-ovirt07).
2019-06-10 15:52:37,817+05 INFO  [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-108) [] User admin@internal successfully logged in with scopes: ovirt-app-api ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access
2019-06-10 15:52:37,947+05 INFO  [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-7) [43e87be3] Lock freed to object 'EngineLock:{exclusiveLocks='[76ff1170-c933-46ae-a6b8-8041b40bc1d4=PROVIDER]', sharedLocks=''}'

Comment 5 Ryan Barry 2019-06-10 12:02:35 UTC
Are you sure the source host can resolve the name of the destination host?

Comment 6 Rubystar 2019-06-11 03:41:31 UTC
Yes, I can resolve hosts.

When I try to migrate VM I got an error on ovirt-engine messages:

Jun 11 08:36:33 ovirt-mgmt01.example.com python[4702]: ::ffff:172.20.27.28 - - [11/Jun/2019 08:36:33] "POST /v2.0//tokens HTTP/1.1" 200 -
Jun 11 08:36:33 ovirt-mgmt01.example.com ovsdb-server[4625]: ovs|00011|stream_ssl|WARN|SSL_read: unexpected SSL connection close
Jun 11 08:36:33 ovirt-mgmt01.example.com ovsdb-server[4625]: ovs|00012|jsonrpc|WARN|ssl:[::ffff:127.0.0.1]:54728: receive error: Protocol error
Jun 11 08:36:33 ovirt-mgmt01.example.com ovsdb-server[4625]: ovs|00013|reconnect|WARN|ssl:[::ffff:127.0.0.1]:54728: connection dropped (Protocol error)
Jun 11 08:36:33 ovirt-mgmt01.example.com python[4702]: ::ffff:172.20.27.28 - - [11/Jun/2019 08:36:33] "GET /v2.0/networks HTTP/1.1" 200 -

Comment 7 Rubystar 2019-06-11 04:10:03 UTC
The logs from console:

Jun 11, 2019, 9:07:57 AM Migration failed (VM: testvm01, Source: vhs-ovirt07). 65 vhs-ovirt07 testvm01
Jun 11, 2019, 9:07:57 AM No available host was found to migrate VM testvm01 to. 166 vhs-ovirt07 testvm01
Jun 11, 2019, 9:07:57 AM Failed to migrate VM testvm01 to Host vhs-ovirt08 . Trying to migrate to another Host. 128 vhs-ovirt07 testvm01
Jun 11, 2019, 9:07:57 AM Migration started (VM: testvm01, Source: vhs-ovirt07, Destination: vhs-ovirt08, User: xxx@internal-authz).

Comment 8 Milan Zamazal 2019-07-18 15:30:28 UTC
"Name or service not known" is definitely a problem, "unexpected SSL connection close" may or may not be. Anyway, it looks like a network problem.

Do you use IPv6, or perhaps a mixed IPv6 and IPv4 networking? Can it be an IPv6 issue? A mixed IPv6 and IPv4 environment is not supported in oVirt.

When creating the cluster, did you use legacy Linux Bridge (the default) or OVS (tech preview) as the switch type?

If the problem still persists, I'd suggest looking into vdsm.log on the source host. Is the error always the same on repeated migration attempts? You should be able to see the requested host name earlier in vdsm.log, when starting the migration, as the destination server. Take the name from the log and try to ping it from the source host -- does it resolve this way?

Comment 9 Red Hat Bugzilla 2023-09-14 05:27:38 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.