Description of problem: After upgrade 2 hosts to ovirt-4.3.3.6 live migration stopped working. On non-updated hosts, migration works. How can I know why my VM is not migrated? I analyzed the engine.log, but no results. Version-Release number of selected component (if applicable): ovirt-engine-4.3.3.6 vdsm-4.30.13-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Click "Migrate" 2. 3. Actual results: VM is migrated on other host in the cluster. Expected results: VM is still work on the same host. Additional info: Part of engine.log 2019-04-26 13:22:41,697+05 INFO [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] Lock Acquired to object 'EngineLock:{exclusiveLocks='[0b525d1a-cbbd-442d-aa90-44b27b09b3e9=VM]', sharedLocks=''}' 2019-04-26 13:22:41,739+05 INFO [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] Running command: MigrateVmToServerCommand internal: false. Entities affected : ID: 0b525d1a-cbbd-442d-aa90-44b27b09b3e9 Type: VMAction group MIGRATE_VM with role type USER 2019-04-26 13:22:41,788+05 INFO [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] START, MigrateVDSCommand( MigrateVDSCommandParameters:{hostId='fe0717b5-fe2e-43cd-9c57-a7cdfc62e3a1', vmId='0b525d1a-cbbd-442d-aa90-44b27b09b3e9', srcHost='vhs-ovirt07.my.org.com', dstVdsId='e884e167-94d6-4580-a6a1-488361ddb950', dstHost='vhs-ovirt08.my.org.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='62', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]', dstQemu='172.20.27.33'}), log id: 775b8db0 2019-04-26 13:22:41,789+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] START, MigrateBrokerVDSCommand(HostName = vhs-ovirt07, MigrateVDSCommandParameters:{hostId='fe0717b5-fe2e-43cd-9c57-a7cdfc62e3a1', vmId='0b525d1a-cbbd-442d-aa90-44b27b09b3e9', srcHost='vhs-ovirt07.my.org.com', dstVdsId='e884e167-94d6-4580-a6a1-488361ddb950', dstHost='vhs-ovirt08.my.org.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='62', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]', dstQemu='172.20.27.33'}), log id: 59fbf8cb 2019-04-26 13:22:41,796+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] FINISH, MigrateBrokerVDSCommand, return: , log id: 59fbf8cb 2019-04-26 13:22:41,803+05 INFO [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] FINISH, MigrateVDSCommand, return: MigratingFrom, log id: 775b8db0 2019-04-26 13:22:41,821+05 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-83) [8fcb45b4-3dab-49f3-9cdf-e32552093249] EVENT_ID: VM_MIGRATION_START(62), Migration started (VM: us-ora-tst01, Source: vhs-ovirt07, Destination: vhs-ovirt08, User: admin@internal-authz). 2019-04-26 13:22:41,828+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-6) [] VM '0b525d1a-cbbd-442d-aa90-44b27b09b3e9'(us-ora-tst01) moved from 'MigratingFrom' --> 'Up' 2019-04-26 13:22:41,828+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-6) [] Adding VM '0b525d1a-cbbd-442d-aa90-44b27b09b3e9'(us-ora-tst01) to re-run list 2019-04-26 13:22:41,833+05 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-6) [] Rerun VM '0b525d1a-cbbd-442d-aa90-44b27b09b3e9'. Called from VDS 'vhs-ovirt07' 2019-04-26 13:22:41,836+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1998) [] START, MigrateStatusVDSCommand(HostName = vhs-ovirt07, MigrateStatusVDSCommandParameters:{hostId='fe0717b5-fe2e-43cd-9c57-a7cdfc62e3a1', vmId='0b525d1a-cbbd-442d-aa90-44b27b09b3e9'}), log id: 4d96b471 2019-04-26 13:22:41,839+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1998) [] FINISH, MigrateStatusVDSCommand, return: , log id: 4d96b471 2019-04-26 13:22:41,849+05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1998) [] EVENT_ID: VM_MIGRATION_TO_SERVER_FAILED(120), Migration failed (VM: us-ora-tst01, Source: vhs-ovirt07, Destination: vhs-ovirt08). 2019-04-26 13:22:41,853+05 INFO [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (EE-ManagedThreadFactory-engine-Thread-1998) [] Lock freed to object 'EngineLock:{exclusiveLocks='[0b525d1a-cbbd-442d-aa90-44b27b09b3e9=VM]', sharedLocks=''}' 2019-04-26 13:22:44,086+05 INFO [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-81) [] User admin@internal successfully logged in with scopes: ovirt-app-api ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access 2019-04-26 13:22:44,104+05 INFO [org.ovirt.engine.core.bll.aaa.CreateUserSessionCommand] (default task-83) [753cec10] Running command: CreateUserSessionCommand internal: false
I tried to migrate another VM on this cluster and it got stuck. Run infinitely. The logs contain information only about the beginning of the migration.
Can you please attach vdsm.log for the relevant time frame?
When I try to migrate VM I got an error: 2019-06-10 15:03:22,198+0500 INFO (jsonrpc/7) [api.virt] FINISH migrate return={'status': {'message': 'Migration in progress', 'code': 0}, 'progress': 0} from=::ffff:172.20.27.28,44184, flow_id=e70a4ed5-06e8-4e72-aee6-b57ac926d332, vmId=6226b719-b0a2-42fd-9668-3e7324a784bb (api:54) 2019-06-10 15:03:22,198+0500 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call VM.migrate succeeded in 0.01 seconds (__init__:312) 2019-06-10 15:03:22,217+0500 ERROR (migsrc/6226b719) [virt.vm] (vmId='6226b719-b0a2-42fd-9668-3e7324a784bb') [Errno -2] Name or service not known (migration:282) 2019-06-10 15:03:22,218+0500 ERROR (migsrc/6226b719) [virt.vm] (vmId='6226b719-b0a2-42fd-9668-3e7324a784bb') Failed to migrate (migration:450) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 402, in _regular_run self._setupVdsConnection() File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 239, in _setupVdsConnection client = self._createClient(port) File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 227, in _createClient client_socket = utils.create_connected_socket(host, int(port), sslctx) File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 435, in create_connected_socket socket.AF_UNSPEC, socket.SOCK_STREAM) gaierror: [Errno -2] Name or service not known Name or service not known???? is it DNS issue? But my DNS work perfectly... How do I understand what name or service is unknown?
More logs engine.log (other VM): 2019-06-10 15:52:23,132+05 INFO [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] Lock Acquired to object 'EngineLock:{exclusiveLocks='[2f09264e-25bb-4ccc-ae70-4ef6e661e57b=VM]', sharedLocks=''}' 2019-06-10 15:52:23,166+05 INFO [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] Running command: MigrateVmCommand internal: false. Entities affected : ID: 2f09264e-25bb-4ccc-ae70-4ef6e661e57b Type: VMAction group MIGRATE_VM with role type USER 2019-06-10 15:52:23,208+05 INFO [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] START, MigrateVDSCommand( MigrateVDSCommandParameters:{hostId='e7b853f1-916e-4464-acbc-0624dccd150f', vmId='2f09264e-25bb-4ccc-ae70-4ef6e661e57b', srcHost='us-vhs-ovirt07.us.metinvest.com', dstVdsId='240f5e0c-0e87-4187-8ea1-df02885c3ad4', dstHost='us-vhs-ovirt08.us.metinvest.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='62', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]', dstQemu='172.20.27.33'}), log id: 7001b351 2019-06-10 15:52:23,209+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] START, MigrateBrokerVDSCommand(HostName = us-vhs-ovirt07, MigrateVDSCommandParameters:{hostId='e7b853f1-916e-4464-acbc-0624dccd150f', vmId='2f09264e-25bb-4ccc-ae70-4ef6e661e57b', srcHost='us-vhs-ovirt07.us.metinvest.com', dstVdsId='240f5e0c-0e87-4187-8ea1-df02885c3ad4', dstHost='us-vhs-ovirt08.us.metinvest.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='62', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]', dstQemu='172.20.27.33'}), log id: 515c9bbe 2019-06-10 15:52:23,216+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] FINISH, MigrateBrokerVDSCommand, return: , log id: 515c9bbe 2019-06-10 15:52:23,218+05 INFO [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] FINISH, MigrateVDSCommand, return: MigratingFrom, log id: 7001b351 2019-06-10 15:52:23,226+05 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-108) [9c4562e0-d860-4348-9839-1510f831dfa0] EVENT_ID: VM_MIGRATION_START(62), Migration started (VM: us-nag-mon01, Source: us-vhs-ovirt07, Destination: us-vhs-ovirt08, User: d.filov@internal-authz). 2019-06-10 15:52:28,443+05 INFO [org.ovirt.engine.core.utils.servlet.CORSSupportFilter] (default task-108) [] CORS support is enabled for origins "*". 2019-06-10 15:52:37,689+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [] VM '2f09264e-25bb-4ccc-ae70-4ef6e661e57b'(us-nag-mon01) moved from 'MigratingFrom' --> 'Up' 2019-06-10 15:52:37,689+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [] Adding VM '2f09264e-25bb-4ccc-ae70-4ef6e661e57b'(us-nag-mon01) to re-run list 2019-06-10 15:52:37,698+05 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-7) [43e87be3] Lock Acquired to object 'EngineLock:{exclusiveLocks='[76ff1170-c933-46ae-a6b8-8041b40bc1d4=PROVIDER]', sharedLocks=''}' 2019-06-10 15:52:37,698+05 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [] Rerun VM '2f09264e-25bb-4ccc-ae70-4ef6e661e57b'. Called from VDS 'us-vhs-ovirt07' 2019-06-10 15:52:37,707+05 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-7) [43e87be3] Running command: SyncNetworkProviderCommand internal: true. 2019-06-10 15:52:37,739+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-5698) [] START, MigrateStatusVDSCommand(HostName = us-vhs-ovirt07, MigrateStatusVDSCommandParameters:{hostId='e7b853f1-916e-4464-acbc-0624dccd150f', vmId='2f09264e-25bb-4ccc-ae70-4ef6e661e57b'}), log id: 6218858b 2019-06-10 15:52:37,743+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-5698) [] FINISH, MigrateStatusVDSCommand, return: , log id: 6218858b 2019-06-10 15:52:37,750+05 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-5698) [] EVENT_ID: VM_MIGRATION_TRYING_RERUN(128), Failed to migrate VM us-nag-mon01 to Host us-vhs-ovirt08 . Trying to migrate to another Host. 2019-06-10 15:52:37,798+05 WARN [org.ovirt.engine.core.bll.MigrateVmCommand] (EE-ManagedThreadFactory-engine-Thread-5698) [] Validation of action 'MigrateVm' failed for user d.filov@internal-authz. Reasons: VAR__ACTION__MIGRATE,VAR__TYPE__VM,VAR__ACTION__MIGRATE,VAR__TYPE__VM,SCHEDULING_NO_HOSTS 2019-06-10 15:52:37,800+05 INFO [org.ovirt.engine.core.bll.MigrateVmCommand] (EE-ManagedThreadFactory-engine-Thread-5698) [] Lock freed to object 'EngineLock:{exclusiveLocks='[2f09264e-25bb-4ccc-ae70-4ef6e661e57b=VM]', sharedLocks=''}' 2019-06-10 15:52:37,806+05 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-5698) [] EVENT_ID: VM_MIGRATION_NO_VDS_TO_MIGRATE_TO(166), No available host was found to migrate VM us-nag-mon01 to. 2019-06-10 15:52:37,808+05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-5698) [] EVENT_ID: VM_MIGRATION_FAILED(65), Migration failed (VM: us-nag-mon01, Source: us-vhs-ovirt07). 2019-06-10 15:52:37,817+05 INFO [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-108) [] User admin@internal successfully logged in with scopes: ovirt-app-api ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access 2019-06-10 15:52:37,947+05 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-7) [43e87be3] Lock freed to object 'EngineLock:{exclusiveLocks='[76ff1170-c933-46ae-a6b8-8041b40bc1d4=PROVIDER]', sharedLocks=''}'
Are you sure the source host can resolve the name of the destination host?
Yes, I can resolve hosts. When I try to migrate VM I got an error on ovirt-engine messages: Jun 11 08:36:33 ovirt-mgmt01.example.com python[4702]: ::ffff:172.20.27.28 - - [11/Jun/2019 08:36:33] "POST /v2.0//tokens HTTP/1.1" 200 - Jun 11 08:36:33 ovirt-mgmt01.example.com ovsdb-server[4625]: ovs|00011|stream_ssl|WARN|SSL_read: unexpected SSL connection close Jun 11 08:36:33 ovirt-mgmt01.example.com ovsdb-server[4625]: ovs|00012|jsonrpc|WARN|ssl:[::ffff:127.0.0.1]:54728: receive error: Protocol error Jun 11 08:36:33 ovirt-mgmt01.example.com ovsdb-server[4625]: ovs|00013|reconnect|WARN|ssl:[::ffff:127.0.0.1]:54728: connection dropped (Protocol error) Jun 11 08:36:33 ovirt-mgmt01.example.com python[4702]: ::ffff:172.20.27.28 - - [11/Jun/2019 08:36:33] "GET /v2.0/networks HTTP/1.1" 200 -
The logs from console: Jun 11, 2019, 9:07:57 AM Migration failed (VM: testvm01, Source: vhs-ovirt07). 65 vhs-ovirt07 testvm01 Jun 11, 2019, 9:07:57 AM No available host was found to migrate VM testvm01 to. 166 vhs-ovirt07 testvm01 Jun 11, 2019, 9:07:57 AM Failed to migrate VM testvm01 to Host vhs-ovirt08 . Trying to migrate to another Host. 128 vhs-ovirt07 testvm01 Jun 11, 2019, 9:07:57 AM Migration started (VM: testvm01, Source: vhs-ovirt07, Destination: vhs-ovirt08, User: xxx@internal-authz).
"Name or service not known" is definitely a problem, "unexpected SSL connection close" may or may not be. Anyway, it looks like a network problem. Do you use IPv6, or perhaps a mixed IPv6 and IPv4 networking? Can it be an IPv6 issue? A mixed IPv6 and IPv4 environment is not supported in oVirt. When creating the cluster, did you use legacy Linux Bridge (the default) or OVS (tech preview) as the switch type? If the problem still persists, I'd suggest looking into vdsm.log on the source host. Is the error always the same on repeated migration attempts? You should be able to see the requested host name earlier in vdsm.log, when starting the migration, as the destination server. Take the name from the log and try to ping it from the source host -- does it resolve this way?
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days