Description of problem: Failed to connect to remote libvirt URI How reproducible: always Steps to Reproduce: 1. Deploy osp16 with at least 1 controller, 2 compute 2. Migrate vm using live migration from one compute to another Actual results: 2019-10-17 05:18:43.473 7 DEBUG nova.virt.libvirt.driver [-] [instance: bc2268e1-b936-4ab1-9e1a-5aed02958a0a] About to invoke the migrate API _live_migration_operation /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:8550 2019-10-17 05:18:43.515 7 ERROR nova.virt.libvirt.driver [-] [instance: bc2268e1-b936-4ab1-9e1a-5aed02958a0a] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+ssh://nova_migration.redhat.local:2022/system?keyfile=/etc/nova/migration/identity: Cannot recv data: Host key verification failed.: Connection reset by peer: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+ssh://nova_migration.redhat.local:2022/system?keyfile=/etc/nova/migration/identity: Cannot recv data: Host key verification failed.: Connection reset by peer 2019-10-17 05:18:43.515 7 DEBUG nova.virt.libvirt.driver [-] [instance: bc2268e1-b936-4ab1-9e1a-5aed02958a0a] Migration operation thread notification thread_finished /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:8906 2019-10-17 05:18:43.592 7 DEBUG oslo_concurrency.lockutils [req-3d305166-216f-4003-b698-a8d74605a3ff d142225cb3234a5fb43ca20576bc8586 e72a01a6e6804afcad97897b58c9578b - default default] Lock "c6bf45ca-4175-4095-9802-7437aeb538e9" released by "nova.compute.manager.ComputeManager.terminate_instance.<locals>.do_terminate_instance" :: held 1.230s inner /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:339 2019-10-17 05:18:43.960 7 DEBUG nova.virt.libvirt.migration [-] [instance: bc2268e1-b936-4ab1-9e1a-5aed02958a0a] VM running on src, migration failed find_job_type /usr/lib/python3.6/site-packages/nova/virt/libvirt/migration.py:404 2019-10-17 05:18:43.960 7 DEBUG nova.virt.libvirt.driver [-] [instance: bc2268e1-b936-4ab1-9e1a-5aed02958a0a] Fixed incorrect job type to be 4 _live_migration_monitor /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:8720 2019-10-17 05:18:43.960 7 ERROR nova.virt.libvirt.driver [-] [instance: bc2268e1-b936-4ab1-9e1a-5aed02958a0a] Migration operation has aborted Expected results: Deployment succeeded Additional info:
Issue is the network name used in the tripleo-ssh-known-hosts ansible role [1]. Right now it adds `[{{ host }}.{{ networks[network]['name'] }}]*` which adds compute references to `internal_api` instead of `internalapi`: [compute-0.internal_api]*. Same issue for FQDN: [compute-0.internal_api.redhat.local]* As a result we don't have a matching key for `compute-0.internalapi.redhat.local`: ~~~ # BEGIN ANSIBLE MANAGED BLOCK [192.168.24.51]*,[compute-0.redhat.local]*,[compute-0]*,[172.17.3.74]*,[compute-0.storage]*,[compute-0.storage.redhat.local]*,[172.17.1.121]*,[compute-0.internal_api]*,[compute-0.internal_api.redhat.local]*,[172.17.2.77]*,[compute-0.tenant]*,[compute-0.tenant.redhat.local]*, ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC4sLXKd8YjMEyofVsFpvwBvnoK34kHzgJt1dF/MBqdvxF6jLc6HlROQ5Qb6dYI4Fjt1jgxUYWR1+iVuS5d08C1JJY2EyTQxII2F6RSw3PRkt3+QWtHmfOhH7ljJ6MlPbYCUPueeIefJSx3wTZcdMCw2cLVqnx9YlRMo30uPJO2Q7zmQ2UM4TwW+x3a7PEJXFhzkXAmER3dkxwX4C832iA3riP4JQ6pPcvUX50ZdK7fWhngPb0D1UAmPrzmS8zf61c1ZWXymmjGc4sEBmjp1RS+wdH/YNTreM5sofewrXqrjIpSwHuuX+YBt/qCXSjtWW99e/CguGaa6wpxYxo03vHZIcrZoupTSC6n8UvEUfk3ZiVBQuyAytv4QQAy9NPeZFaKpyiwDm68n+th4fzB/PX54VRnSNqZXac6qr/dgBcrfXOGjdijISbUas1XHIDSzES7NpPvX6ZoLwG4mN8l2/tz3sGugsK0RtKtmS3/WstFRTkgMi3q094PU6+KHlu+e4M= [192.168.24.51]*,[compute-1.redhat.local]*,[compute-1]*,[172.17.3.16]*,[compute-1.storage]*,[compute-1.storage.redhat.local]*,[172.17.1.60]*,[compute-1.internal_api]*,[compute-1.internal_api.redhat.local]*,[172.17.2.116]*,[compute-1.tenant]*,[compute-1.tenant.redhat.local]*, ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC1K1+8OcsnKvA0iIpirsyhM554V/2KpA/vfBRik3mxii/VAV2VbXY/pwnAyuqv+2WyjAo7cknJCIXDH8Qz98wgQM11dEmrtwiHho9QHnnTUzODBXIaZe69fQuq9YTHY2egL8SBtMG+ODX1IhRm83qe1+3aeW4HNMcYczf4i3IddO0vCE5gIygW2O4Q5MeCqe3XxQ2em1rHFa/GM0dlsso9EVt92nMfa+hXDj/u2iDFxueBWJ+qP51NkS/l0HcnfTqApcaVCdGWcjbHg4wozBR0IzF3sDj1fXNH3VnlOl9z00wg1QXaDwaShkYUxKhrYIcpsFga1KxlbBq/grK5pS0MzicS8acsogRnETVweTT0RAYYpBUIKOf4k55uRdtnckFrTFi3Fbj3wCCUMRFms9zQPuJQFmC4pK16zvLeI4hbA2JazqjZOMnSzKgBR3S0feUPTrDCyNn3vj4wZ5eW4kETOcynvSuQVKKHPifBS4FqAVtfKLdubXJFKieyhn4P66k= [192.168.24.51]*,[controller-0.redhat.local]*,[controller-0]*,[172.17.3.83]*,[controller-0.storage]*,[controller-0.storage.redhat.local]*,[172.17.1.56]*,[controller-0.internal_api]*,[controller-0.internal_api.redhat.local]*,[172.17.2.130]*,[controller-0.tenant]*,[controller-0.tenant.redhat.local]*, ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDLxjtFeJXvYfeFgz0s5cQtiaxUHclTXm7wrssf/rwL58DAyQlZs2R8g7mqy8RpD3qy9FL9wij4D+r5ofpbYtvrO5uYgajxw9coRDORGIgw+8ffCI6Fo+Uuyi+gBQfIyRj+T3OdqDwHY1Pvd3E3ZKvcQkU/9WWujgyz6TCvHmJxOaX376RRQzfnMeodlDXFzHnGutw2tanki3CQgr0vTIq3Ifgae00z7ihiBgQqXj+a0HjjscwX7mc3S5bnGJYFIdIvEoFiQzLDCH5P6Lp2xYEq/uWi1EzL2xL+UoolSkbxoCsRPpRu6cbN7vouoyOHcMIAookJvaereWpROpsZJ1V/T0Me8ErNbX9lxesYwbMKxIdfXcO/yaprVColoLcgmcuxrfGRfpDNP9lTNpd2bEQap8/UWynHsyv3Q1EV7CB6L5CPgdi3YAy+2DV5kTWrVU9C/LKPc1penaz6ByLQPcprfHz88lAEqlQSV8lUln10m8hvqswa6k0q5TRZGvEk0Z8= # END ANSIBLE MANAGED BLOCK ~~~ Role needs to be modified to add the `network_name` + _hostname entry to the ssh_known_hosts line: ~~~ Controller: hosts: controller-0: ansible_host: 192.168.24.44 canonical_hostname: controller-0.redhat.local ctlplane_hostname: controller-0.ctlplane.redhat.local ctlplane_ip: 192.168.24.44 deploy_server_id: a1ffc38e-cd08-42ee-a271-32b4ce82b546 enabled_networks: [ctlplane, storage, storage_mgmt, internal_api, tenant, external, management] external_hostname: controller-0.external.redhat.local external_ip: 10.0.0.110 internal_api_hostname: controller-0.internalapi.redhat.local internal_api_ip: 172.17.1.56 management_ip: 192.168.24.44 storage_hostname: controller-0.storage.redhat.local storage_ip: 172.17.3.83 storage_mgmt_hostname: controller-0.storagemgmt.redhat.local storage_mgmt_ip: 172.17.4.117 tenant_hostname: controller-0.tenant.redhat.local tenant_ip: 172.17.2.130 ~~~ [1] https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo-ssh-known-hosts/tasks/main.yml#L55
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:0283