Description of problem: live migration fails when hostnames are configured with "_" (underscore) due to inconsistent naming in /etc/hosts and /etc/ssh/authorized_keys Additional info: this is easy to reproduce: /home/stack/templates/network-environment.yaml ~~~ parameter_defaults: (...) ComputeHostnameFormat: '%stackname%-compute_v1-%index%' ~~~ Then, deploy a new stack: ~~~ (...) Stack overcloud CREATE_COMPLETE Host 10.0.0.5 not found in /home/stack/.ssh/known_hosts Overcloud Endpoint: http://10.0.0.5:5000/v2.0 Overcloud Deployed ~~~ Then, verify: ~~~ [stack@undercloud-7 ~]$ source stackrc; nova list +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ | 2cef3460-d931-4b8c-9e9d-d297e0f7b3bc | overcloud-compute_v1-0 | ACTIVE | - | Running | ctlplane=192.0.2.16 | | 358951bc-3f29-4f92-8aad-2f9fecac8f8f | overcloud-compute_v1-1 | ACTIVE | - | Running | ctlplane=192.0.2.8 | | 50261983-3699-4874-aeef-cd2ae52825df | overcloud-controller-0 | ACTIVE | - | Running | ctlplane=192.0.2.12 | +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ [stack@undercloud-7 ~]$ . overcloudrc [stack@undercloud-7 ~]$ nova service-list nova hyp+----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+ | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | +----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+ | 3 | nova-consoleauth | overcloud-controller-0.localdomain | internal | enabled | up | 2018-01-20T16:34:35.000000 | - | | 4 | nova-scheduler | overcloud-controller-0.localdomain | internal | enabled | up | 2018-01-20T16:34:34.000000 | - | | 5 | nova-conductor | overcloud-controller-0.localdomain | internal | enabled | up | 2018-01-20T16:34:38.000000 | - | | 6 | nova-compute | overcloud-compute-v1-0 | nova | enabled | up | 2018-01-20T16:34:44.000000 | - | | 7 | nova-compute | overcloud-compute-v1-1 | nova | enabled | up | 2018-01-20T16:34:35.000000 | - | +----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+ e[stack@undercloud-7 ~]$ nova hypervisor-list +----+------------------------+-------+---------+ | ID | Hypervisor hostname | State | Status | +----+------------------------+-------+---------+ | 1 | overcloud-compute-v1-0 | up | enabled | | 2 | overcloud-compute-v1-1 | up | enabled | +----+------------------------+-------+---------+ [stack@undercloud-7 ~]$ ssh heat-admin.2.16 hostname The authenticity of host '192.0.2.16 (192.0.2.16)' can't be established. ECDSA key fingerprint is SHA256:+MA5u0VzqiLp+Q3RdHvfcXy9R+xNO6HU8sfvLCFHeo0. ECDSA key fingerprint is MD5:b3:27:98:11:86:5d:08:32:26:b6:ef:73:00:80:cd:73. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.0.2.16' (ECDSA) to the list of known hosts. overcloud-compute-v1-0 [stack@undercloud-7 ~]$ ssh heat-admin.2.16 "sudo grep compute_v1 /etc/ -R" /etc/cloud/templates/hosts.redhat.tmpl:172.16.2.10 overcloud-compute_v1-0.localdomain overcloud-compute_v1-0 /etc/cloud/templates/hosts.redhat.tmpl:192.0.2.16 overcloud-compute_v1-0.external.localdomain overcloud-compute_v1-0.external /etc/cloud/templates/hosts.redhat.tmpl:172.16.2.10 overcloud-compute_v1-0.internalapi.localdomain overcloud-compute_v1-0.internalapi /etc/cloud/templates/hosts.redhat.tmpl:172.18.0.11 overcloud-compute_v1-0.storage.localdomain overcloud-compute_v1-0.storage /etc/cloud/templates/hosts.redhat.tmpl:192.0.2.16 overcloud-compute_v1-0.storagemgmt.localdomain overcloud-compute_v1-0.storagemgmt /etc/cloud/templates/hosts.redhat.tmpl:172.16.0.7 overcloud-compute_v1-0.tenant.localdomain overcloud-compute_v1-0.tenant /etc/cloud/templates/hosts.redhat.tmpl:192.0.2.16 overcloud-compute_v1-0.management.localdomain overcloud-compute_v1-0.management /etc/cloud/templates/hosts.redhat.tmpl:192.0.2.16 overcloud-compute_v1-0.ctlplane.localdomain overcloud-compute_v1-0.ctlplane /etc/cloud/templates/hosts.redhat.tmpl:172.16.2.13 overcloud-compute_v1-1.localdomain overcloud-compute_v1-1 /etc/cloud/templates/hosts.redhat.tmpl:192.0.2.8 overcloud-compute_v1-1.external.localdomain overcloud-compute_v1-1.external /etc/cloud/templates/hosts.redhat.tmpl:172.16.2.13 overcloud-compute_v1-1.internalapi.localdomain overcloud-compute_v1-1.internalapi /etc/cloud/templates/hosts.redhat.tmpl:172.18.0.15 overcloud-compute_v1-1.storage.localdomain overcloud-compute_v1-1.storage /etc/cloud/templates/hosts.redhat.tmpl:192.0.2.8 overcloud-compute_v1-1.storagemgmt.localdomain overcloud-compute_v1-1.storagemgmt /etc/cloud/templates/hosts.redhat.tmpl:172.16.0.9 overcloud-compute_v1-1.tenant.localdomain overcloud-compute_v1-1.tenant /etc/cloud/templates/hosts.redhat.tmpl:192.0.2.8 overcloud-compute_v1-1.management.localdomain overcloud-compute_v1-1.management /etc/cloud/templates/hosts.redhat.tmpl:192.0.2.8 overcloud-compute_v1-1.ctlplane.localdomain overcloud-compute_v1-1.ctlplane /etc/ssh/ssh_known_hosts:172.16.2.10,overcloud-compute_v1-0.localdomain,overcloud-compute_v1-0,192.0.2.16,overcloud-compute_v1-0.external.localdomain,overcloud-compute_v1-0.external,172.16.2.10,overcloud-compute_v1-0.internalapi.localdomain,overcloud-compute_v1-0.internalapi,172.18.0.11,overcloud-compute_v1-0.storage.localdomain,overcloud-compute_v1-0.storage,192.0.2.16,overcloud-compute_v1-0.storagemgmt.localdomain,overcloud-compute_v1-0.storagemgmt,172.16.0.7,overcloud-compute_v1-0.tenant.localdomain,overcloud-compute_v1-0.tenant,192.0.2.16,overcloud-compute_v1-0.management.localdomain,overcloud-compute_v1-0.management,192.0.2.16,overcloud-compute_v1-0.ctlplane.localdomain,overcloud-compute_v1-0.ctlplane ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBJQhCsIK7sAkmHZ4xg72t2wV39gvrjtE6g1dqZLXmSTB5ubMaqFTytzqvt/T+C8oUKPgX1bQfnIz4VyuKCx4qwQ= /etc/ssh/ssh_known_hosts:172.16.2.13,overcloud-compute_v1-1.localdomain,overcloud-compute_v1-1,192.0.2.8,overcloud-compute_v1-1.external.localdomain,overcloud-compute_v1-1.external,172.16.2.13,overcloud-compute_v1-1.internalapi.localdomain,overcloud-compute_v1-1.internalapi,172.18.0.15,overcloud-compute_v1-1.storage.localdomain,overcloud-compute_v1-1.storage,192.0.2.8,overcloud-compute_v1-1.storagemgmt.localdomain,overcloud-compute_v1-1.storagemgmt,172.16.0.9,overcloud-compute_v1-1.tenant.localdomain,overcloud-compute_v1-1.tenant,192.0.2.8,overcloud-compute_v1-1.management.localdomain,overcloud-compute_v1-1.management,192.0.2.8,overcloud-compute_v1-1.ctlplane.localdomain,overcloud-compute_v1-1.ctlplane ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNFj5SSzkFnn4e417m0Ut+wOXONtEFslAYnGafSpAxasZGdJEpESzN0OhPj4aJNRomA/t2f6Xm2wjRCEjsX+ZiM= grep: /etc/grub2-efi.cfg: No such file or directory grep: /etc/extlinux.conf: No such file or directory /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0.internalapi.localdomain", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1.internalapi.localdomain" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-0", /etc/puppet/hieradata/all_nodes.yaml: "overcloud-compute_v1-1" /etc/puppet/hieradata/bootstrap_node.yaml:bootstrap_nodeid: overcloud-compute_v1-0 /etc/hosts:172.16.2.10 overcloud-compute_v1-0.localdomain overcloud-compute_v1-0 /etc/hosts:192.0.2.16 overcloud-compute_v1-0.external.localdomain overcloud-compute_v1-0.external /etc/hosts:172.16.2.10 overcloud-compute_v1-0.internalapi.localdomain overcloud-compute_v1-0.internalapi /etc/hosts:172.18.0.11 overcloud-compute_v1-0.storage.localdomain overcloud-compute_v1-0.storage /etc/hosts:192.0.2.16 overcloud-compute_v1-0.storagemgmt.localdomain overcloud-compute_v1-0.storagemgmt /etc/hosts:172.16.0.7 overcloud-compute_v1-0.tenant.localdomain overcloud-compute_v1-0.tenant /etc/hosts:192.0.2.16 overcloud-compute_v1-0.management.localdomain overcloud-compute_v1-0.management /etc/hosts:192.0.2.16 overcloud-compute_v1-0.ctlplane.localdomain overcloud-compute_v1-0.ctlplane /etc/hosts:172.16.2.13 overcloud-compute_v1-1.localdomain overcloud-compute_v1-1 /etc/hosts:192.0.2.8 overcloud-compute_v1-1.external.localdomain overcloud-compute_v1-1.external /etc/hosts:172.16.2.13 overcloud-compute_v1-1.internalapi.localdomain overcloud-compute_v1-1.internalapi /etc/hosts:172.18.0.15 overcloud-compute_v1-1.storage.localdomain overcloud-compute_v1-1.storage /etc/hosts:192.0.2.8 overcloud-compute_v1-1.storagemgmt.localdomain overcloud-compute_v1-1.storagemgmt /etc/hosts:172.16.0.9 overcloud-compute_v1-1.tenant.localdomain overcloud-compute_v1-1.tenant /etc/hosts:192.0.2.8 overcloud-compute_v1-1.management.localdomain overcloud-compute_v1-1.management /etc/hosts:192.0.2.8 overcloud-compute_v1-1.ctlplane.localdomain overcloud-compute_v1-1.ctlplane [stack@undercloud-7 ~]$ ~~~ Both live migrations with "_" and "-" fail: ~~~ [stack@undercloud-7 ~]$ nova list +--------------------------------------+--------------+--------+------------+-------------+----------------------------------------------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------------+--------+------------+-------------+----------------------------------------------------------------------+ | 7ee38f55-2f1e-4c95-b5f7-ddfedf5134a7 | cirros-test1 | ACTIVE | - | Running | private=192.168.0.12, 2000:192:168:1:f816:3eff:fec1:226d, 10.0.0.110 | | 67cbc3cf-b379-4b45-92b4-9a690a5effd3 | rhel-test1 | ACTIVE | - | Running | private=192.168.0.3, 2000:192:168:1:f816:3eff:fe11:ca21, 10.0.0.107 | +--------------------------------------+--------------+--------+------------+-------------+----------------------------------------------------------------------+ o[stack@undercloud-7 ~]$nova list --fields name,host nova live-migrat+--------------------------------------+--------------+------------------------+ | ID | Name | Host | +--------------------------------------+--------------+------------------------+ | 7ee38f55-2f1e-4c95-b5f7-ddfedf5134a7 | cirros-test1 | overcloud-compute-v1-0 | | 67cbc3cf-b379-4b45-92b4-9a690a5effd3 | rhel-test1 | overcloud-compute-v1-1 | +--------------------------------------+--------------+------------------------+ [stack@undercloud-7 ~]$ nova live-migration cirros-test1 overcloud-compute-v1-1 [stack@undercloud-7 ~]$ nova list +--------------------------------------+--------------+--------+------------+-------------+----------------------------------------------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------------+--------+------------+-------------+----------------------------------------------------------------------+ | 7ee38f55-2f1e-4c95-b5f7-ddfedf5134a7 | cirros-test1 | ACTIVE | - | Running | private=192.168.0.12, 2000:192:168:1:f816:3eff:fec1:226d, 10.0.0.110 | | 67cbc3cf-b379-4b45-92b4-9a690a5effd3 | rhel-test1 | ACTIVE | - | Running | private=192.168.0.3, 2000:192:168:1:f816:3eff:fe11:ca21, 10.0.0.107 | +--------------------------------------+--------------+--------+------------+-------------+----------------------------------------------------------------------+ [stack@undercloud-7 ~]$ nova list --fields name,host +--------------------------------------+--------------+------------------------+ | ID | Name | Host | +--------------------------------------+--------------+------------------------+ | 7ee38f55-2f1e-4c95-b5f7-ddfedf5134a7 | cirros-test1 | overcloud-compute-v1-0 | | 67cbc3cf-b379-4b45-92b4-9a690a5effd3 | rhel-test1 | overcloud-compute-v1-1 | +--------------------------------------+--------------+------------------------+ [stack@undercloud-7 ~]$ nova migration-list +----+-------------+-----------+------------------------+------------------------+-----------+--------+--------------------------------------+------------+------------+----------------------------+----------------------------+----------------+ | Id | Source Node | Dest Node | Source Compute | Dest Compute | Dest Host | Status | Instance UUID | Old Flavor | New Flavor | Created At | Updated At | Type | +----+-------------+-----------+------------------------+------------------------+-----------+--------+--------------------------------------+------------+------------+----------------------------+----------------------------+----------------+ | 1 | - | - | overcloud-compute-v1-0 | overcloud-compute-v1-1 | - | error | 7ee38f55-2f1e-4c95-b5f7-ddfedf5134a7 | 1 | 1 | 2018-01-20T16:59:15.000000 | 2018-01-20T16:59:20.000000 | live-migration | +----+-------------+-----------+------------------------+------------------------+-----------+--------+--------------------------------------+------------+------------+----------------------------+----------------------------+----------------+ [stack@undercloud-7 ~]$ nova live-migration cirros-test1 overcloud-compute_v1-1 ERROR (ClientException): Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. <class 'nova.exception.ComputeHostNotFound'> (HTTP 500) (Request-ID: req-53b96ca4-8bfb-4b51-bc0c-735063b1448c) [stack@undercloud-7 ~]$ ~~~ For the live migration with "-", the ERROR is: ~~~ [root@overcloud-compute-v1-0 ~]# grep ERROR /var/log/nova/nova-compute.log 2018-01-20 16:07:28.454 16992 ERROR nova.compute.manager [req-9c175cc6-0563-4feb-ba7d-a7081cd21cac - - - - -] No compute node record for host overcloud-compute-v1-0 2018-01-20 16:55:53.479 59019 DEBUG oslo_service.service [req-c7a22b42-71fc-4e37-a2d5-261a3c4e8fa2 - - - - -] logging_exception_prefix = %(asctime)s.%(msecs)03d %(process)d ERROR %(name)s %(instance)s log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:2622 2018-01-20 16:59:20.528 59019 ERROR nova.virt.libvirt.driver [req-a378cae1-71eb-49d3-a3c6-04b7133c1727 354194e670274527bb751e120fdb276b 4b9fb0b405434da6b215bca0bab4e654 - - -] [instance: 7ee38f55-2f1e-4c95-b5f7-ddfedf5134a7] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+ssh://nova_migration@overcloud-compute-v1-1/system?keyfile=/etc/nova/migration/identity: Cannot recv data: ssh: Could not resolve hostname overcloud-compute-v1-1: Name or service not known: Connection reset by peer 2018-01-20 16:59:20.575 59019 ERROR nova.virt.libvirt.driver [req-a378cae1-71eb-49d3-a3c6-04b7133c1727 354194e670274527bb751e120fdb276b 4b9fb0b405434da6b215bca0bab4e654 - - -] [instance: 7ee38f55-2f1e-4c95-b5f7-ddfedf5134a7] Migration operation has aborted [root@overcloud-compute-v1-0 ~]# ~~~ I fixed /etc/hosts manually: ~~~ [root@overcloud-compute-v1-0 ~]# diff /etc/hosts{.bck,} 25c25 < 172.16.2.10 overcloud-compute_v1-0.localdomain overcloud-compute_v1-0 --- > 172.16.2.10 overcloud-compute_v1-0.localdomain overcloud-compute_v1-0 overcloud-compute-v1-0.localdomain overcloud-compute-v1-0 34c34 < 172.16.2.13 overcloud-compute_v1-1.localdomain overcloud-compute_v1-1 --- > 172.16.2.13 overcloud-compute_v1-1.localdomain overcloud-compute_v1-1 overcloud-compute-v1-1.localdomain overcloud-compute-v1-1 [root@overcloud-compute-v1-0 ~]# ~~~ Which then gets further, but fails on host key validation: ~~~ 2018-01-20 17:12:57.116 59019 ERROR nova.virt.libvirt.driver [req-1523be96-5fef-4d20-a275-b08235f206d1 354194e670274527bb751e120fdb276b 4b9fb0b405434da6b215bca0bab4e654 - - -] [instance: 7ee38f55-2f1e-4c95-b5f7-ddfedf5134a7] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+ssh://nova_migration@overcloud-compute-v1-1/system?keyfile=/etc/nova/migration/identity: Cannot recv data: Host key verification failed.: Connection reset by peer 2018-01-20 17:12:57.121 59019 ERROR nova.virt.libvirt.driver [req-1523be96-5fef-4d20-a275-b08235f206d1 354194e670274527bb751e120fdb276b 4b9fb0b405434da6b215bca0bab4e654 - - -] [instance: 7ee38f55-2f1e-4c95-b5f7-ddfedf5134a7] Migration operation has aborted 2018-01-20 17:13:05.980 59019 ERROR nova.virt.libvirt.host [req-06d738da-8383-4516-b681-256c1090688d - - - - -] Hostname has changed from overcloud-compute-v1-0 to overcloud-compute_v1-0.localdomain. A restart is required to take effect. 2018-01-20 17:13:05.999 59019 ERROR nova.virt.libvirt.host [req-06d738da-8383-4516-b681-256c1090688d - - - - -] Hostname has changed from overcloud-compute-v1-0 to overcloud-compute_v1-0.localdomain. A restart is required to take effect. ~~~ So finally, I changed: ~~~ [root@overcloud-compute-v1-0 ~]# diff /etc/ssh/ssh_known_hosts{,.bck} 2,3c2,3 < 172.16.2.10,overcloud-compute_v1-0.localdomain,overcloud-compute_v1-0,overcloud-compute-v1-0.localdomain,overcloud-compute-v1-0,192.0.2.16,overcloud-compute_v1-0.external.localdomain,overcloud-compute_v1-0.external,172.16.2.10,overcloud-compute_v1-0.internalapi.localdomain,overcloud-compute_v1-0.internalapi,172.18.0.11,overcloud-compute_v1-0.storage.localdomain,overcloud-compute_v1-0.storage,192.0.2.16,overcloud-compute_v1-0.storagemgmt.localdomain,overcloud-compute_v1-0.storagemgmt,172.16.0.7,overcloud-compute_v1-0.tenant.localdomain,overcloud-compute_v1-0.tenant,192.0.2.16,overcloud-compute_v1-0.management.localdomain,overcloud-compute_v1-0.management,192.0.2.16,overcloud-compute_v1-0.ctlplane.localdomain,overcloud-compute_v1-0.ctlplane ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBJQhCsIK7sAkmHZ4xg72t2wV39gvrjtE6g1dqZLXmSTB5ubMaqFTytzqvt/T+C8oUKPgX1bQfnIz4VyuKCx4qwQ= < 172.16.2.13,overcloud-compute_v1-1.localdomain,overcloud-compute_v1-1,overcloud-compute-v1-1.localdomain,overcloud-compute-v1-1,192.0.2.8,overcloud-compute_v1-1.external.localdomain,overcloud-compute_v1-1.external,172.16.2.13,overcloud-compute_v1-1.internalapi.localdomain,overcloud-compute_v1-1.internalapi,172.18.0.15,overcloud-compute_v1-1.storage.localdomain,overcloud-compute_v1-1.storage,192.0.2.8,overcloud-compute_v1-1.storagemgmt.localdomain,overcloud-compute_v1-1.storagemgmt,172.16.0.9,overcloud-compute_v1-1.tenant.localdomain,overcloud-compute_v1-1.tenant,192.0.2.8,overcloud-compute_v1-1.management.localdomain,overcloud-compute_v1-1.management,192.0.2.8,overcloud-compute_v1-1.ctlplane.localdomain,overcloud-compute_v1-1.ctlplane ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNFj5SSzkFnn4e417m0Ut+wOXONtEFslAYnGafSpAxasZGdJEpESzN0OhPj4aJNRomA/t2f6Xm2wjRCEjsX+ZiM= --- > 172.16.2.10,overcloud-compute_v1-0.localdomain,overcloud-compute_v1-0,192.0.2.16,overcloud-compute_v1-0.external.localdomain,overcloud-compute_v1-0.external,172.16.2.10,overcloud-compute_v1-0.internalapi.localdomain,overcloud-compute_v1-0.internalapi,172.18.0.11,overcloud-compute_v1-0.storage.localdomain,overcloud-compute_v1-0.storage,192.0.2.16,overcloud-compute_v1-0.storagemgmt.localdomain,overcloud-compute_v1-0.storagemgmt,172.16.0.7,overcloud-compute_v1-0.tenant.localdomain,overcloud-compute_v1-0.tenant,192.0.2.16,overcloud-compute_v1-0.management.localdomain,overcloud-compute_v1-0.management,192.0.2.16,overcloud-compute_v1-0.ctlplane.localdomain,overcloud-compute_v1-0.ctlplane ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBJQhCsIK7sAkmHZ4xg72t2wV39gvrjtE6g1dqZLXmSTB5ubMaqFTytzqvt/T+C8oUKPgX1bQfnIz4VyuKCx4qwQ= > 172.16.2.13,overcloud-compute_v1-1.localdomain,overcloud-compute_v1-1,192.0.2.8,overcloud-compute_v1-1.external.localdomain,overcloud-compute_v1-1.external,172.16.2.13,overcloud-compute_v1-1.internalapi.localdomain,overcloud-compute_v1-1.internalapi,172.18.0.15,overcloud-compute_v1-1.storage.localdomain,overcloud-compute_v1-1.storage,192.0.2.8,overcloud-compute_v1-1.storagemgmt.localdomain,overcloud-compute_v1-1.storagemgmt,172.16.0.9,overcloud-compute_v1-1.tenant.localdomain,overcloud-compute_v1-1.tenant,192.0.2.8,overcloud-compute_v1-1.management.localdomain,overcloud-compute_v1-1.management,192.0.2.8,overcloud-compute_v1-1.ctlplane.localdomain,overcloud-compute_v1-1.ctlplane ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNFj5SSzkFnn4e417m0Ut+wOXONtEFslAYnGafSpAxasZGdJEpESzN0OhPj4aJNRomA/t2f6Xm2wjRCEjsX+ZiM= ~~~ Which the succeeds: ~~~ [stack@undercloud-7 ~]$ nova reset-state cirros-test1 --active Reset state for server cirros-test1 succeeded; new state is active (reverse-i-search)`l': nova live-migration cirros-test1 overc^Cud-compute-v1-1 [stack@undercloud-7 ~]$ ^C [stack@undercloud-7 ~]$ nova live-migration cirros-test1 overcloud-compute-v1-1 [stack@undercloud-7 ~]$ nova list --fields name,host +--------------------------------------+--------------+------------------------+ | ID | Name | Host | +--------------------------------------+--------------+------------------------+ | 7ee38f55-2f1e-4c95-b5f7-ddfedf5134a7 | cirros-test1 | overcloud-compute-v1-1 | | 67cbc3cf-b379-4b45-92b4-9a690a5effd3 | rhel-test1 | overcloud-compute-v1-1 | +--------------------------------------+--------------+------------------------+ ~~~
Hello, I applied the following: ~~~ ComputeHostnameFormat: '%stackname%-compute-v1-%index%' ~~~ Then, I reran `openstack overcloud deploy`. This leads to: ~~~ [root@overcloud-compute-v1-0 ~]# grep '_v1' !$ grep '_v1' /etc/ssh/ssh_known_hosts [root@overcloud-compute-v1-0 ~]# grep '_v1' /etc/hosts [root@overcloud-compute-v1-0 ~]# ~~~ And to a hostname change: ~~~ 2018-01-20 18:07:34.066 59047 ERROR nova.virt.libvirt.host [req-57ec6d8c-80fc-414e-8a6f-96e57d593499 - - - - -] Hostname has changed from overcloud-compute-v1-1 to overcloud-compute-v1-1.localdomain. A restart is required to take effect. ~~~ After a restart on all computes: ~~~ [root@overcloud-compute-v1-1 ~]# systemctl restart openstack-nova-compute [root@overcloud-compute-v1-1 ~]# ~~~ This will lead to another rename of compute services (note the .localdomain): ~~~ [stack@undercloud-7 ~]$ nova service-list +----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+ | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | +----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+ | 3 | nova-consoleauth | overcloud-controller-0.localdomain | internal | enabled | up | 2018-01-20T19:00:40.000000 | - | | 4 | nova-scheduler | overcloud-controller-0.localdomain | internal | enabled | up | 2018-01-20T19:00:37.000000 | - | | 5 | nova-conductor | overcloud-controller-0.localdomain | internal | enabled | up | 2018-01-20T19:00:34.000000 | - | | 6 | nova-compute | overcloud-compute-v1-0 | nova | enabled | down | 2018-01-20T18:07:31.000000 | - | | 7 | nova-compute | overcloud-compute-v1-1 | nova | enabled | down | 2018-01-20T18:07:37.000000 | - | | 8 | nova-compute | overcloud-compute-v1-0.localdomain | nova | enabled | down | 2018-01-20T18:58:45.000000 | - | | 9 | nova-compute | overcloud-compute-v1-1.localdomain | nova | enabled | up | 2018-01-20T19:00:39.000000 | - | +----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+ ~~~ Live migration then fails, because a) in my env, v1-0 is down. But more importantly, the rename to .localdomain messed up other things: ~~~ /var/log/nova/nova-conductor.log:2018-01-20 19:01:31.363 92122 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/conductor/tasks/live_migrate.py", line 49, in _execute /var/log/nova/nova-conductor.log:2018-01-20 19:01:31.363 92122 ERROR oslo_messaging.rpc.server self._check_host_is_up(self.source) /var/log/nova/nova-conductor.log:2018-01-20 19:01:31.363 92122 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/conductor/tasks/live_migrate.py", line 89, in _check_host_is_up /var/log/nova/nova-conductor.log:2018-01-20 19:01:31.363 92122 ERROR oslo_messaging.rpc.server raise exception.ComputeServiceUnavailable(host=host) /var/log/nova/nova-conductor.log:2018-01-20 19:01:31.363 92122 ERROR oslo_messaging.rpc.server ComputeServiceUnavailable: Compute service of overcloud-compute-v1-1 is unavailable at this time. /var/log/nova/nova-conductor.log:2018-01-20 19:01:31.363 92122 ERROR oslo_messaging.rpc.server ~~~ - Andreas
Correction: it's not authorized_keys, it's /etc/ssh/ssh_known_hosts
The hostname has not been set (by cloud-init) because underscore is not a valid hostname character (see https://tools.ietf.org/html/rfc952). The hostname command will not accept this as a hostname e.g: [root@undercloud stack]# hostname foo_ hostname: the specified hostname is invalid Everything else has been configure expecting this invalid hostname to have been set. SSH public/private key authentication failing is the most obvious side-effect of this, but there are likely to be other issues e.g the hostname being reported by nova-compute is also incorrect. > oslo_messaging.rpc.server ComputeServiceUnavailable: Compute service of overcloud-compute-v1-1 is unavailable at this time. What is the expectation here? That looks correct. It's now overcloud-compute-v1-1.localdomain. The overcloud-compute-v1-0 and overcloud-compute-v1-1 services should be deleted
Hi, The hostname change does not work. Check the .localdomain. With the wrong settings, we don't get the .localdomain suffix: ~~~ | 3 | nova-consoleauth | overcloud-controller-0.localdomain | internal | enabled | up | 2018-01-20T19:00:40.000000 | - | | 4 | nova-scheduler | overcloud-controller-0.localdomain | internal | enabled | up | 2018-01-20T19:00:37.000000 | - | | 5 | nova-conductor | overcloud-controller-0.localdomain | internal | enabled | up | 2018-01-20T19:00:34.000000 | - | | 6 | nova-compute | overcloud-compute-v1-0 | nova | enabled | down | 2018-01-20T18:07:31.000000 | - | | 7 | nova-compute | overcloud-compute-v1-1 | nova | enabled | down | 2018-01-20T18:07:37.000000 | - | | 8 | nova-compute | overcloud-compute-v1-0.localdomain | nova | enabled | down | 2018-01-20T18:58:45.000000 | - | | 9 | nova-compute | overcloud-compute-v1-1.localdomain | nova | enabled | ~~~ ~~~ /var/log/nova/nova-conductor.log:2018-01-20 19:01:31.363 92122 ERROR oslo_messaging.rpc.server ComputeServiceUnavailable: Compute service of overcloud-compute-v1-1 is unavailable at this time. ~~~ The database does not contain entries for `overcloud-compute-v1-1.localdomain`, but only for `overcloud-compute-v1-1`. That means that even after a rename via Director, due to this issue, one cannot migrate the instances off if one does not go into the database and fix this manually. Actually, I'm not asking for a mitigation here. I'm asking that we do not let customers set flavor names or ComputeHostnameFormat that contain "_". Or, alternatively, that we correctly convert all of them from "_" to "-". Overall, this is a product bug: we either accept invalid input in our templates and/or do not convert "_" to "-" everywhere where we should do it.
We added a validation in OSP11 to prevent the use of underscore in stacknames which is where this originally snuck in. I believe we have a validation in place for FFU as well. The RHEL documentation has some additional details around valid hostnames. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/ch-configure_host_names#sec_Understanding_Host_Names At this point I'm not sure there's much to do in 10 without possibly breaking existing deployments. If a user has an existing stack deployed, they'll probably need to update the role hostname format to not have a '_' in it and update the node if possible.