Bug 1619556 - FFU: controllers upgrade fails on TASK [tripleo-ssh-known-hosts : Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration]
Summary: FFU: controllers upgrade fails on TASK [tripleo-ssh-known-hosts : Add hosts k...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: z3
: 13.0 (Queens)
Assignee: Jose Luis Franco
QA Contact: Yurii Prokulevych
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-21 07:57 UTC by Marius Cornea
Modified: 2020-09-02 05:41 UTC (History)
8 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.0.4-27.el7ost.src.rpm
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-13 22:28:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 598587 0 None MERGED Always lowercase role name 2020-09-02 05:36:45 UTC
Red Hat Product Errata RHBA-2018:3587 0 None None None 2018-11-13 22:29:36 UTC

Description Marius Cornea 2018-08-21 07:57:10 UTC
Description of problem:

FFU: controllers upgrade fails on TASK [tripleo-ssh-known-hosts : Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration] 

 u'PLAY [Common roles for TripleO servers] ****************************************',
 u'',
 u'TASK [tripleo-bootstrap : Deploy required packages to bootstrap TripleO] *******',
 u'Monday 20 August 2018  11:18:53 -0400 (0:00:00.106)       0:00:04.688 ********* ',
 u'ok: [controller-r02-00] => {"changed": false, "failed": false, "msg": "", "rc": 0, "results": ["openstack-heat-agents-1.5.4-0.20180308153305.ecf43c7.el7ost.noarch providing openstack-heat-agents is already installed", "jq-1.3-4.el7ost.x86_64 providing jq is already installed"]}',
 u'ok: [controller-r00-00] => {"changed": false, "failed": false, "msg": "", "rc": 0, "results": ["openstack-heat-agents-1.5.4-0.20180308153305.ecf43c7.el7ost.noarch providing openstack-heat-agents is already installed", "jq-1.3-4.el7ost.x86_64 providing jq is already installed"]}',
 u'ok: [controller-r01-00] => {"changed": false, "failed": false, "msg": "", "rc": 0, "results": ["openstack-heat-agents-1.5.4-0.20180308153305.ecf43c7.el7ost.noarch providing openstack-heat-agents is already installed", "jq-1.3-4.el7ost.x86_64 providing jq is already installed"]}',
 u'',
 u'TASK [tripleo-bootstrap : Create /var/lib/heat-config/tripleo-config-download directory for deployment data] ***',
 u'Monday 20 August 2018  11:18:54 -0400 (0:00:00.796)       0:00:05.485 ********* ',
 u'changed: [controller-r02-00] => {"changed": true, "failed": false, "gid": 0, "group": "root", "mode": "0755", "owner": "root", "path": "/var/lib/heat-config/tripleo-config-download", "secontext": "unconfined_u:object_r:var_lib_t:s0", "size": 6, "state": "directory", "uid": 0}',
 u'changed: [controller-r01-00] => {"changed": true, "failed": false, "gid": 0, "group": "root", "mode": "0755", "owner": "root", "path": "/var/lib/heat-config/tripleo-config-download", "secontext": "unconfined_u:object_r:var_lib_t:s0", "size": 6, "state": "directory", "uid": 0}',
 u'changed: [controller-r00-00] => {"changed": true, "failed": false, "gid": 0, "group": "root", "mode": "0755", "owner": "root", "path": "/var/lib/heat-config/tripleo-config-download", "secontext": "unconfined_u:object_r:var_lib_t:s0", "size": 6, "state": "directory", "uid": 0}',
 u'',
 u'TASK [tripleo-ssh-known-hosts : Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration] ***',
 u'Monday 20 August 2018  11:18:55 -0400 (0:00:00.380)       0:00:05.865 ********* ',
 u'fatal: [controller-r01-00]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: \'dict object\' has no attribute u\'controller-r01-00\'\\n\\nThe error appears to have been in \'/usr/share/ansible/roles/tripleo-ssh-known-hosts/tasks/main.yml\': line 2, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n---\\n- name: Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration\\n  ^ here\\n\\nexception type: <class \'ansible.errors.AnsibleUndefinedVariable\'>\\nexception: \'dict object\' has no attribute u\'controller-r01-00\'"}',
 u'fatal: [controller-r02-00]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: \'dict object\' has no attribute u\'controller-r01-00\'\\n\\nThe error appears to have been in \'/usr/share/ansible/roles/tripleo-ssh-known-hosts/tasks/main.yml\': line 2, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n---\\n- name: Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration\\n  ^ here\\n\\nexception type: <class \'ansible.errors.AnsibleUndefinedVariable\'>\\nexception: \'dict object\' has no attribute u\'controller-r01-00\'"}',
 u'fatal: [controller-r00-00]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: \'dict object\' has no attribute u\'controller-r01-00\'\\n\\nThe error appears to have been in \'/usr/share/ansible/roles/tripleo-ssh-known-hosts/tasks/main.yml\': line 2, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n---\\n- name: Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration\\n  ^ here\\n\\nexception type: <class \'ansible.errors.AnsibleUndefinedVariable\'>\\nexception: \'dict object\' has no attribute u\'controller-r01-00\'"}',
 u'',
 u'NO MORE HOSTS LEFT *************************************************************',
 u'',
 u'PLAY RECAP *********************************************************************',
 u'controller-r00-00          : ok=4    changed=1    unreachable=0    failed=1   ',
 u'controller-r01-00          : ok=4    changed=1    unreachable=0    failed=1   ',
 u'controller-r02-00          : ok=4    changed=1    unreachable=0    failed=1   ',
 u'',
 u'Monday 20 August 2018  11:18:55 -0400 (0:00:00.107)       0:00:05.972 ********* ',
 u'=============================================================================== ']

Version-Release number of selected component (if applicable):
puddle 2018-08-16.1

How reproducible:
1/1

Steps to Reproduce:
1. Deploy OSP10 with 3 controller + 2 compute + 3 ceph node with following custom hostnames:

parameter_defaults:
    HostnameMap:
        ceph-0: CEPH-R00-00
        ceph-1: CEPH-R01-00
        ceph-2: CEPH-R02-00
        compute-0: COMPUTE-R00-00
        compute-1: COMPUTE-R01-00
        controller-0: CONTROLLER-R00-00
        controller-1: CONTROLLER-R01-00
        controller-2: CONTROLLER-R02-00

Actual results:

openstack overcloud upgrade run --roles Controller --skip-tags validation fails on TASK [tripleo-ssh-known-hosts : Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration]

Expected results:
No failure.

Additional info:

Comment 2 Jose Luis Franco 2018-08-21 15:31:59 UTC
The problem comes from the uppercase hostnames. The ssh_known_hosts variable, which is referenced within the failing task [0] keeps as dictionary key the hostname in uppercase, while ansible handles the hostnames  in lowercase:

ssh_known_hosts": {"CEPH-R00-00": "172.17.3.15,CEPH-R00-00.localdomain,CEPH-R00-00,172.17.3.15,CEPH-R00-00.storage.localdomain,CEPH-R00-00.storage,172.17.4.16,CEPH-R00-00.storagemgmt.localdomain,CEPH-R00-00.storagemgmt,192.168.24.10,CEPH-R00-00.internalapi.localdomain,CEPH-R00-00.internalapi,192.168.24.10,CEPH-R00-00.tenant.localdomain,CEPH-R00-00.tenant,192.168.24.10,CEPH-R00-00.external.localdomain,CEPH-R00-00.external,192.168.24.10,CEPH-R00-00.management.localdomain,CEPH-R00-00.management,192.168.24.10,CEPH-R00-00.ctlplane.localdomain,CEPH-R00-00.ctlplane", "CEPH-R01-00": "172.17.3.20,CEPH-R01-00.localdomain,CEPH-R01-00,172.17.3.20,CEPH-R01-00.storage.localdomain,CEPH-R01-00.storage,172.17.4.19,CEPH-R01-00.storagemgmt.localdomain,CEPH-R01-00.storagemgmt,192.168.24.12,CEPH-R01-00.internalapi.localdomain,CEPH-R01-00.internalapi,192.168.24.12,CEPH-R01-00.tenant.localdomain,CEPH-R01-00.tenant,192.168.24.12,CEPH-R01-00.external.localdomain,CEPH-R01-00.external,192.168.24.12,CEPH-R01-00.management.localdomain,CEPH-R01-00.management,192.168.24.12,CEPH-R01-00.ctlplane.localdomain,CEPH-R01-00.ctlplane", "CEPH-R02-00": "172.17.3.14,CEPH-R02-00.localdomain,CEPH-R02-00,172.17.3.14,CEPH-R02-00.storage.localdomain,CEPH-R02-00.storage,172.17.4.13,CEPH-R02-00.storagemgmt.localdomain,CEPH-R02-00.storagemgmt,192.168.24.13,CEPH-R02-00.internalapi.localdomain,CEPH-R02-00.internalapi,192.168.24.13,CEPH-R02-00.tenant.localdomain,CEPH-R02-00.tenant,192.168.24.13,CEPH-R02-00.external.localdomain,CEPH-R02-00.external,192.168.24.13,CEPH-R02-00.management.localdomain,CEPH-R02-00.management,192.168.24.13,CEPH-R02-00.ctlplane.localdomain,CEPH-R02-00.ctlplane", "COMPUTE-R00-00": "172.17.1.13,COMPUTE-R00-00.localdomain,COMPUTE-R00-00,172.17.3.18,COMPUTE-R00-00.storage.localdomain,COMPUTE-R00-00.storage,192.168.24.6,COMPUTE-R00-00.storagemgmt.localdomain,COMPUTE-R00-00.storagemgmt,172.17.1.13,COMPUTE-R00-00.internalapi.localdomain,COMPUTE-R00-00.internalapi,172.17.2.18,COMPUTE-R00-00.tenant.localdomain,COMPUTE-R00-00.tenant,192.168.24.6,COMPUTE-R00-00.external.localdomain,COMPUTE-R00-00.external,192.168.24.6,COMPUTE-R00-00.management.localdomain,COMPUTE-R00-00.management,192.168.24.6,COMPUTE-R00-00.ctlplane.localdomain,COMPUTE-R00-00.ctlplane", "COMPUTE-R01-00": "172.17.1.20,COMPUTE-R01-00.localdomain,COMPUTE-R01-00,172.17.3.11,COMPUTE-R01-00.storage.localdomain,COMPUTE-R01-00.storage,192.168.24.14,COMPUTE-R01-00.storagemgmt.localdomain,COMPUTE-R01-00.storagemgmt,172.17.1.20,COMPUTE-R01-00.internalapi.localdomain,COMPUTE-R01-00.internalapi,172.17.2.17,COMPUTE-R01-00.tenant.localdomain,COMPUTE-R01-00.tenant,192.168.24.14,COMPUTE-R01-00.external.localdomain,COMPUTE-R01-00.external,192.168.24.14,COMPUTE-R01-00.management.localdomain,COMPUTE-R01-00.management,192.168.24.14,COMPUTE-R01-00.ctlplane.localdomain,COMPUTE-R01-00.ctlplane", "CONTROLLER-R00-00": "172.17.1.18,CONTROLLER-R00-00.localdomain,CONTROLLER-R00-00,172.17.3.24,CONTROLLER-R00-00.storage.localdomain,CONTROLLER-R00-00.storage,172.17.4.21,CONTROLLER-R00-00.storagemgmt.localdomain,CONTROLLER-R00-00.storagemgmt,172.17.1.18,CONTROLLER-R00-00.internalapi.localdomain,CONTROLLER-R00-00.internalapi,172.17.2.20,CONTROLLER-R00-00.tenant.localdomain,CONTROLLER-R00-00.tenant,10.0.0.111,CONTROLLER-R00-00.external.localdomain,CONTROLLER-R00-00.external,192.168.24.15,CONTROLLER-R00-00.management.localdomain,CONTROLLER-R00-00.management,192.168.24.15,CONTROLLER-R00-00.ctlplane.localdomain,CONTROLLER-R00-00.ctlplane", "CONTROLLER-R01-00": "172.17.1.15,CONTROLLER-R01-00.localdomain,CONTROLLER-R01-00,172.17.3.12,CONTROLLER-R01-00.storage.localdomain,CONTROLLER-R01-00.storage,172.17.4.11,CONTROLLER-R01-00.storagemgmt.localdomain,CONTROLLER-R01-00.storagemgmt,172.17.1.15,CONTROLLER-R01-00.internalapi.localdomain,CONTROLLER-R01-00.internalapi,172.17.2.11,CONTROLLER-R01-00.tenant.localdomain,CONTROLLER-R01-00.tenant,10.0.0.106,CONTROLLER-R01-00.external.localdomain,CONTROLLER-R01-00.external,192.168.24.17,CONTROLLER-R01-00.management.localdomain,CONTROLLER-R01-00.management,192.168.24.17,CONTROLLER-R01-00.ctlplane.localdomain,CONTROLLER-R01-00.ctlplane", "CONTROLLER-R02-00": "172.17.1.21,CONTROLLER-R02-00.localdomain,CONTROLLER-R02-00,172.17.3.26,CONTROLLER-R02-00.storage.localdomain,CONTROLLER-R02-00.storage,172.17.4.18,CONTROLLER-R02-00.storagemgmt.localdomain,CONTROLLER-R02-00.storagemgmt,172.17.1.21,CONTROLLER-R02-00.internalapi.localdomain,CONTROLLER-R02-00.internalapi,172.17.2.22,CONTROLLER-R02-00.tenant.localdomain,CONTROLLER-R02-00.tenant,10.0.0.103,CONTROLLER-R02-00.external.localdomain,CONTROLLER-R02-00.external,192.168.24.20,CONTROLLER-R02-00.management.localdomain,CONTROLLER-R02-00.management,192.168.24.20,CONTROLLER-R02-00.ctlplane.localdomain,CONTROLLER-R02-00.ctlplane"}}, "ansible_included_var_files": ["/var/lib/mistral/07674ebb-4f32-4a7a-ac6f-ad73beff16be/global_vars.yaml"], "changed": false, "failed": false}

While the content from the with_items task comes in lower case, so:

[ 'controller-r02-00', 'controller-r01-00', 'controller-r00-00' ]

When trying to access ssh_known_hosts['controller-r01-00'] the key can't be found.

[0] - https://github.com/openstack/tripleo-common/blob/master/roles/tripleo-ssh-known-hosts/tasks/main.yml#L6

Comment 6 Amit Ugol 2018-10-16 09:59:24 UTC
Passed automation.

Comment 11 errata-xmlrpc 2018-11-13 22:28:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3587


Note You need to log in before you can comment on or make changes to this bug.