Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1619556

Summary: FFU: controllers upgrade fails on TASK [tripleo-ssh-known-hosts : Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration]
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-heat-templatesAssignee: Jose Luis Franco <jfrancoa>
Status: CLOSED ERRATA QA Contact: Yurii Prokulevych <yprokule>
Severity: urgent Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: augol, dbecker, jfrancoa, jschluet, jstransk, mburns, morazi, sgolovat
Target Milestone: z3Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-8.0.4-27.el7ost.src.rpm Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-13 22:28:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marius Cornea 2018-08-21 07:57:10 UTC
Description of problem:

FFU: controllers upgrade fails on TASK [tripleo-ssh-known-hosts : Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration] 

 u'PLAY [Common roles for TripleO servers] ****************************************',
 u'',
 u'TASK [tripleo-bootstrap : Deploy required packages to bootstrap TripleO] *******',
 u'Monday 20 August 2018  11:18:53 -0400 (0:00:00.106)       0:00:04.688 ********* ',
 u'ok: [controller-r02-00] => {"changed": false, "failed": false, "msg": "", "rc": 0, "results": ["openstack-heat-agents-1.5.4-0.20180308153305.ecf43c7.el7ost.noarch providing openstack-heat-agents is already installed", "jq-1.3-4.el7ost.x86_64 providing jq is already installed"]}',
 u'ok: [controller-r00-00] => {"changed": false, "failed": false, "msg": "", "rc": 0, "results": ["openstack-heat-agents-1.5.4-0.20180308153305.ecf43c7.el7ost.noarch providing openstack-heat-agents is already installed", "jq-1.3-4.el7ost.x86_64 providing jq is already installed"]}',
 u'ok: [controller-r01-00] => {"changed": false, "failed": false, "msg": "", "rc": 0, "results": ["openstack-heat-agents-1.5.4-0.20180308153305.ecf43c7.el7ost.noarch providing openstack-heat-agents is already installed", "jq-1.3-4.el7ost.x86_64 providing jq is already installed"]}',
 u'',
 u'TASK [tripleo-bootstrap : Create /var/lib/heat-config/tripleo-config-download directory for deployment data] ***',
 u'Monday 20 August 2018  11:18:54 -0400 (0:00:00.796)       0:00:05.485 ********* ',
 u'changed: [controller-r02-00] => {"changed": true, "failed": false, "gid": 0, "group": "root", "mode": "0755", "owner": "root", "path": "/var/lib/heat-config/tripleo-config-download", "secontext": "unconfined_u:object_r:var_lib_t:s0", "size": 6, "state": "directory", "uid": 0}',
 u'changed: [controller-r01-00] => {"changed": true, "failed": false, "gid": 0, "group": "root", "mode": "0755", "owner": "root", "path": "/var/lib/heat-config/tripleo-config-download", "secontext": "unconfined_u:object_r:var_lib_t:s0", "size": 6, "state": "directory", "uid": 0}',
 u'changed: [controller-r00-00] => {"changed": true, "failed": false, "gid": 0, "group": "root", "mode": "0755", "owner": "root", "path": "/var/lib/heat-config/tripleo-config-download", "secontext": "unconfined_u:object_r:var_lib_t:s0", "size": 6, "state": "directory", "uid": 0}',
 u'',
 u'TASK [tripleo-ssh-known-hosts : Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration] ***',
 u'Monday 20 August 2018  11:18:55 -0400 (0:00:00.380)       0:00:05.865 ********* ',
 u'fatal: [controller-r01-00]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: \'dict object\' has no attribute u\'controller-r01-00\'\\n\\nThe error appears to have been in \'/usr/share/ansible/roles/tripleo-ssh-known-hosts/tasks/main.yml\': line 2, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n---\\n- name: Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration\\n  ^ here\\n\\nexception type: <class \'ansible.errors.AnsibleUndefinedVariable\'>\\nexception: \'dict object\' has no attribute u\'controller-r01-00\'"}',
 u'fatal: [controller-r02-00]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: \'dict object\' has no attribute u\'controller-r01-00\'\\n\\nThe error appears to have been in \'/usr/share/ansible/roles/tripleo-ssh-known-hosts/tasks/main.yml\': line 2, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n---\\n- name: Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration\\n  ^ here\\n\\nexception type: <class \'ansible.errors.AnsibleUndefinedVariable\'>\\nexception: \'dict object\' has no attribute u\'controller-r01-00\'"}',
 u'fatal: [controller-r00-00]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: \'dict object\' has no attribute u\'controller-r01-00\'\\n\\nThe error appears to have been in \'/usr/share/ansible/roles/tripleo-ssh-known-hosts/tasks/main.yml\': line 2, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n---\\n- name: Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration\\n  ^ here\\n\\nexception type: <class \'ansible.errors.AnsibleUndefinedVariable\'>\\nexception: \'dict object\' has no attribute u\'controller-r01-00\'"}',
 u'',
 u'NO MORE HOSTS LEFT *************************************************************',
 u'',
 u'PLAY RECAP *********************************************************************',
 u'controller-r00-00          : ok=4    changed=1    unreachable=0    failed=1   ',
 u'controller-r01-00          : ok=4    changed=1    unreachable=0    failed=1   ',
 u'controller-r02-00          : ok=4    changed=1    unreachable=0    failed=1   ',
 u'',
 u'Monday 20 August 2018  11:18:55 -0400 (0:00:00.107)       0:00:05.972 ********* ',
 u'=============================================================================== ']

Version-Release number of selected component (if applicable):
puddle 2018-08-16.1

How reproducible:
1/1

Steps to Reproduce:
1. Deploy OSP10 with 3 controller + 2 compute + 3 ceph node with following custom hostnames:

parameter_defaults:
    HostnameMap:
        ceph-0: CEPH-R00-00
        ceph-1: CEPH-R01-00
        ceph-2: CEPH-R02-00
        compute-0: COMPUTE-R00-00
        compute-1: COMPUTE-R01-00
        controller-0: CONTROLLER-R00-00
        controller-1: CONTROLLER-R01-00
        controller-2: CONTROLLER-R02-00

Actual results:

openstack overcloud upgrade run --roles Controller --skip-tags validation fails on TASK [tripleo-ssh-known-hosts : Add hosts key in /etc/ssh/ssh_known_hosts for live/cold-migration]

Expected results:
No failure.

Additional info:

Comment 2 Jose Luis Franco 2018-08-21 15:31:59 UTC
The problem comes from the uppercase hostnames. The ssh_known_hosts variable, which is referenced within the failing task [0] keeps as dictionary key the hostname in uppercase, while ansible handles the hostnames  in lowercase:

ssh_known_hosts": {"CEPH-R00-00": "172.17.3.15,CEPH-R00-00.localdomain,CEPH-R00-00,172.17.3.15,CEPH-R00-00.storage.localdomain,CEPH-R00-00.storage,172.17.4.16,CEPH-R00-00.storagemgmt.localdomain,CEPH-R00-00.storagemgmt,192.168.24.10,CEPH-R00-00.internalapi.localdomain,CEPH-R00-00.internalapi,192.168.24.10,CEPH-R00-00.tenant.localdomain,CEPH-R00-00.tenant,192.168.24.10,CEPH-R00-00.external.localdomain,CEPH-R00-00.external,192.168.24.10,CEPH-R00-00.management.localdomain,CEPH-R00-00.management,192.168.24.10,CEPH-R00-00.ctlplane.localdomain,CEPH-R00-00.ctlplane", "CEPH-R01-00": "172.17.3.20,CEPH-R01-00.localdomain,CEPH-R01-00,172.17.3.20,CEPH-R01-00.storage.localdomain,CEPH-R01-00.storage,172.17.4.19,CEPH-R01-00.storagemgmt.localdomain,CEPH-R01-00.storagemgmt,192.168.24.12,CEPH-R01-00.internalapi.localdomain,CEPH-R01-00.internalapi,192.168.24.12,CEPH-R01-00.tenant.localdomain,CEPH-R01-00.tenant,192.168.24.12,CEPH-R01-00.external.localdomain,CEPH-R01-00.external,192.168.24.12,CEPH-R01-00.management.localdomain,CEPH-R01-00.management,192.168.24.12,CEPH-R01-00.ctlplane.localdomain,CEPH-R01-00.ctlplane", "CEPH-R02-00": "172.17.3.14,CEPH-R02-00.localdomain,CEPH-R02-00,172.17.3.14,CEPH-R02-00.storage.localdomain,CEPH-R02-00.storage,172.17.4.13,CEPH-R02-00.storagemgmt.localdomain,CEPH-R02-00.storagemgmt,192.168.24.13,CEPH-R02-00.internalapi.localdomain,CEPH-R02-00.internalapi,192.168.24.13,CEPH-R02-00.tenant.localdomain,CEPH-R02-00.tenant,192.168.24.13,CEPH-R02-00.external.localdomain,CEPH-R02-00.external,192.168.24.13,CEPH-R02-00.management.localdomain,CEPH-R02-00.management,192.168.24.13,CEPH-R02-00.ctlplane.localdomain,CEPH-R02-00.ctlplane", "COMPUTE-R00-00": "172.17.1.13,COMPUTE-R00-00.localdomain,COMPUTE-R00-00,172.17.3.18,COMPUTE-R00-00.storage.localdomain,COMPUTE-R00-00.storage,192.168.24.6,COMPUTE-R00-00.storagemgmt.localdomain,COMPUTE-R00-00.storagemgmt,172.17.1.13,COMPUTE-R00-00.internalapi.localdomain,COMPUTE-R00-00.internalapi,172.17.2.18,COMPUTE-R00-00.tenant.localdomain,COMPUTE-R00-00.tenant,192.168.24.6,COMPUTE-R00-00.external.localdomain,COMPUTE-R00-00.external,192.168.24.6,COMPUTE-R00-00.management.localdomain,COMPUTE-R00-00.management,192.168.24.6,COMPUTE-R00-00.ctlplane.localdomain,COMPUTE-R00-00.ctlplane", "COMPUTE-R01-00": "172.17.1.20,COMPUTE-R01-00.localdomain,COMPUTE-R01-00,172.17.3.11,COMPUTE-R01-00.storage.localdomain,COMPUTE-R01-00.storage,192.168.24.14,COMPUTE-R01-00.storagemgmt.localdomain,COMPUTE-R01-00.storagemgmt,172.17.1.20,COMPUTE-R01-00.internalapi.localdomain,COMPUTE-R01-00.internalapi,172.17.2.17,COMPUTE-R01-00.tenant.localdomain,COMPUTE-R01-00.tenant,192.168.24.14,COMPUTE-R01-00.external.localdomain,COMPUTE-R01-00.external,192.168.24.14,COMPUTE-R01-00.management.localdomain,COMPUTE-R01-00.management,192.168.24.14,COMPUTE-R01-00.ctlplane.localdomain,COMPUTE-R01-00.ctlplane", "CONTROLLER-R00-00": "172.17.1.18,CONTROLLER-R00-00.localdomain,CONTROLLER-R00-00,172.17.3.24,CONTROLLER-R00-00.storage.localdomain,CONTROLLER-R00-00.storage,172.17.4.21,CONTROLLER-R00-00.storagemgmt.localdomain,CONTROLLER-R00-00.storagemgmt,172.17.1.18,CONTROLLER-R00-00.internalapi.localdomain,CONTROLLER-R00-00.internalapi,172.17.2.20,CONTROLLER-R00-00.tenant.localdomain,CONTROLLER-R00-00.tenant,10.0.0.111,CONTROLLER-R00-00.external.localdomain,CONTROLLER-R00-00.external,192.168.24.15,CONTROLLER-R00-00.management.localdomain,CONTROLLER-R00-00.management,192.168.24.15,CONTROLLER-R00-00.ctlplane.localdomain,CONTROLLER-R00-00.ctlplane", "CONTROLLER-R01-00": "172.17.1.15,CONTROLLER-R01-00.localdomain,CONTROLLER-R01-00,172.17.3.12,CONTROLLER-R01-00.storage.localdomain,CONTROLLER-R01-00.storage,172.17.4.11,CONTROLLER-R01-00.storagemgmt.localdomain,CONTROLLER-R01-00.storagemgmt,172.17.1.15,CONTROLLER-R01-00.internalapi.localdomain,CONTROLLER-R01-00.internalapi,172.17.2.11,CONTROLLER-R01-00.tenant.localdomain,CONTROLLER-R01-00.tenant,10.0.0.106,CONTROLLER-R01-00.external.localdomain,CONTROLLER-R01-00.external,192.168.24.17,CONTROLLER-R01-00.management.localdomain,CONTROLLER-R01-00.management,192.168.24.17,CONTROLLER-R01-00.ctlplane.localdomain,CONTROLLER-R01-00.ctlplane", "CONTROLLER-R02-00": "172.17.1.21,CONTROLLER-R02-00.localdomain,CONTROLLER-R02-00,172.17.3.26,CONTROLLER-R02-00.storage.localdomain,CONTROLLER-R02-00.storage,172.17.4.18,CONTROLLER-R02-00.storagemgmt.localdomain,CONTROLLER-R02-00.storagemgmt,172.17.1.21,CONTROLLER-R02-00.internalapi.localdomain,CONTROLLER-R02-00.internalapi,172.17.2.22,CONTROLLER-R02-00.tenant.localdomain,CONTROLLER-R02-00.tenant,10.0.0.103,CONTROLLER-R02-00.external.localdomain,CONTROLLER-R02-00.external,192.168.24.20,CONTROLLER-R02-00.management.localdomain,CONTROLLER-R02-00.management,192.168.24.20,CONTROLLER-R02-00.ctlplane.localdomain,CONTROLLER-R02-00.ctlplane"}}, "ansible_included_var_files": ["/var/lib/mistral/07674ebb-4f32-4a7a-ac6f-ad73beff16be/global_vars.yaml"], "changed": false, "failed": false}

While the content from the with_items task comes in lower case, so:

[ 'controller-r02-00', 'controller-r01-00', 'controller-r00-00' ]

When trying to access ssh_known_hosts['controller-r01-00'] the key can't be found.

[0] - https://github.com/openstack/tripleo-common/blob/master/roles/tripleo-ssh-known-hosts/tasks/main.yml#L6

Comment 6 Amit Ugol 2018-10-16 09:59:24 UTC
Passed automation.

Comment 11 errata-xmlrpc 2018-11-13 22:28:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3587