Bug 1641085

Summary: Openshift-on-Openstack install playbook fails after PR #10409
Product: OpenShift Container Platform Reporter: Jon Uriarte <juriarte>
Component: InstallerAssignee: Tomas Sedovic <tsedovic>
Status: CLOSED ERRATA QA Contact: Jon Uriarte <juriarte>
Severity: urgent Docs Contact:
Priority: high    
Version: 3.10.0CC: aos-bugs, gcheresh, hasha, itbrown, jokerman, mmccomas, tsedovic, ushkalim, wsun
Target Milestone: ---Keywords: Triaged
Target Release: 3.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The OpenStack dynamic inventory was always setting the `openshift_kubelet_name_override` Ansible variable. Consequence: This variable is only expected to be set during upgrades from 3.10 to 3.11. Setting it for brand new deployments will cause the openshift-ansible playbooks to fail with an error. Fix: The inventory no longer sets `openshift_kubelet_name_override` variable automatically. Result: The OpenStack cloud deployments are now able to finish successfully.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-10 09:27:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jon Uriarte 2018-10-19 15:20:39 UTC
Description of problem:

Cannot install OCP 3.10 on OSP. It seems the PR https://github.com/openshift/openshift-ansible/pull/10409 is causing this failure, but I don't know if it is a bug in OCP or Openstack install playbook needs some reordering when calling prerequisites playbook and setting facts.

Version-Release number of the following components:
rpm -q openshift-ansible
openshift-ansible-3.10.59-1.git.0.f9ba890.el7.noarch

rpm -q ansible
ansible-2.4.6.0-1.el7ae.noarch

ansible --version
ansible 2.4.6.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/cloud-user/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Feb 20 2018, 09:19:12) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]


How reproducible: always

Steps to Reproduce:
1. Install OSP 13
2. Run Openstack playbooks:
$ ansible-playbook     --user openshift     -i "/usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py"     -i inventory "/usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/prerequisites.yml"

$ ansible-playbook     --user openshift     -i "/usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py"     -i inventory "/usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/provision.yml"

$ ansible-playbook     --user openshift     -i "/usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py"     -i inventory red-hat-ca.yml

$ ansible-playbook     --user openshift     -i "/usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py"     -i inventory repos.yml

$ ansible-playbook     --user openshift     -i "/usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py"     -i inventory "/usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/install.yml"


Actual results:

PLAY [Fail openshift_kubelet_name_override for new hosts] *********************************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
ok: [app-node-0.openshift.example.com]
ok: [infra-node-0.openshift.example.com]
ok: [app-node-1.openshift.example.com]
ok: [master-0.openshift.example.com]

TASK [Fail when openshift_kubelet_name_override is defined] *******************************************************************************************************************************************************
fatal: [app-node-0.openshift.example.com]: FAILED! => {"changed": false, "failed": true, "msg": "openshift_kubelet_name_override Cannot be defined for new hosts"}
fatal: [app-node-1.openshift.example.com]: FAILED! => {"changed": false, "failed": true, "msg": "openshift_kubelet_name_override Cannot be defined for new hosts"}
fatal: [master-0.openshift.example.com]: FAILED! => {"changed": false, "failed": true, "msg": "openshift_kubelet_name_override Cannot be defined for new hosts"}
fatal: [infra-node-0.openshift.example.com]: FAILED! => {"changed": false, "failed": true, "msg": "openshift_kubelet_name_override Cannot be defined for new hosts"}
 [WARNING]: Could not create retry file '/usr/share/ansible/openshift-ansible/playbooks/openstack/openshift-cluster/install.retry'.         [Errno 13] Permission denied: u'/usr/share/ansible/openshift-
ansible/playbooks/openstack/openshift-cluster/install.retry'


PLAY RECAP ********************************************************************************************************************************************************************************************************
app-node-0.openshift.example.com : ok=1    changed=0    unreachable=0    failed=1   
app-node-1.openshift.example.com : ok=1    changed=0    unreachable=0    failed=1   
infra-node-0.openshift.example.com : ok=1    changed=0    unreachable=0    failed=1   
master-0.openshift.example.com : ok=1    changed=0    unreachable=0    failed=1   



Failure summary:


  1. Hosts:    app-node-0.openshift.example.com, app-node-1.openshift.example.com, infra-node-0.openshift.example.com, master-0.openshift.example.com
     Play:     Fail openshift_kubelet_name_override for new hosts
     Task:     Fail when openshift_kubelet_name_override is defined
     Message:  openshift_kubelet_name_override Cannot be defined for new hosts



Expected results: successful installation

Comment 1 Tomas Sedovic 2018-10-22 07:25:36 UTC
The issue is with this line:

https://github.com/openshift/openshift-ansible/blob/6b1f210660771d2066186a7ba793f85d7c526285/roles/openshift_openstack/templates/heat_stack_server.yaml.j2#L247-L249

and the corresponding:

https://github.com/openshift/openshift-ansible/blob/6b1f210660771d2066186a7ba793f85d7c526285/playbooks/openstack/resources.py#L104-L105

This variable was introduced here: https://github.com/openshift/openshift-ansible/commit/1faee0942dec05b6f652669ad6cfced986a0cbc9

(it used to be called `openshift_hostname`)

And it is now disallowed for any new (non-upgrade) deploments:

https://github.com/openshift/openshift-ansible/pull/10409/files

So to resolve this, we must stop setting the variable in our dynamic inventory (resources.py).

I can't tell whether that will be enough, though. Not setting the variable may cause other trouble we'll have to figure out.

Comment 2 Jon Uriarte 2018-10-22 09:37:01 UTC
In OCP 3.10 the 'openshift_kubelet_name_override' is defined in inventory.py instead of in resources.py.

Commenting https://github.com/openshift/openshift-ansible/blob/release-3.10/playbooks/openstack/inventory.py#L100-L101 lines in OCP 3.10 worked and the install playbook finishes successfully.

Comment 10 GenadiC 2018-12-16 14:03:13 UTC
We can install OCP 3.10 on OSP

Comment 12 errata-xmlrpc 2019-01-10 09:27:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0026