Bug 1667785 - upgrade to ocp 3.9 is not considering openstack cloud provider configuration
Summary: upgrade to ocp 3.9 is not considering openstack cloud provider configuration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.9.z
Assignee: Tzu-Mainn Chen
QA Contact: Weihua Meng
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-21 05:01 UTC by Sudarshan Chaudhari
Modified: 2022-03-13 16:48 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-02-20 08:46:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3821602 0 None None None 2019-01-21 20:43:09 UTC
Red Hat Product Errata RHBA-2019:0331 0 None None None 2019-02-20 08:47:02 UTC

Description Sudarshan Chaudhari 2019-01-21 05:01:36 UTC
Description of problem:

Upgrading an OCP cluster from 3.6 -> 3.11. The cluster is running on openstack, but is *not* consuming the cloud provider. The cluster had been successfully upgraded to 3.7 (3.7.72) & we started the 3.9 (3.9.60) control plane. The playbook errored:
~~~
[root@ocp08 ~]# tail -n 40 /tmp/upgrade/39/control_plane.log

TASK [Ensure various deps for running system containers are installed] *********************************************************************************************
skipping: [ocp22.example.com] => (item=atomic)
skipping: [ocp22.example.com] => (item=ostree)
skipping: [ocp22.example.com] => (item=runc)
skipping: [ocp21.example.com] => (item=atomic)
skipping: [ocp21.example.com] => (item=ostree)
skipping: [ocp21.example.com] => (item=runc)
skipping: [ocp14.example.com] => (item=atomic)
skipping: [ocp14.example.com] => (item=ostree)
skipping: [ocp14.example.com] => (item=runc)

PLAY [Initialize cluster facts] ************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************
ok: [ocp22.example.com]
ok: [ocp21.example.com]
ok: [ocp14.example.com]

TASK [Gather Cluster facts] ****************************************************************************************************************************************
fatal: [ocp21.example.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to x.x.x.188 closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 1678, in <module>\r\n    main()\r\n  File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 1665, in main\r\n    additive_facts_to_overwrite)\r\n  File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 1331, in __init__\r\n    additive_facts_to_overwrite)\r\n  File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 1361, in generate_facts\r\n    provider_facts = self.init_provider_facts()\r\n  File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 1503, in init_provider_facts\r\n    provider_info.get('metadata')\r\n  File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 375, in normalize_provider_facts\r\n    facts = normalize_openstack_facts(metadata, facts)\r\n  File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 340, in normalize_openstack_facts\r\n    if socket.gethostbyname(metadata['ec2_compat'][h_var]) == metadata['ec2_compat'][ip_var].split(',')[0]:\r\nAttributeError: 'list' object has no attribute 'split'\r\n", "msg": "MODULE FAILURE", "rc": 0}
fatal: [ocp22.example.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to x.x.x.189 closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 1678, in <module>\r\n    main()\r\n  File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 1665, in main\r\n    additive_facts_to_overwrite)\r\n  File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 1331, in __init__\r\n    additive_facts_to_overwrite)\r\n  File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 1361, in generate_facts\r\n    provider_facts = self.init_provider_facts()\r\n  File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 1503, in init_provider_facts\r\n    provider_info.get('metadata')\r\n  File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 375, in normalize_provider_facts\r\n    facts = normalize_openstack_facts(metadata, facts)\r\n  File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 340, in normalize_openstack_facts\r\n    if socket.gethostbyname(metadata['ec2_compat'][h_var]) == metadata['ec2_compat'][ip_var].split(',')[0]:\r\nAttributeError: 'list' object has no attribute 'split'\r\n", "msg": "MODULE FAILURE", "rc": 0}
fatal: [ocp14.example.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to x.x.x.181 closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 1678, in <module>\r\n    main()\r\n  File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 1665, in main\r\n    additive_facts_to_overwrite)\r\n  File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 1331, in __init__\r\n    additive_facts_to_overwrite)\r\n  File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 1361, in generate_facts\r\n    provider_facts = self.init_provider_facts()\r\n  File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 1503, in init_provider_facts\r\n    provider_info.get('metadata')\r\n  File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 375, in normalize_provider_facts\r\n    facts = normalize_openstack_facts(metadata, facts)\r\n  File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 340, in normalize_openstack_facts\r\n    if socket.gethostbyname(metadata['ec2_compat'][h_var]) == metadata['ec2_compat'][ip_var].split(',')[0]:\r\nAttributeError: 'list' object has no attribute 'split'\r\n", "msg": "MODULE FAILURE", "rc": 0}
    to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_9/upgrade_control_plane.retry

PLAY RECAP *********************************************************************************************************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0
ocp14.example.com : ok=16   changed=0    unreachable=0    failed=1
ocp21.example.com : ok=16   changed=0    unreachable=0    failed=1
ocp22.example.com : ok=20   changed=0    unreachable=0    failed=1



Failure summary:


  1. Hosts:    ocp14.example.com, ocp21.example.com, ocp22.example.com
     Play:     Initialize cluster facts
     Task:     Gather Cluster facts
     Message:  MODULE FAILURE
~~~


Version-Release number of the following components:
rpm -q openshift-ansible:
atomic-openshift-utils.noarch                                 3.9.60-1.git.0.f0ebfaa.el7                                   rhel-7-server-ose-3.9-rpms

Steps to Reproduce:
1. Upgrade the OCP 3.7 to ocp 3.9.60 on Openstack

Actual results:
The controll plane upgrade failed. 

Expected results:
- It should have been completed

Additional info:

This issue is fixed in the Upstream: 
https://github.com/openshift/openshift-ansible/commit/990c29ae04d57099116fbcc56c58f4f7f68507b7


Need info to check if this is added in OCP 3.9.60 or yet to be included.

Comment 4 Vadim Rutkovsky 2019-01-21 11:53:31 UTC
Fixed in https://github.com/openshift/openshift-ansible/pull/11014

Comment 5 Weihua Meng 2019-01-28 10:29:56 UTC
I could not reproduce this bug, could you help? Thanks in advance.

1. install OCP v3.6 without cloudprovider enabled, on OpenStack v10
2. upgrade to v3.7
3. upgrade to v3.9, with openshift-ansible-3.9.60-1.git.0.f0ebfaa.el7.noarch
did not meet the error reported.

Comment 6 Tzu-Mainn Chen 2019-01-28 14:54:50 UTC
Sudarshan, could you explain how you set up the ec2 compat? I tested this code independently.

Comment 8 Weihua Meng 2019-02-11 00:11:53 UTC
Sudarshan, Could you provide detailed steps to reproduce this bug?
Thanks.

Comment 9 Weihua Meng 2019-02-12 02:29:07 UTC
Fixed.

openshift-ansible-3.9.68-1.git.0.0c424ac.el7.noarch

Kernel Version: 3.10.0-862.14.4.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.5 (Maipo)

Comment 11 errata-xmlrpc 2019-02-20 08:46:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0331


Note You need to log in before you can comment on or make changes to this bug.