Description of problem: Upgrading an OCP cluster from 3.6 -> 3.11. The cluster is running on openstack, but is *not* consuming the cloud provider. The cluster had been successfully upgraded to 3.7 (3.7.72) & we started the 3.9 (3.9.60) control plane. The playbook errored: ~~~ [root@ocp08 ~]# tail -n 40 /tmp/upgrade/39/control_plane.log TASK [Ensure various deps for running system containers are installed] ********************************************************************************************* skipping: [ocp22.example.com] => (item=atomic) skipping: [ocp22.example.com] => (item=ostree) skipping: [ocp22.example.com] => (item=runc) skipping: [ocp21.example.com] => (item=atomic) skipping: [ocp21.example.com] => (item=ostree) skipping: [ocp21.example.com] => (item=runc) skipping: [ocp14.example.com] => (item=atomic) skipping: [ocp14.example.com] => (item=ostree) skipping: [ocp14.example.com] => (item=runc) PLAY [Initialize cluster facts] ************************************************************************************************************************************ TASK [Gathering Facts] ********************************************************************************************************************************************* ok: [ocp22.example.com] ok: [ocp21.example.com] ok: [ocp14.example.com] TASK [Gather Cluster facts] **************************************************************************************************************************************** fatal: [ocp21.example.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to x.x.x.188 closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 1678, in <module>\r\n main()\r\n File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 1665, in main\r\n additive_facts_to_overwrite)\r\n File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 1331, in __init__\r\n additive_facts_to_overwrite)\r\n File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 1361, in generate_facts\r\n provider_facts = self.init_provider_facts()\r\n File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 1503, in init_provider_facts\r\n provider_info.get('metadata')\r\n File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 375, in normalize_provider_facts\r\n facts = normalize_openstack_facts(metadata, facts)\r\n File \"/tmp/ansible_7m8GWE/ansible_module_openshift_facts.py\", line 340, in normalize_openstack_facts\r\n if socket.gethostbyname(metadata['ec2_compat'][h_var]) == metadata['ec2_compat'][ip_var].split(',')[0]:\r\nAttributeError: 'list' object has no attribute 'split'\r\n", "msg": "MODULE FAILURE", "rc": 0} fatal: [ocp22.example.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to x.x.x.189 closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 1678, in <module>\r\n main()\r\n File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 1665, in main\r\n additive_facts_to_overwrite)\r\n File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 1331, in __init__\r\n additive_facts_to_overwrite)\r\n File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 1361, in generate_facts\r\n provider_facts = self.init_provider_facts()\r\n File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 1503, in init_provider_facts\r\n provider_info.get('metadata')\r\n File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 375, in normalize_provider_facts\r\n facts = normalize_openstack_facts(metadata, facts)\r\n File \"/tmp/ansible_oPRBEi/ansible_module_openshift_facts.py\", line 340, in normalize_openstack_facts\r\n if socket.gethostbyname(metadata['ec2_compat'][h_var]) == metadata['ec2_compat'][ip_var].split(',')[0]:\r\nAttributeError: 'list' object has no attribute 'split'\r\n", "msg": "MODULE FAILURE", "rc": 0} fatal: [ocp14.example.com]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to x.x.x.181 closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 1678, in <module>\r\n main()\r\n File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 1665, in main\r\n additive_facts_to_overwrite)\r\n File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 1331, in __init__\r\n additive_facts_to_overwrite)\r\n File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 1361, in generate_facts\r\n provider_facts = self.init_provider_facts()\r\n File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 1503, in init_provider_facts\r\n provider_info.get('metadata')\r\n File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 375, in normalize_provider_facts\r\n facts = normalize_openstack_facts(metadata, facts)\r\n File \"/tmp/ansible_tbdv8p/ansible_module_openshift_facts.py\", line 340, in normalize_openstack_facts\r\n if socket.gethostbyname(metadata['ec2_compat'][h_var]) == metadata['ec2_compat'][ip_var].split(',')[0]:\r\nAttributeError: 'list' object has no attribute 'split'\r\n", "msg": "MODULE FAILURE", "rc": 0} to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_9/upgrade_control_plane.retry PLAY RECAP ********************************************************************************************************************************************************* localhost : ok=11 changed=0 unreachable=0 failed=0 ocp14.example.com : ok=16 changed=0 unreachable=0 failed=1 ocp21.example.com : ok=16 changed=0 unreachable=0 failed=1 ocp22.example.com : ok=20 changed=0 unreachable=0 failed=1 Failure summary: 1. Hosts: ocp14.example.com, ocp21.example.com, ocp22.example.com Play: Initialize cluster facts Task: Gather Cluster facts Message: MODULE FAILURE ~~~ Version-Release number of the following components: rpm -q openshift-ansible: atomic-openshift-utils.noarch 3.9.60-1.git.0.f0ebfaa.el7 rhel-7-server-ose-3.9-rpms Steps to Reproduce: 1. Upgrade the OCP 3.7 to ocp 3.9.60 on Openstack Actual results: The controll plane upgrade failed. Expected results: - It should have been completed Additional info: This issue is fixed in the Upstream: https://github.com/openshift/openshift-ansible/commit/990c29ae04d57099116fbcc56c58f4f7f68507b7 Need info to check if this is added in OCP 3.9.60 or yet to be included.
Fixed in https://github.com/openshift/openshift-ansible/pull/11014
I could not reproduce this bug, could you help? Thanks in advance. 1. install OCP v3.6 without cloudprovider enabled, on OpenStack v10 2. upgrade to v3.7 3. upgrade to v3.9, with openshift-ansible-3.9.60-1.git.0.f0ebfaa.el7.noarch did not meet the error reported.
Sudarshan, could you explain how you set up the ec2 compat? I tested this code independently.
Sudarshan, Could you provide detailed steps to reproduce this bug? Thanks.
Fixed. openshift-ansible-3.9.68-1.git.0.0c424ac.el7.noarch Kernel Version: 3.10.0-862.14.4.el7.x86_64 Operating System: Red Hat Enterprise Linux Server 7.5 (Maipo)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0331