Bug 1891816
| Summary: | [UPI] [OSP] control-plane.yml provisioning playbook fails on OSP 16.1 | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jon Uriarte <juriarte> |
| Component: | Cloud Compute | Assignee: | Matthew Booth <mbooth> |
| Cloud Compute sub component: | OpenStack Provider | QA Contact: | weiwei jiang <wjiang> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | adduarte, egarcia, emacchi, m.andre, mbooth, mfedosin, pprinett, rlobillo, wjiang |
| Version: | 4.5 | Keywords: | UpcomingSprint |
| Target Milestone: | --- | Flags: | rlobillo:
needinfo-
rlobillo: needinfo- |
| Target Release: | 4.7.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | python-openstacksdk-0.36.4-1.20201113235938.el8ost | Doc Type: | Bug Fix |
| Doc Text: |
Cause: A bug openstack-sdk caused a failure when requesting server groups OSP16.
Consequence: The UPI playbook control-plane.yaml fails in the "Create the Control Plane servers" task with a stack trace.
Fix: Update openstack-sdk on the bastion host executing UPI ansible tasks to at least python-openstacksdk-0.36.4-1.20201113235938.el8ost.
Result: The UPI playbook succeeds.
N.B. We didn't really fix this in OCP 4.7: we fixed it in OpenStack. The fixed openstack-sdk package is not yet released, so customers hitting this will have to request a hotfix for now.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-02-24 15:28:35 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1899192 | ||
| Bug Blocks: | |||
this seems to be a openstacksdk bug: Here is a similar result https://storyboard.openstack.org/#!/story/2007710 (bug 39843) It is addressed by this patch to openstacksdk https://review.opendev.org/#/c/749381/ which has merged to master as of sep 16 2020 and is included in: Branches master Tags 0.51.0 @Jon Uriarte, could you test with openstacksdk-0.51.0 or later (master). it seems the problem might be fixed there. This could have been fixed upstream; Can you please verify that the bug still exists now that we recommend[1] `ansible-galaxy` to fetch the dependencies? [1]: https://github.com/openshift/installer/pull/4379 Possibly, we'll need to implement a temporary solution like https://github.com/openshift/installer/pull/4375 until the openstacksdk package containing the fix is more widespread. I think perhaps https://github.com/openshift/installer/pull/4375 is not so temporary. If we make the version variable, this would probably be something good to have, since this type of defect will probably pop up again if a newer version appears and it brings roblems. Seems pining the playbooks to a particular api version might be a good idea. Status: Emilien posted a backport of the upstream fix here: https://review.opendev.org/c/openstack/openstacksdk/+/763121/ , which has a +2 +W. Unfortunately it failed to merge due to timeout errors in various tests, all of which seem unrelated to the backport. I have resubmitted and hit the same issue again. I will spend some time trying to improve the timeout situation, or this seems unlikely to ever land. The openstacksdk backport has now landed. (In reply to rlobillo from comment #5) > We also tried to run the control-plane.yaml playbook installing the > collection through ansible-galaxy as mentioned in the documentation We have reverted that change; that was my mistake. ansible-galaxy is not supported. If there's any reference to ansible-galaxy left in code or docs, then you can report it as a bug. Thanks! Checked with python3-openstacksdk-0.36.4-1.20201113235938.el8ost.noarch, and can not reproduce this issue, moved to verified.
TASK [Create the Control Plane servers] ****************************************
task path: /root/jenkins/workspace/Launch Environment Flexy/private-templates/functionality-testing/aos-4_7/hosts/upi_on_openstack-scripts/04_control-plane.yaml:72
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: root
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1611570657.568383-4107914-230337991073578 `" && echo ansible-tmp-1611570657.568383-4107914-230337991073578="` echo /root/.ansible/tmp/ansible-tmp-1611570657.568383-4107914-230337991073578 `" ) && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/cloud/openstack/os_server.py
<localhost> PUT /root/.ansible/tmp/ansible-local-410775579l6z6kd/tmpzm2x60rx TO /root/.ansible/tmp/ansible-tmp-1611570657.568383-4107914-230337991073578/AnsiballZ_os_server.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1611570657.568383-4107914-230337991073578/ /root/.ansible/tmp/ansible-tmp-1611570657.568383-4107914-230337991073578/AnsiballZ_os_server.py && sleep 0'
<localhost> EXEC /bin/sh -c '/usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1611570657.568383-4107914-230337991073578/AnsiballZ_os_server.py && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1611570657.568383-4107914-230337991073578/ > /dev/null 2>&1 && sleep 0'
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
changed: [localhost] => (item=[0, 'wj47uos125ag-kgxfx-master']) => {
"ansible_loop_var": "item",
"changed": true,
"id": "d4570999-a84a-4470-85da-e8f8f801482b",
......
"server_groups": null,
"status": "ACTIVE",
"tags": [],
"task_state": null,
"tenant_id": "542c6ebd48bf40fa857fc245c7572e30",
"terminated_at": null,
"trusted_image_certificates": null,
"updated": "2021-01-25T10:35:47Z",
"user_data": null,
"user_id": "b414646065ab99780ef1bbcba52c07d2033a6f99fd0b10a3b1b12fcb5e5275e1",
"vm_state": "active",
"volumes": []
}
}
META: ran handlers
META: ran handlers
PLAY RECAP *********************************************************************
localhost : ok=7 changed=5 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |
Description of problem: The playbook 'control-plane.yml' for provisioning the master nodes in 4.5 and 4.6 UPI fails when the underlying OSP is 16.1. It works fine in OSP 13. Version-Release number of selected component (if applicable): OCP 4.5.0-0.nightly-2020-10-23-050031 OSP RHOS-16.1-RHEL-8-20201021.n.0 The playbooks are being executed from a bastion host with: ansible 2.9.14 python3-openstacksdk-0.36.3 python3-openstackclient-4.0.0 How reproducible: always Steps to Reproduce: 1. Install OSP 16.1 and create a bastion host (it's not a must for reproducing the issue, it can be run from the undercloud as well) 2. Run the provisioning playbooks for UPI as described in [1] ansible-playbook -i "/home/cloud-user/ostest/inventory.yaml" "/home/cloud-user/ostest/control-plane.yaml" Actual results: TASK [Create the Control Plane servers] **************************************** failed: [localhost] (item=[0, 'ostest-vpwdz-master']) => {"ansible_loop_var": "item", "changed": false, "item": [0, "ostest-vpwdz-master"], "module_stderr": "/usr/lib/python3.6/site-packages/openstack/config/cloud_region.py:432: UserWarning: You have a configured API_VERSION with 'latest' in it. In the context of openstacksdk this doesn't make any sense.\n \"You have a configured API_VERSION with 'latest' in\"\n Traceback (most recent call last):\n File \"/home/cloud-user/.ansible/tmp/ansible-tmp-1603265045.369726-22447-253278130374381/AnsiballZ_os_server.py\", line 102, in <module>\n _ansiballz_main()\n File \"/home/cloud-user/.ansible/tmp/ansible-tmp-1603265045.369726-22447-253278130374381/AnsiballZ_os_server.py\", line 94, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File \"/home/cloud-user/.ansible/tmp/ansible-tmp-1603265045.369726-22447-253278130374381/AnsiballZ_os_server.py\", line 40, in invoke_module\n runpy.run_module(mod_name='ansible.modules.cloud.openstack.os_server', init_globals=None, run_name='__main__', alter_sys=True)\n File \"/usr/lib64/python3.6/runpy.py\", line 205, in run_module\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File \"/usr/lib64/python3.6/runpy.py\", line 96, in _run_module_code\n mod_name, mod_spec, pkg_name, script_name)\n File \"/usr/lib64/python3.6/runpy.py\", line 85, in _run_code\n exec(code, run_globals)\n File \"/tmp/ansible_os_server_payload_r89pe0c8/ansible_os_server_payload.zip/ansible/modules/cloud/openstack/os_server.py\", line 759, in <module>\n File \"/tmp/ansible_os_server_payload_r89pe0c8/ansible_os_server_payload.zip/ansible/modules/cloud/openstack/os_server.py\", line 750, in main\n File \"/tmp/ansible_os_server_payload_r89pe0c8/ansible_os_server_payload.zip/ansible/modules/cloud/openstack/os_server.py\", line 547, in _create_server\n File \"/tmp/ansible_os_server_payload_r89pe0c8/ansible_os_server_payload.zip/ansible/modules/cloud/openstack/os_server.py\", line 417, in _exit_hostvars\n File \"/usr/lib/python3.6/site-packages/openstack/cloud/_compute.py\", line 1832, in get_openstack_vars\n return meta.get_hostvars_from_server(self, server)\n File \"/usr/lib/python3.6/site-packages/openstack/cloud/meta.py\", line 499, in get_hostvars_from_server\n expand_server_security_groups(cloud, server)\n File \"/usr/lib/python3.6/site-packages/openstack/cloud/meta.py\", line 471, in expand_server_security_groups\n groups = cloud.list_server_security_groups(server)\n File \"/usr/lib/python3.6/site-packages/openstack/cloud/_compute.py\", line 198, in list_server_security_groups\n server = self.compute.get_server(server)\n File \"/usr/lib/python3.6/site-packages/openstack/compute/v2/_proxy.py\", line 482, in get_server\n return self._get(_server.Server, server)\n File \"/usr/lib/python3.6/site-packages/openstack/proxy.py\", line 46, in check\n return method(self, expected, actual, *args, **kwargs)\n File \"/usr/lib/python3.6/site-packages/openstack/proxy.py\", line 447, in _get\n resource_type=resource_type.__name__, value=value))\n File \"/usr/lib/python3.6/site-packages/openstack/resource.py\", line 1321, in fetch\n self._translate_response(response, **kwargs)\n File \"/usr/lib/python3.6/site-packages/openstack/resource.py\", line 1134, in _translate_response\n dict.update(self, self.to_dict())\n File \"/usr/lib/python3.6/site-packages/openstack/resource.py\", line 969, in to_dict\n value = getattr(self, attr, None)\n File \"/usr/lib/python3.6/site-packages/openstack/resource.py\", line 580, in __getattribute__\n return object.__getattribute__(self, name)\n File \"/usr/lib/python3.6/site-packages/openstack/resource.py\", line 166, in __get__\n return _convert_type(value, self.type, self.list_type)\n File \"/usr/lib/python3.6/site-packages/openstack/resource.py\", line 66, in _convert_type\n ret.append(_convert_type(raw, list_type))\n File \"/usr/lib/python3.6/site-packages/openstack/resource.py\", line 82, in _convert_type\n return data_type(value)\n ValueError: dictionary update sequence element #0 has length 1; 2 is required\n ", "module_stdout": "", "msg": "MODULE FAILURE\n See stdout/stderr for the exact error", "rc": 1} failed: [localhost] (item=[1, 'ostest-vpwdz-master']) => same error failed: [localhost] (item=[2, 'ostest-vpwdz-master']) => same error The tasks fails, but the VMs are deployed successfully. Expected results: no errors Additional info: Commenting out the lines in [2] the task works ok. Tried with ansible 2.8 and 2.9 but same result. Tried with python3-openstacksdk-0.36.3 (from the bastion host [3]) and python3-openstacksdk-0.36.4 (from the undercloud [4]) but no differences. Workaround: add 'ignore_errors: yes' to the 'Create the Control Plane servers' task [1] https://docs.openshift.com/container-platform/4.5/installing/installing_openstack/installing-openstack-user.html [2] https://github.com/openshift/installer/blob/release-4.5/upi/openstack/control-plane.yaml#L85-L86 [3] http://pulp.dist.prod.ext.phx2.redhat.com/content/dist/layered/rhel8/$basearch/openstack-tools/16/os/ [4] http://rhos-qe-mirror-tlv.usersys.redhat.com/rcm-guest/puddles/OpenStack/16.1-RHEL-8/RHOS-16.1-RHEL-8-20201021.n.0/compose/OpenStack/$basearch/os