Bug 1772080 - [IPI Baremetal]: gz openstack image suffix breaks deployment
Summary: [IPI Baremetal]: gz openstack image suffix breaks deployment
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.3.0
Assignee: Steven Hardy
QA Contact: Sasha Smolyak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-13 15:17 UTC by Steven Hardy
Modified: 2020-02-06 11:42 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-23 11:12:28 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift installer issues 2661 'None' closed baremetal: gz openstack image suffix breaks deployment 2020-02-06 11:39:13 UTC
Github openshift installer pull 2657 'None' closed Bug 1772599: Handle compressed images for libvirt and baremetal IPI 2020-02-06 11:39:13 UTC
Github openshift installer pull 2662 'None' closed Bug 1772080: Strip ".gz" suffix from baremetal image vars 2020-02-06 11:39:13 UTC
Github openshift ironic-rhcos-downloader pull 12 'None' closed If RHCOS image is gzipped, decompress first before passing to qemu-img 2020-02-06 11:39:14 UTC
Github openshift ironic-rhcos-downloader pull 13 'None' closed Check file filename in cache instead of headers 2020-02-06 11:39:14 UTC
Red Hat Bugzilla 1771519 'unspecified' 'ASSIGNED' 'Unable to spawn a bootable openstack instance out of rhcos image' 2019-12-05 06:58:25 UTC
Red Hat Product Errata RHBA-2020:0062 None None None 2020-01-23 11:12:53 UTC

Description Steven Hardy 2019-11-13 15:17:45 UTC
Description of problem:

The recent changes to add a .gz suffix to the openstack (and qemu) images in data/data/rhcos.json have broken baremetal IPI, there are two problems:

The bootstrap VM which relies on the libvirt/qemu image no longer boots - this may be resolved via openshift/installer#2657 or something similar

The image URLs for terraform and also the machines now contains the .gz suffix, but this is wrong because we now uncompress the file in https://github.com/openshift/ironic-rhcos-downloader so we should strip the path given in the tfvars and the machine providerSpec.


How reproducible:
Always

Steps to Reproduce:
1. Attempt to deploy IPI baremetal via openshift-baremetal-install
2. Observe bootstrap VM fails to boot with a non-bootable disk image
3. Work around the bootstrap VM issue then observe the masters fail to deploy because the image URL contains the .gz suffix

Actual results:

Terraform fails with an error like:


level=error msg="Error: Bad request with: [PUT http://172.22.0.2:6385/v1/nodes/2e4048b2-6e1d-4812-9854-df4f5d4a7fcd/states/provision], error message: {\"error_message\": \"{\\\"faultcode\\\": \\\"Client\\\", \\\"faultstring\\\": \\\"Failed to validate deploy or power info for node 2e4048b2-6e1d-4812-9854-df4f5d4a7fcd. Error: Validation of image href http://172.22.0.2/images/rhcos-43.81.201911081536.0-openstack.x86_64.qcow2.gz/rhcos-43.81.201911081536.0-compressed.x86_64.qcow2.gz failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request.\\\", \\\"debuginfo\\\": null}\"}"

After working around that the machine provider spec is also similarly malformed, check `oc describe machineset ostest-worker-0 -n openshift-machine-api`

Expected results:
bootstrap, master and worker deployments should work as before the gzipped images.
Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 2 Sasha Smolyak 2019-12-11 12:05:28 UTC
gz files are downloaded for bootstrap node and unpacked, installation passes. Verified

Comment 4 errata-xmlrpc 2020-01-23 11:12:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062


Note You need to log in before you can comment on or make changes to this bug.