Bug 1564847 - The image tag of cri-o should be v3.9 instead of 3.9 while openshift_release is specified
Summary: The image tag of cri-o should be v3.9 instead of 3.9 while openshift_release ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.9.z
Assignee: Scott Dodson
QA Contact: Gan Huang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-08 08:33 UTC by Gan Huang
Modified: 2018-06-06 15:47 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-06-06 15:46:20 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1796 0 None None None 2018-06-06 15:47:03 UTC

Description Gan Huang 2018-04-08 08:33:12 UTC
Description of problem:
The image tag of cri-o should be v3.9 instead of 3.9 while openshift_release is specified

Version-Release number of the following components:
openshift-ansible-3.9.19-1.git.0.34f4090.el7.noarch.rpm

How reproducible:
always

Steps to Reproduce:
1. Trigger installation with openshift_release specified:
# cat inventory
<--snip-->
openshift_release=v3.9
openshift_use_crio=true
<--snip-->


Actual results:
Installer was trying to pull cri-o:3.9 image that led the installation failed.

TASK [container_runtime : Pre-pull CRI-O System Container image] ***************
Saturday 07 April 2018  23:42:03 -0400 (0:00:00.068)       0:01:53.785 ******** 
fatal: [qe-ghuang39-master-etcd-1.0407-s6f.qe.rhcloud.com]: FAILED! => {"changed": false, "cmd": ["atomic", "pull", "--storage", "ostree", "registry.access.redhat.com/openshift3/cri-o:3.9"], "delta": "0:00:01.375291", "end": "2018-04-07 23:42:06.973018", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2018-04-07 23:42:05.597727", "stderr": "time=\"2018-04-07T23:42:06-04:00\" level=fatal msg=\"Error determining manifest MIME type for docker://registry.access.redhat.com/openshift3/cri-o:3.9: error parsing HTTP 404 response body: invalid character 'F' looking for beginning of value: \"File not found.\\\"\"\" ", "stderr_lines": ["time=\"2018-04-07T23:42:06-04:00\" level=fatal msg=\"Error determining manifest MIME type for docker://registry.access.redhat.com/openshift3/cri-o:3.9: error parsing HTTP 404 response body: invalid character 'F' looking for beginning of value: \"File not found.\\\"\"\" "], "stdout": "", "stdout_lines": []}
fatal: [qe-ghuang39-node-registry-router-1.0407-s6f.qe.rhcloud.com]: FAILED! => {"changed": false, "cmd": ["atomic", "pull", "--storage", "ostree", "registry.access.redhat.com/openshift3/cri-o:3.9"], "delta": "0:00:01.765012", "end": "2018-04-07 23:42:07.416900", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2018-04-07 23:42:05.651888", "stderr": "time=\"2018-04-07T23:42:07-04:00\" level=fatal msg=\"Error determining manifest MIME type for docker://registry.access.redhat.com/openshift3/cri-o:3.9: error parsing HTTP 404 response body: invalid character 'F' looking for beginning of value: \"File not found.\\\"\"\" ", "stderr_lines": ["time=\"2018-04-07T23:42:07-04:00\" level=fatal msg=\"Error determining manifest MIME type for docker://registry.access.redhat.com/openshift3/cri-o:3.9: error parsing HTTP 404 response body: invalid character 'F' looking for beginning of value: \"File not found.\\\"\"\" "], "stdout": "", "stdout_lines": []}

Expected results:
Installer should pull cri-o:v3.9 image instead.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 2 Johnny Liu 2018-04-09 09:17:41 UTC
The root cause in the log is the following task:

TASK [openshift_sanitize_inventory : Normalize openshift_release] **************
Monday 09 April 2018  04:48:40 -0400 (0:00:00.051)       0:00:03.028 ********** 
ok: [ec2-18-232-79-29.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_release": "3.9"}, "changed": false, "failed": false}
ok: [ec2-52-90-214-49.compute-1.amazonaws.com] => {"ansible_facts": {"openshift_release": "3.9"}, "changed": false, "failed": false}


The openshift_release is overwritten to 3.9 without 'v' prefix.

Comment 3 Gan Huang 2018-06-01 05:58:51 UTC
When I specify openshift_release=v3.9, consequently pause_image = "registry.reg-aws.openshift.com:443/openshift3/ose-pod:3.9" in /etc/crio/crio.conf. That lead all pods can't startup.

This should be a very serious issue that's breaking the cri-o installation for 3.9.

Comment 4 Scott Dodson 2018-06-01 13:25:08 UTC
I think comment #3 should've been opened as a new bug since the rest of this bug describes system containers.

Anyway, we need to fix the pause_image definition.

https://github.com/openshift/openshift-ansible/blob/release-3.9/roles/container_runtime/defaults/main.yml#L140-L146

Comment 5 Scott Dodson 2018-06-01 15:51:48 UTC
https://github.com/openshift/openshift-ansible/pull/8601 should fix this

Comment 7 Gan Huang 2018-06-04 03:31:16 UTC
Verified in openshift-ansible-3.9.30-1.git.7.46f8678.el7.noarch

Comment 10 errata-xmlrpc 2018-06-06 15:46:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1796


Note You need to log in before you can comment on or make changes to this bug.