Bug 1574899

Summary: ${version} for imageConfig.format is hardcode to v3.10 when user does not set openshift_image_tag
Product: OpenShift Container Platform Reporter: Johnny Liu <jialiu>
Component: InstallerAssignee: Michael Gugino <mgugino>
Status: CLOSED ERRATA QA Contact: Johnny Liu <jialiu>
Severity: medium Docs Contact:
Priority: high    
Version: 3.10.0CC: aos-bugs, ccoleman, ghuang, gpei, jokerman, mmccomas, sdodson, wmeng
Target Milestone: ---Keywords: Reopened
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-30 19:14:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
installation log with inventory file embedded none

Description Johnny Liu 2018-05-04 09:59:10 UTC
Created attachment 1431204 [details]
installation log with inventory file embedded

Description of problem:
In roles/openshift_control_plane/defaults/main.yml, and roles/openshift_node/defaults/main.yml, there is the following line:

l_os_registry_url: "{{ oreg_url | default(l_osm_registry_url_default) | regex_replace('${version}' | regex_escape, openshift_image_tag | default('${version}')) }}"

When user does not set openshift_image_tag in inventory file, installer would set_facts for openshift_image_tag during installation, which would lead to set ${version} for imageConfig.format to v3.10.

Think about the following scenario:
1. user run a fresh install, the openshift version is v3.10.0, all the images built for 3.10.0 also have v3.10 tag, everything is okay.
2. some days later, v3.10.1 is released out, all the v3.10 images would be pointed to v3.10.1.
3. deploy some apps, image pulled by kubelet will be pulling v3.10.1 image, while the installed openshift version in this cluster is still using v3.10.0, that lead to mismatch between installed openshift version on nodes and infra images version automatically pulled when deploying app.

This would probably bring some unexpected behavior.


Version-Release number of the following components:
openshift-ansible-3.10.0-0.32.0.git.0.bb50d68.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. Set the following option in inventory file without setting openshift_image_tag
oreg_url=registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version}
2. trigger installation
3.

Actual results:
After installation, check master and node config file.
[root@ip-172-18-10-201 ~]# grep imageConfig /etc/origin/master/master-config.yaml  -A 2
imageConfig:
  format: registry.reg-aws.openshift.com:443/openshift3/ose-${component}:v3.10
  latest: false
[root@ip-172-18-10-201 ~]# grep imageConfig /etc/origin/node/node-config.yaml  -A 2
imageConfig:
  format: "registry.reg-aws.openshift.com:443/openshift3/ose-${component}:v3.10"
  latest: false


Expected results:
When user does not set openshift_image_tag, imageConfig.format should be "registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version}"

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Scott Dodson 2018-05-04 12:42:58 UTC
This is intentional, if they want fine grained control over things they should set the exact image tag they'd like to use.

Comment 2 Johnny Liu 2018-05-04 15:39:53 UTC
Could we assure openshift v3.10.0 bin installed from rpm always work well with the latest v3.10 image, such as, openshift node is running as 3.10.0 version, does it always work well with 3.10.N infra image? I guess we can not assure that, at least, QE is hard to cover such scenario in our daily testing, because QE generally run a fresh install to use latest binary and pull the latest image, the version between them is always insistent. we can not assure that if we did not cover that.

BTW, if I am a customer, I would be surprised, why the cluster is running 3.10.0 openshift bin, but it is pulling 3.10.N infra images. 

Once this is released out, many guys (GEE/Customer) would have the same question like this bug, assign back to re-evaluate its risk.

Comment 3 Scott Dodson 2018-05-10 13:44:22 UTC
Think this is a real bug, we should not be substituting any value in for ${version} when configuring the imageConfig.format value.

Comment 4 Michael Gugino 2018-05-10 15:17:37 UTC
PR Created: https://github.com/openshift/openshift-ansible/pull/8326

Comment 6 Johnny Liu 2018-05-16 10:50:52 UTC
Verified this bug with openshift-ansible-3.10.0-0.41.0.git.0.88119e4.el7.noarch, and PASS.


# grep imageConfig /etc/origin/master/master-config.yaml  -A 2
imageConfig:
  format: registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version}
  latest: false


# docker images
REPOSITORY                                                                  TAG                 IMAGE ID            CREATED             SIZE
registry.reg-aws.openshift.com:443/openshift3/ose-node                      v3.10               4f87ef7b7696        4 days ago          1.2 GB
registry.reg-aws.openshift.com:443/openshift3/ose-deployer                  v3.10.0-0.41.0      cbe56c624ba0        4 days ago          635 MB
registry.reg-aws.openshift.com:443/openshift3/ose-control-plane             v3.10               2a64c352cce0        4 days ago          635 MB
registry.reg-aws.openshift.com:443/openshift3/ose-service-catalog           v3.10               6b22ed5fb507        4 days ago          310 MB
registry.reg-aws.openshift.com:443/openshift3/ose-web-console               v3.10               1ca5b306a0bb        4 days ago          318 MB
registry.reg-aws.openshift.com:443/openshift3/ose-template-service-broker   v3.10               53a0bc1f2036        4 days ago          284 MB
registry.reg-aws.openshift.com:443/openshift3/registry-console              v3.10               88976c243ab7        4 days ago          232 MB
registry.reg-aws.openshift.com:443/openshift3/ose-pod                       v3.10.0-0.41.0      653bda7c8a49        4 days ago          214 MB
registry.access.redhat.com/rhel7/etcd                                       3.2.15              4f35b6516d22        5 weeks ago         256 MB


[root@qe-jialiu310-master-etcd-1 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.5 (Maipo)
[root@qe-jialiu310-master-etcd-1 ~]# uname -r
3.10.0-862.2.3.el7.x86_64

Comment 8 errata-xmlrpc 2018-07-30 19:14:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816

Comment 9 Clayton Coleman 2018-08-29 19:14:47 UTC
This was working as designed.