Bug 1634004
| Summary: | Upgrade to OCP 3.9 fails at task "Fail when OpenShift is not installed" | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Joel Rosental R. <jrosenta> |
| Component: | Cluster Version Operator | Assignee: | Patrick Dillon <padillon> |
| Status: | CLOSED ERRATA | QA Contact: | Johnny Liu <jialiu> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 3.9.0 | CC: | aos-bugs, jkaur, jokerman, jrosenta, mmccomas, padillon, wmeng |
| Target Milestone: | --- | ||
| Target Release: | 3.9.z | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: openshift_facts uses default ose images to determine version in containerized install. When customer uses custom images and internal registry, the default images cannot be pulled.
Consequence: openshift.common.version is left empty and upgrade fails on task Fail when OpenShift is not installed.
Fix: if there is a custom_image name pass it to openshift_facts to use.
Result: openshift.common.version is set in containerized installs with registry mirror and installation succeeds.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-12-13 19:27:05 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
The key root cause is openshift_facts is getting version from "openshift/ose" hard-coded image.
I could reproduce this bug with openshift-ansible-3.9.43-1.git.0.d0bc600.el7.noarch when upgrading a containerized cluster from 3.7.64 to 3.9.41. The pure node upgrade would failed at the step.
TASK [Fail when OpenShift is not installed] ************************************
skipping: [host-8-251-31.host.centralci.eng.rdu2.redhat.com] => {"changed": false, "skip_reason": "Conditional result was False"}
fatal: [host-8-251-237.host.centralci.eng.rdu2.redhat.com]: FAILED! => {"changed": false, "msg": "Verify OpenShift is already installed"}
Verified this bug with openshift-ansible-3.9.48-1.git.0.09f6c01.el7.noarch, and PASS.
openshift_image_tag=v3.9.41
openshift_release=v3.9
openshift_cockpit_deployer_prefix=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/
oreg_url=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/ose-${component}:${version}
openshift_docker_additional_registries=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000
openshift_docker_insecure_registries=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000
osm_etcd_image=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/rhel7/etcd
osm_image=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/ose
osn_image=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/node
osn_ovs_image=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/openvswitch
PLAY [Verify upgrade targets] **************************************************
TASK [Gathering Facts] *********************************************************
ok: [host-8-251-237.host.centralci.eng.rdu2.redhat.com]
ok: [host-8-251-31.host.centralci.eng.rdu2.redhat.com]
TASK [include_tasks] ***********************************************************
included: /home/slave2/workspace/Run-Ansible-Playbooks-Nextge/private-openshift-ansible/playbooks/common/openshift-cluster/upgrades/pre/verify_upgrade_targets.yml for host-8-251-31.host.centralci.eng.rdu2.redhat.com, host-8-251-237.host.centralci.eng.rdu2.redhat.com
TASK [Fail when OpenShift is not installed] ************************************
skipping: [host-8-251-31.host.centralci.eng.rdu2.redhat.com] => {"changed": false, "skip_reason": "Conditional result was False"}
skipping: [host-8-251-237.host.centralci.eng.rdu2.redhat.com] => {"changed": false, "skip_reason": "Conditional result was False"}
After upgrade, check:
[root@host-172-16-122-61 ~]# oc get node
NAME STATUS ROLES AGE VERSION
172.16.122.35 Ready compute 32m v1.9.1+a0ce1bc657
172.16.122.61 Ready master 34m v1.9.1+a0ce1bc657
[root@host-172-16-122-61 ~]# oc version
oc v3.9.41
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://172.16.122.61:8443
openshift v3.9.41
kubernetes v1.9.1+a0ce1bc657
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3748 |
Description of problem: Customer reports that while attempting to upgrade a containerized installation of OCP version 3.7.57 to 3.9.41 fails at task "Fail when OpenShift is not installed" TASK [Fail when OpenShift is not installed] ********************************************************************************************************************************************** task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/pre/verify_upgrade_targets.yml:2 skipping: [master1.example.com] => { "changed": false, "skip_reason": "Conditional result was False", "skipped": true } skipping: [master2.example.com] => { "changed": false, "skip_reason": "Conditional result was False", "skipped": true } skipping: [master3.example.com] => { "changed": false, "skip_reason": "Conditional result was False", "skipped": true } fatal: [node1.example.com]: FAILED! => { "changed": false, "failed": true, "msg": "Verify OpenShift is already installed" } fatal: [node2.example.com]: FAILED! => { "changed": false, "failed": true, "msg": "Verify OpenShift is already installed" } [...] Failure summary: 1. Hosts: node1.example.com, node2.example.com Play: Verify upgrade targets Task: Fail when OpenShift is not installed Message: Verify OpenShift is already installed Checking on https://github.com/openshift/openshift-ansible/blob/release-3.9/roles/openshift_facts/library/openshift_facts.py#L903-L921 it looks like this is where the fact gets defined from, however /etc/sysconfig/atomic-openshift-node seems to have the IMAGE_VERSION properly set: # cat /etc/sysconfig/atomic-openshift-node OPTIONS=--loglevel=0 CONFIG_FILE=/etc/origin/node/node-config.yaml IMAGE_VERSION=v3.7.57 Customer tried to remove the facts from /etc/ansible/facts.d and then retry the upgrade procedure in case there was a corrupted fact file, but got same outcome. Version-Release number of the following components: # rpm -qa | grep atomic atomic-openshift-clients-3.9.41-1.git.0.67432b0.el7.x86_64 bash-4.2$ ansible --version ansible 2.4.6.0 config file = /opt/myuser/home/openshift-leun/.ansible.cfg configured module search path = [u'/opt/myuser/home/openshift-leun/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python2.7/site-packages/ansible executable location = /bin/ansible python version = 2.7.5 (default, May 31 2018, 09:41:32) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] How reproducible: I could not reproduce it by myself. Steps to Reproduce: 1. I could not reproduce it by myself. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: It should define this fact correctly and upgrade finish smoothly.