Bug 1443416 - [3.5] Running the config.yml playbook fails on the second run
Summary: [3.5] Running the config.yml playbook fails on the second run
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.5.z
Assignee: Russell Teague
QA Contact: Gan Huang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-19 08:36 UTC by Johan Swensson
Modified: 2018-01-29 13:43 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
When openshift_image_tag is specified in an inventory as on 3.X instead of a full tag of 3.X.x.x, the evaluation of `openshift_image_tag >= LooseVersion('3.X.0.0')` would result in False. This caused the condition to be improperly applied to logic elsewhere in the code resulting in invalid evaluation of version specific facts. The version comparisons have been updated to compare against the terse minimum version of '3.X'.
Clone Of:
: 1466762 1466770 (view as bug list)
Environment:
Last Closed: 2017-07-27 18:02:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ansible.log (187.56 KB, text/plain)
2017-05-22 11:59 UTC, Johan Swensson
no flags Details
inventory (7.72 KB, text/plain)
2017-06-29 13:15 UTC, Johan Swensson
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1810 0 normal SHIPPED_LIVE OpenShift Container Platform atomic-openshift-utils bug fix and enhancement 2017-07-27 21:58:37 UTC

Description Johan Swensson 2017-04-19 08:36:26 UTC
Description of problem:
If one tries to re-run the installer on 3.5 after a successful deployment it fails.

Version-Release number of selected component (if applicable):
openshift-ansible-3.5.53-1.git.0.8ade9f2.el7.noarch
openshift-ansible-roles-3.5.53-1.git.0.8ade9f2.el7.noarch
openshift-ansible-docs-3.5.53-1.git.0.8ade9f2.el7.noarch
openshift-ansible-lookup-plugins-3.5.53-1.git.0.8ade9f2.el7.noarch
openshift-ansible-filter-plugins-3.5.53-1.git.0.8ade9f2.el7.noarch
openshift-ansible-playbooks-3.5.53-1.git.0.8ade9f2.el7.noarch
openshift-ansible-callback-plugins-3.5.53-1.git.0.8ade9f2.el7.noarch
atomic-openshift-utils-3.5.53-1.git.0.8ade9f2.el7.noarch


How reproducible:
Every time on the second run.

Steps to Reproduce:
1. Run the installer with containerized=true
2. After a successful install, rerun the installer


Actual results:
TASK [openshift_master_certificates : file] ************************************
skipping: [master1.example.com] => (item=openshift-router.kubeconfig) 
skipping: [master1.example.com] => (item=openshift-router.key) 
skipping: [master1.example.com] => (item=openshift-router.crt) 
ok: [master2.example.com -> master1.example.com] => (item=admin.crt)
skipping: [master1.example.com] => (item=openshift-registry.kubeconfig) 
skipping: [master1.example.com] => (item=openshift-registry.key) 
skipping: [master1.example.com] => (item=openshift-registry.crt) 
skipping: [master1.example.com] => (item=service-signer.key) 
skipping: [master1.example.com] => (item=service-signer.crt) 
skipping: [master1.example.com] => (item=master.proxy-client.key) 
skipping: [master1.example.com] => (item=master.proxy-client.crt) 
ok: [master2.example.com -> master1.example.com] => (item=admin.kubeconfig)
ok: [master3.example.com -> master1.example.com] => (item=admin.key)
ok: [master2.example.com -> master1.example.com] => (item=admin.key)
ok: [master3.example.com -> master1.example.com] => (item=admin.crt)
skipping: [master1.example.com] => (item=serviceaccounts.public.key) 
skipping: [master1.example.com] => (item=serviceaccounts.private.key) 
ok: [master3.example.com -> master1.example.com] => (item=master.kubelet-client.crt)
ok: [master2.example.com -> master1.example.com] => (item=master.kubelet-client.crt)
ok: [master3.example.com -> master1.example.com] => (item=admin.kubeconfig)
skipping: [master1.example.com] => (item=ca-bundle.crt) 
ok: [master3.example.com -> master1.example.com] => (item=master.kubelet-client.key)
ok: [master2.example.com -> master1.example.com] => (item=master.kubelet-client.key)
skipping: [master1.example.com] => (item=ca.key) 
skipping: [master1.example.com] => (item=ca.crt) 
skipping: [master1.example.com] => (item=master.kubelet-client.key) 
skipping: [master1.example.com] => (item=master.kubelet-client.crt) 
ok: [master2.example.com -> master1.example.com] => (item=ca.key)
ok: [master3.example.com -> master1.example.com] => (item=ca.crt)
ok: [master2.example.com -> master1.example.com] => (item=ca.crt)
skipping: [master1.example.com] => (item=admin.kubeconfig) 
skipping: [master1.example.com] => (item=admin.key) 
ok: [master3.example.com -> master1.example.com] => (item=ca-bundle.crt)
ok: [master2.example.com -> master1.example.com] => (item=ca-bundle.crt)
ok: [master3.example.com -> master1.example.com] => (item=ca.key)
skipping: [master1.example.com] => (item=admin.crt) 
ok: [master3.example.com -> master1.example.com] => (item=serviceaccounts.private.key)
ok: [master2.example.com -> master1.example.com] => (item=serviceaccounts.private.key)
ok: [master2.example.com -> master1.example.com] => (item=serviceaccounts.public.key)
ok: [master3.example.com -> master1.example.com] => (item=serviceaccounts.public.key)
ok: [master2.example.com -> master1.example.com] => (item=master.proxy-client.crt)
ok: [master3.example.com -> master1.example.com] => (item=master.proxy-client.crt)
ok: [master2.example.com -> master1.example.com] => (item=master.proxy-client.key)
ok: [master3.example.com -> master1.example.com] => (item=master.proxy-client.key)
ok: [master2.example.com -> master1.example.com] => (item=service-signer.crt)
ok: [master3.example.com -> master1.example.com] => (item=service-signer.crt)
ok: [master2.example.com -> master1.example.com] => (item=service-signer.key)
ok: [master3.example.com -> master1.example.com] => (item=service-signer.key)
failed: [master2.example.com -> master1.example.com] (item=openshift-registry.crt) => {
    "failed": true, 
    "item": "openshift-registry.crt", 
    "path": "/etc/origin/generated-configs/master-master2.example.com/openshift-registry.crt", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master3.example.com -> master1.example.com] (item=openshift-registry.crt) => {
    "failed": true, 
    "item": "openshift-registry.crt", 
    "path": "/etc/origin/generated-configs/master-master3.example.com/openshift-registry.crt", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master2.example.com -> master1.example.com] (item=openshift-registry.key) => {
    "failed": true, 
    "item": "openshift-registry.key", 
    "path": "/etc/origin/generated-configs/master-master2.example.com/openshift-registry.key", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master3.example.com -> master1.example.com] (item=openshift-registry.key) => {
    "failed": true, 
    "item": "openshift-registry.key", 
    "path": "/etc/origin/generated-configs/master-master3.example.com/openshift-registry.key", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master2.example.com -> master1.example.com] (item=openshift-registry.kubeconfig) => {
    "failed": true, 
    "item": "openshift-registry.kubeconfig", 
    "path": "/etc/origin/generated-configs/master-master2.example.com/openshift-registry.kubeconfig", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master3.example.com -> master1.example.com] (item=openshift-registry.kubeconfig) => {
    "failed": true, 
    "item": "openshift-registry.kubeconfig", 
    "path": "/etc/origin/generated-configs/master-master3.example.com/openshift-registry.kubeconfig", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master2.example.com -> master1.example.com] (item=openshift-router.crt) => {
    "failed": true, 
    "item": "openshift-router.crt", 
    "path": "/etc/origin/generated-configs/master-master2.example.com/openshift-router.crt", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master3.example.com -> master1.example.com] (item=openshift-router.crt) => {
    "failed": true, 
    "item": "openshift-router.crt", 
    "path": "/etc/origin/generated-configs/master-master3.example.com/openshift-router.crt", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master2.example.com -> master1.example.com] (item=openshift-router.key) => {
    "failed": true, 
    "item": "openshift-router.key", 
    "path": "/etc/origin/generated-configs/master-master2.example.com/openshift-router.key", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master3.example.com -> master1.example.com] (item=openshift-router.key) => {
    "failed": true, 
    "item": "openshift-router.key", 
    "path": "/etc/origin/generated-configs/master-master3.example.com/openshift-router.key", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master2.example.com -> master1.example.com] (item=openshift-router.kubeconfig) => {
    "failed": true, 
    "item": "openshift-router.kubeconfig", 
    "path": "/etc/origin/generated-configs/master-master2.example.com/openshift-router.kubeconfig", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory

failed: [master3.example.com -> master1.example.com] (item=openshift-router.kubeconfig) => {
    "failed": true, 
    "item": "openshift-router.kubeconfig", 
    "path": "/etc/origin/generated-configs/master-master3.example.com/openshift-router.kubeconfig", 
    "state": "absent"
}

MSG:

Error while linking: [Errno 2] No such file or directory


Expected results:
The second run should succeed.

Additional info:
Installer ran on a external bastion host.

Comment 1 Ed 2017-04-27 05:57:17 UTC
Having a similar problem on OCP 3.5. Our old 3.4 configuration worked fine.

Error conditions:
- Deploying multi-master topology with etcd and unscheduleable masters
- RPM (default) installation method via yum and using ansible playbooks
- RHEL 7.X HVM

Comment 2 Russell Teague 2017-05-10 18:11:48 UTC
I have been unable to reproduce this with a containerized 3 master install.  Could you provide the following?
* Version of Ansible installed
* Complete ansible.log for the second run

Comment 3 Johan Swensson 2017-05-22 11:59:25 UTC
Created attachment 1281009 [details]
ansible.log

Ansible version is ansible-2.2.1.0-2.el7.noarch

Comment 4 Johan Swensson 2017-06-03 14:12:58 UTC
Might be worth mentioning that I'm using custom certificates for the public side of the masters and routers via:

openshift_master_named_certificates and openshift_hosted_router_certificate

Comment 6 Russell Teague 2017-06-19 20:45:04 UTC
The logic here may be causing an issue.  Investigating.

https://github.com/openshift/openshift-ansible/blame/master/roles/openshift_master_facts/filter_plugins/openshift_master.py#L552

Comment 7 Scott Dodson 2017-06-29 13:04:25 UTC
Johan,

Can you also provide your complete inventory?

Comment 8 Johan Swensson 2017-06-29 13:15:15 UTC
Created attachment 1292878 [details]
inventory

Comment 9 Russell Teague 2017-06-29 19:46:45 UTC
Proposed: https://github.com/openshift/openshift-ansible/pull/4647

Comment 11 Gan Huang 2017-07-04 10:24:45 UTC
I tried multiple times and eventually reproduce this issue:

openshift-ansible-3.5.91-1.git.0.28b3ddb.el7.noarch.rpm (The fix is not in)

==On the first run
Both openshift_image_tag and openshift_version set to v3.5, containerized HA installation on Atomic Host. Installation succeed without errors

It resulted in that `IMAGE_VERSION` was set to v3.5 due to openshift_image_tag was defined in inventory hosts file.

# grep "IMAGE_VERSION" /etc/sysconfig/atomic-openshift-master
IMAGE_VERSION=v3.5

==On the second run
Rerun BYO playbook, installation failed as comment 1


I suspect this was just a bad usage that we shouldn't add "openshift_image_tag=v3.5" in inventory host file.

AFAIK the correct way to use openshift_image_tag is to set a precise version (e.g: v3.5.5.31) instead of v3.5 (like v3.5 should be used by openshift_version only). Is there something changed that QE have to to update the test cases?

Please advice.

Comment 14 Gan Huang 2017-07-07 09:51:19 UTC
Based on Comment 11. Verified with openshift-ansible-3.5.94-1

Comment 16 errata-xmlrpc 2017-07-27 18:02:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1810


Note You need to log in before you can comment on or make changes to this bug.