Bug 1427067

Summary: Failed to redeploy certificates with Atomic Hosts due to "etcd_is_atomic" is detected incorrectly
Product: OpenShift Container Platform Reporter: Gan Huang <ghuang>
Component: InstallerAssignee: Tim Bielawa <tbielawa>
Status: CLOSED ERRATA QA Contact: Gan Huang <ghuang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.5.0CC: aos-bugs, gpei, jokerman, mmccomas
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The fact "etcd_is_atomic" was detected incorrectly due to the role ordering of some fact setting operations. Atomic Hosts do not support yum/repoquery/rpm commands. Consequence: Atomic Hosts would attempt to run commands specific to managing/inspecting repositories and packages when they should not. Fix: The ordering of role calls and fact updates was changed and wrapped in a meta-role to ensure they stay in the correct order. Result: Atomic Hosts will not attempt to run unsupported rpm/yum/repoquery type commands because the etcd_is_atomic fact is correctly detected.
Story Points: ---
Clone Of:
: 1442009 1442010 1442011 (view as bug list) Environment:
Last Closed: 2017-04-12 19:02:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1397958    

Description Gan Huang 2017-02-27 09:33:41 UTC
Description of problem:
Failed to redeploy certificates with Atomic Hosts due to etcd_is_atomic is detected incorrectly

Version-Release number of selected component (if applicable):
openshift-ansible-3.5.15-1.git.0.8d2a456.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. Trigger HA installation with Atomic Hosts
# cat inventory_hosts
<--snip-->
openshift_master_ca_certificate={'certfile': '/root/1488161445-02-Feb-26-Feb-2017/rootCA.pem', 'keyfile': '/root/1488161445-02-Feb-26-Feb-2017/rootCA.key'}
openshift_master_named_certificates=[{"certfile": "/root/1488161445-02-Feb-26-Feb-2017/openshift-127.lab.sjc.redhat.com.crt", "keyfile": "/root/1488161445-02-Feb-26-Feb-2017/openshift-127.lab.sjc.redhat.com.key", "cafile": "/root/1488161445-02-Feb-26-Feb-2017/rootCA.pem", "names": ["test.anotherhost.com","foo.anotherhost.com", "openshift-127.lab.sjc.redhat.com"]}]

<--snip-->
[masters]
openshift-147.lab.sjc.redhat.com  
openshift-136.lab.sjc.redhat.com  
openshift-137.lab.sjc.redhat.com  

[etcd]
openshift-147.lab.sjc.redhat.com  
openshift-136.lab.sjc.redhat.com 
openshift-137.lab.sjc.redhat.com 

2. Redeploy certificates after the normal installs
ansible-playbook -i inventory_hosts /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-openshift-ca.yml 

Actual results:
TASK [etcd_ca : debug] *********************************************************
ok: [openshift-147.lab.sjc.redhat.com] => {
    "etcd_is_atomic": false
}

TASK [etcd_ca : Install openssl] ***********************************************
fatal: [openshift-147.lab.sjc.redhat.com -> openshift-147.lab.sjc.redhat.com]: FAILED! => {
    "changed": false, 
    "failed": true
}

MSG:

Could not find a module for unknown.


NO MORE HOSTS LEFT *************************************************************
    to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-openshift-ca.retry

PLAY RECAP *********************************************************************

Expected results:
No errors.

Additional info:
Added debug info into the code, etcd_is_atomic is set to a wrong value. Actually openshift-147.lab.sjc.redhat.com is an Atomic Host. No such issue in a fresh installation.

# cat /usr/share/ansible/openshift-ansible/roles/etcd_ca/tasks/main.yml
---
- debug: var=etcd_is_atomic

- name: Install openssl
  package: name=openssl state=present
  when: not etcd_is_atomic | bool
  delegate_to: "{{ etcd_ca_host }}"
  run_once: true

<--snip-->

Comment 1 Tim Bielawa 2017-03-08 18:33:38 UTC
Work-in-progress PR is open now: https://github.com/openshift/openshift-ansible/pull/3600

Still working this out and doing initial tests.

Comment 2 Tim Bielawa 2017-03-10 15:32:52 UTC
Merged

Comment 3 Scott Dodson 2017-03-13 18:10:24 UTC
https://github.com/openshift/openshift-ansible/pull/3633 backport to release-1.5

Comment 5 Gan Huang 2017-03-14 10:16:32 UTC
No such issue in openshift-ansible-3.5.32-1.git.0.42cf266.el7.noarch

Move to verified.

Comment 7 errata-xmlrpc 2017-04-12 19:02:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0903