Bug 1419654

Summary: [3.4] Containerized advanced installation fails due to missing CA certificate /etc/origin/master/ca.crt
Product: OpenShift Container Platform Reporter: Javier Ramirez <javier.ramirez>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED ERRATA QA Contact: Johnny Liu <jialiu>
Severity: high Docs Contact:
Priority: urgent    
Version: 3.4.1CC: aos-bugs, bleanhar, chris.ganderton, erich, javier.ramirez, jokerman, mmccomas, rhowe
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openshift-ansible-3.4.61-1.git.0.67ea55a.el7 Doc Type: Bug Fix
Doc Text:
Previously, containerized installations would fail if the path /etc/openshift existed prior to installation. This problem happened in the code that migrated configuration directories from 3.0 to 3.1 names and has been removed ensuring proper installation if /etc/openshift exists prior to installation.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-06 16:38:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1267746    

Description Javier Ramirez 2017-02-06 16:28:03 UTC
Description of problem:
when going through the advanced installer we get the following :


TASK [openshift_node_certificates : fail] **************************************
fatal: [master.example.com]: FAILED! => {
    "changed": false,
    "failed": true
}

MSG:

CA certificate /etc/origin/master/ca.crt doesn't exist on CA host master.example.com. Apply 'openshift_ca' role to master.example.com.

Version-Release number of selected component (if applicable):
3.4.1.2

How reproducible:
Always

Steps to Reproduce:
1. Run the ansible playbook

Actual results:
Fails with:
TASK [openshift_node_certificates : fail] **************************************
fatal: [master.example.com]: FAILED! => {
    "changed": false, 
    "failed": true
}

MSG:

CA certificate /etc/origin/master/ca.crt doesn't exist on CA host master.example.com. Apply 'openshift_ca' role to master.example.com.


Expected results:
Successful installation.

Additional info:

Comment 9 Scott Dodson 2017-02-13 19:00:53 UTC
https://github.com/openshift/openshift-ansible/pull/3343 proposed fix

Comment 10 openshift-github-bot 2017-02-13 20:57:19 UTC
Commit pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/cba1d9e8bcc087c550ce3a4221cd08dc6eac1240
Fix Bug 1419654 Remove legacy config_base fallback to /etc/openshift

If a host had /etc/openshift but not /etc/origin we were setting the
config_base to /etc/openshift in some places but not all. This code was
transitional in order to migrate between 3.0 and 3.1. Given that current
playbooks are only supported when moving from the previous version to
current version this should no longer be necessary.

Comment 11 Scott Dodson 2017-02-13 21:00:19 UTC
release-1.4 backport https://github.com/openshift/openshift-ansible/pull/3344

Comment 13 Johnny Liu 2017-02-14 12:24:28 UTC
Try to reproduce this bug with openshift-ansible-3.4.56-1.git.0.7ba9968.el7.noarch, and create "/etc/openshift" dir in advance before installation, though the failed step is not the same one as initial report, still could be see CA path is pointing to "/etc/openshift/master/ca.crt", finally the installation failed at "Copy the admin client config" step.

<--snip-->
TASK [openshift_node_certificates : Ensure CA certificate exists on openshift_ca_host] ***
Tuesday 14 February 2017  08:59:05 +0000 (0:00:00.140)       0:20:27.254 ****** 
ok: [ec2-54-89-125-34.compute-1.amazonaws.com -> ec2-204-236-250-35.compute-1.amazonaws.com] => {"changed": false, "stat": {"atime": 1487062120.258258, "checksum": "092119d34ae47f1281580d24d11278948260041e", "ctime": 1487062096.388439, "dev": 51714, "executable": false, "exists": true, "gid": 0, "gr_name": "root", "inode": 33642408, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "md5": "88448b19d1721e9f7064afae23d8b06b", "mode": "0644", "mtime": 1487062075.053601, "nlink": 3, "path": "/etc/openshift/master/ca.crt", "pw_name": "root", "readable": true, "rgrp": true, "roth": true, "rusr": true, "size": 1070, "uid": 0, "wgrp": false, "woth": false, "writeable": true, "wusr": true, "xgrp": false, "xoth": false, "xusr": false}}

TASK [openshift_node_certificates : fail] **************************************
Tuesday 14 February 2017  08:59:05 +0000 (0:00:00.633)       0:20:27.888 ****** 
skipping: [ec2-54-89-125-34.compute-1.amazonaws.com] => {"changed": false, "skip_reason": "Conditional check failed", "skipped": true}
<--snip-->

TASK [openshift_projects : Copy the admin client config(s)] ********************
Tuesday 14 February 2017  09:10:41 +0000 (0:00:00.540)       0:32:03.318 ****** 
fatal: [ec2-204-236-250-35.compute-1.amazonaws.com]: FAILED! => {"changed": false, "cmd": ["cp", "/etc/origin/master/admin.kubeconfig", "/tmp/openshift-ansible-7B5yYa/admin.kubeconfig"], "delta": "0:00:00.002807", "end": "2017-02-14 04:10:40.483319", "failed": true, "rc": 1, "start": "2017-02-14 04:10:40.480512", "stderr": "cp: cannot stat ‘/etc/origin/master/admin.kubeconfig’: No such file or directory", "stdout": "", "stdout_lines": [], "warnings": []}

I think this the root cause.

Because the fix package - openshift-ansible-3.4.61-1.git.0.67ea55a.el7 is not attached to advisory, I move this bug's status to "MODIFIED"

Comment 14 Johnny Liu 2017-02-16 05:43:27 UTC
Now openshift-ansible-3.4.61-1.git.0.67ea55a.el7 is attached to advisory. So move this bug to ON_QA.

Comment 15 Johnny Liu 2017-02-16 05:43:47 UTC
Verified this bug with openshift-ansible-3.4.61-1.git.0.67ea55a.el7, and PASS.

1. Before installation, create /etc/openshift dir in advance.
2. Trigger a containerized installation.
3. Installation is finished successfully.
<--snip-->
TASK [openshift_node_certificates : Ensure CA certificate exists on openshift_ca_host] ***
Thursday 16 February 2017  03:03:56 +0000 (0:00:00.267)       0:19:16.328 ***** 
ok: [ec2-54-86-13-123.compute-1.amazonaws.com -> ec2-54-86-13-123.compute-1.amazonaws.com] => {"changed": false, "stat": {"atime": 1487213726.7270463, "checksum": "a189a7b405e4514b0963d2459847196c8f1c74bb", "ctime": 1487213699.4620621, "dev": 51714, "executable": false, "exists": true, "gid": 0, "gr_name": "root", "inode": 67177395, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "md5": "fd91f172651104ee5ea9f89f35ffe233", "mode": "0644", "mtime": 1487213675.738946, "nlink": 3, "path": "/etc/origin/master/ca.crt", "pw_name": "root", "readable": true, "rgrp": true, "roth": true, "rusr": true, "size": 1070, "uid": 0, "wgrp": false, "woth": false, "writeable": true, "wusr": true, "xgrp": false, "xoth": false, "xusr": false}}

TASK [openshift_node_certificates : fail] **************************************
Thursday 16 February 2017  03:03:56 +0000 (0:00:00.678)       0:19:17.007 ***** 
skipping: [ec2-54-86-13-123.compute-1.amazonaws.com] => {"changed": false, "skip_reason": "Conditional check failed", "skipped": true}
<--snip-->

Seen from the about output, CA path is pointing to "/etc/origin/master/ca.crt"

Comment 17 errata-xmlrpc 2017-03-06 16:38:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:0448