Bug 1664889

Summary: Redeploy certificates playbook fails due to etcd related permissions issues
Product: OpenShift Container Platform Reporter: Luke Stanton <lstanton>
Component: InstallerAssignee: Patrick Dillon <padillon>
Installer sub component: openshift-ansible QA Contact: ge liu <geliu>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: gpei, padillon
Version: 3.10.0Keywords: Reopened
Target Milestone: ---   
Target Release: 3.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: starting in 3.10, etcd certs are owned by root because etcd is expected to run as root in a static pod colocated on master nodes Consequence: customers running standalone etcd clusters who upgrade from earlier versions have permission denied when etcd tries to access certs when upgrading or redeploying certs Fix: support upgrading standalone etcd clusters by setting owner to etcd if existing certs have etcd owner. Result: if etcd is running in standalone cluster, certs have owner etcd and etcd can access the certs. allowing upgrade or cert redeploy.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-20 10:11:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luke Stanton 2019-01-09 22:44:55 UTC
Description of problem:

Running redeploy-certificates.yml playbook fails due to etcd permissions. It looks like the problem was documented in https://github.com/openshift/openshift-ansible/issues/10289, however the associated pull request appears to have never gone through.

Additional info:
Had to manually make changes similar to https://github.com/openshift/openshift-ansible/pull/10291 to get things working.

Comment 4 Patrick Dillon 2019-01-24 14:20:17 UTC
Known etcd permission issues were fixed by this pr: https://github.com/openshift/openshift-ansible/pull/10943

Customer had invalid openshift_master_named_certificates entry in inventory which caused install to fail. Could not reproduce any further etcd permission errors with available information. If problem is not solved by fix above, please open a new bug.

Comment 5 Patrick Dillon 2019-01-25 19:02:09 UTC
On closer inspection, the problem is that etcd is running in a separate cluster. Starting in 3.10, OpenShift expects etcd to be run as root in a static pod. I have posted a PR which would allow support for situations like this, where an existing etcd cluster is in place. If there are existing certs with etcd as owner, that ownership will be maintained.

PR: https://github.com/openshift/openshift-ansible/pull/11079

Comment 6 Scott Dodson 2019-01-28 14:47:56 UTC
PR from comment 5 in openshift-ansible-3.11.75-1 and later

Comment 7 ge liu 2019-01-30 09:34:52 UTC
Verified with openshift-ansible-3.10.104-1.git.0.79f87f7.el7.noarch.rpm.

Comment 9 errata-xmlrpc 2019-02-20 10:11:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0328