Bug 1711431

Summary: Cannot recover from bad serving cert secret on Kube api server
Product: OpenShift Container Platform Reporter: Cesar Wong <cewong>
Component: kube-apiserverAssignee: Luis Sanchez <sanchezl>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, calfonso, jokerman, mfojtik, mmccomas, sanchezl
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1711447 (view as bug list) Environment:
Last Closed: 2019-10-16 06:29:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1711447    

Description Cesar Wong 2019-05-17 19:10:20 UTC
Description of problem:
After specifying a bad secret (one that doesn't contain tls.crt/tls.key) as a serving cert for the api server, modifying the secret or modifying the apiserver/cluster resource to point to a good secret does not fix the issue.

Version-Release number of selected component (if applicable):
4.1.0-rc.4

How reproducible:
Always

Steps to Reproduce:
1. Create a serving cert secret with wrong keys (say crt,key instead of tls.crt,tls.key) in the openshift-config namespace.

2. Modify the apiserver/cluster resource and specify the new secret as the default serving cert:

apiVersion: config.openshift.io/v1
kind: APIServer
metadata:
  name: cluster
spec:
  servingCerts:
    defaultServingCertificate:
      name: bad-creds

3. Wait for kube-apiserver operator to apply the configuration and observe that one of the kube-apiserver pods starts crashlooping (as expected).
4. Modify the original secret to contain proper keys (tls.crt/tls.key). Wait for a change in the kube-apiserver pods.
5. Modify apiserver/cluster to point to a newly created secret that contains the right keys:

apiVersion: config.openshift.io/v1
kind: APIServer
metadata:
  name: cluster
spec:
  servingCerts:
    defaultServingCertificate:
      name: good-creds

Wait for a change in the kube-apiserver pods.

Actual results:

After step 4 and/or 5, nothing changes. The one apiserver pod keeps crashlooping saying it can't find tls.crt.


Expected results:

After modifying the secret or the configuration, the secret should be updated on the master and the new serving cert should take effect.

Additional info:

Removing the serving certificate configuration completely:
apiVersion: config.openshift.io/v1
kind: APIServer
metadata:
  name: cluster
spec: {}

and waiting for the kube-apiserver to become stable again does get around this issue.

Comment 2 Luis Sanchez 2019-05-21 19:30:39 UTC
Work-around is to specify the secret in the named certificates section:

  servingCerts:
    namedCertificates:
    - names:
      - api.cluster_name.basedomain
      servingCertificate:
        name: my_secret_name

Comment 3 Luis Sanchez 2019-08-01 13:14:36 UTC
In addition to the code fix, the docs were updated to remove the mention of using defaultServingCert, leaving the nameCertificates section (as shown in the 'work-around' above) as the appropriate way to accomplish what the originator was attempting to do. https://github.com/openshift/openshift-docs/pull/15642

Comment 5 Xingxing Xia 2019-08-09 08:51:11 UTC
(In reply to Luis Sanchez from comment #3)
> In addition to the code fix, the docs were updated to remove the mention of
> using defaultServingCert, leaving the nameCertificates section (as shown in
> the 'work-around' above) as the appropriate way to accomplish what the
> originator was attempting to do.
> https://github.com/openshift/openshift-docs/pull/15642

Right, for 4.2, bug 1731105#c3 verified defaultServingCert is removed, bug 1714771#c9 verified nameCertificates works. So moving this bug to VERIFIED.

Comment 7 errata-xmlrpc 2019-10-16 06:29:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922