Bug 1714771 - Updating the kube-apiserver certificate with a new certificate fails to reload the kube-apiserver certificate
Summary: Updating the kube-apiserver certificate with a new certificate fails to reloa...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.1.0
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
: 4.2.0
Assignee: Luis Sanchez
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks: 1716622
TreeView+ depends on / blocked
 
Reported: 2019-05-28 19:14 UTC by Matt Woodson
Modified: 2019-10-16 06:29 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1716622 (view as bug list)
Environment:
Last Closed: 2019-10-16 06:29:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:29:43 UTC

Description Matt Woodson 2019-05-28 19:14:22 UTC
Description of problem:

Somewhat related to https://bugzilla.redhat.com/show_bug.cgi?id=1711431

We are attempting to rotate certificates for the kube-apiserver.  We have already applied a certificate by specifying here:

  servingCerts:
    namedCertificates:
    - names:
      - api.cluster_name.basedomain
      servingCertificate:
        name: my_secret_name

Everything is applied as desired.

Now, when this certificate expires, we need to renew.  We do the renewal, and replace the contents of the secret "my_secret_name" with the updated certificate, but the kube-apiserver never restarts and the new certificate is never applied.

Workaround: Delete the kube-apiserver pods, they start back up with the new certificate.



Version-Release number of selected component (if applicable):

version   4.1.0-rc.7   True        False         4h49m     Cluster version is 4.1.0-rc.7




Steps to Reproduce:

Steps already described


Actual results:
Certs dont' get applied


Expected results:

New cert to start serving

Comment 1 David Eads 2019-05-28 19:53:42 UTC
@sanchezl  We may be trying to auto-reload. Check the certs on disk.

Comment 2 David Eads 2019-05-28 19:56:49 UTC
Which pods did you delete?  Deleting a static pod doesn't do anything.

Also, you'll want to attach the must-gather output.

Comment 3 Matt Woodson 2019-05-28 20:59:15 UTC
I deleted the pod in the openshift-kube-apiserver namespace.  the pods I deleted were 'kube-apiserver-ip-???'

I have the must-gather script downloaded, but which operator does this need to be called on?  I tried openshift-kube-apiserver-operator and kube-apiserver-operator.  Please advise, and I will get it added

Comment 5 Luis Sanchez 2019-05-29 17:20:48 UTC
Recreation attempt:

Updated user cert in openshift-config namespace:

oc -n openshift-config create secret tls my_secret_name --cert cert.pem  --key privkey.pem --dry-run -o yaml  | oc --insecure-skip-tls-verify apply -f -

I noticed this event:

I0529 15:15:14.298881       1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-apiserver-operator", Name:"kube-apiserver-operator", UID:"c025d5a1-8158-11e9-9b3d-0ab2223433b6", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'SecretUpdated' Updated Secret/user-serving-cert-000 -n openshift-kube-apiserver because it changed

I confirmed {Secret/user-serving-cert-000 -n openshift-kube-apiserver} matched {Secret/my_secret_name -n openshift-config}. 

I was watching the returned certs with this command:

watch -n 0.1 bash -c 'echo | openssl s_client -showcerts -connect api.sanchezl.devcluster.openshift.com:6443 2>/dev/null | openssl x509 -inform pem -noout -text | grep -E "(Not After\s*:|Issuer\s*:|Subject\s*:|DNS\s*:|Not Before\s*:|Serial Number\s*:)"'

I was getting both the old and new cert returned on random retries.

When I checked the files on disk, only 1 out of 3 kube-apiserver pods had the updated certificate on disk. The cert-syncer container logs in the pods which did not update had very terse logs:

I0529 14:53:11.501705       1 observer_polling.go:106] Starting file observer
I0529 14:53:11.503146       1 certsync_controller.go:161] Starting CertSyncer

while the cert-syncer container logs on the "working" pod were verbose.

must-gather logs captured (https://drive.google.com/file/d/1eJl2WBvkOS8ZtC2OqFMzKz_9x81cEADx/view?usp=sharing).

Comment 6 Luis Sanchez 2019-05-29 17:42:37 UTC
Forcing redeployment works to ensure the new certs are being used:

oc patch kubeapiserver/cluster --type=json -p '[ {"op": "replace", "path": "/spec/forceRedeploymentReason", "value": "pickup new certs" } ]'

Comment 7 Luis Sanchez 2019-06-03 18:39:58 UTC
PR https://github.com/openshift/library-go/pull/430

Comment 11 errata-xmlrpc 2019-10-16 06:29:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.