Bug 1716622
Summary: | Updating the kube-apiserver certificate with a new certificate fails to reload the kube-apiserver certificate | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Greg Blomquist <gblomqui> |
Component: | Master | Assignee: | Luis Sanchez <sanchezl> |
Status: | CLOSED ERRATA | QA Contact: | Xingxing Xia <xxia> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.1.0 | CC: | aos-bugs, deads, gblomqui, jokerman, jupierce, mfojtik, mifiedle, mmccomas, mwoodson, sanchezl, sponnaga, wking, xtian, xxia |
Target Milestone: | --- | Keywords: | OSE41z_next |
Target Release: | 4.1.z | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | 4.1.2 | ||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1714771 | Environment: | |
Last Closed: | 2019-06-19 06:45:34 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1714771 | ||
Bug Blocks: | 1718956 |
Comment 1
Greg Blomquist
2019-06-03 19:18:53 UTC
There are two other PRs still required for this BZ: * https://github.com/openshift/cluster-kube-apiserver-operator/pull/492 * https://github.com/openshift/cluster-kube-controller-manager-operator/pull/257 Both PRs merged. https://openshift-release.svc.ci.openshift.org/releasestream/4.1.0-0.nightly/release/4.1.0-0.nightly-2019-06-05-223716?from=4.1.0 has both of the new PRs. Tested 4.1.0-0.nightly-2019-06-05-233256, updating the kube-apiserver certificate with a new certificate, still fails to reload/rollout: First, add certificate by following https://bugzilla.redhat.com/show_bug.cgi?id=1685704#c26 : $ openssl genrsa -out custom2.key 1024 $ openssl req -new -key custom2.key -out custom2.csr ...skipped... Common Name (eg, your name or your server's hostname) []:api.xxia-test.qe.devcluster.openshift.com ...skipped... $ openssl x509 -req -days 1 -in custom2.csr -signkey custom2.key -out custom2.crt $ oc create secret tls api-certs --cert=custom2.crt --key=custom2.key -n openshift-config $ oc edit apiserver cluster ... spec: servingCerts: namedCertificates: - names: - api.xxia-test.qe.devcluster.openshift.com servingCertificate: name: api-certs Then new installer-7-ip-* pods run --> kube-apiserver pods restart. Second, update the certificate with new .crt: $ openssl x509 -req -days 1 -in custom2.csr -signkey custom2.key -out custom2-2.crt $ oc create secret tls api-certs --cert=custom2-2.crt --key=custom2.key -n openshift-config --dry-run -o yaml | oc apply -f - Watch pods, no new installer-8-ip-* appear, kube-apiserver pods never restart accordingly: $ watch oc get po -n openshift-kube-apiserver Every 2.0s: oc get po -n openshift-kube-apiserver fedora29: Thu Jun 6 12:51[0/1243] NAME READY STATUS RESTARTS AGE installer-2-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 3h15m installer-2-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 3h13m installer-2-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 3h15m installer-3-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 3h13m installer-4-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 3h12m installer-5-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 3h11m installer-6-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 3h7m installer-6-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 3h11m installer-6-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 3h9m installer-7-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 18m installer-7-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 16m installer-7-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 19m kube-apiserver-ip-10-0-133-233.us-east-2.compute.internal 2/2 Running 0 17m kube-apiserver-ip-10-0-159-216.us-east-2.compute.internal 2/2 Running 0 16m kube-apiserver-ip-10-0-172-171.us-east-2.compute.internal 2/2 Running 0 19m The API server doesn't restart when certificates change, only when configuration changes. Check to see if you're actually serving with the new certificates. Also, as a reminder, please include the `oc adm must-gather` report so we can avoid ping-ponging back and forth. Accessed the cluster in question above and the rollouts were still on the same generation: $ oc get pods -n openshift-kube-apiserver NAME READY STATUS RESTARTS AGE installer-2-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 14h installer-2-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 14h installer-2-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 14h installer-3-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 14h installer-4-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 14h installer-5-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 14h installer-6-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 14h installer-6-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 14h installer-6-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 14h installer-7-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 11h installer-7-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 11h installer-7-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 11h kube-apiserver-ip-10-0-133-233.us-east-2.compute.internal 2/2 Running 0 11h kube-apiserver-ip-10-0-159-216.us-east-2.compute.internal 2/2 Running 0 11h kube-apiserver-ip-10-0-172-171.us-east-2.compute.internal 2/2 Running 0 11h Retrieved the cert being served with: openssl s_client -showcerts -servername api.xxia-0606.qe.devcluster.openshift.com -connect api.xxia-0606.qe.devcluster.openshift.com:6443 </dev/null | tee -a showcert.out and compared it to tls.crt in the secret api-certs in the namespace openshift-config. The certificates did not match. Verified in the apiserver CR that api-certs is still the servingCertificate. I will include the openssl output and the secret output in the must-gather zip I will be linking shortly. Moving this back ON_QA. Using the correct hostname for the SNI version of the openssl showcerts does show that we are serving the correct cert: openssl s_client -showcerts -servername api.xxia-test.qe.devcluster.openshift.com -connect api.xxia-0606.qe.devcluster.openshift.com:6443 CONNECTED(00000003) --- Certificate chain 0 s:C = US, ST = test, L = Default City, O = Default Company Ltd, CN = api.xxia-test.qe.devcluster.openshift.com i:C = US, ST = test, L = Default City, O = Default Company Ltd, CN = api.xxia-test.qe.devcluster.openshift.com -----BEGIN CERTIFICATE----- MIICjjCCAfcCFBWHWgPnkHlHbTENgY9DZA7NFkWIMA0GCSqGSIb3DQEBCwUAMIGF MQswCQYDVQQGEwJVUzENMAsGA1UECAwEdGVzdDEVMBMGA1UEBwwMRGVmYXVsdCBD aXR5MRwwGgYDVQQKDBNEZWZhdWx0IENvbXBhbnkgTHRkMTIwMAYDVQQDDClhcGku eHhpYS10ZXN0LnFlLmRldmNsdXN0ZXIub3BlbnNoaWZ0LmNvbTAeFw0xOT....... @deads verified the subject and signer from the cert and I was able to verify the expiration: echo | openssl s_client -showcerts -servername api.xxia-test.qe.devcluster.openshift.com -connect api.xxia-0606.qe.devcluster.openshift.com:6443 2>/dev/null | openssl x509 -text Certificate: Data: Version: 1 (0x0) Serial Number: 15:87:5a:03:e7:90:79:47:6d:31:0d:81:8f:43:64:0e:cd:16:45:88 Signature Algorithm: sha256WithRSAEncryption Issuer: C = US, ST = test, L = Default City, O = Default Company Ltd, CN = api.xxia-test.qe.devcluster.openshift.com Validity Not Before: Jun 6 04:39:18 2019 GMT Not After : Jun 7 04:39:18 2019 GMT Subject: C = US, ST = test, L = Default City, O = Default Company Ltd, CN = api.xxia-test.qe.devcluster.openshift.com Marking this VERIFIED on 4.1.0-0.nightly/release/4.1.0-0.nightly-2019-06-05-223716. New certificate is served from disk without new rollout. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:1382 |