https://github.com/openshift/library-go/pull/435
There are two other PRs still required for this BZ: * https://github.com/openshift/cluster-kube-apiserver-operator/pull/492 * https://github.com/openshift/cluster-kube-controller-manager-operator/pull/257
Both PRs merged.
https://openshift-release.svc.ci.openshift.org/releasestream/4.1.0-0.nightly/release/4.1.0-0.nightly-2019-06-05-223716?from=4.1.0 has both of the new PRs.
Tested 4.1.0-0.nightly-2019-06-05-233256, updating the kube-apiserver certificate with a new certificate, still fails to reload/rollout: First, add certificate by following https://bugzilla.redhat.com/show_bug.cgi?id=1685704#c26 : $ openssl genrsa -out custom2.key 1024 $ openssl req -new -key custom2.key -out custom2.csr ...skipped... Common Name (eg, your name or your server's hostname) []:api.xxia-test.qe.devcluster.openshift.com ...skipped... $ openssl x509 -req -days 1 -in custom2.csr -signkey custom2.key -out custom2.crt $ oc create secret tls api-certs --cert=custom2.crt --key=custom2.key -n openshift-config $ oc edit apiserver cluster ... spec: servingCerts: namedCertificates: - names: - api.xxia-test.qe.devcluster.openshift.com servingCertificate: name: api-certs Then new installer-7-ip-* pods run --> kube-apiserver pods restart. Second, update the certificate with new .crt: $ openssl x509 -req -days 1 -in custom2.csr -signkey custom2.key -out custom2-2.crt $ oc create secret tls api-certs --cert=custom2-2.crt --key=custom2.key -n openshift-config --dry-run -o yaml | oc apply -f - Watch pods, no new installer-8-ip-* appear, kube-apiserver pods never restart accordingly: $ watch oc get po -n openshift-kube-apiserver Every 2.0s: oc get po -n openshift-kube-apiserver fedora29: Thu Jun 6 12:51[0/1243] NAME READY STATUS RESTARTS AGE installer-2-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 3h15m installer-2-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 3h13m installer-2-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 3h15m installer-3-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 3h13m installer-4-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 3h12m installer-5-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 3h11m installer-6-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 3h7m installer-6-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 3h11m installer-6-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 3h9m installer-7-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 18m installer-7-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 16m installer-7-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 19m kube-apiserver-ip-10-0-133-233.us-east-2.compute.internal 2/2 Running 0 17m kube-apiserver-ip-10-0-159-216.us-east-2.compute.internal 2/2 Running 0 16m kube-apiserver-ip-10-0-172-171.us-east-2.compute.internal 2/2 Running 0 19m
The API server doesn't restart when certificates change, only when configuration changes. Check to see if you're actually serving with the new certificates. Also, as a reminder, please include the `oc adm must-gather` report so we can avoid ping-ponging back and forth.
Accessed the cluster in question above and the rollouts were still on the same generation: $ oc get pods -n openshift-kube-apiserver NAME READY STATUS RESTARTS AGE installer-2-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 14h installer-2-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 14h installer-2-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 14h installer-3-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 14h installer-4-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 14h installer-5-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 14h installer-6-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 14h installer-6-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 14h installer-6-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 14h installer-7-ip-10-0-133-233.us-east-2.compute.internal 0/1 Completed 0 11h installer-7-ip-10-0-159-216.us-east-2.compute.internal 0/1 Completed 0 11h installer-7-ip-10-0-172-171.us-east-2.compute.internal 0/1 Completed 0 11h kube-apiserver-ip-10-0-133-233.us-east-2.compute.internal 2/2 Running 0 11h kube-apiserver-ip-10-0-159-216.us-east-2.compute.internal 2/2 Running 0 11h kube-apiserver-ip-10-0-172-171.us-east-2.compute.internal 2/2 Running 0 11h Retrieved the cert being served with: openssl s_client -showcerts -servername api.xxia-0606.qe.devcluster.openshift.com -connect api.xxia-0606.qe.devcluster.openshift.com:6443 </dev/null | tee -a showcert.out and compared it to tls.crt in the secret api-certs in the namespace openshift-config. The certificates did not match. Verified in the apiserver CR that api-certs is still the servingCertificate. I will include the openssl output and the secret output in the must-gather zip I will be linking shortly.
Moving this back ON_QA. Using the correct hostname for the SNI version of the openssl showcerts does show that we are serving the correct cert: openssl s_client -showcerts -servername api.xxia-test.qe.devcluster.openshift.com -connect api.xxia-0606.qe.devcluster.openshift.com:6443 CONNECTED(00000003) --- Certificate chain 0 s:C = US, ST = test, L = Default City, O = Default Company Ltd, CN = api.xxia-test.qe.devcluster.openshift.com i:C = US, ST = test, L = Default City, O = Default Company Ltd, CN = api.xxia-test.qe.devcluster.openshift.com -----BEGIN CERTIFICATE----- MIICjjCCAfcCFBWHWgPnkHlHbTENgY9DZA7NFkWIMA0GCSqGSIb3DQEBCwUAMIGF MQswCQYDVQQGEwJVUzENMAsGA1UECAwEdGVzdDEVMBMGA1UEBwwMRGVmYXVsdCBD aXR5MRwwGgYDVQQKDBNEZWZhdWx0IENvbXBhbnkgTHRkMTIwMAYDVQQDDClhcGku eHhpYS10ZXN0LnFlLmRldmNsdXN0ZXIub3BlbnNoaWZ0LmNvbTAeFw0xOT.......
@deads verified the subject and signer from the cert and I was able to verify the expiration: echo | openssl s_client -showcerts -servername api.xxia-test.qe.devcluster.openshift.com -connect api.xxia-0606.qe.devcluster.openshift.com:6443 2>/dev/null | openssl x509 -text Certificate: Data: Version: 1 (0x0) Serial Number: 15:87:5a:03:e7:90:79:47:6d:31:0d:81:8f:43:64:0e:cd:16:45:88 Signature Algorithm: sha256WithRSAEncryption Issuer: C = US, ST = test, L = Default City, O = Default Company Ltd, CN = api.xxia-test.qe.devcluster.openshift.com Validity Not Before: Jun 6 04:39:18 2019 GMT Not After : Jun 7 04:39:18 2019 GMT Subject: C = US, ST = test, L = Default City, O = Default Company Ltd, CN = api.xxia-test.qe.devcluster.openshift.com Marking this VERIFIED on 4.1.0-0.nightly/release/4.1.0-0.nightly-2019-06-05-223716. New certificate is served from disk without new rollout.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:1382