Hide Forgot
Created attachment 1541146 [details] Listings Description of problem: kube-apiserver entered failing state after applying change to apiserver/cluster resources ..snippet of apiserver/cluster.. spec: servingCerts: namedCertificates: - names: - api.int-3.online-starter.openshift.com servingCertificate: name: api-certs I created secrets/api-certs in openshift-config. The certificate is used for both the router and the master and signed for two wildcards: 'DNS:*.apps.int-3.online-starter.openshift.com,DNS:*.int-3.online-starter.openshift.com'. Version-Release number of selected component (if applicable): version 4.0.0-0.alpha-2019-03-04-160136 Expected results: kube-apiserver should not be failing and api.<cluster> should start serving using this certificate. Additional info: See listings in attachments.
Can you attach the logs of the kube-apiserver-operator and the relevant installer (installer-61-ip-10-0-163-238.us-east-2.compute.internal in this case)? The former spawns the later and the later copies the custom cert (or it doesn't for some reason). Also all kube-apiserver-operator events would be helpful.
* The user is specifying a custom certificate to be presented by the apiserver when accessed via the load balancer. * The cluster's kubelets access the apiserver via the load balancer address. * The cluster's kublets do not have the needed CA certificates to trust the certificate presented by the apiserver. Still investigating how the user might configure the trusted CA certs used by the kubelets.
We need to be able to configure trusted CAs for Kubelets.
The kube-apiserver can be configured successfully to serve a customer managed certificate. In this case, we have verified that customer managed certificate is indeed being served. Unfortunately, there is currently no way to configure the kublets to trust the customer managed certificates. As a result the cluster's nodes go into 'NotReady' state as their kubelets lose the ability to communicate with the api-server. We need to be able to configure trusted CAs for kubelets.
@sjenning more detailed analysis in comment 7
Ok, this bug requires changes in several areas: 1) installer need to create internal name for the API endpoint (api-int.$clustername.$basedomain) so we can use the internal CA the kubelets use to connect to the API server which will use SNI to select the right cert 2) KAS needs to have two apiserver serving certs, one for the external api name, which is potentially signed by the customer CA, and an internal api name, which will be the long lived self-signed CA the kubelet currently uses 3) the kubelet bootstrap kubeconfig generated by the MCS need to reference the internal api name 4) installer UPI documentation needs to document the internal name DNS requirement
installer build apiserver URL from baseDomain provided in install config https://github.com/openshift/installer/blob/91ba0f3fde6e36f06ba3609bee5bddf4ab0f5695/pkg/asset/manifests/utils.go#L36-L37 "cluster" Infrastructure resource is created with apiServerURL in the Status https://github.com/openshift/installer/blob/91ba0f3fde6e36f06ba3609bee5bddf4ab0f5695/pkg/asset/manifests/infrastructure.go#L49-L49 https://github.com/openshift/api/blob/master/config/v1/types_infrastructure.go#L55-L58 The --apiserver-url that is fed into the bootstrap MCS -> ignition config -> /etc/kubernetes/kubeconfig -> kubelet bootstrap -> /var/lib/kubelet/kubeconfig comes from there. Thus, in order to change the apiserver URL the kubelets use, we need to change the installer code in the first link s/api/api-int.
And the installer needs to produce a *separate* serving cert signed for that name for the kube-apiserver.
Why couldn't the installer create a single cert good for both api. and api-int. ? If a customer provided their own cert for api. it would be up to the apiserver to use that one instead of the one provided by the installer. Is the apiserver not capable of that?
The single cert would have to be signed by the customer CA if the so configured. Then we end up with the same issue we have now; the kubelet doesn't trust the customer CA. The advantage of two certs for two different names is they can have different signing CAs; customer CA can sign the external one and kube-ca can remain the CA for the internal one. Unfortunately, I think David is saying that kube-ca can't sign for this new internal cert for some reason which is unclear to me.
My suggestion would be that WE would ship with 1 cert good for "api" and "api-int". But a customer could add a second cert, which would be used instead of our single cert, for the "api" name.
(In reply to Justin Pierce from comment #0) > Created attachment 1541146 [details] > Listings > > Description of problem: > > kube-apiserver entered failing state after applying change to > apiserver/cluster resources > > > ..snippet of apiserver/cluster.. > spec: > servingCerts: > namedCertificates: > - names: > - api.int-3.online-starter.openshift.com > servingCertificate: > name: api-certs > > I created secrets/api-certs in openshift-config. The certificate is used for > both the router and the master and signed for two wildcards: > 'DNS:*.apps.int-3.online-starter.openshift.com,DNS:*.int-3.online-starter. > openshift.com'. I get that users will need to update the certificate used by the router as that servers applications. Do you have a reason why you want to update the api.cluster_domain serving certificate. that is not user facing in terms of applications and only used by the k8s client that already require the certificate-authority to trust the api server. _ just looking for more information on the motivation_ previously 3.x was served through api endpoint, but that has also moved to its own route. > > Version-Release number of selected component (if applicable): > version 4.0.0-0.alpha-2019-03-04-160136 > > Expected results: > kube-apiserver should not be failing and api.<cluster> should start serving > using this certificate. > > Additional info: > See listings in attachments.
(In reply to Abhinav Dahiya from comment #15) > (In reply to Justin Pierce from comment #0) > > Created attachment 1541146 [details] > > Listings > > > > Description of problem: > > > > kube-apiserver entered failing state after applying change to > > apiserver/cluster resources > > > > > > ..snippet of apiserver/cluster.. > > spec: > > servingCerts: > > namedCertificates: > > - names: > > - api.int-3.online-starter.openshift.com > > servingCertificate: > > name: api-certs > > > > I created secrets/api-certs in openshift-config. The certificate is used for > > both the router and the master and signed for two wildcards: > > 'DNS:*.apps.int-3.online-starter.openshift.com,DNS:*.int-3.online-starter. > > openshift.com'. > > I get that users will need to update the certificate used by the router as > that servers applications. Do you have a reason why you want to update the > api.cluster_domain serving certificate. that is not user facing in terms of > applications and only used by the k8s client that already require the > certificate-authority to trust the api server. _ just looking for more > information on the motivation_ > > previously 3.x was served through api endpoint, but that has also moved to > its own route. EDIT: previously in 3.x console was served through the api endpoint, but that has also moved to its own route. > > > > Version-Release number of selected component (if applicable): > > version 4.0.0-0.alpha-2019-03-04-160136 > > > > Expected results: > > kube-apiserver should not be failing and api.<cluster> should start serving > > using this certificate. > > > > Additional info: > > See listings in attachments.
the oc client is not the only thing that talks to the apiserver. When you go to the console the console gives you the html and javascript and css, but the actual data shown comes because your browser talks directly to the apiserver. Customers write their own code that talk to the api server. Thing every single customer ever who uses their own Jenkins instance to deploy containers to OpenShift. That code they write may not be "oc". Lots of things other than oc talk to the apiserver. And those things expect that apiserver to have a real trusted certificate. For example, you shouldn't have to click accept on exceptions just to get the console run. You should be able to sign, using either a public CA or your corp CA a certificate, and have the api server use a key that is trusted outside of the cluster.
tracking doc https://docs.google.com/document/d/1kp4VYwbif7ppSJoj-xl8VhD-FiYVu_44kcylZGj2QVI/edit
Sending to Master. While the kubelet is affected by this, there is no change need to the kubelet or anything managed by the Pod team. I've been mostly a middle man on this bug. Transferring it do David who is doing almost all the work.
I think all the master team items are completed. The remaining item is updating the kubelet configuration in the installer, which I think the pod team owns.
should be the last part for this: https://github.com/openshift/installer/pull/1633
tests are green on the PR. just waiting for beta4 to be cut before merging this.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758