Description of problem: Console cannot connect to oauth URL after wildcard certificate, issued by custom PKI applied for ingress router. Users are not able to login into web console. Version-Release number of selected component (if applicable): oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.1.0-rc.4 True False 19h Cluster version is 4.1.0-rc.4 $ oc adm release info console sha256:5397a1d5c54fef88344a0ec105d117aae69114bb61e4f3f6e4421c71c1795208 console-operator sha256:81abea24d3bbde997aef6e786c89003b15a27b5d33f6a26a6056a663706b7f7a cluster-ingress-operator sha256:3156518d2677e69341a0d0dc745c56a2a86c85065ab907d45e796aa020e534af How reproducible: Always Steps to Reproduce: According to the documentation: https://docs.openshift.com/container-platform/4.1/networking/ingress/configuring-default-certificate.html 1. Issue wildcard certificate using own (custom) PKI infrastructure. 2. Create secret: oc --namespace openshift-ingress create secret tls custom-ingress-certs --cert=tls.crt --key=tls.key based on issued certificate/private key. Put certificate + CA into tls.crt. 2. Patch ingress cr: oc patch --type=merge --namespace openshift-ingress-operator ingresscontrollers/default \ --patch '{"spec":{"defaultCertificate":{"name":"custom-ingress-certs"}}}' 3. Ensure that certificate applied: echo | openssl s_client -showcerts -servername console-openshift-console.apps.vadim-01-ocp4.myinternalsite.com -connect console-openshift-console.apps.vadim-01-ocp4..myinternalsite.com:443 | head -n 5 depth=1 C = US, ST = TX, L = Dallas, O = Myinternal, OU = Site, CN = mypki.myinternalsite.com verify error:num=20:unable to get local issuer certificate CONNECTED(00000005) --- Certificate chain 0 s:/CN=*.apps.vadim-01-ocp4.myinternalsite.com i:/C=US/ST=TX/L=Dallas/O=Myinternal/OU=Site/CN=mypki.myinternalsite.com Actual results: Try to log into web console. You'll be redirected to oauth url (oauth-openshift.apps.vadim-01-ocp4.myinternalsite.com), enter your credentials there and then you will see web console with error like "Ooops. Something got wrong" and redirected back to oauth url. There are multiple errors in console pods: 2019/05/21 15:33:31 auth: failed to get latest auth source data: request to OAuth issuer endpoint https://oauth-openshift.apps.vadim-01-ocp4.myinternalsite.com/oauth/token failed: Head https://oauth-openshift.apps.vadim-01-ocp4.myinternalsite.com: x509: certificate signed by unknown authority Expected results: Users are able to login into console. Additional info: Console process doesn't trust custom root CA. Once we applied custom wildcard certificate using own PKI we should provide our CA to pods/processes to communicate with all exposed routes. To do this in OCP 4 you need to add custom root CA to configmap signing-cabundle in namespace openshift-service-ca - and then service-ca operator will populate it. I did it on my cluster: 1. my root CA added to cm signing-cabundle in namespace openshift-service-ca oc get cm signing-cabundle -n openshift-service-ca -o yaml (ensure that I have my root CA here) 2. it was populated to openshift-console namespace, cm service-ca: oc get cm service-ca -n openshift-console -o yaml (ensure that this CM has same content as signing-cabundle) 3. Login into console pod: oc rsh console-5f87f4cbcf-dr6rd sh-4.2$ ps ax PID TTY STAT TIME COMMAND 1 ? Ssl 0:08 /opt/bridge/bin/bridge --public-dir=/opt/bridge/static --config=/var/console-config/console-config.yaml --service-ca-file=/var/service-ca/service-ca.crt 4. The pod has service-ca configmap mounted, with my root CA: sh-4.2$ cat /var/service-ca/service-ca.crt (ensure that content of this file the same as configmap) 5. Check console-config.yaml file - it has different CA (serviceaccount-ca) to validate Oauth route - see oauthEndpointCAFile parameter: sh-4.2$ cat /var/console-config/console-config.yaml apiVersion: console.openshift.io/v1 auth: clientID: console clientSecretFile: /var/oauth-config/clientSecret logoutRedirect: "" oauthEndpointCAFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt clusterInfo: consoleBaseAddress: https://console-openshift-console.apps.vadim-01-ocp4.myinternalsite.com consoleBasePath: "" masterPublicURL: https://api.vadim-01-ocp4.myinternalsite.com:6443 customization: branding: ocp documentationBaseURL: https://docs.openshift.com/container-platform/4.1/ kind: ConsoleConfig servingInfo: bindAddress: https://0.0.0.0:8443 certFile: /var/serving-cert/tls.crt keyFile: /var/serving-cert/tls.key Looks like console pod uses serviceaccount-ca bundle, which is not including custom root CA. 6. Ensure we can connect from console pod using configured (by openshift service-ca operator) custom CA bundle --service-ca-file=/var/service-ca/service-ca.crt: sh-4.2$ curl --cacert /var/service-ca/service-ca.crt https://oauth-openshift.apps.vadim-01-ocp4.myinternalsite.com/oauth/token {"error":"unsupported_grant_type","error_description":"The authorization grant type is not supported by the authorization server."} 7. Ensure that we cannot connect from console pod using CA bundle provided by oauthEndpointCAFile parameter: sh-4.2$ curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt https://oauth-openshift.apps.vadim-01-ocp4.myinternalsite.com/oauth/token curl: (60) Peer's Certificate issuer is not recognized. More details here: http://curl.haxx.se/docs/sslcerts.html curl performs SSL certificate verification by default, using a "bundle" of Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in the bundle, the certificate verification probably failed due to a problem with the certificate (it might be expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification of the certificate, use the -k (or --insecure) option. sh-4.2$ I don't know how to customize serviceaccount-ca bundle (/var/run/secrets/kubernetes.io/serviceaccount/ca.crt) - it created based on kube-api CA certs and router-ca certs, but openshift-ingress operator creates router-ca only when use default wildcard certificate. So I didn't find the way how to add custom root CA into kube-api CA certs (not sure it make sense).
We include system roots in addition to the serviceaccount ca bundle. Is this a valid certificate?
Note that this is specifically called out as a prerequisite in the docs: > You must have a certificate/key pair in PEM-encoded files, where the certificate is signed by a trusted certificate authority and valid for the Ingress domain.
Yes, I noted this. Does it mean we cannot use our own PKI for wildcard certificates? What do you mean by "system roots"? CA certs from kube-api? It is not valid if we use certs from our own PKI.
Sorry, by system roots you mean root CA from node OS, right? They are not valid if use own PKI. In OCP 3 for RHEL nodes there is way how to add additional CA (easy to do in RHEL). But for OCP4 with CoreOS it requires to tweak machineset config, which we should avoid to do. And I think that was the reason to create openshift service ca operator (to manage root CA for pods).
The workaround also worked in my tests. Just a note about it: I needed to restart all the console pods (oc delete pod --all -n openshift-console), but it is possible that I needed it because I was a bit impatient and did not wait long enough.
*** Bug 1723445 has been marked as a duplicate of this bug. ***
Also confirmed the workaround and NOT needed to delete any pod.
The workaround suggested in comment 7 is not suitable. The sa-token-secret/ca.crt is driven by two ca bundles (https://github.com/openshift/cluster-kube-controller-manager-operator/blob/master/pkg/operator/targetconfigcontroller/targetconfigcontroller.go#L301-L315) . If you've making a change to the router's CA, you need to make your change to the `oc get cm/router-ca -n openshift-config-managed`. The ingress operator should manage this.
Hello, If the ingress operator should manage this, then this bug should be moved to it. However, there is explicit code at ingress operator that intentionally does not generate that secret. So I think ingress operator guys should confirm whether this is intentional behavior. If it is intentional, then the error has been to assume that router-ca is always present while it is not, generating the situation where custom CAs are not properly propagated to service account token CA. In that case, we need some alternative, like changing router-ca behavior at ingress operator and/or provide a proper placeholder to add custom CAs to be appended to the ones exposed at service account mounts (as we were able to do in 3.11 with /etc/origin/master/ca-bundle.crt file). Not sure which one would be faster/safer.
// Summary To fix this issue, the ingress router operator needs to update the router-ca configmap in the openshift-config-managed project. `oc get cm/router-ca -n openshift-config-managed -o yaml` We would also need to add to the documentation that the CA must be included with this server cert so that the operator will add it to the router-ca config map. https://docs.openshift.com/container-platform/4.1/networking/ingress-operator.html#nw-ingress-setting-a-custom-default-certificate_configuring-ingress From there once we fix the ingress operator and it handles updating the router-ca configuration map, and then the kube-controller-manager-operator will then handle updating the service ca bundle. https://github.com/openshift/cluster-kube-controller-manager-operator/blob/master/pkg/operator/targetconfigcontroller/targetconfigcontroller.go#L301-L315 Lastly with the service ca bundle updated, I assume that other operators will handle restarting other services so that their trust is updated, for example webconsole pod needs to be restarted. Currently at this time there is no workaround.
Following is a possible solution using the proxy API that will be introduced in 4.2: 1. Generate a CA and certificate (for testing, if you do not already have a CA and certificate): BASE_DOMAIN="$(oc get dns.config/cluster -o 'jsonpath={.spec.baseDomain}')" INGRESS_DOMAIN="$(oc get ingress.config/cluster -o 'jsonpath={.spec.domain}')" openssl genrsa -out example-ca.key 2048 openssl req -x509 -new -key example-ca.key -out example-ca.crt -days 1 -subj "/C=US/ST=NC/L=Chocowinity/O=OS3/OU=Eng/CN=$BASE_DOMAIN" openssl genrsa -out example.key 2048 openssl req -new -key example.key -out example.csr -subj "/C=US/ST=NC/L=Chocowinity/O=OS3/OU=Eng/CN=*.$INGRESS_DOMAIN" openssl x509 -req -in example.csr -CA example-ca.crt -CAkey example-ca.key -CAcreateserial -out example.crt -days 1 2. Configure the CA as the cluster proxy CA: oc -n openshift-config create configmap custom-ca --from-file=ca-bundle.crt=example-ca.crt oc patch proxy/cluster --type=merge --patch='{"spec":{"trustedCA":{"name":"custom-ca"}}}' 3. Configure the certificate as the ingresscontroller's default certificate: oc -n openshift-ingress create secret tls custom-default-cert --cert=example.crt --key=example.key oc -n openshift-ingress-operator patch ingresscontrollers/default --type=merge --patch='{"spec":{"defaultCertificate":{"name":"custom-default-cert"}}}' I tested the above procedure on a development cluster, but the procedure needs further testing to validate that it is a working, supportable solution. I'd appreciate feedback from anyone who can test the above steps and verify that it works for their use-cases.
I successfully validated the https://bugzilla.redhat.com/show_bug.cgi?id=1712525#c39 workaround with one caveat. My dev cluster has proxy enabled, so instead of: $ oc patch proxy/cluster --type=merge --patch='{"spec":{"trustedCA":{"name":"custom-ca"}}}' I modified configmap user-ca-bundle referenced by proxy/cluster trustedCA to include the example-ca.crt CA cert: $ oc edit cm/user-ca-bundle -n openshift-config If this is an acceptable fix, https://docs.openshift.com/container-platform/4.1/networking/ingress-operator.html#nw-ingress-setting-a-custom-default-certificate_configuring-ingress doc should be updated to include directions for adding the CA cert.
Currently, the cluster-network-operator will only publish the combined user provided trust bundle if it's referenced by proxy/cluster [1]. This is why the proxy.spec.trustedCA must be added/modified to reference a configmap containing the custom CA cert(s). This is could be expanded to better support ingress by: 1. Adding a similar trustedCA field to the ingress api that references the configmap used to supply the custom ca bundle. 2. Update [1] to check ingress.trustedCA. [1] https://github.com/openshift/cluster-network-operator/blob/master/pkg/controller/proxyconfig/controller.go#L215-L220 Preferably, the workaround can be documented and we can take time to design a solution for managing cluster-wide custom certs.
Reassigning to docs so that the solutions in #39 and #40 get documented (perhaps as part of ingress custom certificate docs cross-referencing the proxy docs for further background).
I marked comment #40 as public as there is nothing sensitive and goes a bit further toward a resolution.
If there's no workaround in the documentation, at least a warning indicating that not all custom certificates are valid should be included to avoid more people breaking their clusters. Current text n 4.2 may lead to users to understand that they can use their corporate CAs which is wrong: "Replacing the default wildcard certificate with one that is issued by a public or organizational CA will allow external clients to connect securely to applications running under the .apps sub-domain." So, either the workaround is documented or a warning to not use custom CAs to sign custom certificates for ingress is documented. But the documentation as is today will lead to lots of users breaking the clusters and opening support cases.
(In reply to Sergio G. from comment #47) > If there's no workaround in the documentation, at least a warning indicating > that not all custom certificates are valid should be included to avoid more > people breaking their clusters. > > Current text n 4.2 may lead to users to understand that they can use their > corporate CAs which is wrong: > "Replacing the default wildcard certificate with one that is issued by a > public or organizational CA will allow external clients to connect securely > to applications running under the .apps sub-domain." > > So, either the workaround is documented or a warning to not use custom CAs > to sign custom certificates for ingress is documented. But the documentation > as is today will lead to lots of users breaking the clusters and opening > support cases. Not sure which doc you look at but our official doc has "WARNING" telling this[1]. [1]: https://access.redhat.com/documentation/en-us/openshift_container_platform/4.2/html/networking/configuring-ingress#nw-ingress-setting-a-custom-default-certificate_configuring-ingress
I just hit this problem on ocp v4.2, and I am able to resolve it by adding my internal CA to cm 'trusted-ca-bundle' in project "openshift-config-managed". $oc edit cm trusted-ca-bundle -n openshift-config-managed
(In reply to Jeff Li from comment #49) > I just hit this problem on ocp v4.2, and I am able to resolve it by adding > my internal CA to cm 'trusted-ca-bundle' in project > "openshift-config-managed". > > $oc edit cm trusted-ca-bundle -n openshift-config-managed FYI, quoting Engineering: ~~~ you shouldn't be touching things in "openshift-config-managed", those resources are managed by the platform and your changes will be overwritten. if you want to add content to that configmap, you need to add your CAs to the user configmap referenced by your proxy configuration object as discussed here: https://docs.openshift.com/container-platform/4.2/networking/enable-cluster-wide-proxy.html Those CAs will then be added to the openshift-config-managed/trusted-ca-bundle configmap by a controller. ~~~ Regards.
Hi, I have found an issue with the workaround at Comment#39. While it works for the console, grafana and prometheus don't trust the proxy CA bundle (they still rely on service account ca), so you cannot access grafana dashboard. I have filed another bugzilla to get this fixed: https://bugzilla.redhat.com/show_bug.cgi?id=1768977
Should this bug be marked a duplicate of bug 1764704?
I don't think so. This bug is older. I understand this bug is now scoped to document the current workaround at Comment#39 and the other one is to give a more definitive solution. But somebody please correct me if needed. Thanks and regards.
(In reply to Daneyon Hansen from comment #55) > Should this bug be marked a duplicate of bug 1764704? Probably bug 1764704 is a duplicate of this bug, but I haven't done that since this was changed to a doc bug.
Made initial draft of proposed changes: https://github.com/openshift/openshift-docs/pull/18004.
*** Bug 1764704 has been marked as a duplicate of this bug. ***
Resent pull request. Reorganized content to cover configuring a custom PKI as separate article, and referring to it from ingress docs: https://github.com/openshift/openshift-docs/pull/18207
the doc PR looks good, thanks.
Published: https://docs.openshift.com/container-platform/4.2/networking/ingress-operator.html#nw-ingress-setting-a-custom-default-certificate_configuring-ingress https://docs.openshift.com/container-platform/4.2/networking/configuring-a-custom-pki.html#configuring-a-custom-pki https://access.redhat.com/documentation/en-us/openshift_container_platform/4.2/html/networking/configuring-ingress#nw-ingress-setting-a-custom-default-certificate_configuring-ingress https://access.redhat.com/documentation/en-us/openshift_container_platform/4.2/html/networking/configuring-a-custom-pki
@Cody ptal at https://jira.coreos.com/browse/NE-229. We encourage users to include any intermediate certs in tls.crt of the secret containing a custom default certificate. We believe ordering matters, but putting the server certificate followed by any intermediate certs in tls.crt will suffice.
(In reply to Pedro Amoedo from comment #51) > (In reply to Jeff Li from comment #49) > > I just hit this problem on ocp v4.2, and I am able to resolve it by adding > > my internal CA to cm 'trusted-ca-bundle' in project > > "openshift-config-managed". > > > > $oc edit cm trusted-ca-bundle -n openshift-config-managed > > FYI, quoting Engineering: > > ~~~ > you shouldn't be touching things in "openshift-config-managed", those > resources are managed by the platform and your changes will be overwritten. > > if you want to add content to that configmap, you need to add your CAs to > the user configmap referenced by your proxy configuration object as > discussed here: > https://docs.openshift.com/container-platform/4.2/networking/enable-cluster- > wide-proxy.html > > Those CAs will then be added to the > openshift-config-managed/trusted-ca-bundle configmap by a controller. > ~~~ > > Regards. Hi Pedro, are you aware if the issue and the fix that you propose will apply in OpenShift 4.2.19? Because we tried to apply a private CA signed certificate but the console, prometheus, grafana, etc didnt work. But I no tried to apply de Certificate using the procedure that you described. Regards!
(In reply to gdeprati.ar from comment #86) > (In reply to Pedro Amoedo from comment #51) > > (In reply to Jeff Li from comment #49) > > > I just hit this problem on ocp v4.2, and I am able to resolve it by adding > > > my internal CA to cm 'trusted-ca-bundle' in project > > > "openshift-config-managed". > > > > > > $oc edit cm trusted-ca-bundle -n openshift-config-managed > > > > FYI, quoting Engineering: > > > > ~~~ > > you shouldn't be touching things in "openshift-config-managed", those > > resources are managed by the platform and your changes will be overwritten. > > > > if you want to add content to that configmap, you need to add your CAs to > > the user configmap referenced by your proxy configuration object as > > discussed here: > > https://docs.openshift.com/container-platform/4.2/networking/enable-cluster- > > wide-proxy.html > > > > Those CAs will then be added to the > > openshift-config-managed/trusted-ca-bundle configmap by a controller. > > ~~~ > > > > Regards. > > Hi Pedro, are you aware if the issue and the fix that you propose will apply > in OpenShift 4.2.19? Because we tried to apply a private CA signed > certificate but the console, prometheus, grafana, etc didnt work. But I no > tried to apply de Certificate using the procedure that you described. > > Regards! Hi, I'm afraid that version 4.2.19 do NOT contain the fix, AFAIK, for 4.2.x, this is something still in progress, you can check the general status here[1]. My recommendation is to upgrade to 4.3.x[2][3] if possible, on this manner you can get rid of this issue, all default cluster routes will work as expected in 4.3.x when using custom ingress certificate[4] along with cluster-wide PKI (custom CA)[5]. [1] - https://issues.redhat.com/browse/MON-884 [2] - https://docs.openshift.com/container-platform/4.3/updating/updating-cluster-between-minor.html [3] - https://access.redhat.com/solutions/4606811 [4] - https://docs.openshift.com/container-platform/4.3/networking/ingress-operator.html#nw-ingress-setting-a-custom-default-certificate_configuring-ingress [5] - https://docs.openshift.com/container-platform/4.3/networking/configuring-a-custom-pki.html#configuring-a-custom-pki Best Regards.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days