Description of problem: While upgrading a cluster from 4.8.24 to 4.9.12, the upgrade stalls when the authentication clusteroperator goes to Degraded. The error is ``` RouterCertsDegraded: secret/v4-0-config-system-router-certs.spec.data[apps.dev-alln-01.cae.cisco.com] -n openshift-authentication: certificate could not validate route hostname v4-0-config-system-custom-router-certs.apps.dev-alln-01.cae.cisco.com: ``` Version-Release number of selected component (if applicable): 4.9.12 How reproducible: Reproducible Steps to Reproduce: 1. Configure a custom ingress certificate without wildcard on a 4.8.24 cluster. 2. Upgrade to 4.9 Actual results: authentication clusteroperator goes to Degraded state. Expected results: Upgrade completes successfully. Additional info: The PR that introduced these changes is: https://github.com/openshift/cluster-authentication-operator/pull/430/files#diff-0d623dfd885adb20f991bda4c2453aebd732ca6dbb4d1d4be6e79805c3b48de6R311 See RH Support Case for must-gather: 03124512
Ok, I might have found it: https://github.com/openshift/cluster-authentication-operator/commit/7c29d664bd571ce5f8e99456a206584651d200a7#diff-0d623dfd885ad[…]e79805c3b48de6R311 "v4-0-config-system-custom-router-certs" is the last argument in pkg/operator/starter.go https://github.com/openshift/cluster-authentication-operator/commit/7c29d664bd571ce5f8e99456a206584651d200a7#diff-efa5ab900a24c[…]ffcf43ad19a67f0R59 but in the constructor we expect it as the one before the last. -> we set the routeName as customSecretName and vis a versa: "v4-0-config-system-router-certs", "oauth-openshift", "v4-0-config-system-custom-router-certs", should be "v4-0-config-system-router-certs", "v4-0-config-system-custom-router-certs", "oauth-openshift",
Yes, I was thinking the same.
@ancollin mentioned that he has a workaround. Set to severity medium. Pull request created with solution, but still looking for a good way to prevent it from happening again. https://github.com/openshift/cluster-authentication-operator/pull/533
Verified on upgrade from ocp-release:4.8.24-x86_64 to 4.9.0-0.nightly-2022-02-02-193336 1. Created and applied a custom certificate with Issuer: C = US, ST = NY, O = Local Developement, L = Local Developement, CN = oauth-openshift.apps.<cluster-name>.openshift.com, subjectAltName = DNS:oauth-openshift.apps.<cluster-name>.openshift.com, OU = Local Developement 2. Upgraded cluster from ocp-release:4.8.24-x86_64 to 4.9.0-0.nightly-2022-02-02-193336 Actual Results: Upgrade completes successfully $ oc get co authentication 4.9.0-0.nightly-2022-02-02-193336 True False False 6h44m Expected Results: Upgrade completes successfully
Just for the sake of documentation: The "workaround" I had here was to add the "v4-0.*" route as a Subject-Alternative-Name on the certificate we were using for the ingress router. In this case, we already had all of the platform routes added as SANs, since we route application routes through different IngressControllers.
Hi could be this helpful, Installation on $.8 doesn't give a problem with a bare certificate. but on 4.9 it does: This fixed it: https://access.redhat.com/solutions/4542531
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days