Hide Forgot
Description of problem: Customized component route with cert of no SAN does not mark Upgradeable as False to remind user before upgrade to 4.10, see bug 2037274 for background. Background: When testing bug 2037274, we need cover the scenario in which a 4.9 env has some customized component route with cert of no SAN. I tried oauth-openshift route https://docs.openshift.com/container-platform/4.9/authentication/configuring-internal-oauth.html#customizing-the-oauth-server-url_configuring-internal-oauth with cert of no SAN, the customization is verified via web console login, but no operators are marked with Upgradeable as False to remind user. User should be reminded like other scenario like comment of bug 2037274#c10 . Confirmed with Dev https://coreos.slack.com/archives/CS05TR7BK/p1644401923172849?thread_ts=1644337510.355279&cid=CS05TR7BK , a separate bug is needed, so opening with this tracker. OpenShift release version: 4.9.0-0.nightly-2022-02-09-030305 How reproducible: Always Steps to Reproduce (in detail): 1. Prepare cert without SAN: mkdir test_customized_oauth_cert_no_san cd test_customized_oauth_cert_no_san openssl genrsa -out caKey.pem 2048 openssl req -x509 -new -nodes -key caKey.pem -days 100000 -out caCert.pem -subj "/CN=xxia_test_ca" openssl genrsa -out serverKey.pem 2048 cat > server_no_san.conf << EOF [req] req_extensions = v3_req distinguished_name = req_distinguished_name [req_distinguished_name] [ v3_req ] basicConstraints = CA:FALSE keyUsage = nonRepudiation, digitalSignature, keyEncipherment extendedKeyUsage = clientAuth, serverAuth EOF CUSTOM_DOMAIN=qe1.SNIPPED.com openssl req -new -key serverKey.pem -out serverNoSAN.csr -subj "/CN=*.$CUSTOM_DOMAIN" -config server_no_san.conf openssl x509 -req -in serverNoSAN.csr -CA caCert.pem -CAkey caKey.pem -CAcreateserial -out serverCertNoSAN.pem -days 100000 -extensions v3_req -extfile server_no_san.conf 2. Make the cert used in oauth route server: NOTE: the customized route `auth-openshift-custom.CUSTOM_DOMAIN` need to be resolvable. We can add an A record in route53 1) Open https://console.aws.amazon.com/route53/home?region=us-east-2 2) In Hosted Zones, click on the item of CUSTOM_DOMAIN, it is already there created for team use. 3) Click 'Create Record Set' - Name: your customized hostname, eg: auth-openshift-custom.CUSTOM_DOMAIN Type: A IPv4 Address Value: the IP address where our route can be resolved, you can get from `nslookup <default_oauth_route_hostname>` For example: $ nslookup oauth-openshift.apps.YOUR_ENV_SUFFIX ... Non-authoritative answer: Name: oauth-openshift.apps.YOUR_ENV_SUFFIX Address: 18.189.... ... 4) Save your changes oc --namespace openshift-config create secret tls custom-auth-component --cert=serverCertNoSAN.pem --key=serverKey.pem oc edit ingress.config cluster ... spec: componentRoutes: - name: oauth-openshift namespace: openshift-authentication hostname: auth-openshift-custom.CUSTOM_DOMAIN # replace with above CUSTOM_DOMAIN value servingCertKeyPairSecret: name: custom-auth-component ... This will cause oauth pods and KAS pods renew. Wait a moment for the renew to finish. 3. Login to console, it redirects to auth-openshift-custom.CUSTOM_DOMAIN. Input user and password, login succeeded, this verifies the oauth-openshift cert and route works. But oc get co -o yaml does not show any operators with Upgradeable as False, all are still True. oauth-openshift route is one which customers like to customize. We should fix to make Upgradeable as False when it has invalid cert of no-SAN. I guess other scenarios https://docs.openshift.com/container-platform/4.9/web_console/customizing-the-web-console.html#customizing-the-console-route_customizing-web-console and https://docs.openshift.com/container-platform/4.9/security/certificates/replacing-default-ingress-certificate.html have same issue but not yet tried. Actual results: 3. No operator shows Upgradeable as False Expected results: 3. There should be operators marked with Upgradeable as False to remind user before upgrade to 4.10. Impact of the problem: See https://bugzilla.redhat.com/show_bug.cgi?id=2037274#c0 for background Additional info:
I'm setting blocker- as this issue doesn't need to block the next z-stream release, but we may need to fix it in some 4.9.z release before 4.10.0 GA. I don't understand why the SANless certificate works with OpenShift 4.9; neither cluster-authentication-operator nor oauth-server is setting the GODEBUG environment variable as far I can tell using git-grep or ripgrep on their respective source repositories. Can you confirm that the same certificate works with OpenShift 4.9 and fails with OpenShift 4.10?
(In reply to Miciah Dashiel Butler Masters from comment #1) > but we may need to fix it in some 4.9.z release before 4.10.0 GA. Agree > Can you confirm that the same certificate works with OpenShift 4.9 and fails with OpenShift 4.10? Yesterday 4.9 test showed all COs are good. Today tested 4.10 with same steps, got bad COs: $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.10.0-0.nightly-2022-02-09-225148 False False True 10m OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://auth-openshift-custom.qe1.HIDDEN/healthz": x509: certificate relies on legacy Common Name field, use SANs instead ... console 4.10.0-0.nightly-2022-02-09-225148 True True False 36m SyncLoopRefreshProgressing: Working toward version 4.10.0-0.nightly-2022-02-09-225148, 1 replicas available ... $ oc get po -n openshift-console NAME READY STATUS RESTARTS AGE console-5886c6845d-xzbtp 1/1 Running 0 38m console-6fc6b8884f-k5hsp 0/1 Running 3 (59s ago) 11m console-6fc6b8884f-vvh4b 0/1 Running 3 (42s ago) 11m ... $ oc logs -n openshift-console console-6fc6b8884f-k5hsp ... repeated same log lines ... E0210 03:56:13.518829 1 auth.go:232] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://auth-openshift-custom.qe1.HIDDEN/oauth/token failed: Head "https://auth-openshift-custom.qe1.HIDDEN": x509: certificate relies on legacy Common Name field, use SANs instead So, it fails with 4.10 as expected, this further proves this 4.9 bug needs be fixed.
One more thing, the tested 4.10 env shows below as well: $ oc get ingress.config cluster -o yaml ... status: componentRoutes: - conditions: - lastTransitionTime: "2022-02-10T03:44:57Z" message: 'unexpected error at auth-openshift-custom.HIDDEN: Get "https://auth-openshift-custom.qe1.HIDDEN/healthz": x509: certificate relies on legacy Common Name field, use SANs instead' reason: ErrorReachingOutToService status: "True" type: Progressing
I'm working on verification.
Verified in 4.9.0-0.nightly-2022-06-28-211928 with original steps: After applying non-SAN cert, user is reminded by: $ oc get co authentication NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.9.0-0.nightly-2022-06-28-211928 True False True 11h CustomRouteControllerDegraded: custom route configuration failed verification: [error validating secret openshift-config/custom-auth-component: [certificate relies on legacy Common Name field, use SANs instead:... $ oc describe co authentication ... Status: Conditions: Last Transition Time: 2022-06-29T13:53:19Z Message: CustomRouteControllerDegraded: custom route configuration failed verification: [error validating secret openshift-config/custom-auth-component: [certificate relies on legacy Common Name field, use SANs instead: CustomRouteControllerDegraded: sn=17889069629480321911; CustomRouteControllerDegraded: iss=CN=xxia_test_ca]] OAuthClientsControllerDegraded: no ingress for host auth-openshift-custom.qe1.SNIPPED.com in route oauth-openshift in namespace openshift-authentication Reason: CustomRouteController_SyncError::OAuthClientsController_SyncError Status: True Type: Degraded ... Last Transition Time: 2022-06-29T02:36:34Z Message: All is well Reason: AsExpected Status: True Type: Upgradeable ... From the message above, moving to VERIFIED. Revert oc edit ingress.config cluster setting, oc get co authentication is back to normal. But from PR comment "prevent the upgrade", "Upgradeable" condition isn't "False", is this expected, Pierre?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.9.41 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:5434
> But from PR comment "prevent the upgrade", "Upgradeable" condition isn't "False", is this expected, Pierre? Newly added certificates are validated by the respective operators and are outside the scope of this change IIUC. This very change should only catch already added certificates (that were added in a previous OCP version, where they were validated OK) that are detected as invalid upon upgrade, and set "NoUpgrade" in that case.