OpenShift 4.10 is going to be rebased against Kubernetes 1.23. This requires using Go 1.17. However, starting with Go 1.17 support for invalid certificates is going to be removed, see https://go.dev/doc/go1.17. Formally, the temporary `GODEBUG=x509ignoreCN=0` flag has been removed. This implies that starting from OpenShift 4.10 invalid certificates will not be trusted any more as they will fail verification. Example: Given the following certificate: ``` Certificate: Data: ... Subject: CN=foo-domain.com X509v3 extensions: X509v3 Key Usage: critical Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Server Authentication X509v3 Basic Constraints: critical CA:FALSE ``` Verification against the `foo-domain.com` hostname of such certificate will fail with the following error in Go 1.17: ``` x509: certificate relies on legacy Common Name field, use SANs instead ``` Verification of server certificates is executed during TLS client handshakes, a TLS (https) client observing an invalid certificate will reject the connection attempt. Cluster internal issued certificates are not affected, however custom certificates can be configured in various cases: - custom serving certificates for kube-apiserver - custom API webhooks - custom aggregated API endpoints - custom certificates for route endpoints - certificates of external auth identity providers This will lead to broken connections to critical core parts of OpenShift and thus to a degraded cluster if invalid custom certificates are configured.
An OEP has been submitted with more details about mitigations: https://github.com/openshift/enhancements/pull/980
Will work on testing this.
sorry, this is not fixed yet just the initial parts and needs much more work, setting back to assigned.
Note: this bugzilla refers to changes that make sense to be merged in 4.10 and which need to be backported to 4.9. Background: It doesn't make sense to have most changes present in OpenShift 4.10 as it is already based on Go 1.17.
Have read https://github.com/openshift/enhancements/pull/980/files , got to know this is a release blocker because 4.9 must implement related metrics and upgrade prevention, sorry for late allocating time on it :) Checked related PRs and 4.9 PRs, got to know they intend to expose metrics when invalid non-SAN CN certs are used. No test is needed for 4.10. ( But tried to test `Verification against the `foo-domain.com` hostname of such certificate will fail with the following error in Go 1.17` of comment 0 with below cert that uses CN and no SAN: Creating a customer apiserver cert (below openssl commands refer to https://github.com/giantswarm/grumpy/blob/instance_migration/gen_certs.sh): # CREATE THE PRIVATE KEY FOR OUR CUSTOM CA openssl genrsa -out certs/ca.key 2048 # GENERATE A CA CERT WITH THE PRIVATE KEY openssl req -new -x509 -key certs/ca.key -out certs/ca.crt -config certs/ca_config.txt # CREATE THE PRIVATE KEY FOR OUR SERVER openssl genrsa -out certs/apiserver.key 2048 # CREATE A CSR FROM THE CONFIGURATION FILE AND OUR PRIVATE KEY SERVER_HOST=`oc whoami --show-server | grep -o 'api[^:]*'` openssl req -new -key certs/apiserver.key -subj "/CN=$SERVER_HOST" -out apiserver.csr -config certs/grumpy_config.txt # CREATE THE CERT SIGNING THE CSR WITH THE CA CREATED BEFORE openssl x509 -req -in apiserver.csr -CA certs/ca.crt -CAkey certs/ca.key -CAcreateserial -out certs/apiserver.crt oc create secret tls api-certs --cert=certs/apiserver.crt --key=certs/apiserver.key -n openshift-config This apiservert.crt is a custom cert of CN and no SAN, will be invalid in 4.10: $ oc version ... Server Version: 4.10.0-0.nightly-2022-01-13-061145 Kubernetes Version: v1.23.0+50f645e oc patch --type=merge apiserver/cluster -p " spec: servingCerts: namedCertificates: - servingCertificate: name: api-certs " But found KAS can rollout with new pods, and oc get co does not show abnormal thing, strange. Though, checked the cert via `echo | openssl s_client -connect api...:6443`, its cert is not above custom one, this seems to mean the custom cert that uses CN and no SAN is not taking effect, i.e. it is invalid. )
temporarily reassigning to remove code from 4.10/master.
Understood that 4.10 (master) does not need it given Go 1.17 already ensures it. Closing directly.
(In reply to Xingxing Xia from comment #10) > Have read https://github.com/openshift/enhancements/pull/980/files , got to know this is a release blocker because 4.9 must implement related metrics and upgrade prevention, sorry for late allocating time on it :) > Checked related PRs and 4.9 PRs, got to know they intend to expose metrics when invalid non-SAN CN certs are used. > No test is needed for 4.10. > But tried to test `Verification against the `foo-domain.com` hostname of such certificate will fail with the following error in Go 1.17` of comment 0 with below cert that uses CN and no SAN: > Creating a customer apiserver cert (below openssl commands refer to https://github.com/giantswarm/grumpy/blob/instance_migration/gen_certs.sh): > ... > openssl req -new -key certs/apiserver.key -subj "/CN=$SERVER_HOST" -out apiserver.csr -config certs/grumpy_config.txt > ... > But found KAS can rollout with new pods, and oc get co does not show abnormal thing, strange. My above cert had SAN set in grumpy_config.txt. That's why I got above strange result. Today re-commenting here with right verified 4.10 result of no-SAN cert: https://bugzilla.redhat.com/show_bug.cgi?id=2052467#c2 .
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056