+++ This bug was initially created as a clone of Bug #1798887 +++ Description of problem: The cluster network operator is rejecting an HTTPS readiness endpoint with a chain of trust rooted in an organizational CA configured with trustedCA. Version-Release number of selected component (if applicable): OCP 4.2.16 How reproducible: Happened in test and production clusters Steps to Reproduce: 1. Configure MITM proxy, trustedCA, and both HTTP and HTTPS readinessEndpoints (HTTP may not be needed to reproduce this) 2. View Cluster Network Operator logs Actual results: - lastTransitionTime: "2020-02-06T08:31:27Z" message: 'The configuration is invalid for proxy ''cluster'' (readinessEndpoint probe failed for endpoint ''https://www.google.com'': endpoint probe failed for endpoint ''https://www.google.com'' using proxy ''http://proxy.example.com:8080'': Get https://www.google.com: x509: certificate signed by unknown authority). Use ''oc edit proxy.config.openshift.io cluster'' to fix.' reason: InvalidProxyConfig status: "True" type: Degraded Expected results: HTTPS readiness endpoint should pass validation. Additional info: Confirmed chain of trust by doing an `oc rsh` to the network-operator pod, creating /tmp/ca-bundle.crt from the contents of the CM referenced by proxy/cluster trustedCA, and doing: https_proxy=http://proxy.example.com:8080 curl https://www.google.com/ --cacert /tmp/ca-bundle.crt Curl reported no errors. --- Additional comment from Robert Bost on 2020-04-07 20:36:42 UTC --- I can confirm same issue in 4.2.26 --- Additional comment from Robert Bost on 2020-04-07 23:15:03 UTC --- The issue is here: https://github.com/openshift/cluster-network-operator/blob/d69bd9eff18d142e33bfd380273edf386c30f1e5/pkg/controller/proxyconfig/validation.go#L246-L252 I believe the `proxy.Scheme == schemeHTTPS` needs to be changed to `proxy.Scheme == schemeHTTPS || endpoint.Scheme == schemeHTTPS`. A MITM proxy will send back a certificate to the network operator performing a probe even if the proxy Scheme is HTTP. The presence of TLS is based on the endpoint Scheme. Ben or Daneyon, does this seem like the right fix?
I pushed a fix for https://bugzilla.redhat.com/show_bug.cgi?id=1798887 and will cherry-pick to this PR when merged.
> We may need this backported to 4.3.z and 4.2.z as well? Aniket, yes.
4.2 goes out of support when 4.5 is released, so we shouldn't need to backport to 4.2: https://access.redhat.com/support/policy/updates/openshift#dates
Tagged UpcomingSprint as multiple CI jobs failed after the PR of the dependent bug was tagged /lgtm.
*** Bug 1855356 has been marked as a duplicate of this bug. ***
I’m adding UpcomingSprint because I was occupied by fixing bugs with higher priority/severity, developing new features with higher priority, or developing new features to improve stability at a macro level. I will revisit this bug next sprint.
This is fixed in 4.6.