1798887 – [4.6] readinessEndpoint not using trustedCA for trust validation

Bug 1798887 - [4.6] readinessEndpoint not using trustedCA for trust validation

Summary: [4.6] readinessEndpoint not using trustedCA for trust validation

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.2.z
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	low
Target Milestone:	---
Target Release:	4.6.0
Assignee:	Daneyon Hansen
QA Contact:	huirwang
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1791948 (view as bug list)
Depends On:
Blocks:	1849154 1855356
TreeView+	depends on / blocked

Reported:	2020-02-06 08:52 UTC by Chet Hosey
Modified:	2023-09-07 21:44 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1821956 (view as bug list)
Environment:
Last Closed:	2020-10-27 15:55:05 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-network-operator pull 613	0	None	closed	Bug 1798887: Fixes Proxy ReadinessEndpoints Validation	2021-02-16 17:40:35 UTC
Red Hat Product Errata	RHBA-2020:4196	0	None	None	None	2020-10-27 15:55:31 UTC

Description Chet Hosey 2020-02-06 08:52:21 UTC

Description of problem:

The cluster network operator is rejecting an HTTPS readiness endpoint with a chain of trust rooted in an organizational CA configured with trustedCA.


Version-Release number of selected component (if applicable):

OCP 4.2.16


How reproducible:

Happened in test and production clusters


Steps to Reproduce:
1. Configure MITM proxy, trustedCA, and both HTTP and HTTPS readinessEndpoints (HTTP may not be needed to reproduce this)
2. View Cluster Network Operator logs

Actual results:

    - lastTransitionTime: "2020-02-06T08:31:27Z"
      message: 'The configuration is invalid for proxy ''cluster'' (readinessEndpoint
        probe failed for endpoint ''https://www.google.com'': endpoint probe failed for
        endpoint ''https://www.google.com'' using proxy ''http://proxy.example.com:8080'':
        Get https://www.google.com: x509: certificate signed by unknown authority). Use
        ''oc edit proxy.config.openshift.io cluster'' to fix.'
      reason: InvalidProxyConfig
      status: "True"
      type: Degraded


Expected results:

HTTPS readiness endpoint should pass validation.

Additional info:

Confirmed chain of trust by doing an `oc rsh` to the network-operator pod, creating /tmp/ca-bundle.crt from the contents of the CM referenced by proxy/cluster trustedCA, and doing:

    https_proxy=http://proxy.example.com:8080 curl https://www.google.com/ --cacert /tmp/ca-bundle.crt

Curl reported no errors.

Comment 1 Robert Bost 2020-04-07 20:36:42 UTC

I can confirm same issue in 4.2.26

Comment 2 Robert Bost 2020-04-07 23:15:03 UTC

The issue is here:

  https://github.com/openshift/cluster-network-operator/blob/d69bd9eff18d142e33bfd380273edf386c30f1e5/pkg/controller/proxyconfig/validation.go#L246-L252

I believe the `proxy.Scheme == schemeHTTPS` needs to be changed to `proxy.Scheme == schemeHTTPS || endpoint.Scheme == schemeHTTPS`. A MITM proxy will send back a certificate to the network operator performing a probe even if the proxy Scheme is HTTP. The presence of TLS is based on the endpoint Scheme. 

Ben or Daneyon, does this seem like the right fix?

Comment 3 Daneyon Hansen 2020-05-04 23:07:49 UTC

I pushed https://github.com/openshift/cluster-network-operator/pull/613 to fix the issue.

Comment 4 Daneyon Hansen 2020-05-27 16:30:32 UTC

Waiting for the associated PR to merge. This should be considered as a candidate for backport.

Comment 7 Daneyon Hansen 2020-06-18 19:23:39 UTC

*** Bug 1791948 has been marked as a duplicate of this bug. ***

Comment 8 Daneyon Hansen 2020-06-18 20:36:53 UTC

Retargeting to 4.6. The SDN team will handle the backport.

Comment 9 Daneyon Hansen 2020-06-19 17:16:38 UTC

Tagged UpcomingSprint as multiple CI jobs failed after the PR was tagged /lgtm.

Comment 17 errata-xmlrpc 2020-10-27 15:55:05 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Note You need to log in before you can comment on or make changes to this bug.