Bug 1788214 - create a CI job to test disconnected install/upgrade via proxy with custom CA root
Summary: create a CI job to test disconnected install/upgrade via proxy with custom CA...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.2.z
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
: ---
Assignee: Ben Bennett
QA Contact:
URL:
Whiteboard: SDN-CI-IMPACT
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-06 17:52 UTC by Chet Hosey
Modified: 2021-07-12 15:12 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-12 15:12:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1766907 0 high CLOSED CVO not using proxy settings 2023-12-15 16:53:17 UTC
Red Hat Bugzilla 1767669 0 high CLOSED NTP server settings in a restricted network for OpenShift 4 2023-10-06 18:43:47 UTC
Red Hat Bugzilla 1784201 1 None None None 2024-10-01 16:25:29 UTC

Description Chet Hosey 2020-01-06 17:52:45 UTC
We've seen multiple issues with components either failing to work through a corporate proxy that terminates and re-signs TLS connections, or ignoring proxy settings and trying to access the internet directly.

I suspect there just isn't a CI job that tests the whole installation / upgrade process for an installation that's typical of some corporate environments: that is, a cluster that's connected through a proxy that uses a custom CA root for traffic inspection.

Examples include:

  - 4.2 installer not pulling images through such a proxy (case 02503426; abandoned in favor of 4.1 install)

  - CVO not using proxy settings (https://bugzilla.redhat.com/show_bug.cgi?id=1766907)

  - Image pull during deployment fails with MITM proxy (https://bugzilla.redhat.com/show_bug.cgi?id=1784201)

  - NTP server settings in a restricted network for OpenShift 4 (https://bugzilla.redhat.com/show_bug.cgi?id=1767669)

A test case that installed and upgraded an isolated cluster, connected only via a proxy and using a custom CA root, should help to shake these issues out before release.

Comment 1 Stephen Cuppett 2020-01-06 17:56:00 UTC
Setting target release to 4.4 to perform investigation on the active development branch (will be re-set/cloned where fixes & backports, if any, are required).

Comment 2 Steve Kuznetsov 2020-01-16 22:35:39 UTC
Xiaoli has explained that this set of test cases already exists in those which QE runs. Do we have a compelling reason to duplicate it?

Comment 3 Chet Hosey 2020-01-16 22:38:31 UTC
It seems the major gap is in upgrading from 4.1 -> 4.2 and adding proxy configuration.

Comment 4 Chet Hosey 2020-01-17 00:06:10 UTC
My assumption is that if these issues *were* caught by existing test cases, they wouldn't have made it into the released product.

Do you disagree?

To clarify, there are at least two gaps:

- Proxied operations must be tested on a disconnected cluster. For example, prior to 4.2.13 the cluster version operator just ignored proxy settings and accessed the internet directly. If upgrade-via-proxy was tested, there's no way it was tested from a disconnected cluster because it wouldn't have worked.

- Upgrade from 4.1 -> 4.2 and enable proxy settings.

  - First, this caused the network operator to crashloop. This is resolved in 4.2.13.

  - Second, the machine config operator isn't configuring the hosts with the bundle listed in the proxy's trustedCA setting, so kubelet isn't able to pull images through a proxy that uses a corporate certificate authority.

Comment 5 Steve Kuznetsov 2020-03-17 16:38:04 UTC
Do we have an owner for this? Test infrastructure does not fit it.

Comment 6 Daneyon Hansen 2020-03-17 19:23:56 UTC
Minus upgrade coverage, https://github.com/openshift/release/pull/5308 is attempting to provide the necessary CI coverage.

Eric,

Can you provide an update on the status of PR 5308? Aside from upgrades, can you verify your PR addresses these use cases? Does a Jira card exist for creating an upgrade test for a proxied environment?

Comment 7 Steve Kuznetsov 2020-04-22 14:41:48 UTC
Eric, we need information on this one.

Comment 8 Stefan Schimanski 2020-04-22 15:17:59 UTC
Moving to 4.4z as this is certainly not blocking the release today or tomorrow.

Comment 11 ewolinet 2020-04-23 16:52:11 UTC
> Can you provide an update on the status of PR 5308? 

I had been focusing on Logging work to get that out before the feature freeze, so I haven't looked in a while. Last I saw the AWS rehearsal job was still failing due to (I believe) gaps from other teams tests since this was now going to be removing direct egress access... I had opened a bz to track this with the oauth team, but am not sure when it will be addressed.

I'll rerun the rehearsal job and see where things are -- it looked like the non-aws platform rehearsal failed due to different issues as the proxy work is restricted to only the AWS scope.


> Aside from upgrades, can you verify your PR addresses these use cases?
> ignoring proxy settings and trying to access the internet directly

https://github.com/openshift/release/pull/5308 covers this use case.


> Does a Jira card exist for creating an upgrade test for a proxied environment?

Not that I'm aware of.. the only JIRA card was to create the initial blackhole'd VPC job https://issues.redhat.com/browse/DPTP-591
I feel that if there is going to be an upgrade test it would fall on testplatform team to build upon release/5308.

Comment 14 Daneyon Hansen 2020-04-29 15:55:17 UTC
ewolinet,

I created separate bugs to address proxy CI coverage for a) each supported provider b) an upgrade job and c) a day-2 config job.

Comment 20 Ben Bennett 2020-10-16 13:54:21 UTC
Setting the target to 4.7.


Note You need to log in before you can comment on or make changes to this bug.