Bug 1766907

Summary: CVO not using proxy settings
Product: OpenShift Container Platform Reporter: Chet Hosey <ChetRHosey>
Component: Cluster Version OperatorAssignee: Patrick Dillon <padillon>
Status: CLOSED ERRATA QA Contact: Gaoyun Pei <gpei>
Severity: high Docs Contact:
Priority: high    
Version: 4.2.0CC: adahiya, ansverma, aos-bugs, aprajapa, asonmez, ckoep, dahernan, dmoessne, glso, gpei, jkaur, jokerman, mharri, mmcneill, padillon, palonsor, pkhaire, rbost, rsandu, sdodson, takito, vjaypurk, wking
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1775836 (view as bug list) Environment:
Last Closed: 2020-01-23 11:09:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1775836    

Description Chet Hosey 2019-10-30 09:16:01 UTC
Description of problem:

Running 4.2.0 behind a proxy. The OCP console reports that 4.2.2 is available. The upgrade is hung at "downloading update".

The CVO logs include timeouts reaching URLs under https://storage.googleapis.com/ and https://mirror.openshift.com/. I'm able to access them directly via a browser through the same proxy as used for the cluster, so suspect the CVO isn't using the proxy.

How reproducible:

Unknown

Steps to Reproduce:
1. Run 4.2.0 behind a proxy.
2. Access cluster settings.
3. Start upgrade to 4.2.2.

Actual results:
Hung at "Working towards 4.2.2: downloading update".

Logs show:

I1030 08:54:00.640002       1 verify.go:332] unable to load signature: Get https://storage.googleapis.com/openshift-release/official/signatures/openshift/release/sha256=dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0/signature-9: dial tcp 64.233.177.128:443: i/o timeout

I1030 08:54:30.640286       1 verify.go:332] unable to load signature: Get https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/sha256=dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0/signature-1: dial tcp 54.173.18.88:443: i/o timeout


Expected results:

Download should proceed via proxy.

Comment 1 Abhinav Dahiya 2019-10-30 16:04:56 UTC
Can you include the result of `oc adm must-gather`

Comment 4 Chet Hosey 2019-10-31 06:22:01 UTC
The output is larger than Bugzilla will support; I've attached it as must-gather.local.8167234927129864303.tar.xz on Red Hat Support case 02503924. Please let me know of any issues. Thanks!

Comment 8 Gaoyun Pei 2019-11-25 09:49:47 UTC
Hi Patrick,

Since we don't have nightly payload with signature located on https://mirror.openshift.com/ needs to be verified, any suggestion for QE to verify this bug?
I made a check on CVO pod with payload 4.3.0-0.nightly-2019-11-24-183610, no proxy env var was being used there.

Comment 9 Patrick Dillon 2019-11-26 02:15:07 UTC
> I made a check on CVO pod with payload 4.3.0-0.nightly-2019-11-24-183610, no proxy env var was being used there.
That is to be expected. Perhaps unlike other operators, the CVO cannot use env vars to control the proxy. 

> Since we don't have nightly payload with signature located on https://mirror.openshift.com/ needs to be verified, any suggestion for QE to verify this bug?
I will try to come up with a plan tomorrow. If I cannot figure out something for 4.3, I will discuss whether we can test and merge 4.2 first. I will leave the needinfo in place until I give you a response tomorrow.

Comment 10 Abhinav Dahiya 2019-11-26 16:57:38 UTC
I think we can test this by following these steps:

1) create a cluster using 4.3 nightly that has the fix with proxy turned on.
2) since the currently running CVO verifies the release-image, now you can upgrade to any release-image that is signed to check if verification is happening correctly..
eg: oc adm upgrade quay.io/openshift-release-dev/ocp-release:4.2.8 --allow-explicit-upgrade

3) CVO should start upgrading to the signed release-image.. _don't care about if the upgrade succeeds_

Comment 11 Abhinav Dahiya 2019-11-26 17:01:03 UTC
(In reply to Abhinav Dahiya from comment #10)
> I think we can test this by following these steps:
> 
> 1) create a cluster using 4.3 nightly that has the fix with proxy turned on.
> 2) since the currently running CVO verifies the release-image, now you can
> upgrade to any release-image that is signed to check if verification is
> happening correctly..
> eg: oc adm upgrade quay.io/openshift-release-dev/ocp-release:4.2.8
> --allow-explicit-upgrade
> 

we only support verification for images based on digests so make sure you use digest instead of tag.. the digest based got above is `quay.io/openshift-release-dev/ocp-release@sha256:4bf307b98beba4d42da3316464013eac120c6e5a398646863ef92b0e2c621230`

you can check by running `oc adm release info quay.io/openshift-release-dev/ocp-release:4.2.8`

> 3) CVO should start upgrading to the signed release-image.. _don't care
> about if the upgrade succeeds_

Comment 19 Abhinav Dahiya 2019-12-17 23:49:00 UTC
*** Bug 1771284 has been marked as a duplicate of this bug. ***

Comment 22 Glenn Sommer 2020-01-13 08:47:08 UTC
I can confirm this bug still exists for an 4.2.8 to 4.2.12 update.
The workaround of setting the proxy environment manually for the cluster-version-operator pod seems to work. 
Though, this is required for each upgrade - as an upgrade will also trigger an upgrade of the cluster-version-operator... Which in return still has the bug of the missing proxy-environment.

Comment 23 Scott Dodson 2020-01-13 12:58:46 UTC
(In reply to Glenn Sommer from comment #22)
> I can confirm this bug still exists for an 4.2.8 to 4.2.12 update.

Glenn,

Our process requires that we have one bug per release to track these fixes as they need to each be verified. You can see the 4.2 bug in the list of bugs that this bug blocks.
It's resolved and the fix shipped in 4.2.13.

Please see 
https://bugzilla.redhat.com/show_bug.cgi?id=1775836
https://access.redhat.com/errata/RHBA-2020:0014

Comment 25 errata-xmlrpc 2020-01-23 11:09:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062

Comment 29 W. Trevor King 2020-10-16 20:32:49 UTC
CLOSED ERRATA bugs have been verified by QE for at least some use-cases.  If there are additional use cases which remain uncovered, please file new bugs to track them.  Steps to reproduce, cluster IDs, links to must-gathers, and all the other usual stuff will help get to the bottom of whatever issues you're seeing.