Description of problem:
Running 4.2.0 behind a proxy. The OCP console reports that 4.2.2 is available. The upgrade is hung at "downloading update".
The CVO logs include timeouts reaching URLs under https://storage.googleapis.com/ and https://mirror.openshift.com/. I'm able to access them directly via a browser through the same proxy as used for the cluster, so suspect the CVO isn't using the proxy.
Steps to Reproduce:
1. Run 4.2.0 behind a proxy.
2. Access cluster settings.
3. Start upgrade to 4.2.2.
Hung at "Working towards 4.2.2: downloading update".
I1030 08:54:00.640002 1 verify.go:332] unable to load signature: Get https://storage.googleapis.com/openshift-release/official/signatures/openshift/release/sha256=dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0/signature-9: dial tcp 22.214.171.124:443: i/o timeout
I1030 08:54:30.640286 1 verify.go:332] unable to load signature: Get https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/sha256=dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0/signature-1: dial tcp 126.96.36.199:443: i/o timeout
Download should proceed via proxy.
Can you include the result of `oc adm must-gather`
The output is larger than Bugzilla will support; I've attached it as must-gather.local.8167234927129864303.tar.xz on Red Hat Support case 02503924. Please let me know of any issues. Thanks!
Since we don't have nightly payload with signature located on https://mirror.openshift.com/ needs to be verified, any suggestion for QE to verify this bug?
I made a check on CVO pod with payload 4.3.0-0.nightly-2019-11-24-183610, no proxy env var was being used there.
> I made a check on CVO pod with payload 4.3.0-0.nightly-2019-11-24-183610, no proxy env var was being used there.
That is to be expected. Perhaps unlike other operators, the CVO cannot use env vars to control the proxy.
> Since we don't have nightly payload with signature located on https://mirror.openshift.com/ needs to be verified, any suggestion for QE to verify this bug?
I will try to come up with a plan tomorrow. If I cannot figure out something for 4.3, I will discuss whether we can test and merge 4.2 first. I will leave the needinfo in place until I give you a response tomorrow.
I think we can test this by following these steps:
1) create a cluster using 4.3 nightly that has the fix with proxy turned on.
2) since the currently running CVO verifies the release-image, now you can upgrade to any release-image that is signed to check if verification is happening correctly..
eg: oc adm upgrade quay.io/openshift-release-dev/ocp-release:4.2.8 --allow-explicit-upgrade
3) CVO should start upgrading to the signed release-image.. _don't care about if the upgrade succeeds_
(In reply to Abhinav Dahiya from comment #10)
> I think we can test this by following these steps:
> 1) create a cluster using 4.3 nightly that has the fix with proxy turned on.
> 2) since the currently running CVO verifies the release-image, now you can
> upgrade to any release-image that is signed to check if verification is
> happening correctly..
> eg: oc adm upgrade quay.io/openshift-release-dev/ocp-release:4.2.8
we only support verification for images based on digests so make sure you use digest instead of tag.. the digest based got above is `quay.io/openshift-release-dev/ocp-release@sha256:4bf307b98beba4d42da3316464013eac120c6e5a398646863ef92b0e2c621230`
you can check by running `oc adm release info quay.io/openshift-release-dev/ocp-release:4.2.8`
> 3) CVO should start upgrading to the signed release-image.. _don't care
> about if the upgrade succeeds_
*** Bug 1771284 has been marked as a duplicate of this bug. ***
I can confirm this bug still exists for an 4.2.8 to 4.2.12 update.
The workaround of setting the proxy environment manually for the cluster-version-operator pod seems to work.
Though, this is required for each upgrade - as an upgrade will also trigger an upgrade of the cluster-version-operator... Which in return still has the bug of the missing proxy-environment.
(In reply to Glenn Sommer from comment #22)
> I can confirm this bug still exists for an 4.2.8 to 4.2.12 update.
Our process requires that we have one bug per release to track these fixes as they need to each be verified. You can see the 4.2 bug in the list of bugs that this bug blocks.
It's resolved and the fix shipped in 4.2.13.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
CLOSED ERRATA bugs have been verified by QE for at least some use-cases. If there are additional use cases which remain uncovered, please file new bugs to track them. Steps to reproduce, cluster IDs, links to must-gathers, and all the other usual stuff will help get to the bottom of whatever issues you're seeing.