Bug 2094562 - Cluster wide proxy values passed to Update service operator pod, but not operand pod
Summary: Cluster wide proxy values passed to Update service operator pod, but not oper...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OpenShift Update Service
Version: 4.10
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: Pratik Mahajan
QA Contact: Yang Yang
Kathryn Alexander
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-07 22:00 UTC by Chad Crum
Modified: 2022-08-03 11:50 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-03 11:50:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cincinnati-operator pull 144 0 None Merged Bug 2094562: add support for cluster-wide proxy 2022-06-13 21:51:10 UTC
Red Hat Product Errata RHEA-2022:5871 0 None None None 2022-08-03 11:50:54 UTC

Description Chad Crum 2022-06-07 22:00:00 UTC
Description of problem (please be detailed as possible and provide log
snippests):

Installed a 4.10 IPI BM cluster (ipv4 connected env) with cluster-wide proxy, then tried to deploy Update Service operator + operand. Update Service operator automatically had the cluster-wide proxy values added as environment vars for the operator pod, but the update service operand did not have proxy values added, causing the graph builder to hang trying to query the image registry.

# Graph builder hangs here
oc logs -f test2-597cbfbd8-mblhr graph-builder
        tracing_endpoint: None,
    }
[2022-06-07T21:50:42Z DEBUG graph_builder::graph] graph update triggered
[2022-06-07T21:50:42Z TRACE cincinnati::plugins] Running next plugin 'release-scrape-dockerv2'

Scaling back the operator deployment and adding cluster wide proxy env vars to the operand  deployment allows the operand to full deploy / graph builder to properly query through the proxy.


Version of all relevant components (if applicable):

update-service-operator.v4.9.1   OpenShift Update Service   4.9.1 
4.10.0-0.nightly-2022-06-06-184250

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

This was discovered when trying to configure a customer. Customer has a connected "hub" cluster (connected via proxy) that runs the update service and disconnected spoke clusters. 

Is there any workaround available to the best of your knowledge?
Yes - scale back operator deployment and manually add proxy vars that are set in the operator deployment in the operand deployment.

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:
No

Steps to Reproduce:
1. Must have connected ipv4 ipi hub cluster deployed via cluster wide proxy
2. Deploy update service operator via operatorhub
3. Create update service operand with the following spec:
spec:
  graphDataImage: quay.io/chadcrum0/cincinnati-graph-data-container:latest
  releases: quay.io/openshift-release-dev/ocp-release

Actual results:
# Graph builder hangs here
oc logs -f test2-597cbfbd8-mblhr graph-builder
        tracing_endpoint: None,
    }
[2022-06-07T21:50:42Z DEBUG graph_builder::graph] graph update triggered
[2022-06-07T21:50:42Z TRACE cincinnati::plugins] Running next plugin 'release-scrape-dockerv2'

Expected results:
# Graph builder properly queries image registry api via proxy

Additional info:

The spec is using public quay.io for graph data image and releases in this scenario - This is an initial attempt to get everything working (Without added complexity of certs). I expect the next step to use an internal mirror registry for both values.

However this seems like a valid scenario, even if using a proxy to get to an internal registry.

Comment 3 Yang Yang 2022-06-24 07:58:05 UTC
Reproducing it with osus4.9.1 on ocp 4.10.18

Steps to reproduce:
1. Install an 4.10 OCP cluster with proxy enabled (w/o CA)
# oc get proxy -oyaml
apiVersion: v1
items:
- apiVersion: config.openshift.io/v1
  kind: Proxy
  metadata:
    creationTimestamp: "2022-06-24T06:45:50Z"
    generation: 1
    name: cluster
    resourceVersion: "624"
    uid: ec2ba7df-a596-4c9c-8fc3-13b2f05a20db
  spec:
    httpProxy: http://$user:pass@$ip:3128
    httpsProxy: http://$user:pass@$ip:3128
    noProxy: test.no-proxy.com
    trustedCA:
      name: ""
  status:
    httpProxy: http://$user:pass@$ip:3128
    httpsProxy: http://$user:pass@$ip:3128
    noProxy: .cluster.local,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.yanyang-0624b.qe.gcp.devcluster.openshift.com,localhost,metadata,metadata.google.internal,metadata.google.internal.,test.no-proxy.com
kind: List
metadata:
  resourceVersion: ""

2. Install osus4.9.1 on the cluster
# oc get pod
NAME                                      READY   STATUS    RESTARTS   AGE
sample-64f5b54867-kcfxc                   1/2     Running   0          8m2s
sample-ff75f8c58-c4wq5                    1/2     Running   0          12m
updateservice-operator-645c68cc7c-g8vcc   1/1     Running   0          17m

Graph builder is not ready.

# oc logs pod/sample-64f5b54867-kcfxc
Defaulted container "graph-builder" out of: graph-builder, policy-engine, graph-data (init)
[2022-06-24T07:44:24Z DEBUG graph_builder] application settings:
    AppSettings {
        address: ::,
        credentials_path: None,
        mandatory_client_parameters: {},
        manifestref_key: "io.openshift.upgrades.graph.release.manifestref",
        path_prefix: "",
        pause_secs: 300s,
        scrape_timeout_secs: None,
        port: 8080,
        registry: "quay.io",
        repository: "openshift-release-dev/ocp-release",
        status_address: ::,
        status_port: 9080,
        verbosity: Trace,
        fetch_concurrency: 16,
        metrics_required: {
            "graph_upstream_raw_releases",
        },
        plugin_settings: [
            ReleaseScrapeDockerv2Settings {
                registry: "quay.io",
                repository: "openshift-release-dev/ocp-release",
                manifestref_key: "io.openshift.upgrades.graph.release.manifestref",
                fetch_concurrency: 16,
                username: None,
                password: None,
                credentials_path: Some(
                    "/var/lib/cincinnati/registry-credentials/.dockerconfigjson",
                ),
            },
            OpenshiftSecondaryMetadataParserSettings {
                data_directory: "/var/lib/cincinnati/graph-data",
                key_prefix: "io.openshift.upgrades.graph",
                default_arch: "amd64",
                disallowed_errors: {},
            },
            EdgeAddRemovePlugin {
                key_prefix: "io.openshift.upgrades.graph",
                remove_all_edges_value: "*",
                remove_consumed_metadata: false,
            },
        ],
        tracing_endpoint: None,
    }
[2022-06-24T07:44:24Z DEBUG graph_builder::graph] graph update triggered
[2022-06-24T07:44:24Z TRACE cincinnati::plugins] Running next plugin 'release-scrape-dockerv2'

Graph builder hang.

Checking the proxy setting in the osus.
Well it's passed to operator

# oc get pod/updateservice-operator-645c68cc7c-g8vcc -oyaml | grep -i proxy
    - name: HTTP_PROXY
      value: http://$user:pass@$ip:3128
    - name: HTTPS_PROXY
      value: http://$user:pass@$ip:3128
    - name: NO_PROXY
      value: .cluster.local,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.yanyang-0624b.qe.gcp.devcluster.openshift.com,localhost,metadata,metadata.google.internal,metadata.google.internal.,test.no-proxy.com

Whereas, proxy is not passed to operand
# oc get pod/sample-64f5b54867-kcfxc -oyaml | grep -i proxy

It's reproduced.

Comment 4 Yang Yang 2022-06-28 07:00:36 UTC
Verifying with osus5.0.0 on ocp4.10.20
cincinnati-container-v5.0.0-4
cincinnati-operator-bundle-container-v5.0.0-3
cincinnati-operator-container-v5.0.0-4

Steps to reproduce:
1. Install an 4.10 OCP cluster with proxy enabled (w/o CA)
# oc get proxy -oyaml
apiVersion: v1
items:
- apiVersion: config.openshift.io/v1
  kind: Proxy
  metadata:
    creationTimestamp: "2022-06-28T02:44:05Z"
    generation: 1
    name: cluster
    resourceVersion: "618"
    uid: 9e901142-95e1-499e-931d-c0d124aa607d
  spec:
    httpProxy: http://$user:$pass@$ip:3128
    httpsProxy: http://$user:$pass@$ip:3128
    noProxy: test.no-proxy.com
    trustedCA:
      name: ""
  status:
    httpProxy: http://$user:$pass@$ip:3128
    httpsProxy: http://$user:$pass@$ip:3128
    noProxy: test.no-proxy.com,$ips
kind: List
metadata:
  resourceVersion: ""

2. Install osus5.0.0 on the cluster
# oc get all
NAME                                         READY   STATUS    RESTARTS   AGE
pod/sample-574c488d4f-lxvfj                  2/2     Running   0          3m54s
pod/updateservice-operator-54f7686cb-mdms8   1/1     Running   0          118m

NAME                                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/sample-graph-builder             ClusterIP   172.30.236.196   <none>        8080/TCP,9080/TCP   31m
service/sample-policy-engine             ClusterIP   172.30.41.29     <none>        80/TCP,9081/TCP     31m
service/updateservice-operator-metrics   ClusterIP   172.30.161.56    <none>        8443/TCP            118m

NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/sample                   1/1     1            1           31m
deployment.apps/updateservice-operator   1/1     1            1           118m

NAME                                               DESIRED   CURRENT   READY   AGE
replicaset.apps/sample-574c488d4f                  1         1         1       3m54s
replicaset.apps/sample-86ccf6d94b                  0         0         0       31m
replicaset.apps/updateservice-operator-54f7686cb   1         1         1       118m

NAME                                    HOST/PORT                                                                                  PATH   SERVICES               PORT            TERMINATION   WILDCARD
route.route.openshift.io/sample-route   sample-route-openshift-update-service.apps.yanyang-0628a.qe.gcp.devcluster.openshift.com          sample-policy-engine   policy-engine   edge/None     None

# oc get pod/sample-574c488d4f-lxvfj -oyaml | grep -i proxy
    - name: HTTP_PROXY
      value: http://$user:$pass@$ip:3128
    - name: HTTPS_PROXY
      value: http://$user:$pass@$ip:3128
    - name: NO_PROXY
      value: test.no-proxy.com,$ips

3. Check the graph url works
# curl -sk --header Accept:application/json --output /dev/null --write-out "%{http_code}" "${POLICY_ENGINE_GRAPH_URI}?channel=stable-4.10"
200

4. Configure cluster to use the local osus
# oc adm upgrade 
Cluster version is 4.10.20

Upstream: https://sample-route-openshift-update-service.apps.yanyang-0628a.qe.gcp.devcluster.openshift.com/api/upgrades_info/v1/graph
Channel: candidate-4.11 (available channels: candidate-4.10, candidate-4.11)
No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and may result in downtime or data loss.

Moving it to verified state.

Comment 6 errata-xmlrpc 2022-08-03 11:50:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHEA: OSUS Enhancement Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:5871


Note You need to log in before you can comment on or make changes to this bug.