Version: 4.8.0-fc.7 Steps to reproduce: 1. Deploy IPv6 cluster with proxy. [root@f29-h06-000-r640 ~]# oc get proxy cluster -o yaml apiVersion: config.openshift.io/v1 kind: Proxy metadata: creationTimestamp: "2021-06-02T19:45:33Z" generation: 1 name: cluster resourceVersion: "671" uid: d0c6c2b5-fb10-4864-975f-3c45afb5ab5f spec: httpProxy: http://[1000::beef]:3128 httpsProxy: http://[1000::beef]:3128 noProxy: 1000::/64,.vlan614.rdu2.scalelab.redhat.com,fd01::/48,fd02::/112 trustedCA: name: "" status: httpProxy: http://[1000::beef]:3128 httpsProxy: http://[1000::beef]:3128 noProxy: .cluster.local,.svc,.vlan614.rdu2.scalelab.redhat.com,1000::/64,127.0.0.1,api-int.vlan614.rdu2.scalelab.redhat.com,fd01::/48,fd02::/112,localhost [root@f29-h06-000-r640 ~]# oc get network cluster -o yaml apiVersion: config.openshift.io/v1 kind: Network metadata: creationTimestamp: "2021-06-02T19:45:33Z" generation: 2 name: cluster resourceVersion: "4062" uid: 1227b6fd-29c4-45d6-ade7-d154260b3b4a spec: clusterNetwork: - cidr: fd01::/48 hostPrefix: 64 externalIP: policy: {} networkType: OVNKubernetes serviceNetwork: - fd02::/112 status: clusterNetwork: - cidr: fd01::/48 hostPrefix: 64 clusterNetworkMTU: 1400 networkType: OVNKubernetes serviceNetwork: - fd02::/112 [root@f29-h06-000-r640 ~]# 2. Install assisted-service 3. Create agentclusterinstall [root@f29-h06-000-r640 ~]# oc get agentclusterinstall sno-agent-cluster-install -o json|jq ".spec" { "clusterDeploymentRef": { "name": "sno-cluster-deployment" }, "imageSetRef": { "name": "4.8" }, "networking": { "clusterNetwork": [ { "cidr": "fd01::/48", "hostPrefix": 64 } ], "machineNetwork": [ { "cidr": "1000::/64" } ], "serviceNetwork": [ "fd02::/112" ] }, "provisionRequirements": { "controlPlaneAgents": 1 }, "sshPublicKey": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCzwAz3fnZcrca7mY/kVFpQGS2yI1uGd/+t3PMJn/C7Ppj1uIG32ufHkTq+SXh8Zg3xcy9v/Uome1mo3FP7PoGsWms5B9wzbooGhbA3rdph0/NxSzrHO3qcudcJsBM4GVJhcbFfbkzJVCPZQ94O/Y17oKjKuaBz69clPD29BlzKF4xCWzzbJW5Q8Y9tvWvDpCdVBM7VorpAn3MaA95xL6e15douWwwlhdI4dIOk/+8HcfgJnZGyOeLTnLVpjxQaFzTj3ScEud/5yd5wHcICrHH8Fbq419nN7VWjxbMNWUn182mcCCs0RXx2eyYq27yJvgkJS86n09SyLynX6ySqkFXN" } 4. Create clusterdeployment [root@f29-h06-000-r640 ~]# oc get cd single-node -o json|jq ".spec" { "baseDomain": "qe.lab.redhat.com", "clusterInstallRef": { "group": "extensions.hive.openshift.io", "kind": "AgentClusterInstall", "name": "sno-agent-cluster-install", "version": "v1beta1" }, "clusterName": "elvis", "controlPlaneConfig": { "servingCertificates": {} }, "installed": false, "platform": { "agentBareMetal": { "agentSelector": { "matchLabels": { "bla": "aaa" } } } }, "pullSecretRef": { "name": "pull-secret" } } 5. Check status of agentclusterinstall oc get agentclusterinstall sno-agent-cluster-install -o json|jq ".status.conditions[0]" { "lastProbeTime": "2021-06-04T14:13:50Z", "lastTransitionTime": "2021-06-04T14:13:50Z", "message": "The Spec could not be synced due to backend error: command oc adm release info -o template --template '{{.metadata.version}}' --insecure=false quay.io/openshift-release-dev/ocp-release:4.8.0-fc.7-x86_64 exited with non-zero exit code 1: \nerror: unable to read image quay.io/openshift-release-dev/ocp-release:4.8.0-fc.7-x86_64: Get \"https://quay.io/v2/\": dial tcp 3.213.173.170:443: connect: network is unreachable\n", "reason": "BackendError", "status": "False", "type": "SpecSynced" } Expected result: Since the cluster uses a proxy - expected the image to be successfully pulled. I have no problem running pods with images from quay on that cluster, without mirroring.
@sasha How do you install assisted service in step #2? It looks like the assisted service deployment does not use the proxy for some reason.
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: assisted-service-operator namespace: "assisted-installer" spec: channel: "alpha" config: env: - name: IPV6_SUPPORT value: "true" - name: OPENSHIFT_VERSIONS value: '{"4.6":{"display_name":"4.6.16","release_version":"4.6.16","release_image":"quay.io/openshift-release-dev/ocp-release:4.6.16-x86_64","rhcos_image":"https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.6/4.6.8/rhcos-4.6.8-x86_64-live.x86_64.iso","rhcos_rootfs":"https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.6/4.6.8/rhcos-live-rootfs.x86_64.img","rhcos_version":"46.82.202012051820-0","support_level":"production"},"4.7":{"display_name":"4.7.9","release_version":"4.7.9","release_image":"quay.io/openshift-release-dev/ocp-release:4.7.9-x86_64","rhcos_image":"https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.7/4.7.7/rhcos-4.7.7-x86_64-live.x86_64.iso","rhcos_rootfs":"https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.7/4.7.7/rhcos-live-rootfs.x86_64.img","rhcos_version":"47.83.202103251640-0","support_level":"production","default":true},"4.8":{"display_name":"4.8.0-fc.7","release_version":"4.8.0-fc.7","release_image":"quay.io/openshift-release-dev/ocp-release:4.8.0-fc.7-x86_64","rhcos_image":"https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/4.8.0-fc.7/rhcos-4.8.0-fc.7-x86_64-live.x86_64.iso","rhcos_rootfs":"https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/4.8.0-fc.7/rhcos-live-rootfs.x86_64.img","rhcos_version":"48.84.202105281935-0","support_level":"beta"}}' installPlanApproval: Automatic name: assisted-service-operator source: assisted-installer-index sourceNamespace: openshift-marketplace startingCSV: assisted-service-operator.v0.0.4
@sasha It's not a disconnected scenario, is it? @alazar Do we support IPv6 hub cluster in non-disconnected (connected?) environments?
We need to support IPv6, non-disconnected. However the user will need to provide a registry mirror since quay.io does not support IPv6
@alazar but do we expect it to work with a proxy as in the bug's scenario? That is, a hub cluster talking to quay.io via an HTTP proxy.
If the hub cluster is configured with a Proxy that translate IPv4 to IPv6 than it should work Maybe the configuration here is wrong
(In reply to vemporop from comment #3) > @sasha It's not a disconnected scenario, is it? > @alazar Do we support IPv6 hub cluster in non-disconnected > (connected?) environments? The cluster is disconnected.
This is the configuration we're using in our automation: 'proxy': {'httpProxy': 'http://[1001:db8::1]:3128', 'httpsProxy': 'http://[1001:db8::1]:3128', 'noProxy': '1001:db8::/120,2003:db8::/112,2002:db8::/53,.test-infra-cluster-assisted-installer.redhat.com'}, It does seem the same so that's real weird Test-infra does this test on each PR with no issue whatsoever
@sasha as it's a disconnected cluster, you should have set up image mirroring during the deployment of Assisted Installer. Trying to pull an image from quay.io doesn't make sense in this case.
@sasha How did you succeed to deploy the operator on the spoke cluster? I can't even reach a running assisted service when trying to use IPv6 with a proxy. It looks like OMP has the same problem of not respecting the cluster-wide proxy. # oc logs quay-io-ocpmetal-assisted-service-operator-bundle-latest -n assisted-installer time="2021-06-15T13:42:04Z" level=info msg="adding to the registry" bundles="[quay.io/ocpmetal/assisted-service-operator-bundle:latest]" time="2021-06-15T13:42:04Z" level=error msg="permissive mode disabled" bundles="[quay.io/ocpmetal/assisted-service-operator-bundle:latest]" error="[error resolving name : failed to do request: Head \"https://quay.io/v2/ocpmetal/assisted-service-operator-bundle/manifests/latest\": dial tcp 34.224.196.162:443: connect: network is unreachable, image \"quay.io/ocpmetal/assisted-service-operator-bundle:latest\": not found]" Error: [error resolving name : failed to do request: Head "https://quay.io/v2/ocpmetal/assisted-service-operator-bundle/manifests/latest": dial tcp 34.224.196.162:443: connect: network is unreachable, image "quay.io/ocpmetal/assisted-service-operator-bundle:latest": not found] Usage: opm registry add [flags] Flags: -b, --bundle-images strings comma separated list of links to bundle image --ca-file string the root certificates to use when --container-tool=none; see docker/podman docs for certificate loading instructions -c, --container-tool string tool to interact with container images (save, build, etc.). One of: [none, docker, podman] (default "none") -d, --database string relative path to database file (default "bundles.db") --debug enable debug logging -h, --help help for add --mode string graph update mode that defines how channel graphs are updated. One of: [replaces, semver, semver-skippatch] (default "replaces") --permissive allow registry load errors Global Flags: --skip-tls skip TLS certificate verification for container image registries while pulling bundles or index
I appended in the hub's cluster assisted-service-operator.v0.0.4 CSV the following entries to the spec: env: - name: IPV6_SUPPORT value: "true" - name: SERVICE_IMAGE value: <FQDN>:5000/ocpmetal/assisted-service:latest - name: DATABASE_IMAGE value: <FQDN>:5000/ocpmetal/postgresql-12-centos7:latest - name: AGENT_IMAGE value: <FQDN>:5000/ocpmetal/assisted-installer-agent:latest - name: CONTROLLER_IMAGE value: <FQDN>:5000/ocpmetal/assisted-installer-controller:latest - name: INSTALLER_IMAGE value: <FQDN>:5000/ocpmetal/assisted-installer:latest
Ok, let me sum up. 1. You have a cluster running over IPv6. 2. An IPv6 cluster cannot directly access quay.io (or any website for that matter, because in our lab we don't have IPv6 routing outside). 3. Therefore, you installed assisted operator by using a mirror registry. 4. You also have a cluster-wide proxy configured on the cluster. 5. When trying to use the operator for cluster installation (spoke clusters), you wanted it to use the proxy instead of the mirror registry. 6. Assisted service (run by the operator) failed to access quay.io, which it needs to extract some info from the OCP release image. 7. Apparently, the service was not using the proxy. According to my investigation, this particular bug may be caused by the fact that we don't copy proxy definitions from the operator to service pods. The operator gets these definitions automatically by OLM, but does not propagate them. See https://docs.openshift.com/container-platform/4.7/operators/admin/olm-configuring-proxy-support.html > Operators must handle setting environment variables for proxy settings in the pods for any managed Operands. There also seems to be a broader problem of components not using cluster-wide proxy configuration, which you won't encounter when using a mirror registry.
Related bug https://bugzilla.redhat.com/show_bug.cgi?id=1972417
Submitted a fix. I was able to install the operator and apply CRs on an IPv6-only cluster using a proxy - without a mirror registry. I'm not sure why operator installation on IPv6 without a mirror failed before that, but it's OK now and I could test the fix.
@alazar Don't we want this one in 4.8 also?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759