Description of problem: ICSP doesn't work for the guest cluster. I know the root cause is that MCO not installed for the guest cluster. But, it blocks some OLM operators testing. so, I'd like to use this bug to trace it, please feel free to close it if you have any concerns, thanks! mac:~ jianzhang$ oc get pods NAME READY STATUS RESTARTS AGE descheduler-operator-f7b4b55d6-ngkgg 0/1 ImagePullBackOff 0 143m mac:~ jianzhang$ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES descheduler-operator-f7b4b55d6-ngkgg 0/1 ImagePullBackOff 0 143m 10.135.1.64 ip-10-0-133-27.us-east-2.compute.internal <none> <none> Warning Failed 121m (x4 over 122m) kubelet Error: ErrImagePull Normal BackOff 2m32s (x527 over 122m) kubelet Back-off pulling image "registry.redhat.io/openshift4/ose-cluster-kube-descheduler-operator@sha256:b860f12876c18ee353d732076133a613e321305aa32255f268d952f5e03ed3a3" Version-Release number of selected component (if applicable): 4.11 How reproducible: always Steps to Reproduce: 1. Install a Hypershift cluster. 2. Create an ICSP resource for unreleased image testing(registry.redhat.io). mac:~ jianzhang$ oc get imagecontentsourcepolicy brew-registry -o yaml apiVersion: operator.openshift.io/v1alpha1 kind: ImageContentSourcePolicy metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"operator.openshift.io/v1alpha1","kind":"ImageContentSourcePolicy","metadata":{"annotations":{},"name":"brew-registry"},"spec":{"repositoryDigestMirrors":[{"mirrors":["brew.registry.redhat.io"],"source":"registry.redhat.io"},{"mirrors":["brew.registry.redhat.io"],"source":"registry.stage.redhat.io"},{"mirrors":["brew.registry.redhat.io"],"source":"registry-proxy.engineering.redhat.com"}]}} creationTimestamp: "2022-05-23T08:15:58Z" generation: 1 name: brew-registry resourceVersion: "44804" uid: d7069bc2-e2fb-4573-9bbc-4ff82fbdf80b spec: repositoryDigestMirrors: - mirrors: - brew.registry.redhat.io source: registry.redhat.io - mirrors: - brew.registry.redhat.io source: registry.stage.redhat.io - mirrors: - brew.registry.redhat.io source: registry-proxy.engineering.redhat.com 3. Check if the ICSP take effective. mac:~ jianzhang$ oc debug node/ip-10-0-133-27.us-east-2.compute.internal Warning: would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "container-00" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "container-00" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "host" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "container-00" must set securityContext.runAsNonRoot=true), runAsUser=0 (container "container-00" must not set runAsUser=0), seccompProfile (pod or container "container-00" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost") Starting pod/ip-10-0-133-27us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.133.27 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# sh-4.4# cat /etc/containers/registries.conf unqualified-search-registries = ['registry.access.redhat.com', 'docker.io'] Actual results: ICSP doesn't take effec. Expected results: ICSP can be made successfully in the guest cluster. Additional info:
The issue is that for HyperShift clusters, the ICSP inside the guest cluster is completely ignored. However the same mapping can be specified on the HostedCluster resource itself: https://github.com/openshift/hypershift/blob/85a1e6b31352cf9ca3393e3cdce37dc6599faf23/api/v1alpha1/hostedcluster_types.go#L244
The specification mentioned by Cesar to support ICSP on HyperShift guest cluster has been verified. The specification in the `hostedcluster` resource is : - apiVersion: hypershift.openshift.io/v1alpha1 kind: HostedCluster metadata: name: guest-cluster namespace: clusters spec: imageContentSources: - mirrors: - heli-dis-0601v2.mirror-registry.qe.azure.devcluster.openshift.com:6002 source: registry.stage.redhat.io 1) When the guest cluster is created for the first time, it takes effect for the ICSP specification when the worker nodes are created 2) For day 2 operation, just need edit the `hostedcluster` resource. The original worker nodes will be destroyed and new nodes are created with the newest ICSP spec.
Thanks, He! After adding the `imageContentSources` by updating the hostedcluster, I can see the new worker node created.(about 15mins later) mac:~ jianzhang$ oc edit hostedcluster hypershift-ci-24127 mac:~ jianzhang$ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-131-251.us-east-2.compute.internal Ready worker 23m v1.24.0+bb9c2f1 ip-10-0-131-38.us-east-2.compute.internal Ready worker 12m v1.24.0+bb9c2f1 ip-10-0-133-13.us-east-2.compute.internal Ready worker 7m29s v1.24.0+bb9c2f1 ip-10-0-133-55.us-east-2.compute.internal Ready worker 17m v1.24.0+bb9c2f1 mac:~ jianzhang$ oc debug node/ip-10-0-131-251.us-east-2.compute.internal Warning: would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true, hostPID=true), privileged (container "container-00" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "container-00" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "container-00" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "host" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "container-00" must set securityContext.runAsNonRoot=true), runAsUser=0 (container "container-00" must not set runAsUser=0), seccompProfile (pod or container "container-00" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost") Starting pod/ip-10-0-131-251us-east-2computeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.131.251 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# cat /etc/containers/registries.conf unqualified-search-registries = ["registry.access.redhat.com", "docker.io"] short-name-mode = "" [[registry]] prefix = "" location = "registry-proxy.engineering.redhat.com" mirror-by-digest-only = true [[registry.mirror]] location = "brew.registry.redhat.io" [[registry]] prefix = "" location = "registry.redhat.io" mirror-by-digest-only = true [[registry.mirror]] location = "brew.registry.redhat.io" [[registry]] prefix = "" location = "registry.stage.redhat.io" mirror-by-digest-only = true [[registry.mirror]] location = "brew.registry.redhat.io" sh-4.4# exit exit sh-4.4# exit exit The OLM operators can be installed successfully, thanks! mac:~ jianzhang$ oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL default cluster-kube-descheduler-operator cluster-kube-descheduler-operator redhat-operators stable mac:~ jianzhang$ oc get ip -n default NAME CSV APPROVAL APPROVED install-sgb5f clusterkubedescheduleroperator.4.9.0-202205311507 Automatic true mac:~ jianzhang$ oc get csv -n default NAME DISPLAY VERSION REPLACES PHASE clusterkubedescheduleroperator.4.9.0-202205311507 Kube Descheduler Operator 4.9.0-202205311507 Succeeded mac:~ jianzhang$ oc get pods -n default NAME READY STATUS RESTARTS AGE descheduler-operator-5df6774f8-nkcc6 1/1 Running 0 20s Hope the guest cluster doesn't need to restart or recreate the worker nodes anymore in the future. But, for now, LGTM.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069