Bug 2089720 - [Hypershift] ICSP doesn't work for the guest cluster
Summary: [Hypershift] ICSP doesn't work for the guest cluster
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: HyperShift
Version: 4.11
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.11.0
Assignee: Alberto
QA Contact: He Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-24 10:23 UTC by Jian Zhang
Modified: 2022-08-10 11:13 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 11:13:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:13:54 UTC

Description Jian Zhang 2022-05-24 10:23:31 UTC
Description of problem:
ICSP doesn't work for the guest cluster. I know the root cause is that MCO not installed for the guest cluster. But, it blocks some OLM operators testing. so, I'd like to use this bug to trace it, please feel free to close it if you have any concerns, thanks!

mac:~ jianzhang$ oc get pods
NAME                                   READY   STATUS             RESTARTS   AGE
descheduler-operator-f7b4b55d6-ngkgg   0/1     ImagePullBackOff   0          143m
mac:~ jianzhang$ oc get pods -o wide
NAME                                   READY   STATUS             RESTARTS   AGE    IP            NODE                                        NOMINATED NODE   READINESS GATES
descheduler-operator-f7b4b55d6-ngkgg   0/1     ImagePullBackOff   0          143m   10.135.1.64   ip-10-0-133-27.us-east-2.compute.internal   <none>           <none>

  Warning  Failed          121m (x4 over 122m)     kubelet            Error: ErrImagePull
  Normal   BackOff         2m32s (x527 over 122m)  kubelet            Back-off pulling image "registry.redhat.io/openshift4/ose-cluster-kube-descheduler-operator@sha256:b860f12876c18ee353d732076133a613e321305aa32255f268d952f5e03ed3a3"

Version-Release number of selected component (if applicable):
4.11

How reproducible:
always

Steps to Reproduce:
1. Install a Hypershift cluster.

2. Create an ICSP resource for unreleased image testing(registry.redhat.io).
mac:~ jianzhang$ oc get imagecontentsourcepolicy brew-registry -o yaml
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"operator.openshift.io/v1alpha1","kind":"ImageContentSourcePolicy","metadata":{"annotations":{},"name":"brew-registry"},"spec":{"repositoryDigestMirrors":[{"mirrors":["brew.registry.redhat.io"],"source":"registry.redhat.io"},{"mirrors":["brew.registry.redhat.io"],"source":"registry.stage.redhat.io"},{"mirrors":["brew.registry.redhat.io"],"source":"registry-proxy.engineering.redhat.com"}]}}
  creationTimestamp: "2022-05-23T08:15:58Z"
  generation: 1
  name: brew-registry
  resourceVersion: "44804"
  uid: d7069bc2-e2fb-4573-9bbc-4ff82fbdf80b
spec:
  repositoryDigestMirrors:
  - mirrors:
    - brew.registry.redhat.io
    source: registry.redhat.io
  - mirrors:
    - brew.registry.redhat.io
    source: registry.stage.redhat.io
  - mirrors:
    - brew.registry.redhat.io
    source: registry-proxy.engineering.redhat.com


3. Check if the ICSP take effective.

mac:~ jianzhang$ oc debug node/ip-10-0-133-27.us-east-2.compute.internal
Warning: would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "container-00" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "container-00" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "host" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "container-00" must set securityContext.runAsNonRoot=true), runAsUser=0 (container "container-00" must not set runAsUser=0), seccompProfile (pod or container "container-00" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Starting pod/ip-10-0-133-27us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.133.27
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4#
sh-4.4# cat /etc/containers/registries.conf
unqualified-search-registries = ['registry.access.redhat.com', 'docker.io']


Actual results:
ICSP doesn't take effec.

Expected results:
ICSP can be made successfully in the guest cluster.


Additional info:

Comment 1 Cesar Wong 2022-05-24 20:03:32 UTC
The issue is that for HyperShift clusters, the ICSP inside the guest cluster is completely ignored.
However the same mapping can be specified on the HostedCluster resource itself:
https://github.com/openshift/hypershift/blob/85a1e6b31352cf9ca3393e3cdce37dc6599faf23/api/v1alpha1/hostedcluster_types.go#L244

Comment 3 He Liu 2022-06-13 15:45:18 UTC
The specification mentioned by Cesar to support ICSP on HyperShift guest cluster has been verified.

The specification in the `hostedcluster` resource is :

- apiVersion: hypershift.openshift.io/v1alpha1
  kind: HostedCluster
  metadata:
    name: guest-cluster
    namespace: clusters
  spec:
    imageContentSources:
    - mirrors:
      - heli-dis-0601v2.mirror-registry.qe.azure.devcluster.openshift.com:6002
      source: registry.stage.redhat.io

1) When the guest cluster is created for the first time, it takes effect for the ICSP specification when the worker nodes are created
2) For day 2 operation, just need edit the `hostedcluster` resource. The original worker nodes will be destroyed and new nodes are created with the newest ICSP spec.

Comment 4 Jian Zhang 2022-06-14 07:22:39 UTC
Thanks, He!

After adding the `imageContentSources` by updating the hostedcluster, I can see the new worker node created.(about 15mins later)
mac:~ jianzhang$ oc edit hostedcluster hypershift-ci-24127 

mac:~ jianzhang$ oc get nodes
NAME                                         STATUS   ROLES    AGE     VERSION
ip-10-0-131-251.us-east-2.compute.internal   Ready    worker   23m     v1.24.0+bb9c2f1
ip-10-0-131-38.us-east-2.compute.internal    Ready    worker   12m     v1.24.0+bb9c2f1
ip-10-0-133-13.us-east-2.compute.internal    Ready    worker   7m29s   v1.24.0+bb9c2f1
ip-10-0-133-55.us-east-2.compute.internal    Ready    worker   17m     v1.24.0+bb9c2f1
mac:~ jianzhang$ oc debug node/ip-10-0-131-251.us-east-2.compute.internal
Warning: would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true, hostPID=true), privileged (container "container-00" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "container-00" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "container-00" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "host" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "container-00" must set securityContext.runAsNonRoot=true), runAsUser=0 (container "container-00" must not set runAsUser=0), seccompProfile (pod or container "container-00" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Starting pod/ip-10-0-131-251us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.131.251
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# cat /etc/containers/registries.conf
unqualified-search-registries = ["registry.access.redhat.com", "docker.io"]
short-name-mode = ""

[[registry]]
  prefix = ""
  location = "registry-proxy.engineering.redhat.com"
  mirror-by-digest-only = true

  [[registry.mirror]]
    location = "brew.registry.redhat.io"

[[registry]]
  prefix = ""
  location = "registry.redhat.io"
  mirror-by-digest-only = true

  [[registry.mirror]]
    location = "brew.registry.redhat.io"

[[registry]]
  prefix = ""
  location = "registry.stage.redhat.io"
  mirror-by-digest-only = true

  [[registry.mirror]]
    location = "brew.registry.redhat.io"
sh-4.4# exit
exit
sh-4.4# exit
exit

The OLM operators can be installed successfully, thanks!
mac:~ jianzhang$ oc get sub -A
NAMESPACE   NAME                                PACKAGE                             SOURCE             CHANNEL
default     cluster-kube-descheduler-operator   cluster-kube-descheduler-operator   redhat-operators   stable
mac:~ jianzhang$ oc get ip -n default
NAME            CSV                                                 APPROVAL    APPROVED
install-sgb5f   clusterkubedescheduleroperator.4.9.0-202205311507   Automatic   true
mac:~ jianzhang$ oc get csv -n default
NAME                                                DISPLAY                     VERSION              REPLACES   PHASE
clusterkubedescheduleroperator.4.9.0-202205311507   Kube Descheduler Operator   4.9.0-202205311507              Succeeded
mac:~ jianzhang$ oc get pods -n default
NAME                                   READY   STATUS    RESTARTS   AGE
descheduler-operator-5df6774f8-nkcc6   1/1     Running   0          20s


Hope the guest cluster doesn't need to restart or recreate the worker nodes anymore in the future. But, for now, LGTM.

Comment 6 errata-xmlrpc 2022-08-10 11:13:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.