Hi, Evan Will you please help to have a check ? This bug is blocking our current test, please see the reproduce steps as below: 1. Install ocp4.1.4 2. Add new el7 worker 3. Create project # oc create ns kubevirt-hyperconverged # oc project kubevirt-hyperconverged 4. Login to the web console, click the kubevirt-hyperconverged project, click Catalog -> Operator management, click create Subscription, no Error 5. Create an OperatorGroup [root@hp-dl360g9-16 ~]# cat <<EOF | oc create -f - > apiVersion: operators.coreos.com/v1alpha2 > kind: OperatorGroup > metadata: > name: hco-operatorgroup > namespace: kubevirt-hyperconverged > EOF operatorgroup.operators.coreos.com/hco-operatorgroup created 6. Add patch for insecureRegistries oc patch --type=merge --patch='{"spec":{"registrySources":{"insecureRegistries":["brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888"]}}}' image.config.openshift.io/cluster image.config.openshift.io/cluster patched Or Edit /etc/containers/registries.conf: [registries.insecure] registries = ["brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888"] # systemctl restart crio 7. Create a catalog source [root@hp-dl360g9-16 ~]# cat <<EOF | oc create -f - > apiVersion: operators.coreos.com/v1alpha1 > kind: CatalogSource > metadata: > name: hco-catalogsource > namespace: openshift-operator-lifecycle-manager > imagePullPolicy: Always > spec: > sourceType: grpc > image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/container-native-virtualization/hco-bundle-registry:v2.0.0-36 > displayName: KubeVirt HyperConverged > publisher: Red Hat > EOF catalogsource.operators.coreos.com/hco-catalogsource created 8. Check the pod: hco-catalogsource-qzt4r is running: [root@hp-dl360g9-16 ~]# oc get pods -n openshift-operator-lifecycle-manager NAME READY STATUS RESTARTS AGE catalog-operator-dc45db975-zsgk9 1/1 Running 0 10d hco-catalogsource-qzt4r 0/1 Running 0 54s ...... 9. Login to the web console, click the kubevirt-hyperconverged project, click Catalog -> Operator management, click create Subscription, get error: https://user-images.githubusercontent.com/15416633/60945176-75f6d300-a31d-11e9-871b-242eb310538f.png Regards, Chenli Hu
It looks like the hco catalogsource pod is not ready, right? Can you look at that pod or grab logs? Even if that is the issue the packageserver shouldn’t 504 just because one catalog is bad. We will try to make a small repro for this.
Evan, > It looks like the hco catalogsource pod is not ready, right? Can you look at that pod or grab logs? This issue also occurs after a rhel7(location: Beijing) worker added in even if do nothing. I have added the cluster info on above "Additional info:" section for your deep debugging.
Ryan, FYI
Hey Chenli, I have been trying to reproduce this issue with a cluster 4.1.4 that is created by cluster-bot. However, unfortunately, I'm unable to reproduce at this point. There are two things that I have done differently from the steps that you provided: 1. I didn't add new el7 worker node as the cluster already has multiple worker nodes already. 2. I have tried to add brew image reference but it didn't work due to some VPN access problem with the AWS cluster that I have. Instead, I use an actual "hco-bundle-registry" in registry.redhat.io. (registry.redhat.io/container-native-virtualization/hco-bundle-registry:v2.0.0). My experience was that the "hco-catalogsource" pod was up and running as expected. No errors were noticed on my end. I was able to create subscription for kubevirt-hyperconverged just fine. Thanks, Vu
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0691