Description of problem: after apply community catalog source, the pod will continue to restart because healthcheck failed. Version-Release number of selected component (if applicable): all versions in 4.3.* How reproducible: On aws, it seems ok, this is because the hareware performence is good on aws, so the healthcheck on catalog srouce will return ok. Steps to Reproduce: 1. create the community catalog source on platform other than aws cat <<EOF > community-operator-catalog.yaml apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: community-operator-catalog namespace: openshift-marketplace spec: displayName: Community Operator Catalog sourceType: grpc image: docker.io/wangzheng422/operator-catalog:community-2020-02-29 publisher: Community EOF oc create -f community-operator-catalog.yaml 2. you can see the pod will restart always. 3. Actual results: community operator catalog pod will continue restart because of healthcheck failed. Expected results: community operator catalog pod will run smoothly. Additional info:
I just fixed it, and submit a github pull request.
*** Bug 1816986 has been marked as a duplicate of this bug. ***
Please note that the root cause of this issue has been addressed in a recent PR: https://github.com/operator-framework/operator-registry/pull/227 The issue is that on some systems, a lack of an nsswitch config means that dns will attempt to resolve from external sources before local files (so resolving `localhost` could take a long time).
[hui@localhost work]$ oc version Client Version: 4.5.0-202004062101-f2b01c4 Server Version: 4.5.0-0.nightly-2020-04-14-221451 Kubernetes Version: v1.18.0-rc.1 Test the case on azure. Steps to test: 1. create the community catalog source on platform azure. cat <<EOF > community-operator-catalog.yaml apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: community-operator-catalog namespace: openshift-marketplace spec: displayName: Community Operator Catalog sourceType: grpc image: docker.io/wangzheng422/operator-catalog:community-2020-02-29 publisher: Community EOF oc create -f community-operator-catalog.yaml 2. The community operator catalog pod is running. [hui@localhost test]$ oc get CatalogSource -n openshift-marketplace NAME DISPLAY TYPE PUBLISHER AGE certified-operators Certified Operators grpc Red Hat 100m community-operator-catalog Community Operator Catalog grpc Community 9s community-operators Community Operators grpc Red Hat 100m redhat-marketplace Red Hat Marketplace grpc Red Hat 100m redhat-operators Red Hat Operators grpc Red Hat 100m [hui@localhost test]$ oc get pods -n openshift-marketplace NAME READY STATUS RESTARTS AGE certified-operators-789976dc4-5ffsw 1/1 Running 0 88m community-operator-catalog-cksdq 1/1 Running 0 10m community-operators-5b948bd55-sb76t 1/1 Running 0 88m marketplace-operator-6f979dc485-sgq6d 1/1 Running 0 89m redhat-marketplace-5c87b57d76-xh2lz 1/1 Running 0 88m redhat-operators-7c8fb9bcfd-4nwk9 1/1 Running 0 88m The result is what we expected.
The bug has been verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409