Description of problem: message: "her.go:105 +0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76 +0x130\n\ngoroutine 5545 [select, 7 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc00240a780, 0xc003321a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345 +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0xc002efea80?, 0xc0009cc7a0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207 +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136 +0x30a\n\ngoroutine 5678 [select, 3 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc000b70480, 0xc0035d4500)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345 +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc002999e90?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207 +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136 +0x30a\n\ngoroutine 5836 [select, 1 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc003b00180, 0xc003ff8a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345 +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc003a1c8d0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207 +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136 +0x30a\n\ngoroutine 5905 [chan receive, 1 minutes]:\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel.func1()\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:51 +0x85\ncreated by github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:50 +0x231\n" reason: Error startedAt: "2022-06-27T00:00:59Z" Version-Release number of selected component (if applicable): mac:~ jianzhang$ oc exec catalog-operator-66cb8fd8c5-j7vkx -- olm --version OLM version: 0.19.0 git commit: 8c2bd46147a90d58e98de73d34fd79477769f11f mac:~ jianzhang$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-06-25-081133 True False 10h Cluster version is 4.11.0-0.nightly-2022-06-25-081133 How reproducible: always Steps to Reproduce: 1. Install OCP 4.11 2. Check OLM pods Actual results: mac:~ jianzhang$ oc get pods NAME READY STATUS RESTARTS AGE catalog-operator-66cb8fd8c5-j7vkx 1/1 Running 2 (8h ago) 10h collect-profiles-27605340-wgsvf 0/1 Completed 0 42m collect-profiles-27605355-ffgxd 0/1 Completed 0 27m collect-profiles-27605370-w7ds7 0/1 Completed 0 12m olm-operator-6cfd444b8f-r5q4t 1/1 Running 0 10h package-server-manager-66589d4bf8-csr7j 1/1 Running 0 10h packageserver-59977db6cf-nkn5w 1/1 Running 0 10h packageserver-59977db6cf-nxbnx 1/1 Running 0 10h mac:~ jianzhang$ oc get pods catalog-operator-66cb8fd8c5-j7vkx -o yaml apiVersion: v1 kind: Pod metadata: annotations: k8s.v1.cni.cncf.io/network-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.130.0.26" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.130.0.26" ], "default": true, "dns": {} }] openshift.io/scc: nonroot-v2 seccomp.security.alpha.kubernetes.io/pod: runtime/default creationTimestamp: "2022-06-26T23:12:45Z" generateName: catalog-operator-66cb8fd8c5- labels: app: catalog-operator pod-template-hash: 66cb8fd8c5 name: catalog-operator-66cb8fd8c5-j7vkx namespace: openshift-operator-lifecycle-manager ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: catalog-operator-66cb8fd8c5 uid: bcf173be-97bc-4152-8cec-d45f820e167c resourceVersion: "67395" uid: bb34c37b-b22d-4412-bbc0-1e57a1b2bd3a spec: containers: - args: - --namespace - openshift-marketplace - --configmapServerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:033bd57b54bb2e827b0ae95a03cb6f7ceeb65422e85de635185cfed88c17b2b4 - --opmImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:033bd57b54bb2e827b0ae95a03cb6f7ceeb65422e85de635185cfed88c17b2b4 - --util-image - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8 - --writeStatusName - operator-lifecycle-manager-catalog - --tls-cert - /srv-cert/tls.crt - --tls-key - /srv-cert/tls.key - --client-ca - /profile-collector-cert/tls.crt command: - /bin/catalog env: - name: RELEASE_VERSION value: 4.11.0-0.nightly-2022-06-25-081133 image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /healthz port: 8443 scheme: HTTPS periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: catalog-operator ports: - containerPort: 8443 name: metrics protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /healthz port: 8443 scheme: HTTPS periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: requests: cpu: 10m memory: 80Mi securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /srv-cert name: srv-cert readOnly: true - mountPath: /profile-collector-cert name: profile-collector-cert readOnly: true - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7txv8 readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true nodeName: ip-10-0-173-128.ap-southeast-1.compute.internal nodeSelector: kubernetes.io/os: linux node-role.kubernetes.io/master: "" preemptionPolicy: PreemptLowerPriority priority: 2000000000 priorityClassName: system-cluster-critical restartPolicy: Always schedulerName: default-scheduler securityContext: runAsNonRoot: true runAsUser: 65534 seLinuxOptions: level: s0:c20,c0 seccompProfile: type: RuntimeDefault serviceAccount: olm-operator-serviceaccount serviceAccountName: olm-operator-serviceaccount terminationGracePeriodSeconds: 30 tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 120 - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 120 - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists volumes: - name: srv-cert secret: defaultMode: 420 secretName: catalog-operator-serving-cert - name: profile-collector-cert secret: defaultMode: 420 secretName: pprof-cert - name: kube-api-access-7txv8 projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace - configMap: items: - key: service-ca.crt path: service-ca.crt name: openshift-service-ca.crt status: conditions: - lastProbeTime: null lastTransitionTime: "2022-06-26T23:16:59Z" status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2022-06-27T01:19:09Z" status: "True" type: Ready - lastProbeTime: null lastTransitionTime: "2022-06-27T01:19:09Z" status: "True" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2022-06-26T23:16:59Z" status: "True" type: PodScheduled containerStatuses: - containerID: cri-o://f3fce12480556b5ab279236c290acdf15e1bc850426078730dcfd333ecda6795 image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8 imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8 lastState: terminated: containerID: cri-o://5e3ce04f1cf538f679af008dcb62f9477131aae1772f527d6207fdc7bb247a55 exitCode: 2 finishedAt: "2022-06-27T01:19:08Z" message: "her.go:105 +0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76 +0x130\n\ngoroutine 5545 [select, 7 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc00240a780, 0xc003321a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345 +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0xc002efea80?, 0xc0009cc7a0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207 +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136 +0x30a\n\ngoroutine 5678 [select, 3 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc000b70480, 0xc0035d4500)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345 +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc002999e90?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207 +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136 +0x30a\n\ngoroutine 5836 [select, 1 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc003b00180, 0xc003ff8a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345 +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc003a1c8d0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207 +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136 +0x30a\n\ngoroutine 5905 [chan receive, 1 minutes]:\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel.func1()\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:51 +0x85\ncreated by github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:50 +0x231\n" reason: Error startedAt: "2022-06-27T00:00:59Z" name: catalog-operator ready: true restartCount: 2 started: true state: running: startedAt: "2022-06-27T01:19:08Z" hostIP: 10.0.173.128 phase: Running podIP: 10.130.0.26 podIPs: - ip: 10.130.0.26 qosClass: Burstable startTime: "2022-06-26T23:16:59Z" Expected results: catalog-operator works well. Additional info: Operators can be subscribed successfully. mac:~ jianzhang$ oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL jian learn learn qe-app-registry beta openshift-logging cluster-logging cluster-logging qe-app-registry stable openshift-operators-redhat elasticsearch-operator elasticsearch-operator qe-app-registry stable mac:~ jianzhang$ mac:~ jianzhang$ oc get pods -n jian NAME READY STATUS RESTARTS AGE 552b4660850a7fe1e1f142091eb5e4305f18af151727c56f70aa5dffc1dg8cg 0/1 Completed 0 54m learn-operator-666b687bfb-7qppm 1/1 Running 0 54m qe-app-registry-hbzxg 1/1 Running 0 58m mac:~ jianzhang$ oc get csv -n jian NAME DISPLAY VERSION REPLACES PHASE elasticsearch-operator.v5.5.0 OpenShift Elasticsearch Operator 5.5.0 Succeeded learn-operator.v0.0.3 Learn Operator 0.0.3 learn-operator.v0.0.2 Succeeded
mac:~ jianzhang$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-06-25-132614 True False 145m Cluster version is 4.11.0-0.nightly-2022-06-25-132614 mac:~ jianzhang$ oc get pods NAME READY STATUS RESTARTS AGE catalog-operator-869fb4bd4d-lbhgj 1/1 Running 3 (9m25s ago) 170m collect-profiles-27606330-4wg5r 0/1 Completed 0 33m collect-profiles-27606345-lmk4q 0/1 Completed 0 18m collect-profiles-27606360-mksv6 0/1 Completed 0 3m17s olm-operator-5f485d9d5f-wczjc 1/1 Running 0 170m package-server-manager-6cf996b4cc-79lrw 1/1 Running 2 (156m ago) 170m packageserver-5f668f98d7-2vjdn 1/1 Running 0 165m packageserver-5f668f98d7-mb2wc 1/1 Running 0 165m mac:~ jianzhang$ oc get pods catalog-operator-869fb4bd4d-lbhgj -o yaml apiVersion: v1 kind: Pod metadata: annotations: k8s.v1.cni.cncf.io/network-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.130.0.34" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.130.0.34" ], "default": true, "dns": {} }] openshift.io/scc: nonroot-v2 seccomp.security.alpha.kubernetes.io/pod: runtime/default creationTimestamp: "2022-06-27T23:13:12Z" generateName: catalog-operator-869fb4bd4d- labels: app: catalog-operator pod-template-hash: 869fb4bd4d name: catalog-operator-869fb4bd4d-lbhgj namespace: openshift-operator-lifecycle-manager ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: catalog-operator-869fb4bd4d uid: 3a1a8cd3-2151-4650-a96b-c1951d461c67 resourceVersion: "75671" uid: 2f06d663-0697-4e88-9e9a-f802c6641efe spec: containers: - args: - --namespace - openshift-marketplace - --configmapServerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000 - --opmImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000 - --util-image - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 - --writeStatusName - operator-lifecycle-manager-catalog - --tls-cert - /srv-cert/tls.crt - --tls-key - /srv-cert/tls.key - --client-ca - /profile-collector-cert/tls.crt command: - /bin/catalog env: - name: RELEASE_VERSION value: 4.11.0-0.nightly-2022-06-25-132614 image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /healthz port: 8443 scheme: HTTPS periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: catalog-operator ports: - containerPort: 8443 name: metrics protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /healthz port: 8443 scheme: HTTPS periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: requests: cpu: 10m memory: 80Mi securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /srv-cert name: srv-cert readOnly: true - mountPath: /profile-collector-cert name: profile-collector-cert readOnly: true - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-sdjns readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true nodeName: ip-10-0-190-130.ap-south-1.compute.internal nodeSelector: kubernetes.io/os: linux node-role.kubernetes.io/master: "" preemptionPolicy: PreemptLowerPriority priority: 2000000000 priorityClassName: system-cluster-critical restartPolicy: Always schedulerName: default-scheduler securityContext: runAsNonRoot: true runAsUser: 65534 seLinuxOptions: level: s0:c20,c0 seccompProfile: type: RuntimeDefault serviceAccount: olm-operator-serviceaccount serviceAccountName: olm-operator-serviceaccount terminationGracePeriodSeconds: 30 tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 120 - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 120 - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists volumes: - name: srv-cert secret: defaultMode: 420 secretName: catalog-operator-serving-cert - name: profile-collector-cert secret: defaultMode: 420 secretName: pprof-cert - name: kube-api-access-sdjns projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace - configMap: items: - key: service-ca.crt path: service-ca.crt name: openshift-service-ca.crt status: conditions: - lastProbeTime: null lastTransitionTime: "2022-06-27T23:16:47Z" status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2022-06-28T01:53:53Z" status: "True" type: Ready - lastProbeTime: null lastTransitionTime: "2022-06-28T01:53:53Z" status: "True" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2022-06-27T23:16:47Z" status: "True" type: PodScheduled containerStatuses: - containerID: cri-o://2a83d3ba503aa760d27aeef28bf03af84b9ff134ea76c6545251f3253507c22b image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 lastState: terminated: containerID: cri-o://2766bb3bad38c02f531915aac0321ffb18c7a993a7d84b27e8c7199935f7c858 exitCode: 2 finishedAt: "2022-06-28T01:53:52Z" message: "izer/streaming/streaming.go:77 +0xa7\nk8s.io/client-go/rest/watch.(*Decoder).Decode(0xc002b12500)\n\t/build/vendor/k8s.io/client-go/rest/watch/decoder.go:49 +0x4f\nk8s.io/apimachinery/pkg/watch.(*StreamWatcher).receive(0xc0036178c0)\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:105 +0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76 +0x130\n\ngoroutine 3479 [sync.Cond.Wait]:\nsync.runtime_notifyListWait(0xc003a987c8, 0x8)\n\t/usr/lib/golang/src/runtime/sema.go:513 +0x13d\nsync.(*Cond).Wait(0xc001f9bf10?)\n\t/usr/lib/golang/src/sync/cond.go:56 +0x8c\ngolang.org/x/net/http2.(*pipe).Read(0xc003a987b0, {0xc0036e4001, 0xdff, 0xdff})\n\t/build/vendor/golang.org/x/net/http2/pipe.go:76 +0xeb\ngolang.org/x/net/http2.transportResponseBody.Read({0x0?}, {0xc0036e4001?, 0x2?, 0x203511d?})\n\t/build/vendor/golang.org/x/net/http2/transport.go:2407 +0x85\nencoding/json.(*Decoder).refill(0xc002fc0640)\n\t/usr/lib/golang/src/encoding/json/stream.go:165 +0x17f\nencoding/json.(*Decoder).readValue(0xc002fc0640)\n\t/usr/lib/golang/src/encoding/json/stream.go:140 +0xbb\nencoding/json.(*Decoder).Decode(0xc002fc0640, {0x1d377c0, 0xc003523098})\n\t/usr/lib/golang/src/encoding/json/stream.go:63 +0x78\nk8s.io/apimachinery/pkg/util/framer.(*jsonFrameReader).Read(0xc003127770, {0xc0036dd500, 0x1000, 0x1500})\n\t/build/vendor/k8s.io/apimachinery/pkg/util/framer/framer.go:152 +0x19c\nk8s.io/apimachinery/pkg/runtime/serializer/streaming.(*decoder).Decode(0xc003502aa0, 0xc001f9bf10?, {0x2366870, 0xc0044dca80})\n\t/build/vendor/k8s.io/apimachinery/pkg/runtime/serializer/streaming/streaming.go:77 +0xa7\nk8s.io/client-go/rest/watch.(*Decoder).Decode(0xc00059f700)\n\t/build/vendor/k8s.io/client-go/rest/watch/decoder.go:49 +0x4f\nk8s.io/apimachinery/pkg/watch.(*StreamWatcher).receive(0xc0044dcd40)\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:105 +0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76 +0x130\n" reason: Error startedAt: "2022-06-28T01:06:59Z" name: catalog-operator ready: true restartCount: 3 started: true state: running: startedAt: "2022-06-28T01:53:53Z" hostIP: 10.0.190.130 phase: Running podIP: 10.130.0.34 podIPs: - ip: 10.130.0.34 qosClass: Burstable startTime: "2022-06-27T23:16:47Z"
mac:~ jianzhang$ oc get pods package-server-manager-6cf996b4cc-79lrw -o yaml apiVersion: v1 kind: Pod metadata: annotations: k8s.v1.cni.cncf.io/network-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.130.0.13" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.130.0.13" ], "default": true, "dns": {} }] openshift.io/scc: nonroot-v2 seccomp.security.alpha.kubernetes.io/pod: runtime/default creationTimestamp: "2022-06-27T23:13:10Z" generateName: package-server-manager-6cf996b4cc- labels: app: package-server-manager pod-template-hash: 6cf996b4cc name: package-server-manager-6cf996b4cc-79lrw namespace: openshift-operator-lifecycle-manager ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: package-server-manager-6cf996b4cc uid: 191cb627-a1f3-452b-aed2-336de7871004 resourceVersion: "19056" uid: 63cb7200-32d6-4b0d-bfb2-ba725cda7fbd spec: containers: - args: - --name - $(PACKAGESERVER_NAME) - --namespace - $(PACKAGESERVER_NAMESPACE) command: - /bin/psm - start env: - name: PACKAGESERVER_NAME value: packageserver - name: PACKAGESERVER_IMAGE value: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 - name: PACKAGESERVER_NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace - name: RELEASE_VERSION value: 4.11.0-0.nightly-2022-06-25-132614 image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /healthz port: 8080 scheme: HTTP initialDelaySeconds: 30 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: package-server-manager readinessProbe: failureThreshold: 3 httpGet: path: /healthz port: 8080 scheme: HTTP initialDelaySeconds: 30 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: requests: cpu: 10m memory: 50Mi securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-k9hh2 readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true nodeName: ip-10-0-190-130.ap-south-1.compute.internal nodeSelector: kubernetes.io/os: linux node-role.kubern
preemptionPolicy: PreemptLowerPriority priority: 2000000000 priorityClassName: system-cluster-critical restartPolicy: Always schedulerName: default-scheduler securityContext: runAsNonRoot: true runAsUser: 65534 seLinuxOptions: level: s0:c20,c0 seccompProfile: type: RuntimeDefault serviceAccount: olm-operator-serviceaccount serviceAccountName: olm-operator-serviceaccount terminationGracePeriodSeconds: 30 tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 120 - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 120 - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists volumes: - name: kube-api-access-k9hh2 projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace - configMap: items: - key: service-ca.crt path: service-ca.crt name: openshift-service-ca.crt status: conditions: - lastProbeTime: null lastTransitionTime: "2022-06-27T23:16:47Z" status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2022-06-27T23:27:28Z" status: "True" type: Ready - lastProbeTime: null lastTransitionTime: "2022-06-27T23:27:28Z" status: "True" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2022-06-27T23:16:47Z" status: "True" type: PodScheduled containerStatuses: - containerID: cri-o://2c56089fabbaa6d7067feb231750ad20422e299dc33dabc0cd19ceca5951ed3a image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 lastState: terminated: containerID: cri-o://76cc5448de346ec636070a71868432c8f3aa213e16fe211c8fe18a3fee112d23 exitCode: 1 finishedAt: "2022-06-27T23:26:36Z" message: "d/vendor/github.com/spf13/cobra/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/build/vendor/github.com/spf13/cobra/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/build/vendor/github.com/spf13/cobra/command.go:902\nmain.main\n\t/build/cmd/package-server-manager/main.go:36\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\n1.6563723963629913e+09\tERROR\tFailed to get API Group-Resources\t{\"error\": \"Get \\\"https://172.30.0.1:443/api?timeout=32s\\\": dial tcp 172.30.0.1:443: connect: connection refused\"}\nsigs.k8s.io/controller-runtime/pkg/cluster.New\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/cluster/cluster.go:160\nsigs.k8s.io/controller-runtime/pkg/manager.New\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/manager/manager.go:322\nmain.run\n\t/build/cmd/package-server-manager/main.go:67\ngithub.com/spf13/cobra.(*Command).execute\n\t/build/vendor/github.com/spf13/cobra/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/build/vendor/github.com/spf13/cobra/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/build/vendor/github.com/spf13/cobra/command.go:902\nmain.main\n\t/build/cmd/package-server-manager/main.go:36\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\n1.6563723963631017e+09\tERROR\tsetup\tfailed to setup manager instance\t{\"error\": \"Get \\\"https://172.30.0.1:443/api?timeout=32s\\\": dial tcp 172.30.0.1:443: connect: connection refused\"}\ngithub.com/spf13/cobra.(*Command).execute\n\t/build/vendor/github.com/spf13/cobra/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/build/vendor/github.com/spf13/cobra/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/build/vendor/github.com/spf13/cobra/command.go:902\nmain.main\n\t/build/cmd/package-server-manager/main.go:36\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\nError: Get \"https://172.30.0.1:443/api?timeout=32s\": dial tcp 172.30.0.1:443: connect: connection refused\nencountered an error while executing the binary: Get \"https://172.30.0.1:443/api?timeout=32s\": dial tcp 172.30.0.1:443: connect: connection refused\n" reason: Error startedAt: "2022-06-27T23:26:11Z" name: package-server-manager ready: true restartCount: 2 started: true state: running: startedAt: "2022-06-27T23:26:54Z" hostIP: 10.0.190.130 phase: Running podIP: 10.130.0.13 podIPs: - ip: 10.130.0.13 qosClass: Burstable startTime: "2022-06-27T23:16:47Z"
Observed the error restarts: mac:~ jianzhang$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-06-28-160049 True False 5h57m Cluster version is 4.11.0-0.nightly-2022-06-28-160049 mac:~ jianzhang$ oc get pods NAME READY STATUS RESTARTS AGE catalog-operator-7b88dddfbc-rsfhz 1/1 Running 6 (26m ago) 5h51m collect-profiles-27608160-6m7r6 0/1 Completed 0 37m collect-profiles-27608175-94n56 0/1 Completed 0 22m collect-profiles-27608190-nbzcf 0/1 Completed 0 7m55s olm-operator-5977ffb855-lgfn8 1/1 Running 0 9h package-server-manager-75db6dcfc-hql4v 1/1 Running 0 9h packageserver-5955fb79cd-9n56n 1/1 Running 0 9h packageserver-5955fb79cd-xf6f6 1/1 Running 0 9h mac:~ jianzhang$ oc get pods catalog-operator-7b88dddfbc-rsfhz -o yaml apiVersion: v1 kind: Pod metadata: annotations: k8s.v1.cni.cncf.io/network-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.130.0.121" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.130.0.121" ], "default": true, "dns": {} }] openshift.io/scc: nonroot-v2 seccomp.security.alpha.kubernetes.io/pod: runtime/default creationTimestamp: "2022-06-29T02:46:23Z" generateName: catalog-operator-7b88dddfbc- labels: app: catalog-operator pod-template-hash: 7b88dddfbc name: catalog-operator-7b88dddfbc-rsfhz namespace: openshift-operator-lifecycle-manager ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: catalog-operator-7b88dddfbc uid: 4c796f4d-b2f2-4e0d-b2de-6e8ab5f45ce2 resourceVersion: "278732" uid: 10e8c6e9-8f1c-4484-b934-c5208016c426 spec: containers: - args: - --debug - --namespace - openshift-marketplace - --configmapServerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000 - --opmImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000 - --util-image - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 - --writeStatusName - operator-lifecycle-manager-catalog - --tls-cert - /srv-cert/tls.crt - --tls-key - /srv-cert/tls.key - --client-ca - /profile-collector-cert/tls.crt command: - /bin/catalog env: - name: RELEASE_VERSION value: 4.11.0-0.nightly-2022-06-28-160049 image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /healthz port: 8443 scheme: HTTPS periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: catalog-operator ports: - containerPort: 8443 name: metrics protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /healthz port: 8443 scheme: HTTPS periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: requests: cpu: 10m memory: 80Mi securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /srv-cert name: srv-cert readOnly: true - mountPath: /profile-collector-cert name: profile-collector-cert readOnly: true - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-cv642 readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true imagePullSecrets: - name: olm-operator-serviceaccount-dockercfg-f26bf nodeName: ip-10-0-130-83.ap-south-1.compute.internal nodeSelector: kubernetes.io/os: linux node-role.kubernetes.io/master: "" preemptionPolicy: PreemptLowerPriority priority: 2000000000 priorityClassName: system-cluster-critical restartPolicy: Always schedulerName: default-scheduler securityContext: runAsNonRoot: true runAsUser: 65534 seLinuxOptions: level: s0:c20,c0 seccompProfile: type: RuntimeDefault serviceAccount: olm-operator-serviceaccount serviceAccountName: olm-operator-serviceaccount terminationGracePeriodSeconds: 30 tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 120 - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 120 - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists volumes: - name: srv-cert secret: defaultMode: 420 secretName: catalog-operator-serving-cert - name: profile-collector-cert secret: defaultMode: 420 secretName: pprof-cert - name: kube-api-access-cv642 projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace - configMap: items: - key: service-ca.crt path: service-ca.crt name: openshift-service-ca.crt status: conditions: - lastProbeTime: null lastTransitionTime: "2022-06-29T02:46:23Z" status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2022-06-29T08:11:56Z" status: "True" type: Ready - lastProbeTime: null lastTransitionTime: "2022-06-29T08:11:56Z" status: "True" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2022-06-29T02:46:23Z" status: "True" type: PodScheduled containerStatuses: - containerID: cri-o://64a99805cca11e9af66452d739a91259566a35a9e580bfac81dd7205f9272e18 image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 lastState: terminated: containerID: cri-o://938e2eb90465caf8ac99ff4405cfbb9a0f03b16598474f944cca44c3e70af9df exitCode: 2 finishedAt: "2022-06-29T08:11:54Z" message: "ub.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate/kubestate.go:128 +0xc3 fp=0xc003c11180 sp=0xc003c11118 pc=0x1a36603\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*subscriptionSyncer).Sync(0xc000a2f420, {0x237a238, 0xc0004ce4c0}, {0x236a448, 0xc004993ce0})\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/syncer.go:77 +0x535 fp=0xc003c11648 sp=0xc003c11180 pc=0x1a4f535\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*QueueInformer).Sync(...)\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer.go:35\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).processNextWorkItem(0xc0001e3ad0, {0x237a238, 0xc0004ce4c0}, 0xc000cee360)\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:287 +0x57c fp=0xc003c11f70 sp=0xc003c11648 pc=0x1a3ca7c\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).worker(0x10000c0008fd6e0?, {0x237a238, 0xc0004ce4c0}, 0xc0004837b8?)\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:231 +0x45 fp=0xc003c11fb0 sp=0xc003c11f70 pc=0x1a3c4a5\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start.func3()\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x32 fp=0xc003c11fe0 sp=0xc003c11fb0 pc=0x1a3c152\nruntime.goexit()\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc003c11fe8 sp=0xc003c11fe0 pc=0x4719c1\ncreated by github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x557\n" reason: Error startedAt: "2022-06-29T07:56:16Z" name: catalog-operator ready: true restartCount: 6 started: true state: running: startedAt: "2022-06-29T08:11:55Z" hostIP: 10.0.130.83 phase: Running podIP: 10.130.0.121 podIPs: - ip: 10.130.0.121 qosClass: Burstable startTime: "2022-06-29T02:46:23Z"
Please ignore comment 4, 5, they are nothing with this issue.
Check the `previous` log. mac:~ jianzhang$ oc logs catalog-operator-f8ddcb57b-j5rf2 --previous time="2022-07-03T23:49:00Z" level=info msg="log level info" ... ... time="2022-07-04T03:43:25Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink= time="2022-07-04T03:43:25Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink= fatal error: concurrent map writes fatal error: concurrent map writes goroutine 559 [running]: runtime.throw({0x2048546?, 0x0?}) /usr/lib/golang/src/runtime/panic.go:992 +0x71 fp=0xc001f9c508 sp=0xc001f9c4d8 pc=0x43e9f1 runtime.mapassign_faststr(0x1d09880, 0xc0031847b0, {0x2079799, 0x2e}) /usr/lib/golang/src/runtime/map_faststr.go:295 +0x38b fp=0xc001f9c570 sp=0xc001f9c508 pc=0x419b4b github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.Pod(0xc001f4a900, {0x203dd0f, 0xf}, {0xc00132ccc0, 0x38}, {0xc003582d50, 0x13}, 0xc00452c1e0, 0xc0031847b0, 0x5, ...) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/reconciler.go:227 +0xbdf fp=0xc001f9cbb0 sp=0xc001f9c570 pc=0x1a475ff github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*grpcCatalogSourceDecorator).Pod(0xc001f9ce50, {0xc003582d50, 0x13}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:125 +0xf9 fp=0xc001f9cc30 sp=0xc001f9cbb0 pc=0x1a42c99 github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*GrpcRegistryReconciler).currentPodsWithCorrectImageAndSpec(0xc001f9ce68?, {0xc001f4a900}, {0xc003582d50, 0x13}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:190 +0x198 fp=0xc001f9ce48 sp=0xc001f9cc30 pc=0x1a437b8 github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*GrpcRegistryReconciler).CheckRegistryServer(0xc000bcbf80?, 0x493b77?) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:453 +0x4c fp=0xc001f9ce88 sp=0xc001f9ce48 pc=0x1a45fcc github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).healthy(0x38ca8453?, 0xc001f4a900) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:196 +0x7e fp=0xc001f9ced0 sp=0xc001f9ce88 pc=0x1a4ae1e github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).health(0x1bc37c0?, 0xc003e7e7e0, 0x8?) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:159 +0x2a fp=0xc001f9cf10 sp=0xc001f9ced0 pc=0x1a4ac8a github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).catalogHealth(0xc000a59a90, {0xc003356a50, 0x11}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:137 +0x387 fp=0xc001f9d040 sp=0xc001f9cf10 pc=0x1a4a827 github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).Reconcile(0xc000a59a90, {0x237a238, 0xc0003fe5c0}, {0x7f9f6e5b3900?, 0xc0050f64f0?}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:82 +0x21e fp=0xc001f9d118 sp=0xc001f9d040 pc=0x1a4a1be github.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate.ReconcilerChain.Reconcile({0xc0008e43c0?, 0x3, 0xc001f9d258?}, {0x237a238, 0xc0003fe5c0}, {0x7f9f6e5b3328?, 0xc0050f6490?}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate/kubestate.go:128 +0xc3 fp=0xc001f9d180 sp=0xc001f9d118 pc=0x1a36603 github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*subscriptionSyncer).Sync(0xc0004dfd50, {0x237a238, 0xc0003fe5c0}, {0x236a448, 0xc003cbb760}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/syncer.go:77 +0x535 fp=0xc001f9d648 sp=0xc001f9d180 pc=0x1a4f535 github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*QueueInformer).Sync(...) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer.go:35 github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).processNextWorkItem(0xc00057a580, {0x237a238, 0xc0003fe5c0}, 0xc000954720) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:287 +0x57c fp=0xc001f9df70 sp=0xc001f9d648 pc=0x1a3ca7c github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).worker(0x0?, {0x237a238, 0xc0003fe5c0}, 0x0?) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:231 +0x45 fp=0xc001f9dfb0 sp=0xc001f9df70 pc=0x1a3c4a5 github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start.func3() /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x32 fp=0xc001f9dfe0 sp=0xc001f9dfb0 pc=0x1a3c152 runtime.goexit() /usr/lib/golang/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc001f9dfe8 sp=0xc001f9dfe0 pc=0x4719c1 created by github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x557 Seems like it failed at: https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L227
As Jian pointed out, the catalog operator is failing due to a concurrent write at https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L227. This is happening because: - The grpcCatalogSourceDecorator includes a pointer to a catalogSource: https://github.com/operator-framework/operator-lifecycle-manager/blob/0b7970c3d609e1f483a0ba68f599a5ffd2e139b5/pkg/controller/registry/reconciler/grpc.go#L32-L35 - The grpcCatalogSourceDecorator's Annotations function returns the catalogSource's annotations: https://github.com/operator-framework/operator-lifecycle-manager/blob/0b7970c3d609e1f483a0ba68f599a5ffd2e139b5/pkg/controller/registry/reconciler/grpc.go#L66 - The grpcCatalogSourceDecorator's Pod function uses the annotations returned from it's Annotations function: https://github.com/operator-framework/operator-lifecycle-manager/blob/0b7970c3d609e1f483a0ba68f599a5ffd2e139b5/pkg/controller/registry/reconciler/grpc.go#L125 - These catalogSource's annotations are updated within the Pod function here: https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L227 Line 227 in the reconciler.go directly mutates the catalogSource's annotations. The grpcCatalogSourceDecorator's Annotations function should be returning a copy of the annotations or it should be created with a deepcopy of the catalogSource to avoid mutating an object in the lister cache. This doesn't seem to be a blocker, but we should get a fix in swiftly.
1, Create a cluster with the fixed PR via the Cluster-bot. mac:~ jianzhang$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.ci.test-2022-07-13-022646-ci-ln-41fvni2-latest True False 126m Cluster version is 4.11.0-0.ci.test-2022-07-13-022646-ci-ln-41fvni2-latest 2, Subscribe some operators. mac:~ jianzhang$ oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL default etcd etcd community-operators singlenamespace-alpha openshift-logging cluster-logging cluster-logging redhat-operators stable openshift-operators-redhat elasticsearch-operator elasticsearch-operator redhat-operators stable mac:~ jianzhang$ oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL default etcd etcd community-operators singlenamespace-alpha openshift-logging cluster-logging cluster-logging redhat-operators stable openshift-operators-redhat elasticsearch-operator elasticsearch-operator redhat-operators stable mac:~ jianzhang$ mac:~ jianzhang$ mac:~ jianzhang$ oc get csv -n openshift-operators-redhat NAME DISPLAY VERSION REPLACES PHASE elasticsearch-operator.5.4.2 OpenShift Elasticsearch Operator 5.4.2 Succeeded mac:~ jianzhang$ oc get csv -n openshift-logging NAME DISPLAY VERSION REPLACES PHASE cluster-logging.5.4.2 Red Hat OpenShift Logging 5.4.2 Succeeded elasticsearch-operator.5.4.2 OpenShift Elasticsearch Operator 5.4.2 Succeeded mac:~ jianzhang$ oc get csv -n default NAME DISPLAY VERSION REPLACES PHASE elasticsearch-operator.5.4.2 OpenShift Elasticsearch Operator 5.4.2 Succeeded etcdoperator.v0.9.4 etcd 0.9.4 etcdoperator.v0.9.2 Succeeded 3, Check OLM catalog-operator pods status. mac:~ jianzhang$ oc get pods NAME READY STATUS RESTARTS AGE catalog-operator-546db7cdf5-7pldg 1/1 Running 0 145m collect-profiles-27628110-lr2nv 0/1 Completed 0 30m collect-profiles-27628125-br8b8 0/1 Completed 0 15m collect-profiles-27628140-m64gp 0/1 Completed 0 38s olm-operator-754d7f6f56-26qhw 1/1 Running 0 145m package-server-manager-77d5cbf696-v9w4p 1/1 Running 0 145m packageserver-6884994d98-2smtw 1/1 Running 0 143m packageserver-6884994d98-5d7jg 1/1 Running 0 143m mac:~ jianzhang$ oc logs catalog-operator-546db7cdf5-7pldg --previous Error from server (BadRequest): previous terminated container "catalog-operator" in pod "catalog-operator-546db7cdf5-7pldg" not found No terminated container. catalog-operator works well. Verify it.
Changed the status to VERIFIED based on comment 10.
The issue seem to affect OpenShift 4.10.39 as well: % oc get logs catalog-operator-6479d65774-znsnk... time="2022-11-18T14:44:20Z" level=info msg="log level info" time="2022-11-18T14:44:20Z" level=info msg="TLS keys set, using https for metrics" time="2022-11-18T14:44:28Z" level=info msg="connection established. cluster-version: v1.23.12+7566c4d" ... time="2022-11-18T16:02:39Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink= fatal error: concurrent map writes goroutine 746 [running]: runtime.throw({0x1ee9827, 0xc00e2a85a8}) /usr/lib/golang/src/runtime/panic.go:1198 +0x71 fp=0xc00e2a8550 sp=0xc00e2a8520 pc=0x43bcd1 runtime.mapassign_faststr(0x1be0e40, 0xc004351200, {0x1f17e43, 0x2e}) /usr/lib/golang/src/runtime/map_faststr.go:211 +0x39c fp=0xc00e2a85b8 sp=0xc00e2a8550 pc=0x41949c github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.Pod(0xc009b1a480, {0x1edf878, 0xf}, {0xc0172e1e40, 0x1}, {0xc018a283a8, 0x13}, 0xc004351200, 0xc014ebc0f0, 0x5, ...) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/reconciler.go:227 +0xbbf fp=0xc00e2a8bf0 sp=0xc00e2a85b8 pc=0x195e1ff github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*grpcCatalogSourceDecorator).Pod(0xc00e2a8e90, {0xc018a283a8, 0x13}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:125 +0xf9 fp=0xc00e2a8c70 sp=0xc00e2a8bf0 pc=0x1959919 github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*GrpcRegistryReconciler).currentPodsWithCorrectImageAndSpec(0xc00e2a8ea8, {0xc009b1a480}, {0xc018a283a8, 0x13}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:190 +0x1b1 fp=0xc00e2a8e88 sp=0xc00e2a8c70 pc=0x195a491 github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*GrpcRegistryReconciler).CheckRegistryServer(0xc00119f980, 0xc0163d6780) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:453 +0x4c fp=0xc00e2a8ec8 sp=0xc00e2a8e88 pc=0x195cb2c github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).healthy(0x3719777f, 0xc009b1a480) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:196 +0x7e fp=0xc00e2a8f10 sp=0xc00e2a8ec8 pc=0x19619be github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).health(0x1ab0320, 0xc00e7890e0, 0x4) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:159 +0x2a fp=0xc00e2a8f50 sp=0xc00e2a8f10 pc=0x196182a github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).catalogHealth(0xc0005f5860, {0xc000467b18, 0xc001c6bad0}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:137 +0x372 fp=0xc00e2a9060 sp=0xc00e2a8f50 pc=0x19614f2 github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).Reconcile(0xc0005f5860, {0x21b7b70, 0xc0009c33c0}, {0x7fb4bc391e88, 0xc001c6b940}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:82 +0x20e fp=0xc00e2a9128 sp=0xc00e2a9060 pc=0x1960ece github.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate.ReconcilerChain.Reconcile({0xc0005ade40, 0x3, 0xc001c6b900}, {0x21b7b70, 0xc0009c33c0}, {0x7fb4bc3918b0, 0xc001c6b900}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate/kubestate.go:128 +0xc3 fp=0xc00e2a9190 sp=0xc00e2a9128 pc=0x194cfe3 github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*subscriptionSyncer).Sync(0xc0007ba620, {0x21b7b70, 0xc0009c33c0}, {0x21a1318, 0xc015a37de0}) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/syncer.go:77 +0x532 fp=0xc00e2a9660 sp=0xc00e2a9190 pc=0x1964992 github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*QueueInformer).Sync(...) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer.go:35 github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).processNextWorkItem(0xc000022160, {0x21b7b70, 0xc0009c33c0}, 0xc00121b140) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:287 +0x895 fp=0xc00e2a9f70 sp=0xc00e2a9660 pc=0x1953915 github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).worker(0xc0003c84e0, {0x21b7b70, 0xc0009c33c0}, 0x0) /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:231 +0x45 fp=0xc00e2a9fb0 sp=0xc00e2a9f70 pc=0x1953025 github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start·dwrap·7() /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x32 fp=0xc00e2a9fe0 sp=0xc00e2a9fb0 pc=0x1952cd2 runtime.goexit() /usr/lib/golang/src/runtime/asm_amd64.s:1581 +0x1 fp=0xc00e2a9fe8 sp=0xc00e2a9fe0 pc=0x46f181 created by github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x55c Clusterversion: 4.10.39 Are those crashes also blocking updates of already installed versions of operators? We saw that a installed operator wasn't marked as update-able even though there is a newer version in the packagemanifest. Also are there any workarounds/mitigations to avoid those crashes?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days