Description of problem: NFD operator pod goes into error state on 4.2.x cluster when deploying from CLI using release-4.2 branch on github. Version-Release number of selected component (if applicable): 4.2 How reproducible: 100% Steps to Reproduce: 1. 4.2.18 cluster 2. Clone https://github.com/openshift/cluster-nfd-operator 3. Checkout release-4.2 4. PULLPOLICY=Always make deploy 5. oc get pods -n openshift-nfd-operator Actual results: [ematysek@jump cluster-nfd-operator]$ oc get pods -n openshift-nfd-operator NAME READY STATUS RESTARTS AGE nfd-operator-b7f4fbff8-ks85l 0/1 CrashLoopBackOff 4 3m40s [ematysek@jump cluster-nfd-operator]$ Expected results: nfd operator running successfully. Additional info: Pod logs: [ematysek@jump cluster-nfd-operator]$ oc logs nfd-operator-b7f4fbff8-ks85l {"level":"info","ts":1582223530.2681565,"logger":"cmd","msg":"Go Version: go1.13.5"} {"level":"info","ts":1582223530.2681808,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"} {"level":"info","ts":1582223530.2681844,"logger":"cmd","msg":"Version of operator-sdk: v0.12.0"} {"level":"info","ts":1582223530.2684205,"logger":"leader","msg":"Trying to become the leader."} {"level":"info","ts":1582223532.286605,"logger":"leader","msg":"Found existing lock with my name. I was likely restarted."} {"level":"info","ts":1582223532.2866318,"logger":"leader","msg":"Continuing as the leader."} {"level":"info","ts":1582223534.2899957,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":8080"} {"level":"info","ts":1582223534.2903976,"logger":"cmd","msg":"Registering Components."} {"level":"info","ts":1582223534.2910452,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"nodefeaturediscovery-controller","source":"kind source: /, Kind="} {"level":"info","ts":1582223534.2912865,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"nodefeaturediscovery-controller","source":"kind source: /, Kind="} {"level":"info","ts":1582223534.2914093,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"nodefeaturediscovery-controller","source":"kind source: /, Kind="} {"level":"info","ts":1582223534.2915206,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"nodefeaturediscovery-controller","source":"kind source: /, Kind="} {"level":"info","ts":1582223534.2916193,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"nodefeaturediscovery-controller","source":"kind source: /, Kind="} {"level":"error","ts":1582223534.2916377,"logger":"cmd","msg":"","error":"no kind is registered for the type v1.SecurityContextConstraints in scheme \"k8s.io/client-go/kubernetes/scheme/register.go:65\"","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/openshift/cluster-nfd-operator/vendor/github.com/go-logr/zapr/zapr.go:128\nmain.main\n\t/go/src/github.com/openshift/cluster-nfd-operator/cmd/manager/main.go:92\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:203"} [ematysek@jump cluster-nfd-operator]$
*** Bug 1805394 has been marked as a duplicate of this bug. ***
Failed to deploy NFD operator from github https://github.com/openshift/cluster-nfd-operator release-4.2 branch: Server Version: 4.2.0-0.nightly-2020-03-09-194140 Kubernetes Version: v1.14.6-152-g117ba1f $ oc get pods -n openshift-nfd NAME READY STATUS RESTARTS AGE nfd-operator-5d5d64b769-vjcct 0/1 ErrImagePull 0 9s MacBook-Pro:cluster-nfd-operator walid$ MacBook-Pro:cluster-nfd-operator walid$ oc get pods -n openshift-nfd NAME READY STATUS RESTARTS AGE nfd-operator-5d5d64b769-vjcct 0/1 ImagePullBackOff 0 12s $ oc get NodeFeatureDiscovery -A NAMESPACE NAME AGE openshift-nfd nfd-master-server 25s $ oc get pods -n openshift-nfd NAME READY STATUS RESTARTS AGE nfd-operator-5d5d64b769-vjcct 0/1 ErrImagePull 0 45s $ oc get events -n openshift-nfd LAST SEEN TYPE REASON OBJECT MESSAGE 59s Normal Scheduled pod/nfd-operator-5d5d64b769-vjcct Successfully assigned openshift-nfd/nfd-operator-5d5d64b769-vjcct to preserve-stage-42-6bscs-control-plane-2 10s Normal Pulling pod/nfd-operator-5d5d64b769-vjcct Pulling image "quay.io/zvonkok/cluster-nfd-operator:release-4.2" 9s Warning Failed pod/nfd-operator-5d5d64b769-vjcct Failed to pull image "quay.io/zvonkok/cluster-nfd-operator:release-4.2": rpc error: code = Unknown desc = Error reading manifest release-4.2 in quay.io/zvonkok/cluster-nfd-operator: manifest unknown: manifest unknown 9s Warning Failed pod/nfd-operator-5d5d64b769-vjcct Error: ErrImagePull 21s Normal BackOff pod/nfd-operator-5d5d64b769-vjcct Back-off pulling image "quay.io/zvonkok/cluster-nfd-operator:release-4.2" 21s Warning Failed pod/nfd-operator-5d5d64b769-vjcct Error: ImagePullBackOff 59s Normal SuccessfulCreate replicaset/nfd-operator-5d5d64b769 Created pod: nfd-operator-5d5d64b769-vjcct 59s Normal ScalingReplicaSet deployment/nfd-operator Scaled up replica set nfd-operator-5d5d64b769 to 1 $ oc describe pods -n openshift-nfd nfd-operator-5d5d64b769-vjcct Name: nfd-operator-5d5d64b769-vjcct Namespace: openshift-nfd Priority: 0 Node: preserve-stage-42-6bscs-control-plane-2/10.0.97.253 Start Time: Fri, 13 Mar 2020 09:06:40 -0400 Labels: name=nfd-operator pod-template-hash=5d5d64b769 Annotations: k8s.v1.cni.cncf.io/networks-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.130.0.81" ], "default": true, "dns": {} }] openshift.io/scc: anyuid Status: Pending IP: 10.130.0.81 IPs: <none> Controlled By: ReplicaSet/nfd-operator-5d5d64b769 Containers: nfd-operator: Container ID: Image: quay.io/zvonkok/cluster-nfd-operator:release-4.2 Image ID: Port: 60000/TCP Host Port: 0/TCP Command: cluster-nfd-operator State: Waiting Reason: ImagePullBackOff Ready: False Restart Count: 0 Readiness: exec [stat /tmp/operator-sdk-ready] delay=4s timeout=1s period=10s #success=1 #failure=1 Environment: WATCH_NAMESPACE: openshift-nfd (v1:metadata.namespace) POD_NAME: nfd-operator-5d5d64b769-vjcct (v1:metadata.name) OPERATOR_NAME: cluster-nfd-operator NODE_FEATURE_DISCOVERY_IMAGE: quay.io/zvonkok/node-feature-discovery:v4.2 Mounts: /tmp from tmp (rw) /var/run/secrets/kubernetes.io/serviceaccount from nfd-operator-token-gh8v8 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: tmp: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> nfd-operator-token-gh8v8: Type: Secret (a volume populated by a Secret) SecretName: nfd-operator-token-gh8v8 Optional: false QoS Class: BestEffort Node-Selectors: node-role.kubernetes.io/master= Tolerations: node-role.kubernetes.io/master:NoSchedule node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 97s default-scheduler Successfully assigned openshift-nfd/nfd-operator-5d5d64b769-vjcct to preserve-stage-42-6bscs-control-plane-2 Normal Pulling 48s (x3 over 89s) kubelet, preserve-stage-42-6bscs-control-plane-2 Pulling image "quay.io/zvonkok/cluster-nfd-operator:release-4.2" Warning Failed 47s (x3 over 89s) kubelet, preserve-stage-42-6bscs-control-plane-2 Failed to pull image "quay.io/zvonkok/cluster-nfd-operator:release-4.2": rpc error: code = Unknown desc = Error reading manifest release-4.2 in quay.io/zvonkok/cluster-nfd-operator: manifest unknown: manifest unknown Warning Failed 47s (x3 over 89s) kubelet, preserve-stage-42-6bscs-control-plane-2 Error: ErrImagePull Normal BackOff 9s (x6 over 88s) kubelet, preserve-stage-42-6bscs-control-plane-2 Back-off pulling image "quay.io/zvonkok/cluster-nfd-operator:release-4.2" Warning Failed 9s (x6 over 88s) kubelet, preserve-stage-42-6bscs-control-plane-2 Error: ImagePullBackOff
This should work now, we have a release-4.2 tag on quay.io available. (sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4)
I was successfully able to deploy NFD using the 4.2 release on a 4.2.x cluster $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.34 True False 71m Cluster version is 4.2.34 $ oc get pods -n openshift-nfd NAME READY STATUS RESTARTS AGE nfd-master-5br7b 1/1 Running 0 73s nfd-master-c6zrd 1/1 Running 0 73s nfd-master-r2xlb 1/1 Running 0 73s nfd-operator-5d5d64b769-jshkc 1/1 Running 0 108s nfd-worker-cfqt8 1/1 Running 2 74s nfd-worker-mhvl6 1/1 Running 2 74s nfd-worker-v48t8 1/1 Running 2 74s $ oc get NodeFeatureDiscovery -A NAMESPACE NAME AGE openshift-nfd nfd-master-server 97s $ oc describe pod/nfd-operator-5d5d64b769-jshkc -n openshift-nfd Name: nfd-operator-5d5d64b769-jshkc Namespace: openshift-nfd Priority: 0 Node: ip-10-0-148-204.us-east-2.compute.internal/10.0.148.204 Start Time: Tue, 02 Jun 2020 16:14:38 +0000 Labels: name=nfd-operator pod-template-hash=5d5d64b769 Annotations: k8s.v1.cni.cncf.io/networks-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.128.0.39" ], "default": true, "dns": {} }] openshift.io/scc: anyuid Status: Running IP: 10.128.0.39 IPs: <none> Controlled By: ReplicaSet/nfd-operator-5d5d64b769 Containers: nfd-operator: Container ID: cri-o://536fd8a2e03afa24c2e4bc911fb8ae456b9fa31b70135dc12b06bd7bc11546b6 Image: quay.io/zvonkok/cluster-nfd-operator:release-4.2 Image ID: quay.io/zvonkok/cluster-nfd-operator@sha256:d54590cfb50c26c813ffd9d02dc157a725d1bf8fb75f748730dd8ae29a5c5476 Port: 60000/TCP Host Port: 0/TCP Command: cluster-nfd-operator State: Running Started: Tue, 02 Jun 2020 16:15:08 +0000 Ready: True Restart Count: 0 Readiness: exec [stat /tmp/operator-sdk-ready] delay=4s timeout=1s period=10s #success=1 #failure=1 Environment: WATCH_NAMESPACE: openshift-nfd (v1:metadata.namespace) POD_NAME: nfd-operator-5d5d64b769-jshkc (v1:metadata.name) OPERATOR_NAME: cluster-nfd-operator NODE_FEATURE_DISCOVERY_IMAGE: quay.io/zvonkok/node-feature-discovery:v4.2 Mounts: /tmp from tmp (rw) /var/run/secrets/kubernetes.io/serviceaccount from nfd-operator-token-6htxs (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: tmp: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> nfd-operator-token-6htxs: Type: Secret (a volume populated by a Secret) SecretName: nfd-operator-token-6htxs Optional: false QoS Class: BestEffort Node-Selectors: node-role.kubernetes.io/master= Tolerations: node-role.kubernetes.io/master:NoSchedule node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 2m44s default-scheduler Successfully assigned openshift-nfd/nfd-operator-5d5d64b769-jshkc to ip-10-0-148-204.us-east-2.compute.internal Normal Pulling 2m35s kubelet, ip-10-0-148-204.us-east-2.compute.internal Pulling image "quay.io/zvonkok/cluster-nfd-operator:release-4.2" Normal Pulled 2m14s kubelet, ip-10-0-148-204.us-east-2.compute.internal Successfully pulled image "quay.io/zvonkok/cluster-nfd-operator:release-4.2" Normal Created 2m14s kubelet, ip-10-0-148-204.us-east-2.compute.internal Created container nfd-operator Normal Started 2m14s kubelet, ip-10-0-148-204.us-east-2.compute.internal Started container nfd-operator
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2589