Bug 1815129 - OCP 4.2.z - nfd-worker pods fail to deploy in namespace other than default after NFD operator is deployed from OperatorHub
Summary: OCP 4.2.z - nfd-worker pods fail to deploy in namespace other than default af...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node Feature Discovery Operator
Version: 4.2.z
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 4.2.z
Assignee: Zvonko Kosic
QA Contact: Walid A.
URL:
Whiteboard:
: 1808503 (view as bug list)
Depends On: 1808061
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-19 14:45 UTC by Walid A.
Modified: 2020-05-26 18:28 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1807620
Environment:
Last Closed: 2020-04-21 11:37:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-nfd-operator pull 79 0 None closed [release-4.2] Bug 1815129: Set the correct namespace for SCC when installed in non default namespace 2020-09-22 12:21:53 UTC
Red Hat Product Errata RHBA-2020:1450 0 None None None 2020-04-21 11:38:14 UTC

Comment 3 Walid A. 2020-04-17 04:57:46 UTC
Tested on OCP 4.2.28:

Server Version: 4.2.28
Kubernetes Version: v1.14.6-152-g117ba1f

Unable to deploy NFD operator from OperatorHub in a custom namespace, operator image not found:

$ oc get pods -n test-nfd
NAME                            READY   STATUS             RESTARTS   AGE
nfd-operator-66546cdbfc-wp9pw   0/1     ImagePullBackOff   0          3h58m

$ oc logs -n test-nfd nfd-operator-66546cdbfc-wp9pw
Error from server (BadRequest): container "nfd-operator" in pod "nfd-operator-66546cdbfc-wp9pw" is waiting to start: trying and failing to pull image
MacBook-Pro:.docker walid$ 
MacBook-Pro:.docker walid$ oc get events -n test-nfd
LAST SEEN   TYPE      REASON    OBJECT                              MESSAGE
53m         Normal    Pulling   pod/nfd-operator-66546cdbfc-wp9pw   Pulling image "image-registry.openshift-image-registry.svc:5000/openshift/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40"
3m32s       Normal    BackOff   pod/nfd-operator-66546cdbfc-wp9pw   Back-off pulling image "image-registry.openshift-image-registry.svc:5000/openshift/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40"
13m         Warning   Failed    pod/nfd-operator-66546cdbfc-wp9pw   Error: ImagePullBackOff


$ oc describe -n test-nfd pod/nfd-operator-66546cdbfc-wp9pw
Name:           nfd-operator-66546cdbfc-wp9pw
Namespace:      test-nfd
Priority:       0
Node:           ip-10-0-162-203.us-west-2.compute.internal/10.0.162.203
Start Time:     Tue, 14 Apr 2020 12:33:26 -0400
Labels:         name=nfd-operator
                pod-template-hash=66546cdbfc
Annotations:    alm-examples:
                  [
                    {
                      "apiVersion": "nfd.openshift.io/v1alpha1",
                      "kind": "NodeFeatureDiscovery",
                      "metadata": {
                        "name": "nfd-master-server"
                      },
                      "spec": {
                        "namespace": "openshift-nfd"
                      }
                    }
                  ]
                capabilities: Basic Install
                categories: Database
                certified: false
                containerImage: 
                createdAt: 2019-05-30T00:00:00Z
                description:
                  This software enables node feature discovery for Kubernetes. It detects hardware features available on each node in a Kubernetes cluster, ...
                k8s.v1.cni.cncf.io/networks-status:
                  [{
                      "name": "openshift-sdn",
                      "interface": "eth0",
                      "ips": [
                          "10.128.0.66"
                      ],
                      "default": true,
                      "dns": {}
                  }]
                olm.operatorGroup: test-nfd-7fdxz
                olm.operatorNamespace: test-nfd
                olm.targetNamespaces: test-nfd
                openshift.io/scc: anyuid
                provider: Red Hat
                repository: https://github.com/openshift/cluster-nfd-operator
                support: Red Hat
Status:         Pending
IP:             10.128.0.66
IPs:            <none>
Controlled By:  ReplicaSet/nfd-operator-66546cdbfc
Containers:
  nfd-operator:
    Container ID:  
    Image:         image-registry.openshift-image-registry.svc:5000/openshift/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40
    Image ID:      
    Port:          60000/TCP
    Host Port:     0/TCP
    Command:
      cluster-nfd-operator
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Readiness:      exec [stat /tmp/operator-sdk-ready] delay=4s timeout=1s period=10s #success=1 #failure=1
    Environment:
      WATCH_NAMESPACE:                (v1:metadata.annotations['olm.targetNamespaces'])
      POD_NAME:                      nfd-operator-66546cdbfc-wp9pw (v1:metadata.name)
      OPERATOR_NAME:                 cluster-nfd-operator
      NODE_FEATURE_DISCOVERY_IMAGE:  image-registry.openshift-image-registry.svc:5000/openshift/ose-node-feature-discovery@sha256:f8d60643622304dbb4d9fee5b0223c7a6c6d972480127c0352745598ccde39e2
    Mounts:
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from nfd-operator-token-rncsc (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  nfd-operator-token-rncsc:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  nfd-operator-token-rncsc
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  node-role.kubernetes.io/master=
Tolerations:     node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason   Age                       From                                                 Message
  ----     ------   ----                      ----                                                 -------
  Normal   Pulling  54m (x41 over 3h59m)      kubelet, ip-10-0-162-203.us-west-2.compute.internal  Pulling image "image-registry.openshift-image-registry.svc:5000/openshift/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40"
  Warning  Failed   14m (x986 over 3h59m)     kubelet, ip-10-0-162-203.us-west-2.compute.internal  Error: ImagePullBackOff
  Normal   BackOff  4m12s (x1031 over 3h59m)  kubelet, ip-10-0-162-203.us-west-2.compute.internal  Back-off pulling image "image-registry.openshift-image-registry.svc:5000/openshift/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40"

Comment 4 Walid A. 2020-04-17 17:07:10 UTC
Tested on 4.2.29 stage cluster and NFD deployed successfully both from custom or default namespace from OperatorHub.  Nodes were labeled correctly:

$ oc get pods -n test-nfd
NAME                            READY   STATUS    RESTARTS   AGE
nfd-master-27svr                1/1     Running   0          21m
nfd-master-p6fvd                1/1     Running   0          21m
nfd-master-t6pqk                1/1     Running   0          21m
nfd-operator-69fb6cb8ff-hb6hp   1/1     Running   0          22m
nfd-worker-52g55                1/1     Running   2          21m
nfd-worker-6hmmm                1/1     Running   2          21m
nfd-worker-jkhr6                1/1     Running   2          21m

$ oc describe -n test-nfd pod/nfd-operator-69fb6cb8ff-hb6hp
Name:           nfd-operator-69fb6cb8ff-hb6hp
Namespace:      test-nfd
Priority:       0
Node:           4229-stage17-whrk7-control-plane-1/10.0.102.102
Start Time:     Fri, 17 Apr 2020 12:13:28 -0400
Labels:         name=nfd-operator
                pod-template-hash=69fb6cb8ff
Annotations:    alm-examples:
                  [
                    {
                      "apiVersion": "nfd.openshift.io/v1alpha1",
                      "kind": "NodeFeatureDiscovery",
                      "metadata": {
                        "name": "nfd-master-server"
                      },
                      "spec": {
                        "namespace": "openshift-nfd"
                      }
                    }
                  ]
                capabilities: Basic Install
                categories: Database
                certified: false
                containerImage: 
                createdAt: 2019-05-30T00:00:00Z
                description:
                  This software enables node feature discovery for Kubernetes. It detects hardware features available on each node in a Kubernetes cluster, ...
                k8s.v1.cni.cncf.io/networks-status:
                  [{
                      "name": "openshift-sdn",
                      "interface": "eth0",
                      "ips": [
                          "10.128.0.49"
                      ],
                      "default": true,
                      "dns": {}
                  }]
                olm.operatorGroup: test-nfd-vrdfw
                olm.operatorNamespace: test-nfd
                olm.targetNamespaces: test-nfd
                openshift.io/scc: anyuid
                provider: Red Hat
                repository: https://github.com/openshift/cluster-nfd-operator
                support: Red Hat
Status:         Running
IP:             10.128.0.49
IPs:            <none>
Controlled By:  ReplicaSet/nfd-operator-69fb6cb8ff
Containers:
  nfd-operator:
    Container ID:  cri-o://411d026bf8c102f98d7cfa10ca82639ebc0a27ddac04d2066e34459b96fe822a
    Image:         registry.stage.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40
    Image ID:      registry.stage.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40
    Port:          60000/TCP
    Host Port:     0/TCP
    Command:
      cluster-nfd-operator
    State:          Running
      Started:      Fri, 17 Apr 2020 12:13:40 -0400
    Ready:          True
    Restart Count:  0
    Readiness:      exec [stat /tmp/operator-sdk-ready] delay=4s timeout=1s period=10s #success=1 #failure=1
    Environment:
      WATCH_NAMESPACE:                (v1:metadata.annotations['olm.targetNamespaces'])
      POD_NAME:                      nfd-operator-69fb6cb8ff-hb6hp (v1:metadata.name)
      OPERATOR_NAME:                 cluster-nfd-operator
      NODE_FEATURE_DISCOVERY_IMAGE:  registry.stage.redhat.io/openshift4/ose-node-feature-discovery@sha256:f8d60643622304dbb4d9fee5b0223c7a6c6d972480127c0352745598ccde39e2
    Mounts:
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from nfd-operator-token-tvl86 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  nfd-operator-token-tvl86:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  nfd-operator-token-tvl86
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  node-role.kubernetes.io/master=
Tolerations:     node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age    From                                         Message
  ----    ------     ----   ----                                         -------
  Normal  Scheduled  2m17s  default-scheduler                            Successfully assigned test-nfd/nfd-operator-69fb6cb8ff-hb6hp to 4229-stage17-whrk7-control-plane-1
  Normal  Pulling    2m8s   kubelet, 4229-stage17-whrk7-control-plane-1  Pulling image "registry.stage.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40"
  Normal  Pulled     2m6s   kubelet, 4229-stage17-whrk7-control-plane-1  Successfully pulled image "registry.stage.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40"
  Normal  Created    2m5s   kubelet, 4229-stage17-whrk7-control-plane-1  Created container nfd-operator
  Normal  Started    2m5s   kubelet, 4229-stage17-whrk7-control-plane-1  Started container nfd-operator

Comment 6 errata-xmlrpc 2020-04-21 11:37:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1450

Comment 7 Zvonko Kosic 2020-05-26 18:28:28 UTC
*** Bug 1808503 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.