Description of problem: This issue is related to the Node Feature Discovery operand, not Operator. When trying to deploy the nfd-topology-updater daemonset from the node-feature-discovery github repo, on OCP build 4.10.-fc1 with kubernetes 1.23, the pods cannot get deployed on the worker nodes: 10s Warning FailedCreate daemonset/nfd-topology-updater Error creating: pods "nfd-topology-updater-" is forbidden: unable to validate against any security context constraint: The latest node-feature-discovery image was built from the github master branch repo and pushed to a private quay registry repo, then updated kustomization.yaml file to reflect that image and tag: git clone https://github.com/openshift/node-feature-discovery.git cd node-feature-discovery IMAGE_REGISTRY=quay.io/<account_name> make image push image to your repo We used image: quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974 IMAGE_REPO=quay.io/<account_name>/node-feature-discovery IMAGE_TAG_NAME=v0.10.0-devel-439-gea04e974 make yamls # cat kustomization.yaml apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization namespace: node-feature-discovery images: - name: '*' newName: quay.io/<account_name>/node-feature-discovery newTag: v0.10.0-devel-439-gea04e974 resources: - deployment/overlays/default # oc apply -k . 1058 history 400 | grep apply # oc apply -k deployment/overlays/topologyupdater/ namespace/node-feature-discovery unchanged customresourcedefinition.apiextensions.k8s.io/noderesourcetopologies.topology.node.k8s.io created serviceaccount/nfd-master unchanged serviceaccount/nfd-topology-updater created serviceaccount/nfd-worker unchanged role.rbac.authorization.k8s.io/nfd-worker unchanged clusterrole.rbac.authorization.k8s.io/nfd-master unchanged clusterrole.rbac.authorization.k8s.io/nfd-topology-updater created rolebinding.rbac.authorization.k8s.io/nfd-worker configured clusterrolebinding.rbac.authorization.k8s.io/nfd-master unchanged clusterrolebinding.rbac.authorization.k8s.io/nfd-topology-updater created service/nfd-master unchanged deployment.apps/nfd-master configured daemonset.apps/nfd-topology-updater created securitycontextconstraints.security.openshift.io/nfd-worker configured # oc get pods -n node-feature-discovery NAME READY STATUS RESTARTS AGE nfd-master-7c77db9d7d-l7gt8 1/1 Running 0 88s nfd-worker-l2ssw 1/1 Running 2 (103s ago) 3m51s nfd-worker-ntgff 1/1 Running 2 (103s ago) 3m51s nfd-worker-qcrw6 1/1 Running 2 (103s ago) 3m51s # oc get all -n node-feature-discovery NAME READY STATUS RESTARTS AGE pod/nfd-master-7c77db9d7d-l7gt8 1/1 Running 0 98s pod/nfd-worker-l2ssw 1/1 Running 2 (113s ago) 4m1s pod/nfd-worker-ntgff 1/1 Running 2 (113s ago) 4m1s pod/nfd-worker-qcrw6 1/1 Running 2 (113s ago) 4m1s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/nfd-master ClusterIP 172.30.145.252 <none> 8080/TCP 4m1s NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/nfd-topology-updater 0 0 0 0 0 <none> 98s daemonset.apps/nfd-worker 3 3 3 3 3 <none> 4m1s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/nfd-master 1/1 1 1 4m1s NAME DESIRED CURRENT READY AGE replicaset.apps/nfd-master-7c77db9d7d 1 1 1 98s replicaset.apps/nfd-master-84c867c6fc 0 0 0 4m1s # oc get events -n node-feature-discovery LAST SEEN TYPE REASON OBJECT MESSAGE 4m15s Normal Scheduled pod/nfd-master-7c77db9d7d-l7gt8 Successfully assigned node-feature-discovery/nfd-master-7c77db9d7d-l7gt8 to ip-10-0-214-37.us-east-2.compute.internal 4m14s Normal AddedInterface pod/nfd-master-7c77db9d7d-l7gt8 Add eth0 [10.130.0.51/23] from openshift-sdn 4m14s Normal Pulling pod/nfd-master-7c77db9d7d-l7gt8 Pulling image "gcr.io/k8s-staging-nfd/node-feature-discovery:master" 4m10s Normal Pulled pod/nfd-master-7c77db9d7d-l7gt8 Successfully pulled image "gcr.io/k8s-staging-nfd/node-feature-discovery:master" in 4.640525049s 4m9s Normal Created pod/nfd-master-7c77db9d7d-l7gt8 Created container nfd-master 4m9s Normal Started pod/nfd-master-7c77db9d7d-l7gt8 Started container nfd-master 4m16s Normal SuccessfulCreate replicaset/nfd-master-7c77db9d7d Created pod: nfd-master-7c77db9d7d-l7gt8 6m38s Normal Scheduled pod/nfd-master-84c867c6fc-qkwf7 Successfully assigned node-feature-discovery/nfd-master-84c867c6fc-qkwf7 to ip-10-0-134-36.us-east-2.compute.internal 6m37s Normal AddedInterface pod/nfd-master-84c867c6fc-qkwf7 Add eth0 [10.128.0.45/23] from openshift-sdn 6m37s Normal Pulling pod/nfd-master-84c867c6fc-qkwf7 Pulling image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" 6m31s Normal Pulled pod/nfd-master-84c867c6fc-qkwf7 Successfully pulled image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" in 5.968466214s 6m30s Normal Created pod/nfd-master-84c867c6fc-qkwf7 Created container nfd-master 6m30s Normal Started pod/nfd-master-84c867c6fc-qkwf7 Started container nfd-master 6m19s Warning Unhealthy pod/nfd-master-84c867c6fc-qkwf7 Readiness probe errored: rpc error: code = Unknown desc = command error: time="2022-01-14T22:53:43Z" level=error msg="exec failed: container_linux.go:380: starting container process caused: exec: \"/usr/bin/grpc_health_probe\": stat /usr/bin/grpc_health_probe: no such file or directory"... 6m19s Warning Unhealthy pod/nfd-master-84c867c6fc-qkwf7 Liveness probe errored: rpc error: code = Unknown desc = command error: time="2022-01-14T22:53:43Z" level=error msg="exec failed: container_linux.go:380: starting container process caused: exec: \"/usr/bin/grpc_health_probe\": stat /usr/bin/grpc_health_probe: no such file or directory"... 6m9s Warning Unhealthy pod/nfd-master-84c867c6fc-qkwf7 Readiness probe errored: rpc error: code = Unknown desc = command error: time="2022-01-14T22:53:53Z" level=error msg="exec failed: container_linux.go:380: starting container process caused: exec: \"/usr/bin/grpc_health_probe\": stat /usr/bin/grpc_health_probe: no such file or directory"... 6m9s Warning Unhealthy pod/nfd-master-84c867c6fc-qkwf7 Liveness probe errored: rpc error: code = Unknown desc = command error: time="2022-01-14T22:53:53Z" level=error msg="exec failed: container_linux.go:380: starting container process caused: exec: \"/usr/bin/grpc_health_probe\": stat /usr/bin/grpc_health_probe: no such file or directory"... 5m59s Warning Unhealthy pod/nfd-master-84c867c6fc-qkwf7 Readiness probe errored: rpc error: code = Unknown desc = command error: time="2022-01-14T22:54:03Z" level=error msg="exec failed: container_linux.go:380: starting container process caused: exec: \"/usr/bin/grpc_health_probe\": stat /usr/bin/grpc_health_probe: no such file or directory"... 5m59s Warning Unhealthy pod/nfd-master-84c867c6fc-qkwf7 Liveness probe errored: rpc error: code = Unknown desc = command error: time="2022-01-14T22:54:03Z" level=error msg="exec failed: container_linux.go:380: starting container process caused: exec: \"/usr/bin/grpc_health_probe\": stat /usr/bin/grpc_health_probe: no such file or directory"... 5m49s Warning Unhealthy pod/nfd-master-84c867c6fc-qkwf7 Liveness probe errored: rpc error: code = Unknown desc = command error: time="2022-01-14T22:54:13Z" level=error msg="exec failed: container_linux.go:380: starting container process caused: exec: \"/usr/bin/grpc_health_probe\": stat /usr/bin/grpc_health_probe: no such file or directory"... 5m49s Warning Unhealthy pod/nfd-master-84c867c6fc-qkwf7 Readiness probe errored: rpc error: code = Unknown desc = command error: time="2022-01-14T22:54:13Z" level=error msg="exec failed: container_linux.go:380: starting container process caused: exec: \"/usr/bin/grpc_health_probe\": stat /usr/bin/grpc_health_probe: no such file or directory"... 5m39s Warning Unhealthy pod/nfd-master-84c867c6fc-qkwf7 Liveness probe errored: rpc error: code = Unknown desc = command error: time="2022-01-14T22:54:23Z" level=error msg="exec failed: container_linux.go:380: starting container process caused: exec: \"/usr/bin/grpc_health_probe\": stat /usr/bin/grpc_health_probe: no such file or directory"... 4m49s Warning Unhealthy pod/nfd-master-84c867c6fc-qkwf7 (combined from similar events): Readiness probe errored: rpc error: code = Unknown desc = command error: time="2022-01-14T22:55:13Z" level=error msg="exec failed: container_linux.go:380: starting container process caused: exec: \"/usr/bin/grpc_health_probe\": stat /usr/bin/grpc_health_probe: no such file or directory"... 6m39s Normal SuccessfulCreate replicaset/nfd-master-84c867c6fc Created pod: nfd-master-84c867c6fc-qkwf7 3m56s Normal SuccessfulDelete replicaset/nfd-master-84c867c6fc Deleted pod: nfd-master-84c867c6fc-qkwf7 6m39s Normal ScalingReplicaSet deployment/nfd-master Scaled up replica set nfd-master-84c867c6fc to 1 4m16s Normal ScalingReplicaSet deployment/nfd-master Scaled up replica set nfd-master-7c77db9d7d to 1 3m56s Normal ScalingReplicaSet deployment/nfd-master Scaled down replica set nfd-master-84c867c6fc to 0 10s Warning FailedCreate daemonset/nfd-topology-updater Error creating: pods "nfd-topology-updater-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.containers[0].securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000650000, 1000659999], provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "nfd-worker": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount] 6m38s Normal Scheduled pod/nfd-worker-l2ssw Successfully assigned node-feature-discovery/nfd-worker-l2ssw to ip-10-0-215-102.us-east-2.compute.internal 4m18s Normal Pulling pod/nfd-worker-l2ssw Pulling image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" 6m33s Normal Pulled pod/nfd-worker-l2ssw Successfully pulled image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" in 5.798301286s 4m18s Normal Created pod/nfd-worker-l2ssw Created container nfd-worker 4m18s Normal Started pod/nfd-worker-l2ssw Started container nfd-worker 5m31s Normal Pulled pod/nfd-worker-l2ssw Successfully pulled image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" in 233.442724ms 4m30s Warning BackOff pod/nfd-worker-l2ssw Back-off restarting failed container 4m18s Normal Pulled pod/nfd-worker-l2ssw Successfully pulled image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" in 232.960876ms 6m38s Normal Scheduled pod/nfd-worker-ntgff Successfully assigned node-feature-discovery/nfd-worker-ntgff to ip-10-0-149-248.us-east-2.compute.internal 6m38s Warning FailedMount pod/nfd-worker-ntgff MountVolume.SetUp failed for volume "nfd-worker-conf" : failed to sync configmap cache: timed out waiting for the condition 6m38s Warning FailedMount pod/nfd-worker-ntgff MountVolume.SetUp failed for volume "kube-api-access-njx7m" : failed to sync configmap cache: timed out waiting for the condition 4m14s Normal Pulling pod/nfd-worker-ntgff Pulling image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" 6m32s Normal Pulled pod/nfd-worker-ntgff Successfully pulled image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" in 5.1910695s 4m14s Normal Created pod/nfd-worker-ntgff Created container nfd-worker 4m14s Normal Started pod/nfd-worker-ntgff Started container nfd-worker 5m31s Normal Pulled pod/nfd-worker-ntgff Successfully pulled image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" in 283.662628ms 4m30s Warning BackOff pod/nfd-worker-ntgff Back-off restarting failed container 4m14s Normal Pulled pod/nfd-worker-ntgff Successfully pulled image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" in 238.745232ms 6m38s Normal Scheduled pod/nfd-worker-qcrw6 Successfully assigned node-feature-discovery/nfd-worker-qcrw6 to ip-10-0-172-142.us-east-2.compute.internal 6m38s Warning FailedMount pod/nfd-worker-qcrw6 MountVolume.SetUp failed for volume "nfd-worker-conf" : failed to sync configmap cache: timed out waiting for the condition 6m38s Warning FailedMount pod/nfd-worker-qcrw6 MountVolume.SetUp failed for volume "kube-api-access-8xt8r" : failed to sync configmap cache: timed out waiting for the condition 4m19s Normal Pulling pod/nfd-worker-qcrw6 Pulling image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" 6m32s Normal Pulled pod/nfd-worker-qcrw6 Successfully pulled image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" in 5.106207986s 4m18s Normal Created pod/nfd-worker-qcrw6 Created container nfd-worker 4m18s Normal Started pod/nfd-worker-qcrw6 Started container nfd-worker 5m31s Normal Pulled pod/nfd-worker-qcrw6 Successfully pulled image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" in 234.526522ms 4m30s Warning BackOff pod/nfd-worker-qcrw6 Back-off restarting failed container 4m19s Normal Pulled pod/nfd-worker-qcrw6 Successfully pulled image "quay.io/<account_name>/node-feature-discovery:v0.10.0-devel-439-gea04e974" in 238.138173ms 6m39s Normal SuccessfulCreate daemonset/nfd-worker Created pod: nfd-worker-l2ssw 6m39s Normal SuccessfulCreate daemonset/nfd-worker Created pod: nfd-worker-qcrw6 6m39s Normal SuccessfulCreate daemonset/nfd-worker Created pod: nfd-worker-ntgff # Version-Release number of selected component (if applicable): # oc version Client Version: 4.10.0-fc.1 Server Version: 4.10.0-fc.1 Kubernetes Version: v1.23.0+50f645e How reproducible: At least twice Steps to Reproduce: 1. see above 2. 3. Actual results: nfd-topology-updater pods not being deployed, and cluster topology CRs not created Expected results: nfd-topology-updater pods should be deployed on worker nodes, and corresponding cluster topology CRs created Additional info:
https://github.com/openshift/cluster-nfd-operator/pull/236 provides the propoer fix
Verified on OCP 4.10.0-fc1 with latest NFD bundle from cluster-nfd-operator master github repo. [root@ip-172-31-45-145 cluster-nfd-operator]# oc get pods NAME READY STATUS RESTARTS AGE 4e0596ddb96fa7fe0a9d66ab8e52f32a3202c797b5732cbe136b0fa01ddpdt9 0/1 Completed 0 24m nfd-controller-manager-78f5855596-rqdm6 2/2 Running 0 23m nfd-master-6s6dv 1/1 Running 0 15m nfd-master-jp8qg 1/1 Running 0 15m nfd-master-rdjmn 1/1 Running 0 15m nfd-topology-updater-8s4dl 1/1 Running 0 15m nfd-topology-updater-n5928 1/1 Running 0 15m nfd-topology-updater-vwxt4 1/1 Running 0 15m nfd-worker-d45qn 1/1 Running 0 15m nfd-worker-ddh6t 1/1 Running 0 15m nfd-worker-m9cqb 1/1 Running 0 15m quay-io-<reponame>-nfd-operator-bundle-4-10-29 1/1 Running 0 24m # oc get crd | grep topology noderesourcetopologies.topology.node.k8s.io 2022-01-28T22:59:58Z # oc get noderesourcetopologies.topology.node.k8s.io NAME AGE ip-10-0-149-248.us-east-2.compute.internal 15m ip-10-0-172-142.us-east-2.compute.internal 15m ip-10-0-215-102.us-east-2.compute.internal 15m # oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-134-36.us-east-2.compute.internal Ready master 14d v1.23.0+50f645e ip-10-0-149-248.us-east-2.compute.internal Ready worker 14d v1.23.0+50f645e ip-10-0-172-142.us-east-2.compute.internal Ready worker 14d v1.23.0+50f645e ip-10-0-176-9.us-east-2.compute.internal Ready master 14d v1.23.0+50f645e ip-10-0-214-37.us-east-2.compute.internal Ready master 14d v1.23.0+50f645e ip-10-0-215-102.us-east-2.compute.internal Ready worker 14d v1.23.0+50f645e # oc get nodes | grep worker ip-10-0-149-248.us-east-2.compute.internal Ready worker 14d v1.23.0+50f645e ip-10-0-172-142.us-east-2.compute.internal Ready worker 14d v1.23.0+50f645e ip-10-0-215-102.us-east-2.compute.internal Ready worker 14d v1.23.0+50f645e # oc version Client Version: 4.10.0-fc.1 Server Version: 4.10.0-fc.1 Kubernetes Version: v1.23.0+50f645e # oc describe noderesourcetopologies.topology.node.k8s.io ip-10-0-149-248.us-east-2.compute.internal Name: ip-10-0-149-248.us-east-2.compute.internal Namespace: Labels: <none> Annotations: <none> API Version: topology.node.k8s.io/v1alpha1 Kind: NodeResourceTopology Metadata: Creation Timestamp: 2022-01-28T23:08:43Z Generation: 1 Managed Fields: API Version: topology.node.k8s.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:topologyPolicies: f:zones: Manager: nfd-master Operation: Update Time: 2022-01-28T23:08:43Z Resource Version: 6175792 UID: 75dfac83-249c-4f92-a6d6-733f047e8d2b Topology Policies: None Zones: Costs: Name: node-0 Value: 10 Name: node-0 Resources: Allocatable: 0 Available: 0 Capacity: 4 Name: cpu Type: Node Events: <none>
Re-opening this BZ as we are seeing a regression here from 4.10.0fc1 on OCP 4.10.3 with CSV nfd.4.10.0-202202241648: the NFD topology updater pods failing to deploy, when Operator is deployed from OperatorHub and nodefeaturediscovery instance created with topology updater option selected. # oc get pods -n openshift-nfd NAME READY STATUS RESTARTS AGE nfd-controller-manager-68f474b8-6jh7p 2/2 Running 0 127m nfd-master-qx6r2 1/1 Running 0 125m nfd-master-rnxlw 1/1 Running 0 125m nfd-master-vzpkq 1/1 Running 0 125m nfd-worker-dgsvx 1/1 Running 0 105m nfd-worker-dvpz4 1/1 Running 0 125m nfd-worker-x7vqk 1/1 Running 0 125m nfd-worker-zxh62 1/1 Running 0 125m # oc get nodefeaturediscovery -A -o yaml | grep topo topologyupdater: true # oc get events -n openshift-nfd LAST SEEN TYPE REASON OBJECT MESSAGE 127m Normal LeaderElection configmap/39f5e5c3.nodefeaturediscoveries.nfd.kubernetes.io nfd-controller-manager-68f474b8-6jh7p_690128ea-b635-4a7b-b3f8-8e9c2db02dfe became leader 127m Normal LeaderElection lease/39f5e5c3.nodefeaturediscoveries.nfd.kubernetes.io nfd-controller-manager-68f474b8-6jh7p_690128ea-b635-4a7b-b3f8-8e9c2db02dfe became leader 127m Normal Scheduled pod/nfd-controller-manager-68f474b8-6jh7p Successfully assigned openshift-nfd/nfd-controller-manager-68f474b8-6jh7p to ip-10-0-180-203.us-east-2.compute.internal 127m Normal AddedInterface pod/nfd-controller-manager-68f474b8-6jh7p Add eth0 [10.129.2.12/23] from openshift-sdn 127m Normal Pulling pod/nfd-controller-manager-68f474b8-6jh7p Pulling image "registry.redhat.io/openshift4/ose-kube-rbac-proxy@sha256:2f6b4a9a4467e08ebd5d633095b810b490dbf0bda171a5f6f9aeebc9cb763bba" 127m Normal Pulled pod/nfd-controller-manager-68f474b8-6jh7p Successfully pulled image "registry.redhat.io/openshift4/ose-kube-rbac-proxy@sha256:2f6b4a9a4467e08ebd5d633095b810b490dbf0bda171a5f6f9aeebc9cb763bba" in 3.268799214s 127m Normal Created pod/nfd-controller-manager-68f474b8-6jh7p Created container kube-rbac-proxy 127m Normal Started pod/nfd-controller-manager-68f474b8-6jh7p Started container kube-rbac-proxy 127m Normal Pulling pod/nfd-controller-manager-68f474b8-6jh7p Pulling image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:b0ef1959130e25021eae8dcf41a0aa37b939a9af8ce7bcbe05b629c39ff75235" 127m Normal Pulled pod/nfd-controller-manager-68f474b8-6jh7p Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:b0ef1959130e25021eae8dcf41a0aa37b939a9af8ce7bcbe05b629c39ff75235" in 4.424121185s 127m Normal Created pod/nfd-controller-manager-68f474b8-6jh7p Created container manager 127m Normal Started pod/nfd-controller-manager-68f474b8-6jh7p Started container manager 127m Normal SuccessfulCreate replicaset/nfd-controller-manager-68f474b8 Created pod: nfd-controller-manager-68f474b8-6jh7p 127m Normal ScalingReplicaSet deployment/nfd-controller-manager Scaled up replica set nfd-controller-manager-68f474b8 to 1 125m Normal Scheduled pod/nfd-master-qx6r2 Successfully assigned openshift-nfd/nfd-master-qx6r2 to ip-10-0-185-92.us-east-2.compute.internal 125m Normal AddedInterface pod/nfd-master-qx6r2 Add eth0 [10.128.0.76/23] from openshift-sdn 125m Normal Pulling pod/nfd-master-qx6r2 Pulling image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" 125m Normal Pulled pod/nfd-master-qx6r2 Successfully pulled image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" in 5.542965444s 125m Normal Created pod/nfd-master-qx6r2 Created container nfd-master 125m Normal Started pod/nfd-master-qx6r2 Started container nfd-master 125m Normal Scheduled pod/nfd-master-rnxlw Successfully assigned openshift-nfd/nfd-master-rnxlw to ip-10-0-206-193.us-east-2.compute.internal 125m Warning FailedMount pod/nfd-master-rnxlw MountVolume.SetUp failed for volume "kube-api-access-55dth" : failed to sync configmap cache: timed out waiting for the condition 125m Normal AddedInterface pod/nfd-master-rnxlw Add eth0 [10.129.0.73/23] from openshift-sdn 125m Normal Pulling pod/nfd-master-rnxlw Pulling image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" 125m Normal Pulled pod/nfd-master-rnxlw Successfully pulled image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" in 3.413573447s 125m Normal Created pod/nfd-master-rnxlw Created container nfd-master 125m Normal Started pod/nfd-master-rnxlw Started container nfd-master 125m Normal Scheduled pod/nfd-master-vzpkq Successfully assigned openshift-nfd/nfd-master-vzpkq to ip-10-0-154-175.us-east-2.compute.internal 125m Warning FailedMount pod/nfd-master-vzpkq MountVolume.SetUp failed for volume "kube-api-access-q4nzs" : failed to sync configmap cache: timed out waiting for the condition 125m Normal AddedInterface pod/nfd-master-vzpkq Add eth0 [10.130.0.62/23] from openshift-sdn 125m Normal Pulling pod/nfd-master-vzpkq Pulling image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" 125m Normal Pulled pod/nfd-master-vzpkq Successfully pulled image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" in 5.652410372s 125m Normal Created pod/nfd-master-vzpkq Created container nfd-master 125m Normal Started pod/nfd-master-vzpkq Started container nfd-master 125m Normal SuccessfulCreate daemonset/nfd-master Created pod: nfd-master-vzpkq 125m Normal SuccessfulCreate daemonset/nfd-master Created pod: nfd-master-qx6r2 125m Normal SuccessfulCreate daemonset/nfd-master Created pod: nfd-master-rnxlw 79m Warning FailedCreate daemonset/nfd-topology-updater Error creating: pods "nfd-topology-updater-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.containers[0].securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000650000, 1000659999], spec.containers[0].securityContext.seLinuxOptions.level: Invalid value: "": must be s0:c26,c0, spec.containers[0].securityContext.seLinuxOptions.type: Invalid value: "container_runtime_t": must be , provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "nfd-worker": Forbidden: not usable by user or serviceaccount, provider "nfd-topology-updater": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount] 9m29s Warning FailedCreate daemonset/nfd-topology-updater Error creating: pods "nfd-topology-updater-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.containers[0].securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000650000, 1000659999], spec.containers[0].securityContext.seLinuxOptions.level: Invalid value: "": must be s0:c26,c0, spec.containers[0].securityContext.seLinuxOptions.type: Invalid value: "container_runtime_t": must be , provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "nfd-worker": Forbidden: not usable by user or serviceaccount, provider "nfd-topology-updater": Forbidden: not usable by user or serviceaccount, provider "nvidia-driver": Forbidden: not usable by user or serviceaccount, provider "nvidia-gpu-feature-discovery": Forbidden: not usable by user or serviceaccount, provider "nvidia-mig-manager": Forbidden: not usable by user or serviceaccount, provider "nvidia-node-status-exporter": Forbidden: not usable by user or serviceaccount, provider "nvidia-operator-validator": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "nvidia-dcgm": Forbidden: not usable by user or serviceaccount, provider "nvidia-dcgm-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount] 105m Normal Scheduled pod/nfd-worker-dgsvx Successfully assigned openshift-nfd/nfd-worker-dgsvx to ip-10-0-134-44.us-east-2.compute.internal 105m Normal Pulling pod/nfd-worker-dgsvx Pulling image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" 105m Normal Pulled pod/nfd-worker-dgsvx Successfully pulled image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" in 15.438947707s 105m Normal Created pod/nfd-worker-dgsvx Created container nfd-worker 105m Normal Started pod/nfd-worker-dgsvx Started container nfd-worker 125m Normal Scheduled pod/nfd-worker-dvpz4 Successfully assigned openshift-nfd/nfd-worker-dvpz4 to ip-10-0-180-203.us-east-2.compute.internal 125m Normal Pulling pod/nfd-worker-dvpz4 Pulling image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" 125m Normal Pulled pod/nfd-worker-dvpz4 Successfully pulled image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" in 4.492711352s 125m Normal Created pod/nfd-worker-dvpz4 Created container nfd-worker 125m Normal Started pod/nfd-worker-dvpz4 Started container nfd-worker 125m Normal Scheduled pod/nfd-worker-x7vqk Successfully assigned openshift-nfd/nfd-worker-x7vqk to ip-10-0-212-168.us-east-2.compute.internal 125m Normal Pulling pod/nfd-worker-x7vqk Pulling image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" 125m Normal Pulled pod/nfd-worker-x7vqk Successfully pulled image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" in 5.105295723s 125m Normal Created pod/nfd-worker-x7vqk Created container nfd-worker 125m Normal Started pod/nfd-worker-x7vqk Started container nfd-worker 125m Normal Scheduled pod/nfd-worker-zxh62 Successfully assigned openshift-nfd/nfd-worker-zxh62 to ip-10-0-134-224.us-east-2.compute.internal 125m Normal Pulling pod/nfd-worker-zxh62 Pulling image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" 125m Normal Pulled pod/nfd-worker-zxh62 Successfully pulled image "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:a4a2b1e30a63cc9c9642c66eaabd39c15918828b7ff443fa63f53c68ea652961" in 8.248053299s 125m Normal Created pod/nfd-worker-zxh62 Created container nfd-worker 125m Normal Started pod/nfd-worker-zxh62 Started container nfd-worker 125m Normal SuccessfulCreate daemonset/nfd-worker Created pod: nfd-worker-x7vqk 125m Normal SuccessfulCreate daemonset/nfd-worker Created pod: nfd-worker-zxh62 125m Normal SuccessfulCreate daemonset/nfd-worker Created pod: nfd-worker-dvpz4 105m Normal SuccessfulCreate daemonset/nfd-worker Created pod: nfd-worker-dgsvx 127m Normal RequirementsUnknown clusterserviceversion/nfd.4.10.0-202202241648 requirements not yet checked 127m Normal RequirementsNotMet clusterserviceversion/nfd.4.10.0-202202241648 one or more requirements couldn't be found 127m Normal AllRequirementsMet clusterserviceversion/nfd.4.10.0-202202241648 all requirements found, attempting install 127m Normal InstallSucceeded clusterserviceversion/nfd.4.10.0-202202241648 waiting for install components to report healthy 127m Normal InstallWaiting clusterserviceversion/nfd.4.10.0-202202241648 installing: waiting for deployment nfd-controller-manager to become ready: deployment "nfd-controller-manager" not available: Deployment does not have minimum availability. 127m Normal InstallSucceeded clusterserviceversion/nfd.4.10.0-202202241648 install strategy completed with no errors # oc version Client Version: 4.10.0-fc.1 Server Version: 4.10.3 Kubernetes Version: v1.23.3+e419edf ==== Logs from NFD operator are also attached in private comment
Verified with NFD operator bundle build from image from https://github.com/openshift/cluster-nfd-operator.git master branch: # git clone https://github.com/openshift/cluster-nfd-operator.git # git status On branch master Your branch is up to date with 'origin/master'. nothing to commit, working tree clean # git log -1 commit 06ccbf9aca360330e64636c3110b65741cbaa60a (HEAD -> master, origin/release-4.12, origin/release-4.11, origin/master, origin/HEAD) Merge: 2f9e1582 2e47de7e Author: OpenShift Merge Robot <openshift-merge-robot.github.com> Date: Thu Mar 17 13:26:12 2022 +0100 Merge pull request #248 from ArangoGutierrez/devel/rbac_fix Bug 2042536: Fix SCC control # operator-sdk run bundle "quay.io/wabouham/nfd-operator-bundle:4.10.5" -n openshift-nfd INFO[0005] Successfully created registry pod: quay-io-wabouham-nfd-operator-bundle-4-10-5 INFO[0006] Created CatalogSource: nfd-catalog INFO[0006] OperatorGroup "operator-sdk-og" created INFO[0006] Created Subscription: nfd-v4-10-5-sub INFO[0018] Approved InstallPlan install-mngw6 for the Subscription: nfd-v4-10-5-sub INFO[0018] Waiting for ClusterServiceVersion "openshift-nfd/nfd.v4.10.5" to reach 'Succeeded' phase INFO[0018] Waiting for ClusterServiceVersion "openshift-nfd/nfd.v4.10.5" to appear INFO[0025] Found ClusterServiceVersion "openshift-nfd/nfd.v4.10.5" phase: Pending INFO[0029] Found ClusterServiceVersion "openshift-nfd/nfd.v4.10.5" phase: Installing INFO[0070] Found ClusterServiceVersion "openshift-nfd/nfd.v4.10.5" phase: Succeeded INFO[0070] OLM has successfully installed "nfd.v4.10.5" # From OperatorHub on OpenShift console, created a NodeFeatureDiscovery instance and selected the toplogy updater option # oc get nodefeaturediscovery -n openshift-nfd -o yaml | grep topo topologyupdater: true # oc get pods -n openshift-nfd NAME READY STATUS RESTARTS AGE 32b89008b14c5c007493c3e5bed8ff190f27861bff8413c71947cc5325fftct 0/1 Completed 0 4m29s nfd-controller-manager-6f749f986f-fw676 2/2 Running 0 4m17s nfd-master-4z65l 1/1 Running 0 117s nfd-master-nkkd9 1/1 Running 0 117s nfd-master-nw647 1/1 Running 0 117s nfd-topology-updater-2pnnp 1/1 Running 0 117s nfd-topology-updater-fwvrb 1/1 Running 0 117s nfd-topology-updater-gt9h9 1/1 Running 0 117s nfd-worker-7fs2t 1/1 Running 0 117s nfd-worker-7tgtb 1/1 Running 0 117s nfd-worker-z2bzw 1/1 Running 0 117s quay-io-wabouham-nfd-operator-bundle-4-10-5 1/1 Running 0 4m45s ### Note the nfd-topology-updater pods are created as expected for each worker node # oc get crd | grep topo noderesourcetopologies.topology.node.k8s.io 2022-03-18T13:26:08Z # oc get noderesourcetopologies.topology.node.k8s.io NAME AGE ip-10-0-157-99.us-east-2.compute.internal 63s ip-10-0-187-91.us-east-2.compute.internal 64s ip-10-0-203-115.us-east-2.compute.internal 66s # oc get nodes | grep worker ip-10-0-157-99.us-east-2.compute.internal Ready worker 25h v1.23.3+e419edf ip-10-0-187-91.us-east-2.compute.internal Ready worker 25h v1.23.3+e419edf ip-10-0-203-115.us-east-2.compute.internal Ready worker 25h v1.23.3+e419edf # oc describe noderesourcetopologies.topology.node.k8s.io Name: ip-10-0-157-99.us-east-2.compute.internal Namespace: Labels: <none> Annotations: <none> API Version: topology.node.k8s.io/v1alpha1 Kind: NodeResourceTopology Metadata: Creation Timestamp: 2022-03-18T13:28:43Z Generation: 1 Managed Fields: API Version: topology.node.k8s.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:topologyPolicies: f:zones: Manager: nfd-master Operation: Update Time: 2022-03-18T13:28:43Z Resource Version: 576487 UID: 7e600f06-5863-4ccc-8278-d6931df7ca04 Topology Policies: None Zones: Costs: Name: node-0 Value: 10 Name: node-0 Resources: Allocatable: 0 Available: 0 Capacity: 4 Name: cpu Type: Node Events: <none> Name: ip-10-0-187-91.us-east-2.compute.internal Namespace: Labels: <none> Annotations: <none> API Version: topology.node.k8s.io/v1alpha1 Kind: NodeResourceTopology Metadata: Creation Timestamp: 2022-03-18T13:28:42Z Generation: 1 Managed Fields: API Version: topology.node.k8s.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:topologyPolicies: f:zones: Manager: nfd-master Operation: Update Time: 2022-03-18T13:28:42Z Resource Version: 576470 UID: 76678a88-9389-42e1-8560-0cbcb0e8e568 Topology Policies: None Zones: Costs: Name: node-0 Value: 10 Name: node-0 Resources: Allocatable: 0 Available: 0 Capacity: 4 Name: cpu Type: Node Events: <none> Name: ip-10-0-203-115.us-east-2.compute.internal Namespace: Labels: <none> Annotations: <none> API Version: topology.node.k8s.io/v1alpha1 Kind: NodeResourceTopology Metadata: Creation Timestamp: 2022-03-18T13:28:40Z Generation: 1 Managed Fields: API Version: topology.node.k8s.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:topologyPolicies: f:zones: Manager: nfd-master Operation: Update Time: 2022-03-18T13:28:40Z Resource Version: 576429 UID: 235fe8a9-fcaa-459f-9a06-ea92aff7c78d Topology Policies: None Zones: Costs: Name: node-0 Value: 10 Name: node-0 Resources: Allocatable: 0 Available: 0 Capacity: 4 Name: cpu Type: Node Events: <none> # oc describe node | grep feature feature.node.kubernetes.io/cpu-cpuid.ADX=true feature.node.kubernetes.io/cpu-cpuid.AESNI=true feature.node.kubernetes.io/cpu-cpuid.AVX=true feature.node.kubernetes.io/cpu-cpuid.AVX2=true feature.node.kubernetes.io/cpu-cpuid.AVX512BW=true feature.node.kubernetes.io/cpu-cpuid.AVX512CD=true feature.node.kubernetes.io/cpu-cpuid.AVX512DQ=true feature.node.kubernetes.io/cpu-cpuid.AVX512F=true feature.node.kubernetes.io/cpu-cpuid.AVX512VL=true feature.node.kubernetes.io/cpu-cpuid.FMA3=true feature.node.kubernetes.io/cpu-cpuid.HYPERVISOR=true feature.node.kubernetes.io/cpu-cpuid.MPX=true feature.node.kubernetes.io/cpu-hardware_multithreading=true feature.node.kubernetes.io/kernel-config.NO_HZ=true feature.node.kubernetes.io/kernel-config.NO_HZ_FULL=true feature.node.kubernetes.io/kernel-selinux.enabled=true feature.node.kubernetes.io/kernel-version.full=4.18.0-305.40.2.el8_4.x86_64 feature.node.kubernetes.io/kernel-version.major=4 feature.node.kubernetes.io/kernel-version.minor=18 feature.node.kubernetes.io/kernel-version.revision=0 feature.node.kubernetes.io/pci-1d0f.present=true feature.node.kubernetes.io/storage-nonrotationaldisk=true feature.node.kubernetes.io/system-os_release.ID=rhcos feature.node.kubernetes.io/system-os_release.OSTREE_VERSION=410.84.202203141348-0 feature.node.kubernetes.io/system-os_release.RHEL_VERSION=8.4 feature.node.kubernetes.io/system-os_release.VERSION_ID=4.10 feature.node.kubernetes.io/system-os_release.VERSION_ID.major=4 feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=10 nfd.node.kubernetes.io/feature-labels: feature.node.kubernetes.io/cpu-cpuid.ADX=true feature.node.kubernetes.io/cpu-cpuid.AESNI=true feature.node.kubernetes.io/cpu-cpuid.AVX=true feature.node.kubernetes.io/cpu-cpuid.AVX2=true feature.node.kubernetes.io/cpu-cpuid.AVX512BW=true feature.node.kubernetes.io/cpu-cpuid.AVX512CD=true feature.node.kubernetes.io/cpu-cpuid.AVX512DQ=true feature.node.kubernetes.io/cpu-cpuid.AVX512F=true feature.node.kubernetes.io/cpu-cpuid.AVX512VL=true feature.node.kubernetes.io/cpu-cpuid.FMA3=true feature.node.kubernetes.io/cpu-cpuid.HYPERVISOR=true feature.node.kubernetes.io/cpu-cpuid.MPX=true feature.node.kubernetes.io/cpu-hardware_multithreading=true feature.node.kubernetes.io/kernel-config.NO_HZ=true feature.node.kubernetes.io/kernel-config.NO_HZ_FULL=true feature.node.kubernetes.io/kernel-selinux.enabled=true feature.node.kubernetes.io/kernel-version.full=4.18.0-305.40.2.el8_4.x86_64 feature.node.kubernetes.io/kernel-version.major=4 feature.node.kubernetes.io/kernel-version.minor=18 feature.node.kubernetes.io/kernel-version.revision=0 feature.node.kubernetes.io/pci-1d0f.present=true feature.node.kubernetes.io/storage-nonrotationaldisk=true feature.node.kubernetes.io/system-os_release.ID=rhcos feature.node.kubernetes.io/system-os_release.OSTREE_VERSION=410.84.202203141348-0 feature.node.kubernetes.io/system-os_release.RHEL_VERSION=8.4 feature.node.kubernetes.io/system-os_release.VERSION_ID=4.10 feature.node.kubernetes.io/system-os_release.VERSION_ID.major=4 feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=10 nfd.node.kubernetes.io/feature-labels: feature.node.kubernetes.io/cpu-cpuid.ADX=true feature.node.kubernetes.io/cpu-cpuid.AESNI=true feature.node.kubernetes.io/cpu-cpuid.AVX=true feature.node.kubernetes.io/cpu-cpuid.AVX2=true feature.node.kubernetes.io/cpu-cpuid.AVX512BW=true feature.node.kubernetes.io/cpu-cpuid.AVX512CD=true feature.node.kubernetes.io/cpu-cpuid.AVX512DQ=true feature.node.kubernetes.io/cpu-cpuid.AVX512F=true feature.node.kubernetes.io/cpu-cpuid.AVX512VL=true feature.node.kubernetes.io/cpu-cpuid.FMA3=true feature.node.kubernetes.io/cpu-cpuid.HYPERVISOR=true feature.node.kubernetes.io/cpu-cpuid.MPX=true feature.node.kubernetes.io/cpu-hardware_multithreading=true feature.node.kubernetes.io/kernel-config.NO_HZ=true feature.node.kubernetes.io/kernel-config.NO_HZ_FULL=true feature.node.kubernetes.io/kernel-selinux.enabled=true feature.node.kubernetes.io/kernel-version.full=4.18.0-305.40.2.el8_4.x86_64 feature.node.kubernetes.io/kernel-version.major=4 feature.node.kubernetes.io/kernel-version.minor=18 feature.node.kubernetes.io/kernel-version.revision=0 feature.node.kubernetes.io/pci-1d0f.present=true feature.node.kubernetes.io/storage-nonrotationaldisk=true feature.node.kubernetes.io/system-os_release.ID=rhcos feature.node.kubernetes.io/system-os_release.OSTREE_VERSION=410.84.202203141348-0 feature.node.kubernetes.io/system-os_release.RHEL_VERSION=8.4 feature.node.kubernetes.io/system-os_release.VERSION_ID=4.10 feature.node.kubernetes.io/system-os_release.VERSION_ID.major=4 feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=10 nfd.node.kubernetes.io/feature-labels:
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.11.0 extras and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5070