Bug 1952145
| Summary: | OCP 4.8: Node Feature Discovery (NFD) Operator fails to deploy from OperatorHub: cluster-nfd-operator executable file not found in $PATH error | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Walid A. <wabouham> |
| Component: | Node Feature Discovery Operator | Assignee: | Carlos Eduardo Arango Gutierrez <carangog> |
| Status: | CLOSED ERRATA | QA Contact: | Walid A. <wabouham> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.8 | CC: | carangog, hmiyamot, lhorsley, satwsing, sejug |
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-27 22:19:28 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
*** Bug 1928975 has been marked as a duplicate of this bug. *** Faced the same issue on the ppc64le arch. with OCP build 4.8.0-0.nightly-ppc64le-2021-04-28-005545
NFD version: 4.8.0-202104271317.p0
```
# oc get all -A | grep nfd
openshift-operators pod/nfd-operator-6ddddb9549-b4n92 0/1 CreateContainerError 0 5m35s
openshift-operators deployment.apps/nfd-operator 0/1 1 0 5m35s
openshift-operators replicaset.apps/nfd-operator-6ddddb9549 1 1 0 5m35s
```
```
# oc describe pod/nfd-operator-6ddddb9549-b4n92 -n openshift-operators
Name: nfd-operator-6ddddb9549-b4n92
Namespace: openshift-operators
Priority: 0
Node: master-2/193.168.200.213
Start Time: Mon, 03 May 2021 03:37:49 -0400
Labels: name=nfd-operator
pod-template-hash=6ddddb9549
Annotations: alm-examples:
[
{
"apiVersion": "nfd.openshift.io/v1",
"kind": "NodeFeatureDiscovery",
"metadata": {
"name": "nfd-master-server"
},
"spec": {
"operand": {
"image": "registry.redhat.io/openshift4/ose-node-feature-discovery:v4.8.0",
"imagePullPolicy": "Always",
"namespace": "node-feature-discovery-operator"
},
"workerConfig": {
"configData": "sources:\n pci:\n deviceLabelFields:\n - \"vendor\"\n deviceClassWhitelist:\n - \"0200\"\n ...
}
}
}
]
capabilities: Basic Install
categories: Database
certified: false
containerImage: registry.redhat.io/openshift4/ose-cluster-nfd-operator:v4.8.0
createdAt: 2019-05-30T00:00:00Z
description:
This software enables node feature discovery for Kubernetes. It detects hardware features available on each node in a Kubernetes cluster, ...
k8s.v1.cni.cncf.io/network-status:
[{
"name": "",
"interface": "eth0",
"ips": [
"10.130.0.100"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status:
[{
"name": "",
"interface": "eth0",
"ips": [
"10.130.0.100"
],
"default": true,
"dns": {}
}]
olm.operatorGroup: global-operators
olm.operatorNamespace: openshift-operators
olm.skipRange: >=4.6.0 <4.8.0-202104271317.p0
olm.targetNamespaces:
openshift.io/scc: anyuid
operatorframework.io/properties:
{"properties":[{"type":"olm.gvk","value":{"group":"nfd.openshift.io","kind":"NodeFeatureDiscovery","version":"v1"}},{"type":"olm.package",...
provider: Red Hat
repository: https://github.com/openshift/cluster-nfd-operator
support: Red Hat
Status: Pending
IP: 10.130.0.100
IPs:
IP: 10.130.0.100
Controlled By: ReplicaSet/nfd-operator-6ddddb9549
Containers:
nfd-operator:
Container ID:
Image: registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:de3498688673acea54b31707266db4c2d319672c3ad99c137411d8d70b64e2d1
Image ID:
Port: 60000/TCP
Host Port: 0/TCP
Command:
node-feature-discovery-operator
State: Waiting
Reason: CreateContainerError
Ready: False
Restart Count: 0
Environment:
WATCH_NAMESPACE: (v1:metadata.annotations['olm.targetNamespaces'])
POD_NAME: nfd-operator-6ddddb9549-b4n92 (v1:metadata.name)
OPERATOR_NAME: cluster-nfd-operator
NODE_FEATURE_DISCOVERY_IMAGE: registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:355bf8f24c86fd1f6c0946d42d0e542338e3af534573a139a0e4471c0935b4c3
HTTP_PROXY: http://rdr-satwin-bastion-0:3128
HTTPS_PROXY: http://rdr-satwin-bastion-0:3128
NO_PROXY: .cluster.local,.rdr-satwin.redhat.com,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,172.30.0.0/16,193.168.200.0/24,api-int.rdr-satwin.redhat.com,localhost
OPERATOR_CONDITION_NAME: nfd.4.8.0-202104271317.p0
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from nfd-operator-token-5h24f (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
nfd-operator-token-5h24f:
Type: Secret (a volume populated by a Secret)
SecretName: nfd-operator-token-5h24f
Optional: false
QoS Class: BestEffort
Node-Selectors: node-role.kubernetes.io/master=
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m default-scheduler Successfully assigned openshift-operators/nfd-operator-6ddddb9549-b4n92 to master-2
Normal AddedInterface 5m58s multus Add eth0 [10.130.0.100/23]
Normal Pulled 5m42s kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:de3498688673acea54b31707266db4c2d319672c3ad99c137411d8d70b64e2d1" in 15.628274544s
Warning Failed 5m41s kubelet Error: container create failed: time="2021-05-03T07:38:09Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"node-feature-discovery-operator\": executable file not found in $PATH"
Normal Pulled 5m37s kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:de3498688673acea54b31707266db4c2d319672c3ad99c137411d8d70b64e2d1" in 2.931851277s
Warning Failed 5m37s kubelet Error: container create failed: time="2021-05-03T07:38:13Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"node-feature-discovery-operator\": executable file not found in $PATH"
Warning Failed 5m22s kubelet Error: container create failed: time="2021-05-03T07:38:28Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"node-feature-discovery-operator\": executable file not found in $PATH"
Normal Pulled 5m22s kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:de3498688673acea54b31707266db4c2d319672c3ad99c137411d8d70b64e2d1" in 2.743734366s
Warning Failed 5m4s kubelet Error: container create failed: time="2021-05-03T07:38:46Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"node-feature-discovery-operator\": executable file not found in $PATH"
Normal Pulled 5m4s kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:de3498688673acea54b31707266db4c2d319672c3ad99c137411d8d70b64e2d1" in 2.789621989s
```
Using the nfd operator bundle from the merged PR branch https://github.com/openshift/cluster-nfd-operator/pull/161, NFD operator was successfully deployed with `operator-sdk run` command after building the latest nfd operator image and nfd bundle images and pushed them to my own quay.io repo. From OpenShift console, Operator -> Installed Operators, NFD: created the NodeFeatureDiscoveries instance # operator-sdk run bundle quay.io/wabouham/nfd-operator-bundle:0.0.1 INFO[0009] Successfully created registry pod: quay-io-wabouham-nfd-operator-bundle-0-0-1 INFO[0009] Created CatalogSource: node-feature-discovery-operator-catalog INFO[0010] OperatorGroup "operator-sdk-og" created INFO[0010] Created Subscription: node-feature-discovery-operator-v0-0-1-sub INFO[0015] Approved InstallPlan install-f5sb8 for the Subscription: node-feature-discovery-operator-v0-0-1-sub INFO[0015] Waiting for ClusterServiceVersion "default/node-feature-discovery-operator.v0.0.1" to reach 'Succeeded' phase INFO[0015] Waiting for ClusterServiceVersion "default/node-feature-discovery-operator.v0.0.1" to appear INFO[0046] Found ClusterServiceVersion "default/node-feature-discovery-operator.v0.0.1" phase: Pending INFO[0047] Found ClusterServiceVersion "default/node-feature-discovery-operator.v0.0.1" phase: Installing INFO[0078] Found ClusterServiceVersion "default/node-feature-discovery-operator.v0.0.1" phase: Succeeded INFO[0078] OLM has successfully installed "node-feature-discovery-operator.v0.0.1" # oc get pods -n default NAME READY STATUS RESTARTS AGE ff0c8a9d7f0601070764733f0bae54bb0110a7e8d656898cd9afd0c7d8mv4v2 0/1 Completed 0 2m19s nfd-controller-manager-5f85fd8ccc-fmngz 2/2 Running 0 106s quay-io-wabouham-nfd-operator-bundle-0-0-1 1/1 Running 0 2m32s # oc get pods -n default NAME READY STATUS RESTARTS AGE ff0c8a9d7f0601070764733f0bae54bb0110a7e8d656898cd9afd0c7d8mv4v2 0/1 Completed 0 5m49s nfd-controller-manager-5f85fd8ccc-fmngz 2/2 Running 0 5m16s nfd-master-66pbh 1/1 Running 0 113s nfd-master-ds8sn 1/1 Running 0 113s nfd-master-wx9lz 1/1 Running 0 113s nfd-worker-5ncvq 1/1 Running 0 112s nfd-worker-tbnkm 1/1 Running 0 112s nfd-worker-xlpnh 1/1 Running 0 112s quay-io-wabouham-nfd-operator-bundle-0-0-1 1/1 Running 0 6m2s Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.2 extras update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2435 |
Description of problem: This is on OCP 4.8.0-0.nightly-2021-04-20-070522 ipi install on aws. Using catalogsource qe-app-registry and clusterserviceversion/nfd.4.8.0-202104162013.p0, NFD operator times out when trying to install it form console/OperatorHub: # oc get pods -n openshift-operators NAME READY STATUS RESTARTS AGE nfd-operator-7f67cd9b57-m8qg5 0/1 CreateContainerError 0 39m # oc describe pod nfd-operator-7f67cd9b57-m8qg5 -n openshift-operators Name: nfd-operator-7f67cd9b57-m8qg5 Namespace: openshift-operators Priority: 0 Node: ip-10-0-213-141.us-east-2.compute.internal/10.0.213.141 Start Time: Wed, 21 Apr 2021 14:51:42 +0000 Labels: name=nfd-operator pod-template-hash=7f67cd9b57 Annotations: alm-examples: [ { "apiVersion": "nfd.openshift.io/v1", "kind": "NodeFeatureDiscovery", "metadata": { "name": "nfd-master-server" }, "spec": { "operand": { "image": "registry.redhat.io/openshift4/ose-node-feature-discovery:v4.8.0", "imagePullPolicy": "Always", "namespace": "node-feature-discovery-operator" }, "workerConfig": { "configData": "sources:\n pci:\n deviceLabelFields:\n - \"vendor\"\n deviceClassWhitelist:\n - \"0200\"\n ... } } } ] capabilities: Basic Install categories: Database certified: false containerImage: registry.redhat.io/openshift4/ose-cluster-nfd-operator:v4.8.0 createdAt: 2019-05-30T00:00:00Z description: This software enables node feature discovery for Kubernetes. It detects hardware features available on each node in a Kubernetes cluster, ... k8s.v1.cni.cncf.io/network-status: [{ "name": "", "interface": "eth0", "ips": [ "10.130.0.99" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "", "interface": "eth0", "ips": [ "10.130.0.99" ], "default": true, "dns": {} }] olm.operatorGroup: global-operators olm.operatorNamespace: openshift-operators olm.skipRange: >=4.6.0 <4.8.0-202104162013.p0 olm.targetNamespaces: openshift.io/scc: anyuid operatorframework.io/properties: {"properties":[{"type":"olm.gvk","value":{"group":"nfd.openshift.io","kind":"NodeFeatureDiscovery","version":"v1"}},{"type":"olm.package",... provider: Red Hat repository: https://github.com/openshift/cluster-nfd-operator support: Red Hat Status: Pending IP: 10.130.0.99 IPs: IP: 10.130.0.99 Controlled By: ReplicaSet/nfd-operator-7f67cd9b57 Containers: nfd-operator: Container ID: Image: registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd Image ID: Port: 60000/TCP Host Port: 0/TCP Command: cluster-nfd-operator State: Waiting Reason: CreateContainerError Ready: False Restart Count: 0 Environment: WATCH_NAMESPACE: (v1:metadata.annotations['olm.targetNamespaces']) POD_NAME: nfd-operator-7f67cd9b57-m8qg5 (v1:metadata.name) OPERATOR_NAME: cluster-nfd-operator NODE_FEATURE_DISCOVERY_IMAGE: registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:c11f20d88a0adee2c12e7c7409e84135f7439e872340811a51b6461616dd513f OPERATOR_CONDITION_NAME: nfd.4.8.0-202104162013.p0 Mounts: /var/run/secrets/kubernetes.io/serviceaccount from nfd-operator-token-xz8sl (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: nfd-operator-token-xz8sl: Type: Secret (a volume populated by a Secret) SecretName: nfd-operator-token-xz8sl Optional: false QoS Class: BestEffort Node-Selectors: node-role.kubernetes.io/master= Tolerations: node-role.kubernetes.io/master:NoSchedule node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 36m default-scheduler Successfully assigned openshift-operators/nfd-operator-7f67cd9b57-m8qg5 to ip-10-0-213-141.us-east-2.compute.internal Normal AddedInterface 36m multus Add eth0 [10.130.0.99/23] Normal Pulled 36m kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd" in 2.173252733s Warning Failed 36m kubelet Error: container create failed: time="2021-04-21T14:51:47Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"cluster-nfd-operator\": executable file not found in $PATH" Warning Failed 36m kubelet Error: container create failed: time="2021-04-21T14:51:50Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"cluster-nfd-operator\": executable file not found in $PATH" Normal Pulled 36m kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd" in 2.233554247s Normal Pulled 36m kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd" in 2.114799275s Warning Failed 36m kubelet Error: container create failed: time="2021-04-21T14:52:05Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"cluster-nfd-operator\": executable file not found in $PATH" Normal Pulled 36m kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd" in 2.248943794s Warning Failed 36m kubelet Error: container create failed: time="2021-04-21T14:52:19Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"cluster-nfd-operator\": executable file not found in $PATH" Normal Pulled 36m kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd" in 2.332834673s Warning Failed 36m kubelet Error: container create failed: time="2021-04-21T14:52:35Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"cluster-nfd-operator\": executable file not found in $PATH" Normal Pulled 35m kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd" in 2.210708164s Warning Failed 35m kubelet Error: container create failed: time="2021-04-21T14:52:50Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"cluster-nfd-operator\": executable file not found in $PATH" Warning Failed 35m kubelet Error: container create failed: time="2021-04-21T14:53:08Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"cluster-nfd-operator\": executable file not found in $PATH" Normal Pulled 35m kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd" in 2.206978898s Normal Pulled 35m kubelet Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd" in 2.498489502s Warning Failed 35m kubelet Error: container create failed: time="2021-04-21T14:53:23Z" level=error msg="container_linux.go:367: starting container process caused: exec: \"cluster-nfd-operator\": executable file not found in $PATH" Normal Pulling 6m45s (x120 over 36m) kubelet Pulling image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd" Normal Pulled 111s (x130 over 34m) kubelet (combined from similar events): Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:312e243d1feb44b3bb2150c084f1fbca0e737719cbdd4822e9c89d7cfd2bb2fd" in 2.111801928s Version-Release number of selected component (if applicable): # oc version Client Version: 4.8.0-0.nightly-2021-04-05-174735 Server Version: 4.8.0-0.nightly-2021-04-20-070522 Kubernetes Version: v1.21.0-rc.0+b2955f1 How reproducible: Every time Steps to Reproduce: 1. Install OCP 4.8 nightly cluster, ipi on aws 2. Create catalogsource to pull latest 4.8 NFD operator images 3. From OCP console, deploy NFD from OperatorHub using nfd.4.8.0-202104162013.p0 csv Actual results: pod/nfd-operator-7f67cd9b57-mt45h 0/1 CreateContainerError. Fails to install from OperatorHub Expected results: Install Succeeded and NFD operator pod running Additional info: