Bug 1815129
| Summary: | OCP 4.2.z - nfd-worker pods fail to deploy in namespace other than default after NFD operator is deployed from OperatorHub | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Walid A. <wabouham> |
| Component: | Node Feature Discovery Operator | Assignee: | Zvonko Kosic <zkosic> |
| Status: | CLOSED ERRATA | QA Contact: | Walid A. <wabouham> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.2.z | CC: | carangog, ematysek, mifiedle, mpatel, sejug, wabouham, zkosic |
| Target Milestone: | --- | ||
| Target Release: | 4.2.z | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1807620 | Environment: | |
| Last Closed: | 2020-04-21 11:37:59 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1808061 | ||
| Bug Blocks: | |||
Tested on 4.2.29 stage cluster and NFD deployed successfully both from custom or default namespace from OperatorHub. Nodes were labeled correctly:
$ oc get pods -n test-nfd
NAME READY STATUS RESTARTS AGE
nfd-master-27svr 1/1 Running 0 21m
nfd-master-p6fvd 1/1 Running 0 21m
nfd-master-t6pqk 1/1 Running 0 21m
nfd-operator-69fb6cb8ff-hb6hp 1/1 Running 0 22m
nfd-worker-52g55 1/1 Running 2 21m
nfd-worker-6hmmm 1/1 Running 2 21m
nfd-worker-jkhr6 1/1 Running 2 21m
$ oc describe -n test-nfd pod/nfd-operator-69fb6cb8ff-hb6hp
Name: nfd-operator-69fb6cb8ff-hb6hp
Namespace: test-nfd
Priority: 0
Node: 4229-stage17-whrk7-control-plane-1/10.0.102.102
Start Time: Fri, 17 Apr 2020 12:13:28 -0400
Labels: name=nfd-operator
pod-template-hash=69fb6cb8ff
Annotations: alm-examples:
[
{
"apiVersion": "nfd.openshift.io/v1alpha1",
"kind": "NodeFeatureDiscovery",
"metadata": {
"name": "nfd-master-server"
},
"spec": {
"namespace": "openshift-nfd"
}
}
]
capabilities: Basic Install
categories: Database
certified: false
containerImage:
createdAt: 2019-05-30T00:00:00Z
description:
This software enables node feature discovery for Kubernetes. It detects hardware features available on each node in a Kubernetes cluster, ...
k8s.v1.cni.cncf.io/networks-status:
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.128.0.49"
],
"default": true,
"dns": {}
}]
olm.operatorGroup: test-nfd-vrdfw
olm.operatorNamespace: test-nfd
olm.targetNamespaces: test-nfd
openshift.io/scc: anyuid
provider: Red Hat
repository: https://github.com/openshift/cluster-nfd-operator
support: Red Hat
Status: Running
IP: 10.128.0.49
IPs: <none>
Controlled By: ReplicaSet/nfd-operator-69fb6cb8ff
Containers:
nfd-operator:
Container ID: cri-o://411d026bf8c102f98d7cfa10ca82639ebc0a27ddac04d2066e34459b96fe822a
Image: registry.stage.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40
Image ID: registry.stage.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40
Port: 60000/TCP
Host Port: 0/TCP
Command:
cluster-nfd-operator
State: Running
Started: Fri, 17 Apr 2020 12:13:40 -0400
Ready: True
Restart Count: 0
Readiness: exec [stat /tmp/operator-sdk-ready] delay=4s timeout=1s period=10s #success=1 #failure=1
Environment:
WATCH_NAMESPACE: (v1:metadata.annotations['olm.targetNamespaces'])
POD_NAME: nfd-operator-69fb6cb8ff-hb6hp (v1:metadata.name)
OPERATOR_NAME: cluster-nfd-operator
NODE_FEATURE_DISCOVERY_IMAGE: registry.stage.redhat.io/openshift4/ose-node-feature-discovery@sha256:f8d60643622304dbb4d9fee5b0223c7a6c6d972480127c0352745598ccde39e2
Mounts:
/tmp from tmp (rw)
/var/run/secrets/kubernetes.io/serviceaccount from nfd-operator-token-tvl86 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
tmp:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
nfd-operator-token-tvl86:
Type: Secret (a volume populated by a Secret)
SecretName: nfd-operator-token-tvl86
Optional: false
QoS Class: BestEffort
Node-Selectors: node-role.kubernetes.io/master=
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m17s default-scheduler Successfully assigned test-nfd/nfd-operator-69fb6cb8ff-hb6hp to 4229-stage17-whrk7-control-plane-1
Normal Pulling 2m8s kubelet, 4229-stage17-whrk7-control-plane-1 Pulling image "registry.stage.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40"
Normal Pulled 2m6s kubelet, 4229-stage17-whrk7-control-plane-1 Successfully pulled image "registry.stage.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40"
Normal Created 2m5s kubelet, 4229-stage17-whrk7-control-plane-1 Created container nfd-operator
Normal Started 2m5s kubelet, 4229-stage17-whrk7-control-plane-1 Started container nfd-operator
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1450 *** Bug 1808503 has been marked as a duplicate of this bug. *** |
Tested on OCP 4.2.28: Server Version: 4.2.28 Kubernetes Version: v1.14.6-152-g117ba1f Unable to deploy NFD operator from OperatorHub in a custom namespace, operator image not found: $ oc get pods -n test-nfd NAME READY STATUS RESTARTS AGE nfd-operator-66546cdbfc-wp9pw 0/1 ImagePullBackOff 0 3h58m $ oc logs -n test-nfd nfd-operator-66546cdbfc-wp9pw Error from server (BadRequest): container "nfd-operator" in pod "nfd-operator-66546cdbfc-wp9pw" is waiting to start: trying and failing to pull image MacBook-Pro:.docker walid$ MacBook-Pro:.docker walid$ oc get events -n test-nfd LAST SEEN TYPE REASON OBJECT MESSAGE 53m Normal Pulling pod/nfd-operator-66546cdbfc-wp9pw Pulling image "image-registry.openshift-image-registry.svc:5000/openshift/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40" 3m32s Normal BackOff pod/nfd-operator-66546cdbfc-wp9pw Back-off pulling image "image-registry.openshift-image-registry.svc:5000/openshift/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40" 13m Warning Failed pod/nfd-operator-66546cdbfc-wp9pw Error: ImagePullBackOff $ oc describe -n test-nfd pod/nfd-operator-66546cdbfc-wp9pw Name: nfd-operator-66546cdbfc-wp9pw Namespace: test-nfd Priority: 0 Node: ip-10-0-162-203.us-west-2.compute.internal/10.0.162.203 Start Time: Tue, 14 Apr 2020 12:33:26 -0400 Labels: name=nfd-operator pod-template-hash=66546cdbfc Annotations: alm-examples: [ { "apiVersion": "nfd.openshift.io/v1alpha1", "kind": "NodeFeatureDiscovery", "metadata": { "name": "nfd-master-server" }, "spec": { "namespace": "openshift-nfd" } } ] capabilities: Basic Install categories: Database certified: false containerImage: createdAt: 2019-05-30T00:00:00Z description: This software enables node feature discovery for Kubernetes. It detects hardware features available on each node in a Kubernetes cluster, ... k8s.v1.cni.cncf.io/networks-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.128.0.66" ], "default": true, "dns": {} }] olm.operatorGroup: test-nfd-7fdxz olm.operatorNamespace: test-nfd olm.targetNamespaces: test-nfd openshift.io/scc: anyuid provider: Red Hat repository: https://github.com/openshift/cluster-nfd-operator support: Red Hat Status: Pending IP: 10.128.0.66 IPs: <none> Controlled By: ReplicaSet/nfd-operator-66546cdbfc Containers: nfd-operator: Container ID: Image: image-registry.openshift-image-registry.svc:5000/openshift/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40 Image ID: Port: 60000/TCP Host Port: 0/TCP Command: cluster-nfd-operator State: Waiting Reason: ImagePullBackOff Ready: False Restart Count: 0 Readiness: exec [stat /tmp/operator-sdk-ready] delay=4s timeout=1s period=10s #success=1 #failure=1 Environment: WATCH_NAMESPACE: (v1:metadata.annotations['olm.targetNamespaces']) POD_NAME: nfd-operator-66546cdbfc-wp9pw (v1:metadata.name) OPERATOR_NAME: cluster-nfd-operator NODE_FEATURE_DISCOVERY_IMAGE: image-registry.openshift-image-registry.svc:5000/openshift/ose-node-feature-discovery@sha256:f8d60643622304dbb4d9fee5b0223c7a6c6d972480127c0352745598ccde39e2 Mounts: /tmp from tmp (rw) /var/run/secrets/kubernetes.io/serviceaccount from nfd-operator-token-rncsc (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: tmp: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> nfd-operator-token-rncsc: Type: Secret (a volume populated by a Secret) SecretName: nfd-operator-token-rncsc Optional: false QoS Class: BestEffort Node-Selectors: node-role.kubernetes.io/master= Tolerations: node-role.kubernetes.io/master:NoSchedule node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pulling 54m (x41 over 3h59m) kubelet, ip-10-0-162-203.us-west-2.compute.internal Pulling image "image-registry.openshift-image-registry.svc:5000/openshift/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40" Warning Failed 14m (x986 over 3h59m) kubelet, ip-10-0-162-203.us-west-2.compute.internal Error: ImagePullBackOff Normal BackOff 4m12s (x1031 over 3h59m) kubelet, ip-10-0-162-203.us-west-2.compute.internal Back-off pulling image "image-registry.openshift-image-registry.svc:5000/openshift/ose-cluster-nfd-operator@sha256:0199a7e0d7c3f71c90ea5ee8a04810ab017163430a3173ffb4eafd0b71b1ad40"