Bug 1797678

Summary: OCP 4.3 - Node Feature Discovery (NFD) nfd-operator fails to deploy from CLI and github repo
Product: OpenShift Container Platform Reporter: Carlos Eduardo Arango Gutierrez <carangog>
Component: Node Feature Discovery OperatorAssignee: Zvonko Kosic <zkosic>
Status: CLOSED ERRATA QA Contact: Walid A. <wabouham>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.3.0CC: ematysek, joncp, mifiedle, ppod, scuppett, sejug, wabouham, zkosic
Target Milestone: ---Keywords: Regression, Reopened, TestBlocker
Target Release: 4.3.z   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1789560 Environment:
Last Closed: 2020-02-19 05:40:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1789560    
Bug Blocks: 1802744, 1805427    

Description Carlos Eduardo Arango Gutierrez 2020-02-03 16:02:38 UTC
+++ This bug was initially created as a clone of Bug #1789560 +++

This bug was fixed on PR #53 , and needs to be cherry-picked into release 4.3 branch

Comment 1 Carlos Eduardo Arango Gutierrez 2020-02-03 16:14:22 UTC
https://github.com/openshift/cluster-nfd-operator/pull/57

Comment 2 Carlos Eduardo Arango Gutierrez 2020-02-03 16:24:13 UTC

*** This bug has been marked as a duplicate of bug 1789560 ***

Comment 3 Carlos Eduardo Arango Gutierrez 2020-02-03 16:25:14 UTC
Target Release: --- → 4.3.0

Comment 4 Carlos Eduardo Arango Gutierrez 2020-02-03 16:43:30 UTC

*** This bug has been marked as a duplicate of bug 1789560 ***

Comment 6 Zvonko Kosic 2020-02-11 16:10:02 UTC
*** Bug 1800715 has been marked as a duplicate of this bug. ***

Comment 7 Walid A. 2020-02-11 19:04:01 UTC
Failed verification on IPI AWS cluster on OCP 4.3.1:

# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.1     True        False         3h34m   Cluster version is 4.3.1

Server Version: 4.3.1
Kubernetes Version: v1.16.2

cd $GOPATH
cd src/github.com/openshift
git clone https://github.com/openshift/cluster-nfd-operator.git
cd cluster-nfd-operator
git checkout release-4.3
PULLPOLICY=Always make deploy

# oc get pods -n openshift-nfd
NAME                           READY   STATUS         RESTARTS   AGE
nfd-operator-f77bc847f-5zgqv   0/1     ErrImagePull   0          49s

# oc get events -n openshift-nfd
LAST SEEN   TYPE      REASON              OBJECT                              MESSAGE
<unknown>   Normal    Scheduled           pod/nfd-operator-f77bc847f-5zgqv    Successfully assigned openshift-nfd/nfd-operator-f77bc847f-5zgqv to ip-10-0-155-116.us-west-2.compute.internal
12s         Normal    Pulling             pod/nfd-operator-f77bc847f-5zgqv    Pulling image "quay.io/zvonkok/cluster-nfd-operator:release-4.3"
11s         Warning   Failed              pod/nfd-operator-f77bc847f-5zgqv    Failed to pull image "quay.io/zvonkok/cluster-nfd-operator:release-4.3": rpc error: code = Unknown desc = Error reading manifest release-4.3 in quay.io/zvonkok/cluster-nfd-operator: manifest unknown: manifest unknown
11s         Warning   Failed              pod/nfd-operator-f77bc847f-5zgqv    Error: ErrImagePull
27s         Normal    BackOff             pod/nfd-operator-f77bc847f-5zgqv    Back-off pulling image "quay.io/zvonkok/cluster-nfd-operator:release-4.3"
27s         Warning   Failed              pod/nfd-operator-f77bc847f-5zgqv    Error: ImagePullBackOff
64s         Normal    SuccessfulCreate    replicaset/nfd-operator-f77bc847f   Created pod: nfd-operator-f77bc847f-5zgqv
64s         Normal    ScalingReplicaSet   deployment/nfd-operator             Scaled up replica set nfd-operator-f77bc847f to 1

# oc logs -n openshift-nfd nfd-operator-f77bc847f-5zgqv
Error from server (BadRequest): container "nfd-operator" in pod "nfd-operator-f77bc847f-5zgqv" is waiting to start: trying and failing to pull image

Comment 8 Walid A. 2020-02-12 06:58:15 UTC
After pushing new cluster-nfd-operator images to respective registries, verification was successful:

# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.1     True        False         3h34m   Cluster version is 4.3.1

Server Version: 4.3.1
Kubernetes Version: v1.16.2

cd $GOPATH
cd src/github.com/openshift
git clone https://github.com/openshift/cluster-nfd-operator.git
cd cluster-nfd-operator
git checkout release-4.3
PULLPOLICY=Always make deploy

# oc get pods -n openshift-nfd
NAME                           READY   STATUS    RESTARTS   AGE
nfd-master-4m5s4               1/1     Running   0          13m
nfd-master-jrs72               1/1     Running   0          13m
nfd-master-pz9dh               1/1     Running   0          13m
nfd-operator-f77bc847f-8t64t   1/1     Running   0          13m
nfd-worker-64mwk               1/1     Running   2          13m
nfd-worker-9phjq               1/1     Running   2          13m
nfd-worker-zn78b               1/1     Running   2          13m

Comment 10 errata-xmlrpc 2020-02-19 05:40:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0492