Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1969560

Summary: Community NFD operator fails installation
Product: OpenShift Container Platform Reporter: Christian LaPolt <christian.lapolt>
Component: Node Feature Discovery OperatorAssignee: Carlos Eduardo Arango Gutierrez <carangog>
Status: CLOSED ERRATA QA Contact: Lena Horsley <lhorsley>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.9CC: carangog, christian.lapolt, krmoser, lhorsley, sejug
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-18 17:19:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1934148    

Description Christian LaPolt 2021-06-08 15:40:13 UTC
Description of problem:
On s390x there are 2 NFD operators on Operator Hub.  The Red Hat supplied one installs fine.  The Community one fails.  I understand that the community operators are not Red Hat supported but for the sake of clarity for the customer maybe that operator should not be tagged for s390x.

Version-Release number of selected component (if applicable):
4.7.0 provided by Red Hat

How reproducible:
Very

Steps to Reproduce:
1.Have a 4.7 or 4.8 cluster on Z
2.Try to install from operator hub
3.

Actual results:
Failed: install failed: deployment nfd-operator not ready before timeout: deployment "nfd-operator" exceeded its progress deadline

nfd-operator-59c6596f4d-kzvvb              0/1     CrashLoopBackOff   9    	 22m

Expected results:
Either it installs successfully or is not available on the hub.

Additional info:

Comment 1 Carlos Eduardo Arango Gutierrez 2021-06-08 16:20:14 UTC
Thanks fot this, I will take this as an action item to remove Community s390x tag

Comment 2 Carlos Eduardo Arango Gutierrez 2021-07-29 23:16:21 UTC
https://github.com/redhat-openshift-ecosystem/community-operators-prod/pull/47 merged

Comment 3 Lena Horsley 2021-08-06 20:37:59 UTC
Tested on build: 4.7.0-0.nightly-2021-08-05-210914 and 4.8.0-0.nightly-2021-08-05-031749
Cloud provider: AWS


================================================================================================

oc get clusterversion; oc get pods -n openshift-operators
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2021-08-05-210914   True        False         4h2m    Cluster version is 4.7.0-0.nightly-2021-08-05-210914
NAME                            READY   STATUS    RESTARTS   AGE
nfd-master-b52kg                1/1     Running   0          3h11m
nfd-master-jqkm9                1/1     Running   0          3h11m
nfd-master-k4nvw                1/1     Running   0          3h11m
nfd-operator-5db8945ff5-m7tv8   1/1     Running   0          3h12m
nfd-worker-5l2ww                1/1     Running   0          3h11m
nfd-worker-8rvfh                1/1     Running   0          3h11m
nfd-worker-9wkjb                1/1     Running   0          3h11m



oc get clusterversion; oc get pods -n openshift-operators
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-08-05-031749   True        False         3h46m   Cluster version is 4.8.0-0.nightly-2021-08-05-031749
NAME                            READY   STATUS    RESTARTS   AGE
nfd-master-8s2vh                1/1     Running   0          165m
nfd-master-fgxbk                1/1     Running   0          165m
nfd-master-m67gx                1/1     Running   0          165m
nfd-operator-5db8945ff5-7tw66   1/1     Running   0          166m
nfd-worker-9pl4r                1/1     Running   1          165m
nfd-worker-9sbg7                1/1     Running   1          165m
nfd-worker-xd8sw                1/1     Running   1          165m


================================================================================================

Additional testing performed (after installing NFD):
1. Display the node labels (oc describe nodes | grep feature, oc describe nodes | egrep 'Roles|pci'
2. Add a node/machineset
3. Deploy GPU operator and view metrics (GPU operator depends upon NFD)
4. Remove a node/machineset
5. Display the node labels (oc describe nodes | grep feature, oc describe nodes | egrep 'Roles|pci'
6. Log creation and stability.

Comment 6 errata-xmlrpc 2021-10-18 17:19:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.9 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3760