Bug 1910117 - NFD Community Operator watches for wrong CRD version and crashes repeatedly
Summary: NFD Community Operator watches for wrong CRD version and crashes repeatedly
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node Feature Discovery Operator
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.z
Assignee: Carlos Eduardo Arango Gutierrez
QA Contact: Walid A.
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-22 17:49 UTC by James Harmison
Modified: 2021-02-08 13:41 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-08 13:41:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github operator-framework community-operators pull 2950 0 None closed update NFD operator OLM catalog 2021-01-25 21:45:35 UTC
Red Hat Product Errata RHSA-2021:0310 0 None None None 2021-02-08 13:41:57 UTC

Description James Harmison 2020-12-22 17:49:07 UTC
Description of problem:
Installation of community NFD operator alpha channel results in a deployment of the controller-manager image from the :latest tag. This image is for a much newer version of the operator, with an update API version for the CRD. This causes a CrashLoopBackoff on the controller-manager and a failed operator installation.

Version-Release number of selected component (if applicable):
4.5 on alpha channel

How reproducible:
always

Steps to Reproduce:
1. Install NFD community operator on a stable-4.6 cluster (or any cluster, afaict) (while attempting to work around https://bugzilla.redhat.com/show_bug.cgi?id=1897346 ) from the Operator catalog

Actual results:
controller-manager pod for NFD operator enters CrashLoopBackoff, logs indicating that nodefeaturediscoveries.nfd.openshift.io/v1 doesn't exist. 

Verify that CRD installed by bundle is nodefeaturediscoveries.nfd.openshift.io/v1alpha1:
https://github.com/openshift/cluster-nfd-operator/blob/release-4.5/manifests/olm-catalog/4.5/nfd.crd.yaml#L39

Verify that Deployment for controller-manager is tracking latest:
https://github.com/openshift/cluster-nfd-operator/blob/release-4.5/manifests/olm-catalog/4.5/nfd.v4.5.0.clusterserviceversion.yaml#L162

Expected results:
The image in the deployment should use a tag that is pinned to the appropriate version of the operator, registering a watch for v1alpha1.

Additional info:

Comment 1 Ryan Kraus 2020-12-22 18:02:37 UTC
Should a published version of the operator really be targeting the latest tag?

Comment 2 James Harmison 2020-12-22 18:22:01 UTC
On master branch:
make deploy results in the opposite problem. Default image w/ master tag is looking for v1alpha1 CRD, v1 is present. I adjusted the image deployed and all is working as expected, but somehow by default it's wrong.

Comment 3 Carlos Eduardo Arango Gutierrez 2021-01-21 13:48:17 UTC
PR has been merged, this is ready for QA

Comment 4 Carlos Eduardo Arango Gutierrez 2021-01-25 14:11:56 UTC
Now that https://github.com/operator-framework/community-operators/pull/2950 has been merged is this issue still happening, or more fixes are needed
thanks

Comment 5 Walid A. 2021-01-25 22:58:40 UTC
Verified that the community NFD operator can be successfully deployed from OperatorHub on OCP 4.6.0-0.nightly-2021-01-25-060359.

Comment 7 James Harmison 2021-02-01 14:04:35 UTC
Can confirm that the changes work for deploying to a 3-node 4.6 bare metal cluster from the community catalog. Thanks!

Just hoping for 1897346 to clear up soon in support of the official operator release channels.

Comment 9 errata-xmlrpc 2021-02-08 13:41:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.6.16 extras security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0310


Note You need to log in before you can comment on or make changes to this bug.