Bug 1914869
| Summary: | OCP 4.7 NFD - Operand configuration options for NodeFeatureDiscovery are empty, no supported image for ppc64le | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | pdsilva | ||||||||
| Component: | Node Feature Discovery Operator | Assignee: | Carlos Eduardo Arango Gutierrez <carangog> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | pdsilva | ||||||||
| Severity: | urgent | Docs Contact: | |||||||||
| Priority: | unspecified | ||||||||||
| Version: | 4.7 | CC: | aprabhak, carangog, danili, hmiyamot, pdsilva, sejug, yselkowi | ||||||||
| Target Milestone: | --- | Keywords: | Reopened | ||||||||
| Target Release: | 4.7.0 | ||||||||||
| Hardware: | ppc64le | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | |||||||||||
| : | 1927489 (view as bug list) | Environment: | |||||||||
| Last Closed: | 2021-02-24 15:01:39 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Bug Depends On: | 1927489 | ||||||||||
| Bug Blocks: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
pdsilva
2021-01-11 11:35:10 UTC
Created attachment 1746247 [details]
nfd-operand-screenshot.jpg
A specific procedure is required for testing OLM operators prior to GA: https://docs.engineering.redhat.com/display/MULTIARCH/How+To+Test+Red+Hat+ART+Operators The community images are not supported by Red Hat on any architecture, and in most cases are only available for x86_64. Attempting to use them would invalidate testing. Making Yaakov's comment un-private so that the Power team could leverage the link for testing. We have used and followed the instructions. We see the `nfd-operator` pod Running, but that does NOT do a thing about feature discovery. You need to "configure" it by creating an instance post install. When you try to do that, the page asks you to fill name, label, namespace, etc, but also "Image" to pull and run. I don't know if that's supposed to be pre-filled, as 4.6 didn't ask for such a thing. I believe this bug should be reopened, looked at the NFD team, and for building/posting multi-arch images, if not already done. Thanks. Re-opening the bug based on Hiro's comments in https://bugzilla.redhat.com/show_bug.cgi?id=1914869#c4. After discussing this bug with the Power testing team who opened this bug, I am setting this bug as a "Blocker+" as the bug is blocking an NFD regression test case executed by the Multi-Arch Power team; however, if the NFD Operator team believes otherwise, please feel free to let us know and make the appropriate changes. The issue here is actually that in the default UI deployment of NFD, the master image is not set to the NODE_FEATURE_DISCOVERY_IMAGE supplied in the operator environment. If you add it in manually, via the GUI or yaml, nfd appears to deploy and work as expected. Full discussion is here: https://coreos.slack.com/archives/C0138QKKYTU/p1610468952274000?thread_ts=1610371907.258200&cid=C0138QKKYTU This is a regression in terms of the UI functionality from 4.6. Created attachment 1748425 [details]
nfd-operand-4.7.0-202101161147.p0
The Operand fields - Image, Image Pull Policy and Namespace still appear empty with the 4.7.0-202101161147.p0 version of NFD. Screenshot attachment 1748425 [details] for reference.
# oc version
Client Version: 4.7.0-0.nightly-ppc64le-2021-01-18-024748
Server Version: 4.7.0-0.nightly-ppc64le-2021-01-18-024748
Kubernetes Version: v1.20.0+d9c52cc
# oc get packagemanifest | grep nfd
nfd Red Hat Operators v4.7 Stage 19m
# oc get csv | grep nfd
nfd.4.7.0-202101161147.p0 Node Feature Discovery 4.7.0-202101161147.p0 Succeeded
# oc get pods -A | grep nfd
openshift-operators nfd-operator-7c46664675-mvps2 1/1 Running 0 6m6s
Created attachment 1749519 [details]
NFD-operand-nfd.4.7.0-202101210137.p
Have re-deployed NFD with version nfd.4.7.0-202101210137.p. The fields get populated now but the image provided "quay.io/openshift/origin-node-feature-discovery:4.7" is not multi-arch. See Screenshot attachment 1749519 [details] for reference. We would need this image pre-populated as per the arch.
Cluster build details:
# oc version
Client Version: 4.7.0-0.nightly-ppc64le-2021-01-21-052650
Server Version: 4.7.0-0.nightly-ppc64le-2021-01-21-052650
Kubernetes Version: v1.20.0+91b6da5
# oc get packagemanifest | grep nfd
nfd Red Hat Operators v4.7 Stage 8h
# oc get csv | grep nfd
nfd.4.7.0-202101210137.p0 Node Feature Discovery 4.7.0-202101210137.p0 Succeeded
Verified on AWS nightly build `4.7.0-0.nightly-2021-01-22-104107` the NodeFeatureDiscoveries instance fields are pre-populated as expected. nfd csv version: nfd.4.7.0-202101230053.p0 When a release is GA, the production version of Operators is pulled from the Red Hat registry (e.g. registry.redhat.io/openshift4/ose-cluster-nfd-operator : https://catalog.redhat.com/software/containers/openshift4/ose-cluster-nfd-operator/5d9e23f1bed8bd2245d9378c?container-tabs=overview this ART built images have support for all the Red Hat supported Multi-Arch Before that you can manually use an image from https://brewweb.engineering.redhat.com/brew/search?match=glob&type=build&terms=node-feature-discovery-container-*4.7* @pdsilva do you have an OCP environment on ppc64le to verify the fix ? thanks. ART provides the multi-arch team builds that point directly to the brew registry for testing. The latest builds should have been pointing to registry.redhat.io. Does this only start happening when the builds get pushed to stage? The problem here is that the operator image appears to populate the default nfd image to the quay origin URL instead of the corresponding redhat.registry.io address. It never used to do this before - it has always pointed directly to the redhat.registry.io image corresponding to the release in question. This appears to be a bug. I'm having Hiro attempt to reproduce the bug with the CFC stage index image. https://docs.engineering.redhat.com/display/CFC/Test If this fails, this operator may go live populating the image operator with the upstream/origin image. (Which is not something you would detect on x86, but that image doesn't work for multi-arch). The latest staging index image in the 4.7 channel appears to be from 12/21/21 and does not include the fix above. Have verified NFD installation on OCP 4.7.0-rc.1 on Power with the staging OperatorSource. The installation is successful and the operand image shows registry.redhat.io/openshift4/ose-node-feature-discovery:v4.7.0. I have currently used the 202102130115.p0 image from brew registry "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:417e5b1e0c60f67b82462e3bd7a678ae913968b5908b6b717f706ed34520d071" with which the pods are in Running state. # oc version Client Version: 4.7.0-rc.1 Server Version: 4.7.0-rc.1 Kubernetes Version: v1.20.0+ba45583 # oc get csv | grep nfd nfd.4.7.0-202102111715.p0 Node Feature Discovery 4.7.0-202102111715.p0 Succeeded # oc get pods -A | grep nfd openshift-operators nfd-master-44nvn 1/1 Running 0 49m openshift-operators nfd-master-5lhhq 1/1 Running 0 47m openshift-operators nfd-master-dnnb7 1/1 Running 0 48m openshift-operators nfd-operator-65955df6f4-gfp94 1/1 Running 0 36h openshift-operators nfd-worker-ct9tr 1/1 Running 0 45m openshift-operators nfd-worker-g5wtp 1/1 Running 0 22m Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 extras and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5635 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |