Bug 1986570 - operand container image is required
Summary: operand container image is required
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node Feature Discovery Operator
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.8.z
Assignee: Carlos Eduardo Arango Gutierrez
QA Contact: Walid A.
URL:
Whiteboard:
Depends On: 1986305
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-27 19:37 UTC by Carlos Eduardo Arango Gutierrez
Modified: 2021-08-31 15:38 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1986305
Environment:
Last Closed: 2021-08-31 15:38:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-nfd-operator pull 189 0 None None None 2021-08-11 22:04:40 UTC
Red Hat Product Errata RHBA-2021:3249 0 None None None 2021-08-31 15:38:34 UTC

Description Carlos Eduardo Arango Gutierrez 2021-07-27 19:37:40 UTC
+++ This bug was initially created as a clone of Bug #1986305 +++

Description of problem:
When deploying and upgrading the nfd operator on a disconnected environment there is always a need of manually editing the operand image.

If the user remove the image from the operand section the config is applied on the cluster but the operator will failed to create the daemonset with the following error:

2021-07-26T16:15:50.238Z	ERROR	controller-runtime.manager.controller.nodefeaturediscovery	Reconciler error	{"reconciler group": "nfd.openshift.io", "reconciler kind": "NodeFeatureDiscovery", "name": "nfd-instance", "namespace": "openshift-nfd", "error": "DaemonSet.apps \"nfd-master\" is invalid: spec.template.spec.containers[0].image: Required value"}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/src/github.com/openshift/cluster-nfd-operator/vendor/github.com/go-logr/zapr/zapr.go:132
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/src/github.com/openshift/cluster-nfd-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:267
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/src/github.com/openshift/cluster-nfd-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1
	/go/src/github.com/openshift/cluster-nfd-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:198
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1
	/go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
	/go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155
k8s.io/apimachinery/pkg/util/wait.BackoffUntil
	/go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext
	/go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.UntilWithContext
	/go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99


How reproducible:
100%

Steps to Reproduce:
1. deploy nfd operator
2. deploy the nfdConfig CR without image in the operand section


Expected results:
If the image field is empty use the environment variable that exist in the nfd-operator this will use the right image of the operand that was built for the specific bundle version and was mirror from the bundle.

It will also allow switch to new images when the operator is upgraded without the request from the user to check the csv for related images and switch the config.


- name: OPERATOR_NAME
  value: cluster-nfd-operator
- name: NODE_FEATURE_DISCOVERY_IMAGE
  value: registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:5d9be97edd55051934bc9129fed627ef24211e8c1758e52253f64d0eeffadf3b
- name: OPERATOR_CONDITION_NAME
  value: node-feature-discovery-operator.v4.8.
image: registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:1d1446af00893668d34452f367914ea2f43ad93a693ee7bf58905208b558ec79

Comment 3 Walid A. 2021-08-17 21:34:47 UTC
Verified on OCP 4.8.0-0.nightly-2021-08-17-004424 on AWS IPI cluster.
NFD nodefeaturediscovery instance was created when the CR was deployed from yaml file with oc apply -f and with the image name removed.

# cat  010_namespace.yml
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-nfd

# oc apply -f 010_namespace.yml

Created new NFD bundle from release-4.8 branch of NFD operator github repo clone:

# git clone https://github.com/openshift/cluster-nfd-operator.git
# cd cluster-nfd-operator
# git checkout release-4.8

# export REGISTRY_AUTH_FILE=/root/<pull-secret.json>

# ORG=<username> PULLPOLICY=Always IMAGE_REGISTRY='quay.io/<username>' IMAGE_PUSH_CMD='podman push' IMAGE_BUILD_CMD='podman build' make image
# podman tag <nfd-operator-image-id> quay.io/<username>/cluster-nfd-operator:4.8.5
# podman login -u <username> quay.io
# podman push quay.io/<username>/cluster-nfd-operator:4.8.5

# export VERSION=4.8.5

ORG=<username> BUNDLE_IMG='quay.io/<username>/nfd-operator-bundle:4.8.5' PULLPOLICY=Always IMAGE_REGISTRY='quay.io/<username>' IMAGE_PUSH_CMD='podman push' IMAGE_BUILD_CMD='podman build' make bundle bundle-build bundle-push VERSION=4.8.5

operator-sdk run bundle  "quay.io/<username>/nfd-operator-bundle:4.8.5" -n openshift-nfd
INFO[0006] Successfully created registry pod: quay-io-<username>-nfd-operator-bundle-4-8-5 
INFO[0006] Created CatalogSource: node-feature-discovery-operator-catalog 
INFO[0006] OperatorGroup "operator-sdk-og" created      
INFO[0006] Created Subscription: node-feature-discovery-operator-v4-8-5-sub 
INFO[0010] Approved InstallPlan install-wwvfz for the Subscription: node-feature-discovery-operator-v4-8-5-sub 
INFO[0010] Waiting for ClusterServiceVersion "openshift-nfd/node-feature-discovery-operator.v4.8.5" to reach 'Succeeded' phase 
INFO[0010]   Waiting for ClusterServiceVersion "openshift-nfd/node-feature-discovery-operator.v4.8.5" to appear 
INFO[0017]   Found ClusterServiceVersion "openshift-nfd/node-feature-discovery-operator.v4.8.5" phase: Pending 
INFO[0018]   Found ClusterServiceVersion "openshift-nfd/node-feature-discovery-operator.v4.8.5" phase: InstallReady 
INFO[0019]   Found ClusterServiceVersion "openshift-nfd/node-feature-discovery-operator.v4.8.5" phase: Installing 
INFO[0049]   Found ClusterServiceVersion "openshift-nfd/node-feature-discovery-operator.v4.8.5" phase: Succeeded 
INFO[0049] OLM has successfully installed "node-feature-discovery-operator.v4.8.5" 

oc get clusterserviceversion.operators.coreos.com/node-feature-discovery-operator.v4.8.5 -n openshift-nfd -o json | jq -r '.metadata.annotations."alm-examples"' | jq .[0] | jq --arg ns openshift-nfd '.metadata.namespace = $ns' | jq --arg  ns openshift-nfd '.spec.namespace = $ns' > nfd_cr_4.8.5.json

# remove the image name from nfd_cr_4.8.5.json and rename CR file
cat nfd_cr_4.8.5.json_removed_operand_image 
{
  "apiVersion": "nfd.openshift.io/v1",
  "kind": "NodeFeatureDiscovery",
  "metadata": {
    "name": "nfd-instance",
    "namespace": "openshift-nfd"
  },
  "spec": {
    "customConfig": {
      "configData": "#    - name: \"more.kernel.features\"\n#      matchOn:\n#      - loadedKMod: [\"example_kmod3\"]\n#    - name: \"more.features.by.nodename\"\n#      value: customValue\n#      matchOn:\n#      - nodename: [\"special-.*-node-.*\"]\n"
    },
    "instance": "",
    "operand": {
      "imagePullPolicy": "Always",
      "namespace": "openshift-nfd"
    },
    "workerConfig": {
      "configData": "core:\n#  labelWhiteList:\n#  noPublish: false\n  sleepInterval: 60s\n#  sources: [all]\n#  klog:\n#    addDirHeader: false\n#    alsologtostderr: false\n#    logBacktraceAt:\n#    logtostderr: true\n#    skipHeaders: false\n#    stderrthreshold: 2\n#    v: 0\n#    vmodule:\n##   NOTE: the following options are not dynamically run-time configurable\n##         and require a nfd-worker restart to take effect after being changed\n#    logDir:\n#    logFile:\n#    logFileMaxSize: 1800\n#    skipLogHeaders: false\nsources:\n#  cpu:\n#    cpuid:\n##     NOTE: whitelist has priority over blacklist\n#      attributeBlacklist:\n#        - \"BMI1\"\n#        - \"BMI2\"\n#        - \"CLMUL\"\n#        - \"CMOV\"\n#        - \"CX16\"\n#        - \"ERMS\"\n#        - \"F16C\"\n#        - \"HTT\"\n#        - \"LZCNT\"\n#        - \"MMX\"\n#        - \"MMXEXT\"\n#        - \"NX\"\n#        - \"POPCNT\"\n#        - \"RDRAND\"\n#        - \"RDSEED\"\n#        - \"RDTSCP\"\n#        - \"SGX\"\n#        - \"SSE\"\n#        - \"SSE2\"\n#        - \"SSE3\"\n#        - \"SSE4.1\"\n#        - \"SSE4.2\"\n#        - \"SSSE3\"\n#      attributeWhitelist:\n#  kernel:\n#    kconfigFile: \"/path/to/kconfig\"\n#    configOpts:\n#      - \"NO_HZ\"\n#      - \"X86\"\n#      - \"DMI\"\n  pci:\n    deviceClassWhitelist:\n      - \"0200\"\n      - \"03\"\n      - \"12\"\n    deviceLabelFields:\n#      - \"class\"\n      - \"vendor\"\n#      - \"device\"\n#      - \"subsystem_vendor\"\n#      - \"subsystem_device\"\n#  usb:\n#    deviceClassWhitelist:\n#      - \"0e\"\n#      - \"ef\"\n#      - \"fe\"\n#      - \"ff\"\n#    deviceLabelFields:\n#      - \"class\"\n#      - \"vendor\"\n#      - \"device\"\n#  custom:\n#    - name: \"my.kernel.feature\"\n#      matchOn:\n#        - loadedKMod: [\"example_kmod1\", \"example_kmod2\"]\n#    - name: \"my.pci.feature\"\n#      matchOn:\n#        - pciId:\n#            class: [\"0200\"]\n#            vendor: [\"15b3\"]\n#            device: [\"1014\", \"1017\"]\n#        - pciId :\n#            vendor: [\"8086\"]\n#            device: [\"1000\", \"1100\"]\n#    - name: \"my.usb.feature\"\n#      matchOn:\n#        - usbId:\n#          class: [\"ff\"]\n#          vendor: [\"03e7\"]\n#          device: [\"2485\"]\n#        - usbId:\n#          class: [\"fe\"]\n#          vendor: [\"1a6e\"]\n#          device: [\"089a\"]\n#    - name: \"my.combined.feature\"\n#      matchOn:\n#        - pciId:\n#            vendor: [\"15b3\"]\n#            device: [\"1014\", \"1017\"]\n#          loadedKMod : [\"vendor_kmod1\", \"vendor_kmod2\"]\n"
    },
    "namespace": "openshift-nfd"
  }
}

# oc apply -f nfd_cr_4.8.5.json_removed_operand_image

# oc get pods -n openshift-nfd
NAME                                                              READY   STATUS      RESTARTS   AGE
f0641f817e848f07b6508f21783cf46079edac611f247dc064a30e7d8d4r2q9   0/1     Completed   0          100m
nfd-controller-manager-6dc6c6dc5c-v6fmk                           2/2     Running     0          100m
nfd-master-fwwbf                                                  1/1     Running     0          36m
nfd-master-lh628                                                  1/1     Running     0          36m
nfd-master-wfmz8                                                  1/1     Running     0          36m
nfd-worker-cgqnh                                                  1/1     Running     0          36m
nfd-worker-klps4                                                  1/1     Running     0          36m
nfd-worker-xnm7j                                                  1/1     Running     0          36m
quay-io-<username>-nfd-operator-bundle-4-8-5                      1/1     Running     0          100m

# oc describe pod/nfd-worker-cgqnh  -n openshift-nfd
Name:         nfd-worker-cgqnh
Namespace:    openshift-nfd
Priority:     0
Node:         ip-10-0-193-61.us-east-2.compute.internal/10.0.193.61
Start Time:   Tue, 17 Aug 2021 20:51:13 +0000
Labels:       app=nfd-worker
              controller-revision-hash=669869d9b9
              pod-template-generation=1
Annotations:  openshift.io/scc: nfd-worker
Status:       Running
IP:           10.0.193.61
IPs:
  IP:           10.0.193.61
Controlled By:  DaemonSet/nfd-worker
Containers:
  nfd-worker:
    Container ID:  cri-o://ab8d1ddc97d4f13dcea1b88538dc2870dcba5abb88f13a33bb2da7f7ffb39f2b
    Image:         quay.io/openshift/origin-node-feature-discovery:4.8
    Image ID:      quay.io/openshift/origin-node-feature-discovery@sha256:2cd02ec6e65e19b26d2e79898ab0f535916cf1f3e0bc894c1190370100b9719f
    Port:          <none>
    Host Port:     <none>
    Command:
      nfd-worker
    Args:
      --server=nfd-master:$(NFD_MASTER_SERVICE_PORT)
    State:          Running
      Started:      Tue, 17 Aug 2021 20:51:22 +0000
    Ready:          True
    Restart Count:  0
    Environment:
      NODE_NAME:   (v1:spec.nodeName)
    Mounts:
      /etc/kubernetes/node-feature-discovery from nfd-worker-config (rw)
      /etc/kubernetes/node-feature-discovery/custom.d/custom-rules from custom-config (ro)
      /etc/kubernetes/node-feature-discovery/features.d from nfd-features (rw)
      /etc/kubernetes/node-feature-discovery/source.d from nfd-hooks (rw)
      /host-boot from host-boot (ro)
      /host-etc/os-release from host-os-release (ro)
      /host-sys from host-sys (rw)
      /host-usr/lib from host-usr-lib (ro)
      /host-usr/src from host-usr-src (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-c9drg (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  host-boot:
    Type:          HostPath (bare host directory volume)
    Path:          /boot
    HostPathType:  
  host-os-release:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/os-release
    HostPathType:  
  host-sys:
    Type:          HostPath (bare host directory volume)
    Path:          /sys
    HostPathType:  
  host-usr-lib:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/lib
    HostPathType:  
  host-usr-src:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/src
    HostPathType:  
  nfd-hooks:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/node-feature-discovery/source.d
    HostPathType:  
  nfd-features:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/node-feature-discovery/features.d
    HostPathType:  
  nfd-worker-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      nfd-worker
    Optional:  false
  custom-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      nfd-worker
    Optional:  false
  kube-api-access-c9drg:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 :NoSchedule op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  36m   default-scheduler  Successfully assigned openshift-nfd/nfd-worker-cgqnh to ip-10-0-193-61.us-east-2.compute.internal
  Normal  Pulling    36m   kubelet            Pulling image "quay.io/openshift/origin-node-feature-discovery:4.8"
  Normal  Pulled     36m   kubelet            Successfully pulled image "quay.io/openshift/origin-node-feature-discovery:4.8" in 8.066785052s
  Normal  Created    36m   kubelet            Created container nfd-worker
  Normal  Started    36m   kubelet            Started container nfd-worker
#

Comment 4 ximhan 2021-08-20 07:11:22 UTC
OpenShift engineering has decided to NOT ship 4.8.6 on 8/23 due to the following issue.
https://bugzilla.redhat.com/show_bug.cgi?id=1995785
All the fixes part will be now included in 4.8.7 on 8/30.

Comment 8 errata-xmlrpc 2021-08-31 15:38:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.9 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3249


Note You need to log in before you can comment on or make changes to this bug.