Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1961484

Summary: nfd operator installation fails on Power, due to the kube-rbac-proxy image not being multi-arch
Product: OpenShift Container Platform Reporter: Satwinder Singh <satwsing>
Component: Node Feature Discovery OperatorAssignee: Carlos Eduardo Arango Gutierrez <carangog>
Status: CLOSED ERRATA QA Contact: Walid A. <wabouham>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.8CC: carangog, danili, jschinta, sejug
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:19:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
UI shows timeout error while installation none

Description Satwinder Singh 2021-05-18 05:39:53 UTC
Created attachment 1784323 [details]
UI shows timeout error while installation

Description of problem:
NFD operator installation failed and the pod is in `CrashLoopBackOff`, due to the kube-rbac-proxy image not being multi-arch


Version-Release number of selected component (if applicable):
4.8.0-202105131518.p0


Steps to Reproduce:
1.Direct Installation of NFD operator from the UI


Actual results:
NFD status
```
# oc get all -A | grep nfd
openshift-operators                                pod/nfd-controller-manager-c65f4cfb4-cc5kj                            1/2     CrashLoopBackOff   6          7m22s
openshift-operators                                service/nfd-controller-manager-metrics-service     ClusterIP      172.30.215.67    <none>                                 8443/TCP                       7m23s
openshift-operators                                deployment.apps/nfd-controller-manager                   0/1     1            0           7m23s
openshift-operators                                replicaset.apps/nfd-controller-manager-c65f4cfb4                    1         1         0       7m23s


Expected results:
Installation should complete.


Additional info:

oc describe pod nfd-controller-manager-c65f4cfb4-cc5kj -n openshift-operators
```
# oc describe pod nfd-controller-manager-c65f4cfb4-cc5kj -n openshift-operators
Name:         nfd-controller-manager-c65f4cfb4-cc5kj
Namespace:    openshift-operators
Priority:     0
Node:         worker-1/193.168.200.119
Start Time:   Mon, 17 May 2021 09:55:54 -0400
Labels:       control-plane=controller-manager
             pod-template-hash=c65f4cfb4
Annotations:  alm-examples:
               [
                 {
                   "apiVersion": "nfd.openshift.io/v1",
                   "kind": "NodeFeatureDiscovery",
                   "metadata": {
                     "name": "nfd-instance",
                     "namespace": "openshift-nfd"
                   },
                   "spec": {
                     "customConfig": {
                       "configData": "#    - name: \"more.kernel.features\"\n#      matchOn:\n#      - loadedKMod: [\"example_kmod3\"]\n#    - name: \"mo...
                     },
                     "instance": "",
                     "operand": {
                       "image": "registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:3ef63f75d9d949d74f05f6071b649deacbac9889a029bdb42487ee7f...
                       "imagePullPolicy": "Always",
                       "namespace": "openshift-nfd"
                     },
                     "workerConfig": {
                       "configData": "#core:\n#  labelWhiteList:\n#  noPublish: false\n#  sleepInterval: 60s\n#  sources: [all]\n#  klog:\n#    addDirHea...
                     }
                   }
                 }
               ]
             capabilities: Basic Install
             categories: Database
             containerImage:
               registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:ae4a807cf0bbcee0f3124c3c3c24b433ec90902731a40b551afc402ee0d5dbcc
             description:
               This software enables node feature discovery for OpenShift. It detects hardware features available on each node in an OpenShift cluster, a...
             k8s.v1.cni.cncf.io/network-status:
               [{
                   "name": "",
                   "interface": "eth0",
                   "ips": [
                       "10.128.3.60"
                   ],
                   "default": true,
                   "dns": {}
               }]
             k8s.v1.cni.cncf.io/networks-status:
               [{
                   "name": "",
                   "interface": "eth0",
                   "ips": [
                       "10.128.3.60"
                   ],
                   "default": true,
                   "dns": {}
               }]
             olm.operatorGroup: global-operators
             olm.operatorNamespace: openshift-operators
             olm.skipRange: >=4.6.0 <4.8.0
             olm.targetNamespaces:
             openshift.io/scc: anyuid
             operatorframework.io/properties:
               {"properties":[{"type":"olm.gvk","value":{"group":"nfd.openshift.io","kind":"NodeFeatureDiscovery","version":"v1"}},{"type":"olm.package",...
             operators.operatorframework.io/builder: operator-sdk-v1.4.0+git
             operators.operatorframework.io/project_layout: go.kubebuilder.io/v3
             provider: Red Hat
             repository: https://github.com/openshift/cluster-nfd-operator
             support: Red Hat
Status:       Running
IP:           10.128.3.60
IPs:
 IP:           10.128.3.60
Controlled By:  ReplicaSet/nfd-controller-manager-c65f4cfb4
Containers:
 kube-rbac-proxy:
   Container ID:  cri-o://0e1a655d10d087a1d1139f65ce02cb676bcc52f5aff48127ad023b95368e7748
   Image:         gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0
   Image ID:      gcr.io/kubebuilder/kube-rbac-proxy@sha256:e10d1d982dd653db74ca87a1d1ad017bc5ef1aeb651bdea089debf16485b080b
   Port:          8443/TCP
   Host Port:     0/TCP
   Args:
     --secure-listen-address=0.0.0.0:8443
     --upstream=http://127.0.0.1:8080/
     --logtostderr=true
     --v=10
   State:          Waiting
     Reason:       CrashLoopBackOff
   Last State:     Terminated
     Reason:       Error
     Exit Code:    1
     Started:      Mon, 17 May 2021 10:07:17 -0400
     Finished:     Mon, 17 May 2021 10:07:17 -0400
   Ready:          False
   Restart Count:  7
   Environment:
     HTTP_PROXY:               http://rdr-satwin-bastion-0:3128
     HTTPS_PROXY:              http://rdr-satwin-bastion-0:3128
     NO_PROXY:                 .cluster.local,.rdr-satwin.redhat.com,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,172.30.0.0/16,193.168.200.0/24,api-int.rdr-satwin.redhat.com,localhost
     OPERATOR_CONDITION_NAME:  node-feature-discovery-operator.v4.8.0
   Mounts:
     /var/run/secrets/kubernetes.io/serviceaccount from nfd-operator-token-pp2sl (ro)
 manager:
   Container ID:  cri-o://542986d3688e1a938a59e88bbeee9bc4d9aa734c4125064a00dee5da42d5411d
   Image:         registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:ae4a807cf0bbcee0f3124c3c3c24b433ec90902731a40b551afc402ee0d5dbcc
   Image ID:      registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:4c8449713ea9550604402a6e03abe956f25ca3f4b9fda417e65eff38beec1d85
   Port:          <none>
   Host Port:     <none>
   Command:
     /node-feature-discovery-operator
   Args:
     --health-probe-bind-address=:8081
     --metrics-bind-address=127.0.0.1:8080
     --leader-elect
   State:          Running
     Started:      Mon, 17 May 2021 09:56:18 -0400
   Ready:          True
   Restart Count:  0
   Liveness:       http-get http://:8081/healthz delay=15s timeout=1s period=20s #success=1 #failure=3
   Readiness:      http-get http://:8081/readyz delay=5s timeout=1s period=10s #success=1 #failure=3
   Environment:
     WATCH_NAMESPACE:                (v1:metadata.annotations['olm.targetNamespaces'])
     POD_NAME:                      nfd-controller-manager-c65f4cfb4-cc5kj (v1:metadata.name)
     OPERATOR_NAME:                 cluster-nfd-operator
     NODE_FEATURE_DISCOVERY_IMAGE:  registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:3ef63f75d9d949d74f05f6071b649deacbac9889a029bdb42487ee7f89423204
     HTTP_PROXY:                    http://rdr-satwin-bastion-0:3128
     HTTPS_PROXY:                   http://rdr-satwin-bastion-0:3128
     NO_PROXY:                      .cluster.local,.rdr-satwin.redhat.com,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,172.30.0.0/16,193.168.200.0/24,api-int.rdr-satwin.redhat.com,localhost
     OPERATOR_CONDITION_NAME:       node-feature-discovery-operator.v4.8.0
   Mounts:
     /var/run/secrets/kubernetes.io/serviceaccount from nfd-operator-token-pp2sl (ro)
Conditions:
 Type              Status
 Initialized       True
 Ready             False
 ContainersReady   False
 PodScheduled      True
Volumes:
 nfd-operator-token-pp2sl:
   Type:        Secret (a volume populated by a Secret)
   SecretName:  nfd-operator-token-pp2sl
   Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
 Type     Reason          Age                   From               Message
 ----     ------          ----                  ----               -------
 Normal   Scheduled       13m                   default-scheduler  Successfully assigned openshift-operators/nfd-controller-manager-c65f4cfb4-cc5kj to worker-1
 Normal   AddedInterface  13m                   multus             Add eth0 [10.128.3.60/23]
 Normal   Pulling         13m                   kubelet            Pulling image "gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0"
 Normal   Pulled          13m                   kubelet            Successfully pulled image "gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0" in 6.288728546s
 Normal   Pulling         13m                   kubelet            Pulling image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:ae4a807cf0bbcee0f3124c3c3c24b433ec90902731a40b551afc402ee0d5dbcc"
 Normal   Pulled          13m                   kubelet            Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-nfd-operator@sha256:ae4a807cf0bbcee0f3124c3c3c24b433ec90902731a40b551afc402ee0d5dbcc" in 13.615753565s
 Normal   Created         13m                   kubelet            Created container manager
 Normal   Started         13m                   kubelet            Started container manager
 Normal   Created         12m (x4 over 13m)     kubelet            Created container kube-rbac-proxy
 Normal   Started         12m (x4 over 13m)     kubelet            Started container kube-rbac-proxy
 Normal   Pulled          12m (x3 over 13m)     kubelet            Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0" already present on machine
 Warning  BackOff         3m14s (x48 over 13m)  kubelet            Back-off restarting failed container

```

-------------------------------------------------

# oc logs nfd-controller-manager-c65f4cfb4-cc5kj -n openshift-operators kube-rbac-proxy
```
standard_init_linux.go:219: exec user process caused: exec format error
```

Comment 2 Satwinder Singh 2021-05-20 05:36:07 UTC
Issue is resolved now:

NFD Version: 4.8.0-202105191051.p0.assembly.stream

```
#  oc get all -A | grep nfd
openshift-operators                                pod/nfd-controller-manager-76d5b5c466-rn4gr                           2/2     Running     0          4m27s
openshift-operators                                pod/nfd-master-9vpgc                                                  1/1     Running     0          98s
openshift-operators                                pod/nfd-master-cwrr8                                                  1/1     Running     0          99s
openshift-operators                                pod/nfd-master-nw66q                                                  1/1     Running     0          98s
openshift-operators                                pod/nfd-worker-94rvq                                                  1/1     Running     0          98s
openshift-operators                                pod/nfd-worker-zrpw7                                                  1/1     Running     0          98s
openshift-operators                                service/nfd-controller-manager-metrics-service     ClusterIP      172.30.3.20      <none>                                 8443/TCP                       4m29s
openshift-operators                                service/nfd-master                                 ClusterIP      172.30.36.84     <none>                                 12000/TCP                      99s
openshift-operators                      daemonset.apps/nfd-master                    3         3         3       3            3           node-role.kubernetes.io/master=                            99s
openshift-operators                      daemonset.apps/nfd-worker                    2         2         2       2            2           <none>                                                     98s
openshift-operators                                deployment.apps/nfd-controller-manager                   1/1     1            1           4m27s
openshift-operators                                replicaset.apps/nfd-controller-manager-76d5b5c466                   1         1         1       4m27s

----------------------------------------------------------------------------------------------------------------------------------

# oc get pods -A -o wide | grep nfd
openshift-operators                                nfd-controller-manager-76d5b5c466-rn4gr                           2/2     Running     0          5m3s    10.128.3.89       worker-1   <none>           <none>
openshift-operators                                nfd-master-9vpgc                                                  1/1     Running     0          2m14s   10.130.1.42       master-2   <none>           <none>
openshift-operators                                nfd-master-cwrr8                                                  1/1     Running     0          2m15s   10.129.1.93       master-0   <none>           <none>
openshift-operators                                nfd-master-nw66q                                                  1/1     Running     0          2m14s   10.128.1.57       master-1   <none>           <none>
openshift-operators                                nfd-worker-94rvq                                                  1/1     Running     0          2m14s   193.168.200.29    worker-0   <none>           <none>
openshift-operators                                nfd-worker-zrpw7                                                  1/1     Running     0          2m14s   193.168.200.119   worker-1   <none>           <none>

```

Comment 3 jschinta 2021-05-21 08:34:12 UTC
I can confirm it is solved on s390x as well.

NFD Version: 4.8.0-202105191051.p0.assembly.stream

Comment 5 Dan Li 2021-06-09 12:56:18 UTC
Based on Comment 2 and Comment 3, both Satwinder (Power) and Jan (Z) have confirmed that this bug has been resolved. Moving to Verified.

Comment 8 errata-xmlrpc 2021-07-27 22:19:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.2 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2435