Description of problem: Openshift Server Version: 4.6.0-0.nightly-s390x-2020-09-25-054206 uses image registry.redhat.io/redhat/community-operator-index:v4.6 . This results in an exec format error for community operators as there is only an x86 image built. I used `skopeo inspect docker://registry.redhat.io/redhat/community-operator-index:v4.6` to verify there is no s390x build (nor is there a ppc64le build) for this image. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: All "community-operator-*" pods are failing in the "openshift-marketplace" namespace. Expected results: `oc get pods -n openshift-marketplace` should show all "Running" pods.
Created attachment 1716673 [details] log file from one running community-operators pod
[root@ospamgrs3 ~]# oc get pods -n openshift-marketplace NAME READY STATUS RESTARTS AGE certified-operators-56648cfd98-scwr9 1/1 Running 0 9h certified-operators-gnh6c 1/1 Running 0 4h7m community-operators-5c686 0/1 CrashLoopBackOff 117 9h community-operators-77df6c68b7-jbdgb 1/1 Running 0 9h community-operators-ddgwg 0/1 CrashLoopBackOff 116 9h marketplace-operator-784d9f5896-64bt7 1/1 Running 0 9h redhat-marketplace-5765ff97c-gfp59 1/1 Running 0 9h redhat-marketplace-5lfs9 1/1 Running 0 3h50m redhat-operators-58b4d5c978-qmqwd 1/1 Running 0 9h redhat-operators-vrfln 1/1 Running 0 9h [root@ospamgrs3 ~]# oc logs community-operators-ddgwg -n openshift-marketplace standard_init_linux.go:219: exec user process caused: exec format error [root@ospamgrs3 ~]# oc logs community-operators-5c686 -n openshift-marketplace standard_init_linux.go:219: exec user process caused: exec format error [root@ospamgrs3 ~]# oc logs community-operators-77df6c68b7-jbdgb -n openshift-marketplace > community-operators-77df6c68b7-jbdgb.log (log attached)
This error below is also observed on ppc64le as of October 2nd, 2020 openshift-marketplace community-operators-2dnhq 0/1 CrashLoopBackOff 7 16m openshift-marketplace community-operators-qrhpp 0/1 CrashLoopBackOff 11 37m This error was discovered on 4.6 install as well as 4.4.27 install.
This looks to be fixed for 4.6 s390x as of the latest nightlies (4.6.0-0.nightly-s390x-2020-10-08-182421 here) [dorzel@rock-kvmlp-1 ocp4-workdir]$ oc get pods -n openshift-marketplace NAME READY STATUS RESTARTS AGE certified-operators-gf8c5 1/1 Running 0 45m community-operators-vnwxt 1/1 Running 0 45m marketplace-operator-8dd9598-9jcz8 1/1 Running 0 50m redhat-marketplace-cvq7k 1/1 Running 0 45m redhat-operators-vl724 1/1 Running 0 45m [dorzel@rock-kvmlp-1 ocp4-workdir]$ oc get pod community-operators-vnwxt -n openshift-marketplace -o jsonpath={.spec.containers[0].image} registry.redhat.io/redhat/community-operator-index:latest It is my understanding that ppc64le should be fixed as well as the manifest list is now working correctly for this image.
According to the latest z-stream testing, this error still occurs on 4.4.28 nightly for ppc64le. This error is not discovered in the 4.5.z and 4.6.0-RC nightlies. Link to the test results here: https://docs.google.com/spreadsheets/d/1PuW0zyBg7moLIiXq8tQ0cFyz427NGx8R-cStv5Mt7ok/edit#gid=1433717023
Adding "UpcomingSprint" as team will not get to this bug during this sprint
Looking into the 4.4.28 ppc64le community operator errors, it does not seem those are related to this bug. There are no image pull errors or exec format errors that would indicate a missing build for ppc64le. In fact, there are no errors in the logs at all. The only thing that seems to be wrong there is this message from cluster events: "Liveness probe failed: timeout: failed to connect service "localhost:50051" within 1s" I'm getting the same error intermittently with other operator pods which leads me to believe it is resource related. This is typical of my ppc64le cluster, and it is saying that the API server has quite a high latency. Maybe this is similar to what's going on in CI? Since this is still resolved for 4.5/4.6, and the 4.4.28 (ppc64le only) seems to not actually be a bug, I think this can be closed.