Bug 1991508 - ppc64le and s390x CI jobs are failing with exec format errors
Summary: ppc64le and s390x CI jobs are failing with exec format errors
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Multi-Arch
Version: 4.9
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.9.0
Assignee: Deep Mistry
QA Contact: Jeremy Poulin
URL:
Whiteboard:
: 1991629 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-09 10:13 UTC by Stephen Benjamin
Modified: 2021-10-18 17:45 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
job=periodic-ci-openshift-multiarch-master-nightly-4.9-ocp-installer-remote-libvirt-ppc64le=all job=periodic-ci-openshift-multiarch-master-nightly-4.9-ocp-installer-remote-libvirt-s390x=all
Last Closed: 2021-10-18 17:45:29 UTC
Target Upstream Version:
Embargoed:
dmistry: needinfo+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ci-tools pull 2237 0 None None None 2021-08-10 14:40:45 UTC
Red Hat Issue Tracker MULTIARCH-1549 0 None None None 2021-08-09 10:16:24 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:45:41 UTC

Description Stephen Benjamin 2021-08-09 10:13:58 UTC
periodic-ci-openshift-multiarch-master-nightly-4.9-ocp-e2e-compact-remote-libvirt-ppc64le
periodic-ci-openshift-multiarch-master-nightly-4.9-ocp-installer-remote-libvirt-s390x

is failing frequently in CI, see:
https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-multiarch-master-nightly-4.9-ocp-e2e-compact-remote-libvirt-ppc64le

Error:
[36mINFO[0m[2021-08-09T04:56:54Z] standard_init_linux.go:219: exec user process caused: exec format error 
[36mINFO[0m[2021-08-09T04:58:17Z] Imported release 4.9.0-0.nightly-2021-08-07-175228 created at 2021-08-07 17:54:17 +0000 UTC with 141 images to tag release:latest 
[36mINFO[0m[2021-08-09T04:58:17Z] Ran for 1m36s                                
[31mERRO[0m[2021-08-09T04:58:17Z] Some steps failed:                           
[31mERRO[0m[2021-08-09T04:58:17Z] 
  * could not run steps: step [release:s390x-latest] failed: failed to get CLI image: unable to find the 'cli' image in the provided release image: the pod ci-op-53q5qixg/release-images-s390x-latest-cli failed after 9s (failed containers: release): ContainerFailed one or more containers exited

Comment 1 Deep Mistry 2021-08-09 13:18:52 UTC
Can you point out the periodic-ci-openshift-multiarch-master-nightly-4.9-ocp-e2e-compact-remote-libvirt-ppc64le job which failed with the similar issue?

Can you confirm if the job failing for ppc64le is https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-openshift-multiarch-master-nightly-4.9-ocp-installer-remote-libvirt-ppc64le ?

Comment 3 brad.williams 2021-08-09 15:38:39 UTC
The *s390x* job that is successful (https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiar[…]4.9-ocp-installer-remote-libvirt-s390x/1424421723737952256) is pulling *BOTH* the s390x and ppc64le images...


INFO[2021-08-08T17:26:22Z] Resolved release latest to registry.ci.openshift.org/ocp/release:4.9.0-0.nightly-2021-08-07-175228 
INFO[2021-08-08T17:26:22Z] Resolved release ppc64le-initial to registry.ci.openshift.org/ocp-ppc64le/release-ppc64le:4.9.0-0.nightly-ppc64le-2021-08-07-155716 
INFO[2021-08-08T17:26:22Z] Resolved release ppc64le-latest to registry.ci.openshift.org/ocp-ppc64le/release-ppc64le:4.9.0-0.nightly-ppc64le-2021-08-07-172251 
INFO[2021-08-08T17:26:22Z] Resolved release s390x-initial to registry.ci.openshift.org/ocp-s390x/release-s390x:4.9.0-0.nightly-s390x-2021-08-07-155712 
INFO[2021-08-08T17:26:22Z] Resolved release s390x-latest to registry.ci.openshift.org/ocp-s390x/release-s390x:4.9.0-0.nightly-s390x-2021-08-07-172256 
INFO[2021-08-08T17:26:22Z] Resolved release initial to registry.ci.openshift.org/ocp/release:4.9.0-0.nightly-2021-08-06-170119 


The *s390x* jobs that are failing (https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-multiar[…]4.9-ocp-installer-remote-libvirt-s390x/1424595440749252608) are only pulling the *ppc64le* images...

INFO[2021-08-09T04:56:41Z] Resolved release initial to registry.ci.openshift.org/ocp/release:4.9.0-0.nightly-2021-08-06-170119 
INFO[2021-08-09T04:56:41Z] Resolved release latest to registry.ci.openshift.org/ocp/release:4.9.0-0.nightly-2021-08-07-175228 
INFO[2021-08-09T04:56:41Z] Resolved release ppc64le-initial to registry.ci.openshift.org/ocp-ppc64le/release-ppc64le:4.9.0-0.nightly-ppc64le-2021-08-07-155716 
INFO[2021-08-09T04:56:41Z] Resolved release ppc64le-latest to registry.ci.openshift.org/ocp-ppc64le/release-ppc64le:4.9.0-0.nightly-ppc64le-2021-08-07-172251 

I checked the release-controller logic and it doesnt appear to have crossed the streams (s390x <-> ppc64le) anywhere. 

I have also verified that the *ppc64le* failures are caused by the same issue, except they are pulling the *s390x* images.

Comment 5 Deep Mistry 2021-08-10 13:17:12 UTC
*** Bug 1991629 has been marked as a duplicate of this bug. ***

Comment 7 Dan Li 2021-08-10 14:21:44 UTC
Setting "Blocker-" after chat with Deep

Comment 8 Dan Li 2021-08-10 16:38:20 UTC
Hi Deep, do you think this bug will reach "ON_QA" by the end of this sprint (August 14th)? If not, we might want to add the "reviewed-in-sprint" flag.

Comment 12 errata-xmlrpc 2021-10-18 17:45:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.