Description of problem: ci-openshift-cluster-network-operator-master-e2e-agnostic-upgrade is failing most of the time https://prow.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-cluster-network-operator-master-e2e-agnostic-upgrade Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: For CNO e2e-agnostic-upgrade is failing most of the time. For other projects as well, like console-operator for example: https://prow.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-console-operator-master-e2e-agnostic-upgrade Others seem to be more lucky about it: https://prow.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-cluster-monitoring-operator-master-e2e-agnostic-upgrade https://prow.ci.openshift.org/job-history/origin-ci-test/pr-logs/directory/pull-ci-openshift-cluster-etcd-operator-master-e2e-agnostic-upgrade
This e2e-agnostic-upgrade job is really just the e2e-azure-upgrade job [0]. Not sure why agnostic is used in the name. but, the periodic version of this job is also pretty unhealthy. I've pinged the TRT team about this job to see if they have any lead on it's health. The first few jobs I looked at were failing to bring up initial resources and not related to even running tests. If the infra is not very stable, I'd argue to change these jobs as presubmits to use aws instead of azure to reduce the noise devs have to deal with on their PRs. Here's a PR [3] to do just that. If that's reasonable, please comment on the PR and add a /lgtm [0] https://github.com/openshift/release/blob/a3830da4426d5afb00765e809a1e8c8f6a48e422/ci-operator/config/openshift/cluster-network-operator/openshift-cluster-network-operator-release-4.11.yaml#L66-L69 [1] https://sippy.ci.openshift.org/sippy-ng/jobs/4.10/analysis?filters=%7B%22items%22%3A%5B%7B%22columnField%22%3A%22name%22%2C%22operatorValue%22%3A%22equals%22%2C%22value%22%3A%22periodic-ci-openshift-release-master-ci-4.10-e2e-azure-upgrade%22%7D%5D%7D [2] https://coreos.slack.com/archives/C01CQA76KMX/p1647298980599429 [3] https://github.com/openshift/release/pull/26977
Marking verified as the fix for this was to move the agnostic job from azure to aws. The name of the job is now e2e-aws-upgrade as well. the PR was merged to do this and new PR checks are using the new job. example: https://github.com/openshift/cluster-network-operator/pull/1339
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069