Bug 1705100 - [ci] e2e-aws-operator flakes
Summary: [ci] e2e-aws-operator flakes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.1.0
Assignee: Miciah Dashiel Butler Masters
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-01 13:31 UTC by Dan Mace
Modified: 2022-08-04 22:24 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:48:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-ingress-operator pull 223 0 None closed Bug 1705100: TestIngressControllerUpdate: Re-get between updates 2020-08-11 21:20:53 UTC
Github openshift cluster-ingress-operator pull 225 0 None closed Bug 1705100: TestIngressControllerScale: Re-get between updates 2020-08-11 21:20:53 UTC
Red Hat Product Errata RHBA-2019:0758 0 None None None 2019-06-04 10:48:27 UTC

Description Dan Mace 2019-05-01 13:31:05 UTC
Description of problem:

ingress-operator e2e test flake is happening more frequently, blocking PRs.

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/222/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/841


https://search.svc.ci.openshift.org/?search=TestIngressControllerUpdate&maxAge=168h&context=2&type=all


--- FAIL: TestIngressControllerUpdate (12.79s)
	operator_test.go:367: failed to reset IngressController: Operation cannot be fulfilled on ingresscontrollers.operator.openshift.io "default": the object has been modified; please apply your changes to the latest version and try again
	operator_test.go:381: failed to get recreated CA certificate configmap: timed out waiting for the condition

=== RUN   TestRouterCACertificate
--- FAIL: TestRouterCACertificate (11.60s)
	operator_test.go:595: failed to get CA certificate: timed out waiting for the condition


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Miciah Dashiel Butler Masters 2019-05-01 19:10:04 UTC
What I believe is happening is that the TestIngressControllerUpdate fails on its second update to the ingress controller because, in between the updates that the test does, something else (possibly the status sync code) is updating the ingress controller, which causes its resource version to change.

The solution is the re-get the resource before the second update.

TestRouterCACertificate fails because TestIngressControllerUpdate fails before restoring the original default certificate secret reference.  Fixing the failure in TestIngressControllerUpdate should prevent the failure in TestRouterCACertificate.

PR: https://github.com/openshift/cluster-ingress-operator/pull/223

Comment 3 Miciah Dashiel Butler Masters 2019-05-02 16:11:19 UTC
We had a similar problem in TestIngressControllerScale, which should also be fixed now.

PR: https://github.com/openshift/cluster-ingress-operator/pull/225

Comment 4 Hongan Li 2019-05-08 06:36:05 UTC
I'm going to mark this as verified since no similar issue in recent ci test.

Comment 6 errata-xmlrpc 2019-06-04 10:48:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.