1705100 – [ci] e2e-aws-operator flakes

Bug 1705100 - [ci] e2e-aws-operator flakes

Summary: [ci] e2e-aws-operator flakes

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Miciah Dashiel Butler Masters
QA Contact:	Hongan Li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-05-01 13:31 UTC by Dan Mace
Modified:	2022-08-04 22:24 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:48:19 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift cluster-ingress-operator pull 223	None	closed	Bug 1705100: TestIngressControllerUpdate: Re-get between updates	2020-08-11 21:20:53 UTC
Github	openshift cluster-ingress-operator pull 225	None	closed	Bug 1705100: TestIngressControllerScale: Re-get between updates	2020-08-11 21:20:53 UTC
Red Hat Product Errata	RHBA-2019:0758	None	None	None	2019-06-04 10:48:27 UTC

Description Dan Mace 2019-05-01 13:31:05 UTC

Description of problem:

ingress-operator e2e test flake is happening more frequently, blocking PRs.

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/222/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator/841


https://search.svc.ci.openshift.org/?search=TestIngressControllerUpdate&maxAge=168h&context=2&type=all


--- FAIL: TestIngressControllerUpdate (12.79s)
	operator_test.go:367: failed to reset IngressController: Operation cannot be fulfilled on ingresscontrollers.operator.openshift.io "default": the object has been modified; please apply your changes to the latest version and try again
	operator_test.go:381: failed to get recreated CA certificate configmap: timed out waiting for the condition

=== RUN   TestRouterCACertificate
--- FAIL: TestRouterCACertificate (11.60s)
	operator_test.go:595: failed to get CA certificate: timed out waiting for the condition


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Miciah Dashiel Butler Masters 2019-05-01 19:10:04 UTC

What I believe is happening is that the TestIngressControllerUpdate fails on its second update to the ingress controller because, in between the updates that the test does, something else (possibly the status sync code) is updating the ingress controller, which causes its resource version to change.

The solution is the re-get the resource before the second update.

TestRouterCACertificate fails because TestIngressControllerUpdate fails before restoring the original default certificate secret reference.  Fixing the failure in TestIngressControllerUpdate should prevent the failure in TestRouterCACertificate.

PR: https://github.com/openshift/cluster-ingress-operator/pull/223

Comment 3 Miciah Dashiel Butler Masters 2019-05-02 16:11:19 UTC

We had a similar problem in TestIngressControllerScale, which should also be fixed now.

PR: https://github.com/openshift/cluster-ingress-operator/pull/225

Comment 4 Hongan Li 2019-05-08 06:36:05 UTC

I'm going to mark this as verified since no similar issue in recent ci test.

Comment 6 errata-xmlrpc 2019-06-04 10:48:19 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.