Bug 2030972 - TestAdminAck should succeed: vulnerable to API-server hiccups
Summary: TestAdminAck should succeed: vulnerable to API-server hiccups
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Test Framework
Version: 4.8
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: 4.11.0
Assignee: W. Trevor King
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-10 06:50 UTC by W. Trevor King
Modified: 2022-07-25 04:31 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-27 03:01:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 27041 0 None open Bug 2030972: test/extended/util/openshift/clusterversionoperator: Survive API hiccups 2022-04-21 19:30:50 UTC

Description W. Trevor King 2021-12-10 06:50:36 UTC
The test case:

  [sig-cluster-lifecycle] TestAdminAck should succeed [Suite:openshift/conformance/parallel]

is vulnerable to brief API-server hiccups like [1]:

  Dec  6 00:00:10.440: FAIL: Error accessing configmap openshift-config-managed/admin-gates: Get "https://api.ci-op-g2m38jp7-eafe9.origin-ci-int-aws.dev.rhcloud.com:6443/api/v1/namespaces/openshift-config-managed/configmaps/admin-gates": dial tcp: lookup api.ci-op-g2m38jp7-eafe9.origin-ci-int-aws.dev.rhcloud.com on 172.30.0.10:53: no such host

and [2]:

  Dec  9 19:53:20.747: FAIL: Error accessing configmap openshift-config-managed/admin-gates: Get "https://api.ci-op-w5q90zpi-9278e.origin-ci-int-aws.dev.rhcloud.com:6443/api/v1/namespaces/openshift-config-managed/configmaps/admin-gates": dial tcp 100.21.251.165:6443: i/o timeout

We should... do something to make those non-fatal.  Logging the error and then bailing out to wait for the next poll round might work, but we want to ensure that we actually get a successful run and don't claim "success" if all our attempts were "I couldn't actually connect to the Kube API-server to check".

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2026806#c8
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=2027929#c1

Comment 2 W. Trevor King 2022-02-04 00:47:53 UTC
Moving back to NEW, because I haven't had time to work on it.  Leaving myself in as the assignee, because I don't want to dump fixing this on the Test Framework folks.


Note You need to log in before you can comment on or make changes to this bug.