Bug 1752132 - [GCP] e2e failure:[Feature:Builds][webhook] TestWebhook [Suite:openshift/conformance/parallel]
Summary: [GCP] e2e failure:[Feature:Builds][webhook] TestWebhook [Suite:openshift/conf...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.2.0
Assignee: Corey Daley
QA Contact: wewang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-13 19:37 UTC by David Eads
Modified: 2019-10-16 06:41 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-16 06:41:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 23803 0 None closed Bug 1752132: Adding By statements for Webhook Test 2020-09-27 07:25:37 UTC
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:41:23 UTC

Description David Eads 2019-09-13 19:37:46 UTC
[Feature:Builds][webhook] TestWebhook [Suite:openshift/conformance/parallel] failed in four of the last ten runs on https://testgrid.k8s.io/redhat-openshift-release-informing#redhat-canary-openshift-ocp-installer-e2e-gcp-4.2&sort-by-failures= .   To meet our objectives, the overall failure rate must be 1/4, this exceeds that.

Link to a job showing the failure: https://prow.k8s.io/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-gcp-4.2/260

Comment 1 Corey Daley 2019-09-13 21:04:51 UTC
From what I can find in the logs, the actual error that is occurring is not bubbling up through the WaitForAccessAllowed call in the "adding to binding" step of this specific test.  But it does seem that adding that specific binding is completing the Create step, but possibly the permissions are not being either 1.) reloaded or 2.) the cache is not being updated to support checking the newly created RoleBinding.

It may be worth adding additional logging here, or maybe someone can point me to where the additional logs are located if they are already being written somewhere?

Comment 2 Corey Daley 2019-09-13 21:06:34 UTC
Seems that I can't edit my previous comment ...

It also seems that this is not really an issue with the Build component but some kind of Auth issue?

Comment 3 Corey Daley 2019-09-16 13:58:52 UTC
Out of the last 14 runs, there were three failures that were unrelated to this bugzilla.

The last three failures were caused by:
#280  level=fatal msg="Bootstrap failed to complete: failed to wait for bootstrapping to complete: timed out waiting for the condition" 
#279 level=error msg="Error: Error waiting to create Image: Error waiting for Creating Image: timeout while waiting for state to become 'DONE' (last state: 'RUNNING', timeout: 4m0s)"
#275  fail [k8s.io/kubernetes/test/e2e/framework/framework.go:338]: Sep 15 14:34:54.275: Couldn't delete ns: "e2e-test-build-webhooks-ht54s": namespace e2e-test-build-webhooks-ht54s was not deleted with limit: timed out waiting for the condition, namespace is empty but is not yet removed (&errors.errorString{s:"namespace e2e-test-build-webhooks-ht54s was not deleted with limit: timed out waiting for the condition, namespace is empty but is not yet removed"})

Comment 4 Corey Daley 2019-09-16 15:10:41 UTC
Ok, ignore all of that, the real issue is 
https://prow.k8s.io/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-gcp-4.2/260#0:build-log.txt%3A7194

 Sep 13 16:17:16.745: INFO: Couldn't delete ns: "e2e-test-build-webhooks-dbqzv": namespace e2e-test-build-webhooks-dbqzv was not deleted with limit: timed out waiting for the condition, namespace is empty but is not yet removed (&errors.errorString{s:"namespace e2e-test-build-webhooks-dbqzv was not deleted with limit: timed out waiting for the condition, namespace is empty but is not yet removed"})
...
fail [k8s.io/kubernetes/test/e2e/framework/framework.go:338]: Sep 13 16:17:16.745: Couldn't delete ns: "e2e-test-build-webhooks-dbqzv": namespace e2e-test-build-webhooks-dbqzv was not deleted with limit: timed out waiting for the condition, namespace is empty but is not yet removed (&errors.errorString{s:"namespace e2e-test-build-webhooks-dbqzv was not deleted with limit: timed out waiting for the condition, namespace is empty but is not yet removed"})

Comment 6 wewang 2019-09-19 02:22:04 UTC
Verified in version:
4.2.0-0.nightly-2019-09-19-004703

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-gcp-4.2/337

Comment 7 errata-xmlrpc 2019-10-16 06:41:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.