1805821 – [build-cop] extended test failure: Unexpected error in oauth ldap

Bug 1805821 - [build-cop] extended test failure: Unexpected error in oauth ldap

Summary: [build-cop] extended test failure: Unexpected error in oauth ldap

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	apiserver-auth
Sub Component:
Version:	4.3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.5.0
Assignee:	Venkata Siva Teja Areti
QA Contact:	pmali
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1812186 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-02-21 15:31 UTC by Venkata Siva Teja Areti
Modified:	2020-07-13 17:17 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-07-13 17:16:59 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift origin pull 24597	0	None	closed	Bug 1805821: change restart policy of command pod to RestartPolicyOnFailure	2020-09-09 04:57:27 UTC
Red Hat Product Errata	RHBA-2020:2409	0	None	None	None	2020-07-13 17:17:19 UTC

Description Venkata Siva Teja Areti 2020-02-21 15:31:39 UTC

Description of problem:

An extended oauth ldap test failed 26 times in last 14 days

https://search.svc.ci.openshift.org/?search=errours+encountered+trying+to+run+ldapsearch+pod&maxAge=336h&context=2&type=all

Seeing two kinds of failures

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-ovn-4.3/516

fail [github.com/openshift/origin/test/extended/oauth/oauth_ldap.go:53]: Unexpected error:
    <*errors.errorString | 0xc000f841f0>: {
        s: "errours encountered trying to run ldapsearch pod: [error waiting for the pod 'runonce-ldapsearch-pod' to complete: timed out waiting for the condition]",
    }
    errours encountered trying to run ldapsearch pod: [error waiting for the pod 'runonce-ldapsearch-pod' to complete: timed out waiting for the condition]
occurred

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-serial-4.3/937

fail [github.com/openshift/origin/test/extended/oauth/groupsync.go:32]: Unexpected error:
    <*errors.errorString | 0xc002f39070>: {
        s: "errours encountered trying to run ldapsearch pod: [command pod runonce-ldapsearch-pod did not complete: Get https://147.75.102.37:10250/containerLogs/e2e-test-ldap-group-sync-5cmcx/runonce-ldapsearch-pod/runonce-ldapsearch-pod: remote error: tls: internal error]",
    }
    errours encountered trying to run ldapsearch pod: [command pod runonce-ldapsearch-pod did not complete: Get https://147.75.102.37:10250/containerLogs/e2e-test-ldap-group-sync-5cmcx/runonce-ldapsearch-pod/runonce-ldapsearch-pod: remote error: tls: internal error]
occurred

Comment 1 Venkata Siva Teja Areti 2020-02-21 21:40:12 UTC

https://search.svc.ci.openshift.org/?search=The+specified+key+does+not+exist&maxAge=6h&context=2&type=all

Comment 5 Venkata Siva Teja Areti 2020-05-08 14:39:24 UTC

As of today, there are a bunch of failures with ldap client pod errors.

Error logs from client pod

May  6 17:32:59.580: INFO: runonce-ldapsearch-pod[e2e-test-oauth-ldap-2gzs5].container[runonce-ldapsearch-pod].log
ldap_start_tls: Can't contact LDAP server (-1)
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)

Events in the namespaces show that the containers are running

May  6 17:32:59.479: INFO: POD                               NODE                               PHASE    GRACE  CONDITIONS
May  6 17:32:59.479: INFO: openldap-server-7f479cc6f5-v7lq6  qvp04xw3-3c054-8dfh6-worker-ttm74  Running         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2020-05-06 17:32:37 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2020-05-06 17:32:53 +0000 UTC  } {ContainersReady True 0001-01-01 00:00:00 +0000 UTC 2020-05-06 17:32:53 +0000 UTC  } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2020-05-06 17:32:37 +0000 UTC  }]
May  6 17:32:59.480: INFO: runonce-ldapsearch-pod            qvp04xw3-3c054-8dfh6-worker-26gxz  Failed          [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2020-05-06 17:32:54 +0000 UTC  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2020-05-06 17:32:59 +0000 UTC ContainersNotReady containers with unready status: [runonce-ldapsearch-pod]} {ContainersReady False 0001-01-01 00:00:00 +0000 UTC 2020-05-06 17:32:59 +0000 UTC ContainersNotReady containers with unready status: [runonce-ldapsearch-pod]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2020-05-06 17:32:54 +0000 UTC  }]

It is possible that there is a race condition between client and server. Updated the PR to reflect the fix.

Comment 6 Venkata Siva Teja Areti 2020-05-08 14:40:56 UTC

For the previous comment, fetched the logs from this CI failure

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.5/670

Comment 10 Venkata Siva Teja Areti 2020-05-11 17:06:57 UTC

*** Bug 1812186 has been marked as a duplicate of this bug. ***

Comment 13 errata-xmlrpc 2020-07-13 17:16:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409

Note You need to log in before you can comment on or make changes to this bug.