Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1862322

Summary:	[sig-operator] an end user can use OLM can subscribe to the operator: Timed out after 300.000s
Product:	OpenShift Container Platform	Reporter:	W. Trevor King <wking>
Component:	OLM	Assignee:	Evan Cordell <ecordell>
OLM sub component:	OLM	QA Contact:	Jian Zhang <jiazha>
Status:	CLOSED DUPLICATE	Docs Contact:
Severity:	urgent
Priority:	urgent	CC:	nhale, sttts
Version:	4.6
Target Milestone:	---
Target Release:	4.6.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:	[sig-operator] an end user can use OLM can subscribe to the operator
Last Closed:	2020-08-06 12:07:11 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description W. Trevor King 2020-07-31 03:51:25 UTC

test:
[sig-operator] an end user can use OLM can subscribe to the operator 

is failing frequently in CI, see search results:
https://search.svc.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-operator%5C%5D+an+end+user+can+use+OLM+can+subscribe+to+the+operator

For example:

$ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=12h&type=junit&search=%5C%5Bsig-operator%5C%5D+an+end+user+can+use+OLM+can+subscribe+to+the+operator' | grep 'failures match' | sort
promote-release-openshift-machine-os-content-e2e-aws-4.6 - 15 runs, 80% failed, 42% of failures match
pull-ci-cri-o-cri-o-master-e2e-aws - 49 runs, 82% failed, 10% of failures match
pull-ci-openshift-cloud-credential-operator-master-e2e-aws - 5 runs, 80% failed, 50% of failures match
...
pull-ci-operator-framework-operator-lifecycle-manager-master-e2e-gcp - 20 runs, 75% failed, 80% of failures match
pull-ci-operator-framework-operator-registry-master-e2e-aws - 7 runs, 86% failed, 33% of failures match
rehearse-10377-pull-ci-openshift-cluster-monitoring-operator-release-4.6-e2e - 1 runs, 100% failed, 100% of failures match
rehearse-8108-pull-ci-openshift-openshift-ansible-master-e2e-aws-scaleup-rhel7 - 5 runs, 60% failed, 67% of failures match
release-openshift-ocp-e2e-aws-scaleup-rhel7-4.6 - 2 runs, 100% failed, 100% of failures match
release-openshift-ocp-installer-e2e-aws-fips-4.6 - 2 runs, 100% failed, 50% of failures match
release-openshift-ocp-installer-e2e-aws-ovn-4.6 - 2 runs, 100% failed, 50% of failures match
release-openshift-ocp-installer-e2e-azure-4.6 - 2 runs, 100% failed, 50% of failures match
release-openshift-ocp-installer-e2e-gcp-4.6 - 2 runs, 100% failed, 50% of failures match
release-openshift-ocp-installer-e2e-gcp-ovn-4.6 - 2 runs, 100% failed, 50% of failures match
release-openshift-origin-installer-e2e-azure-4.6 - 15 runs, 87% failed, 31% of failures match

Looks like this is very sad across the board on 4.6. Example release job is [1],  which dies with:

STEP: Found 0 events.
Jul 31 02:01:33.785: INFO: POD  NODE  PHASE  GRACE  CONDITIONS
Jul 31 02:01:33.785: INFO: 
Jul 31 02:01:33.849: INFO: skipping dumping cluster info - cluster too large
Jul 31 02:01:33.908: INFO: Deleted {user.openshift.io/v1, Resource=users  e2e-test-olm-23440-bf62c-user}, err: <nil>
Jul 31 02:01:33.973: INFO: Deleted {oauth.openshift.io/v1, Resource=oauthclients  e2e-client-e2e-test-olm-23440-bf62c}, err: <nil>
Jul 31 02:01:34.008: INFO: Deleted {oauth.openshift.io/v1, Resource=oauthaccesstokens  0sOayY4ZSvmD5tbwFfzHXgAAAAAAAAAA}, err: <nil>
[AfterEach] [sig-operator] an end user can use OLM
  github.com/openshift/origin@/test/extended/util/client.go:134
Jul 31 02:01:34.008: INFO: Waiting up to 7m0s for all (but 100) nodes to be ready
STEP: Destroying namespace "e2e-test-olm-23440-bf62c" for this suite.
Jul 31 02:01:34.082: INFO: Running AfterSuite actions on all nodes
Jul 31 02:01:34.082: INFO: Running AfterSuite actions on node 1
fail [github.com/openshift/origin@/test/extended/operators/olm.go:199]: Timed out after 300.000s.
Expected
    <string>: 
not to equal
    <string>: 

[1]: https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-e2e-aws-scaleup-rhel7-4.6/1288974231769452544

Comment 1 Stefan Schimanski 2020-07-31 05:42:43 UTC

Seems to perma-fail now. Disabled the test.

Comment 7 Evan Cordell 2020-08-06 11:59:18 UTC

The cause of the flakiness was resolved via other fixes in operator-registry, the attached PR re-enables the test.

Comment 8 Evan Cordell 2020-08-06 12:07:11 UTC


*** This bug has been marked as a duplicate of bug 1857928 ***