Bug 1923737

Summary:	openshift-tests should better support use of --count with early/late tests and support --fail-fast
Product:	OpenShift Container Platform	Reporter:	Clayton Coleman <ccoleman>
Component:	Test Framework	Assignee:	Devan Goodwin <dgoodwin>
Status:	CLOSED WONTFIX	QA Contact:
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	4.7	CC:	jiazha, pmuller, skuznets
Target Milestone:	---
Target Release:	4.7.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-02-02 18:15:35 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Clayton Coleman 2021-02-01 18:23:08 UTC

In general we are leaning more heavily into iteration, perform a number of cleanups on the test code (for porting to 4.6 as well) that unify some temporary hacks into a more principled structure.

--fail-fast should terminate cleanly when a test fails (and highlight the test htat failed)

[Early] and [Late] should not be run multiple times with --count

[Early] and [Late] differ between jobs and should not overlap

SCC pre-test checks were extremely slow (every 15s) and should take no more than 30s with check intervals much faster

Cloud provider initialization should not occur unless needed

Disruptive tests should skip [Late] but not [Early]

Comment 5 Jian Zhang 2021-03-04 10:06:42 UTC

1, `--count` works as expected, if it is `-1`, it will run forever.
[root@preserve-olm-env origin]# ./openshift-tests run all --dry-run|grep "OLM Report Upgradeable"|./openshift-tests run --count -1 -f -
openshift-tests version: v4.1.0-3659-g7c51c89
started: (0/1/1) "[sig-operator] an end user can use OLM Report Upgradeable in OLM ClusterOperators status [Suite:openshift/conformance/parallel]"

passed: (3.8s) 2021-03-04T09:45:44 "[sig-operator] an end user can use OLM Report Upgradeable in OLM ClusterOperators status [Suite:openshift/conformance/parallel]"

started: (0/2/2) "[sig-operator] an end user can use OLM Report Upgradeable in OLM ClusterOperators status [Suite:openshift/conformance/parallel]"

passed: (2.8s) 2021-03-04T09:45:47 "[sig-operator] an end user can use OLM Report Upgradeable in OLM ClusterOperators status [Suite:openshift/conformance/parallel]"

started: (0/3/3) "[sig-operator] an end user can use OLM Report Upgradeable in OLM ClusterOperators status [Suite:openshift/conformance/parallel]"
...

But, if --count = -5 , it runs 1 time, it should run forever if --count < 0.

[root@preserve-olm-env origin]# ./openshift-tests run all --dry-run|grep "OLM Report Upgradeable"|./openshift-tests run --count -5 -f -
openshift-tests version: v4.1.0-3659-g7c51c89
started: (0/1/1) "[sig-operator] an end user can use OLM Report Upgradeable in OLM ClusterOperators status [Suite:openshift/conformance/parallel]"

passed: (2.8s) 2021-03-04T09:48:33 "[sig-operator] an end user can use OLM Report Upgradeable in OLM ClusterOperators status [Suite:openshift/conformance/parallel]"


Timeline:

Mar 04 09:48:31.559 I ns/e2e-test-olm-23440-ksd65 namespace/e2e-test-olm-23440-ksd65 reason/CreatedSCCRanges created SCC ranges
Mar 04 09:48:31.699 - 1s    W ns/openshift-marketplace pod/qe-app-registry-r2ztw node/ip-10-0-179-41.us-east-2.compute.internal pod has been pending longer than a minute
Mar 04 09:48:31.699 - 1s    W ns/e2e-runtimeclass-7115 pod/test-runtimeclass-e2e-runtimeclass-7115-non-conflict-runti882v2 node/ip-10-0-140-208.us-east-2.compute.internal pod has been pending longer than a minute
Mar 04 09:48:31.699 - 1s    W ns/openshift-marketplace pod/qe-app-registry-2h64f node/ip-10-0-140-208.us-east-2.compute.internal pod has been pending longer than a minute

1 pass, 0 skip (2.8s)

if --count = 0 , it runs 1 time, it should not run.

[root@preserve-olm-env origin]# ./openshift-tests run all --dry-run|grep "OLM Report Upgradeable"|./openshift-tests run --count 0 -f -
openshift-tests version: v4.1.0-3659-g7c51c89
started: (0/1/1) "[sig-operator] an end user can use OLM Report Upgradeable in OLM ClusterOperators status [Suite:openshift/conformance/parallel]"

passed: (2.9s) 2021-03-04T09:47:51 "[sig-operator] an end user can use OLM Report Upgradeable in OLM ClusterOperators status [Suite:openshift/conformance/parallel]"


Timeline:

Mar 04 09:47:49.026 I ns/e2e-test-olm-23440-hwk9h namespace/e2e-test-olm-23440-hwk9h reason/CreatedSCCRanges created SCC ranges
Mar 04 09:47:49.134 - 999ms W ns/openshift-marketplace pod/qe-app-registry-2h64f node/ip-10-0-140-208.us-east-2.compute.internal pod has been pending longer than a minute
Mar 04 09:47:49.134 - 999ms W ns/e2e-runtimeclass-7115 pod/test-runtimeclass-e2e-runtimeclass-7115-non-conflict-runti882v2 node/ip-10-0-140-208.us-east-2.compute.internal pod has been pending longer than a minute
Mar 04 09:47:49.134 - 999ms W ns/openshift-marketplace pod/qe-app-registry-r2ztw node/ip-10-0-179-41.us-east-2.compute.internal pod has been pending longer than a minute

1 pass, 0 skip (2.9s)


`--fail-fast` works as expected, it exited the suite when any test fails
[root@preserve-olm-env origin]# ./openshift-tests run all --dry-run|grep "\[sig-node\] RuntimeClass"
"[sig-node] RuntimeClass  should support RuntimeClasses API operations [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]"
"[sig-node] RuntimeClass should reject a Pod requesting a RuntimeClass with an unconfigured handler [NodeFeature:RuntimeHandler] [Disabled:Broken] [Suite:k8s]"
"[sig-node] RuntimeClass should reject a Pod requesting a RuntimeClass with conflicting node selector [Disabled:Broken] [Suite:k8s]"
"[sig-node] RuntimeClass should reject a Pod requesting a deleted RuntimeClass [NodeFeature:RuntimeHandler] [Disabled:Broken] [Suite:k8s]"
"[sig-node] RuntimeClass should reject a Pod requesting a non-existent RuntimeClass [NodeFeature:RuntimeHandler] [Disabled:Broken] [Suite:k8s]"
"[sig-node] RuntimeClass should run a Pod requesting a RuntimeClass with a configured handler [NodeFeature:RuntimeHandler] [Disabled:Broken] [Suite:k8s]"
"[sig-node] RuntimeClass should run a Pod requesting a RuntimeClass with scheduling with taints [Serial]  [Disabled:Broken] [Suite:k8s]"
"[sig-node] RuntimeClass should run a Pod requesting a RuntimeClass with scheduling without taints  [Disabled:Broken] [Suite:k8s]"

[root@preserve-olm-env origin]# ./openshift-tests run all --dry-run|grep "\[sig-node\] RuntimeClass"|./openshift-tests run --fail-fast -f -
openshift-tests version: v4.1.0-3659-g7c51c89
started: (0/1/8) "[sig-node] RuntimeClass should reject a Pod requesting a RuntimeClass with conflicting node selector [Disabled:Broken] [Suite:k8s]"

started: (0/2/8) "[sig-node] RuntimeClass should run a Pod requesting a RuntimeClass with scheduling without taints  [Disabled:Broken] [Suite:k8s]"

started: (0/3/8) "[sig-node] RuntimeClass should reject a Pod requesting a non-existent RuntimeClass [NodeFeature:RuntimeHandler] [Disabled:Broken] [Suite:k8s]"

started: (0/4/8) "[sig-node] RuntimeClass  should support RuntimeClasses API operations [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]"

started: (0/5/8) "[sig-node] RuntimeClass should run a Pod requesting a RuntimeClass with a configured handler [NodeFeature:RuntimeHandler] [Disabled:Broken] [Suite:k8s]"

started: (0/6/8) "[sig-node] RuntimeClass should reject a Pod requesting a deleted RuntimeClass [NodeFeature:RuntimeHandler] [Disabled:Broken] [Suite:k8s]"

started: (0/7/8) "[sig-node] RuntimeClass should reject a Pod requesting a RuntimeClass with an unconfigured handler [NodeFeature:RuntimeHandler] [Disabled:Broken] [Suite:k8s]"

passed: (1.4s) 2021-03-04T09:20:59 "[sig-node] RuntimeClass should reject a Pod requesting a RuntimeClass with conflicting node selector [Disabled:Broken] [Suite:k8s]"
...
failed: (5m4s) 2021-03-04T09:26:01 "[sig-node] RuntimeClass should run a Pod requesting a RuntimeClass with a configured handler [NodeFeature:RuntimeHandler] [Disabled:Broken] [Suite:k8s]"
skipped: (5m4s) 2021-03-04T09:26:01 "[sig-node] RuntimeClass should run a Pod requesting a RuntimeClass with scheduling without taints  [Disabled:Broken] [Suite:k8s]"

Timeline:
...
Failing tests:

[sig-node] RuntimeClass should run a Pod requesting a RuntimeClass with a configured handler [NodeFeature:RuntimeHandler] [Disabled:Broken] [Suite:k8s]

error: 1 fail, 5 pass, 1 skip (5m4s)

But, it should have 2 skip test cases, the results summary is incorrect.

Comment 7 Steve Kuznetsov 2021-05-17 19:24:17 UTC

@clayton what is left here?

Comment 9 Red Hat Bugzilla 2023-09-15 01:00:10 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days