2084361 – ci jobs exit 2 for an unknown reason

Bug 2084361 - ci jobs exit 2 for an unknown reason

Summary: ci jobs exit 2 for an unknown reason

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Test Framework
Sub Component:
Version:	4.11
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.11.0
Assignee:	OpenShift Release Oversight
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-05-12 00:05 UTC by jamo luhrsen
Modified:	2022-11-21 19:44 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-11-21 19:44:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift origin pull 27136	0	None	open	Bug 2084361: Add some debugging information for mysterious exit 2 errors	2022-05-15 21:10:15 UTC
Github	openshift origin pull 27230	0	None	open	Bug 2084361: Print error msg before os.Exit()	2022-06-08 17:26:06 UTC

Description jamo luhrsen 2022-05-12 00:05:30 UTC

Some jobs, like these two [0][1], are showing up as a failure, but there
is no clear reason for the failure other than the test container was exit(2)

There is a brief slack conversation about this here[2].

a quick hit on search.ci shows that it has happened 15 out of 332 failed
jobs:

    curl -s 'https://search.ci.openshift.org/search?maxAge=12h&type=build-log&context=1&search=Step.*openshift-e2e-test+failed' | jq -r 'to_entries[].value | to_entries[].value[].context[]' | grep 'exit status' | sort | uniq -c
    317 error: failed to execute wrapped command: exit status 1
     15 error: failed to execute wrapped command: exit status 2

The test container failure message does not help:

    {"component":"entrypoint","error":"wrapped process failed: exit status 2","file":"k8s.io/test-infra/prow/entrypoint/run.go:80","func":"k8s.io/test-infra/prow/entrypoint.Options.Run","level":"error","msg":"Error executing test process","severity":"error","time":"2022-05-11T08:16:30Z"}
error: failed to execute wrapped command: exit status 2


I cannot find any other clues in the job artifacts. It seems like all the
tests and steps are ok in these jobs and it should be marked as a pass.

[0] https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade/1524255508167397376
[1] https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade/1524255509832536064
[2] https://coreos.slack.com/archives/C01CQA76KMX/p1652304143240249

Comment 2 jamo luhrsen 2022-05-18 15:02:54 UTC

moving back to assigned, because the PR associated with this bz was just for debugging
purposes. I haven't come across another example of this yet since the debug PR went
in. When I do, I'll reply here with what I find.

Comment 4 Devan Goodwin 2022-11-21 19:44:26 UTC

No more exit status 2 in the results using the command above. Closing.

Note You need to log in before you can comment on or make changes to this bug.