Even though all tests pass, "make unit-test" occasionally exits with a non-zero return code. This happens because the "vet" utility, that is called by "go test", is killed and fails: go test -v -race ./pkg/... ./test/utils/credentials /usr/lib/golang/pkg/tool/linux_amd64/vet: signal: killed /usr/lib/golang/pkg/tool/linux_amd64/vet: signal: killed /usr/lib/golang/pkg/tool/linux_amd64/vet: signal: killed /usr/lib/golang/pkg/tool/linux_amd64/vet: signal: killed (...) Here is an example job: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/23165/rehearse-23165-pull-ci-openshift-azure-file-csi-driver-master-unit2/1455228476146585600/build-log.txt For the time being we have disable "go vet" in our the "unit" CI job, but that should be reverted and they underlying issue fixed: https://github.com/openshift/release/commit/c8066e2e385e4563433cf06f08cae62bb73dd636#diff-75299b45d9fd8e4fae4211fb7c9dcba6d02cbf691985d9c1bf776895f3cd005aR45
While the CI job was "fixed" to run `go test -vet=off`, here we want to investigate *why* CI kills our `make unit-test`.
This is the test container as executed by CI: - resources: limits: memory: 4Gi requests: cpu: 100m memory: 200Mi with `/bin/time -v make unit-test` in the same pod I got: Command being timed: "make unit-test" User time (seconds): 309.93 System time (seconds): 30.03 Percent of CPU this job got: 493% Elapsed (wall clock) time (h:mm:ss or m:ss): 1:08.92 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 1687828 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 45 Minor (reclaiming a frame) page faults: 4179941 Voluntary context switches: 443988 Involuntary context switches: 55552 Swaps: 0 File system inputs: 15296 File system outputs: 3266488 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 The tests need ~ 1.6 GB of memory.
This says that the current average is 2Gi: https://resources.ci.openshift.org/usage/pods?branch=master&container=test&org=openshift&repo=azure-file-csi-driver&target=unit&variant= I'm restoring `make test-unit` and adding bigger limits in the linked PR.
Verified pass, the last unit test job(https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_azure-file-csi-driver/6/pull-ci-openshift-azure-file-csi-driver-master-unit/1458151165031092224) is successful now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056