+++ This bug was initially created as a clone of Bug #1765294 +++ Description of problem: [Feature:OpenShiftControllerManager] TestDockercfgTokenDeletedController [Suite:openshift/conformance/parallel] fail [github.com/onsi/ginkgo/internal/leafnodes/runner.go:113]: timeout: sa1-dockercfg-zdx4x Additional info: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.3/228 https://testgrid.k8s.io/redhat-openshift-release-4.3-informing-ocp#release-openshift-ocp-installer-e2e-openstack-4.3 --- Additional comment from Anurag saxena on 2019-10-25 21:02:11 UTC --- --- Additional comment from Sergiusz Urbaniak on 2019-10-28 10:46:55 UTC --- confirming the issue is still persistent in e2e tests. --- Additional comment from errata-xmlrpc on 2019-10-29 20:27:32 UTC --- Bug report changed to ON_QA status by Errata System. A QE request has been submitted for advisory RHBA-2019:46256-02 https://errata.devel.redhat.com/advisory/46256 --- Additional comment from wewang on 2019-10-31 08:37:58 UTC --- It still exist in e2e test: 4.3.0-0.nightly-2019-10-31-050543 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-4.3/294 --- Additional comment from Adam Kaplan on 2019-11-01 20:58:35 UTC --- --- Additional comment from Ben Parees on 2019-11-04 19:44:22 UTC --- I wonder if this is a watch issue in the test...can we replace the logic in waitForSecretDelete that looks for the deletion event with an explicit poll that simply looks for the secret in question to go missing? --- Additional comment from Ricardo Maraschini on 2019-11-07 15:43:31 UTC --- I have just sent a patch that migrates away from watch, let's see if it is an issue there. --- Additional comment from Oleg Bulatov on 2019-11-08 11:45:03 UTC --- The PR that Ricardo mentioned: https://github.com/openshift/origin/pull/24103 --- Additional comment from OpenShift Automated Release Tooling on 2019-11-13 19:28:01 UTC --- Elliott changed bug status from ('MODIFIED',) to ON_QA. --- Additional comment from wewang on 2019-11-15 07:14:38 UTC --- [Feature:OpenShiftControllerManager] TestDockercfgTokenDeletedController [Suite:openshift/conformance/parallel] is verified in: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-4.3/801 --- Additional comment from Petr Muller on 2019-11-20 18:19:58 UTC --- This test failure also occurred in a machine-os-content promotion job https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-promote-openshift-machine-os-content-e2e-aws-4.3/3992 It looks like the fix is supposed to be in, can you please check if the above is the same thing? --- Additional comment from Oleg Bulatov on 2019-11-21 00:15:33 UTC --- Watch has been replaced by Poll, but the test still flakes [1]. [1] https://testgrid.k8s.io/redhat-openshift-ocp-release-4.3-blocking#release-openshift-origin-installer-e2e-gcp-4.3&include-filter-by-regex=TestDockercfgTokenDeletedController --- Additional comment from Ricardo Maraschini on 2019-11-21 15:16:58 UTC --- I have this test running individually here for more than 1 hour. It takes less than 10 seconds to complete and I had not even a single failure. Starting to look to see if there may be any problem due to parallel tests. --- Additional comment from Oleg Bulatov on 2019-11-21 16:41:15 UTC --- Adding debug information to the test: https://github.com/openshift/origin/pull/24187 --- Additional comment from Adam Kaplan on 2019-11-22 18:31:34 UTC --- Moving to 4.4.0, we will likely need to backport to 4.3.0 once we determine the root cause. --- Additional comment from Adam Kaplan on 2019-11-26 15:42:13 UTC --- --- Additional comment from Adam Kaplan on 2019-11-26 15:42:51 UTC --- Moving this to 4.3.0 given the impact of this bug. --- Additional comment from Ed Santiago on 2019-11-27 15:16:00 UTC --- Still seeing this in recent runs: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-4.3/478 (plus many from last night) --- Additional comment from Ben Parees on 2019-11-27 15:19:52 UTC --- This test was (temporarily) disabled as of 15 hours ago: https://github.com/openshift/origin/pull/24221 maybe it hadn't made it through the ART cycle though. --- Additional comment from Oleg Bulatov on 2019-11-27 15:37:20 UTC --- It's disabled only in master (4.4). Do we want to disable it in 4.3? --- Additional comment from Ben Parees on 2019-11-27 15:55:19 UTC --- ugh. yes. thanks Oleg. --- Additional comment from Adam Kaplan on 2019-12-02 14:32:03 UTC --- Note too that once we uncover the root cause of the flake, we need a 4.3 backport anyway for the .0 release or a z-stream update.
all attached PRs are only about gathering additional information. Switching back to ASSIGNED
FYI: sometimes passed in ci:https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-4.3/531 but failed in: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-4.3/977
*** Bug 1814453 has been marked as a duplicate of this bug. ***
From [1]: > The backport to 4.3.z is on hold until 4.4.0 goes GA. Also [2]. But the 4.4 bug 1806792 is VERIFIED, we run a lot of 4.4 CI, and we have 4.4 RCs out in the wild. Can we declare "soaked enough" at some point before 4.4.0 and land this backport to address the most common cause of 4.3 CI failures (which is what this bug was yesterday, although today other failure modes have pulled ahead ;). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1814453#c2 [2]: https://github.com/openshift/openshift-controller-manager/pull/72#pullrequestreview-371497080
Checked in version: 4.3.0-0.ci-2020-03-26-003534 [Feature:OpenShiftControllerManager] TestDockercfgTokenDeletedController [Suite:openshift/conformance/parallel] passed in job: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-4.3/1670
Reopening. This likely caused the regression in https://bugzilla.redhat.com/show_bug.cgi?id=1785023.
Moving back to VERIFIED - fix for regression is being tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1785023 and its dependent BZs.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1262