Bug 1852916 - [sig-apps][Feature:DeploymentConfig] deploymentconfigs adoption will orphan all RCs and adopt them back when recreated
Summary: [sig-apps][Feature:DeploymentConfig] deploymentconfigs adoption will orphan a...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: openshift-controller-manager
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.0
Assignee: Maciej Szulik
QA Contact: zhou ying
URL:
Whiteboard: workloads, LifecycleStale
: 1852995 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-01 15:07 UTC by Corey Daley
Modified: 2020-10-27 16:12 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
[sig-apps][Feature:DeploymentConfig] deploymentconfigs adoption will orphan all RCs and adopt them back when recreated [Feature:DeploymentConfig] deploymentconfigs adoption [Conformance] will orphan all RCs and adopt them back when recreated
Last Closed: 2020-10-27 16:11:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 25408 0 None closed Bug 1868061: switch watchtools.UntilWithoutRetry to watchtools.UntilWithSync in deployment tests 2020-09-14 02:37:38 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:12:05 UTC

Comment 2 W. Trevor King 2020-07-02 04:38:51 UTC
Linked 'launch' job was via cluster-bot, which you can see by clicking through from the job-detail page to the ProwJob YAML, which opens with:

metadata:
  annotations:
    ci-chat-bot.openshift.io/channel: ""
    ci-chat-bot.openshift.io/expires: "13500"
    ci-chat-bot.openshift.io/jobInputs: '[{"Image":"","Version":"4.5.0-0.latest","Refs":[{"org":"openshift","repo":"origin","base_ref":"master","base_sha":"5d8c7f115968f4e3cdc44e047be2585bec6f1e7e","pulls":[{"number":25217,"author":"system:serviceaccount:ci:ci-chat-bot","sha":"db81e9892f66b8c838f545e8051c53df5696d39e"}]},{"org":"openshift","repo":"oauth-server","base_ref":"master","base_sha":"9eee6c6eeaf48f5426dd501c0726f777eb850847","pulls":[{"number":50,"author":"system:serviceaccount:ci:ci-chat-bot","sha":"9de5c160e16fee7fddc29ea2423d083548be8d6b"}]}]}]'
    ci-chat-bot.openshift.io/jobParams: test=e2e
    ci-chat-bot.openshift.io/mode: test
    ci-chat-bot.openshift.io/ns: ci-ln-mbpkz5b
    ci-chat-bot.openshift.io/originalMessage: test e2e openshift/origin#25217,openshift/oauth-server#50
    ci-chat-bot.openshift.io/platform: aws
    ci-chat-bot.openshift.io/user: U9UTT7MAT
    prow.k8s.io/job: release-openshift-origin-installer-launch-aws
  creationTimestamp: "2020-07-01T11:29:18Z"
  generation: 3
  labels:
    ci-chat-bot.openshift.io/launch: "true"
    prow.k8s.io/build-id: ""
    prow.k8s.io/id: chat-bot-2020-07-01-112918.7695
    prow.k8s.io/job: release-openshift-origin-installer-launch-aws
    prow.k8s.io/type: periodic
  name: chat-bot-2020-07-01-112918.7695
  namespace: ci
  resourceVersion: "66292321"
  selfLink: /apis/prow.k8s.io/v1/namespaces/ci/prowjobs/chat-bot-2020-07-01-112918.7695
  uid: 5e016563-bd89-45a9-a2d1-231e7bf78005

I dunno if we care about 4.5-latest cluster-bot jobs, since it's possible someone was mucking about with the cluster as it ran.  Also in the referenced job, you can see that this test-case was flaky, not fatal (it failed once, but passed on retest).  When it failed, the failure message was:

  timed out waiting for the condition

which is not very helpful.  As I mentioned in the similar bug 1852995, I'd be in favor of work to improve that message to say what we timed out waiting for and how far along we were, to give folks some guidance when they to figure out what got stuck and why.

Comment 3 Maciej Szulik 2020-07-02 08:09:27 UTC
Based on https://sippy-bparees.svc.ci.openshift.org/?release=4.6 this failed 6 times, I'm marking this low priority, accordingly.

Comment 4 Maciej Szulik 2020-07-02 08:09:36 UTC
*** Bug 1852995 has been marked as a duplicate of this bug. ***

Comment 6 Jan Safranek 2020-08-12 09:06:11 UTC
Noticed this today when debugging unrelated Watch errors

fail [github.com/openshift/origin/test/extended/deployments/util.go:751]: watch closed unexpectedly
Expected
    <bool>: false
to be equivalent to
    <bool>: true


And (note a different test case!)

[sig-apps][Feature:DeploymentConfig] deploymentconfigs with minimum ready seconds set should not transition the deployment to Complete before satisfied [Suite:openshift/conformance/parallel] expand_less

fail [github.com/openshift/origin@/test/extended/deployments/deployments.go:1090]: Unexpected error:
    <*errors.errorString | 0xc000540410>: {
        s: "watch closed before UntilWithoutRetry timeout",
    }
    watch closed before UntilWithoutRetry timeout
occurred

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-compact-4.6/1293389102703448064


There may be some low hanging fruit here: don't use UntilWithoutRetry in https://github.com/openshift/origin/blob/705b5e6987ec79af02b54e749f52f4bc67c455d9/test/extended/deployments/util.go

From UntilWithoutRetry comments:
// Warning: Unless you have a very specific use case (probably a special Watcher) don't use this function!!!
// Warning: This will fail e.g. on API timeouts and/or 'too old resource version' error.
// Warning: You are most probably looking for a function *Until* or *UntilWithSync* below,
// Warning: solving such issues.
// TODO: Consider making this function private to prevent misuse when the other occurrences in our codebase are gone.

(API timeout or similar error is most probably the case here)

Comment 7 Maciej Szulik 2020-08-12 11:14:03 UTC
This potentially will be fixed in https://github.com/openshift/origin/pull/25408

Comment 8 Maciej Szulik 2020-09-10 18:47:30 UTC
Fix landed in https://github.com/openshift/origin/pull/25010

Comment 10 zhou ying 2020-09-15 02:26:11 UTC
Checked from latest 4.6 test runs from Gcp and Azure , can't reproduce this issue now. will move to verified status. 

https://testgrid.k8s.io/redhat-openshift-ocp-release-4.6-informing#release-openshift-ocp-installer-e2e-gcp-4.6
https://testgrid.k8s.io/redhat-openshift-ocp-release-4.6-informing#release-openshift-ocp-installer-e2e-azure-4.6

Comment 13 errata-xmlrpc 2020-10-27 16:11:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.