Bug 1843462
Summary: | daemonset, deployment, and replicaset status can permafail | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Maciej Szulik <maszulik> | |
Component: | kube-controller-manager | Assignee: | Maciej Szulik <maszulik> | |
Status: | CLOSED ERRATA | QA Contact: | zhou ying <yinzhou> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.5 | CC: | aos-bugs, bparees, deads, mfojtik, wking, yinzhou | |
Target Milestone: | --- | Keywords: | Upgrades | |
Target Release: | 4.5.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause:
In certain cases NotFound error was swallowed by controller logic.
Consequence:
Missing NotFound event was causing the controller not be aware of missing pods.
Fix:
Properly react to NotFound events, which indicate that the pod was already removed by a different actor.
Result:
Controller (deployment, daemonset, replicaset and others) will properly react to pod NotFound event.
|
Story Points: | --- | |
Clone Of: | 1843187 | |||
: | 1843876 (view as bug list) | Environment: | ||
Last Closed: | 2020-07-13 17:42:56 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1843187 | |||
Bug Blocks: | 1843876 |
Description
Maciej Szulik
2020-06-03 10:55:48 UTC
Aligning Keywords with the upstream bug. This is currently blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1845889 which is waiting a backport of https://github.com/openshift/origin/pull/25091 to land and fix k8s conformance tests. Checked with payload: 4.5.0-0.nightly-2020-06-20-194346: Open 2 terminals , and at the same time, on the first terminal delete one pod for deployment, on the second terminal scale down the deployment . check the deploy, no new pod created: [zhouying@dhcp-140-138 ~]$ oc get po NAME READY STATUS RESTARTS AGE ruby-ex-1-build 0/1 Completed 0 2m20s ruby-ex-76567d646-2bppr 1/1 Running 0 12s ruby-ex-76567d646-q86k4 1/1 Running 0 12s ruby-ex-76567d646-tw49k 1/1 Running 0 84s [zhouying@dhcp-140-138 ~]$ oc delete po/ruby-ex-76567d646-tw49k pod "ruby-ex-76567d646-tw49k" deleted [zhouying@dhcp-140-138 ~]$ oc scale deploy/ruby-ex --replicas=2 deployment.apps/ruby-ex scaled [zhouying@dhcp-140-138 ~]$ oc get po NAME READY STATUS RESTARTS AGE ruby-ex-1-build 0/1 Completed 0 3m56s ruby-ex-76567d646-2bppr 1/1 Running 0 108s ruby-ex-76567d646-q86k4 1/1 Running 0 108s ruby-ex-76567d646-tw49k 1/1 Terminating 0 3m Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 Removing UpgradeBlocker from this older bug, to remove it from the suspect queue described in [1]. If you feel like this bug still needs to be a suspect, please add keyword again. [1]: https://github.com/openshift/enhancements/pull/475 |