Bug 1843319
Summary: | daemonsets fail to rollout during upgrade | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ben Parees <bparees> | |
Component: | kube-controller-manager | Assignee: | Tomáš Nožička <tnozicka> | |
Status: | CLOSED ERRATA | QA Contact: | zhou ying <yinzhou> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.5 | CC: | aos-bugs, deads, maszulik, mfojtik, tnozicka, wking | |
Target Milestone: | --- | Keywords: | Reopened | |
Target Release: | 4.6.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: DaemonSet recreation
Consequence: DaemonSet could get stuck for 5 minutes while expectations expire.
Fix: DaemonSet controller now clears expectations on recreate.
Result: Doesn't get stuck.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1845242 (view as bug list) | Environment: | ||
Last Closed: | 2020-10-27 16:04:37 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1845242 |
Description
Ben Parees
2020-06-03 02:31:41 UTC
Yeah, David already opened bug 1843187, so I'll close this as a duplicate. *** This bug has been marked as a duplicate of bug 1843187 *** fyi, I've confirmed the pods are actually ready so this is likely the expectations bug we are tracking This is a different bug. The GC controller is not cleaning up six pods in openshift-monitoring that do not have valid owner references. reopening based on https://bugzilla.redhat.com/show_bug.cgi?id=1843319#c3 This is causing upgrade failures in 4.5, what is the basis for deferring it? (in general comments should always be added to a bug explaining a deferral, when the bug is deferred) (In reply to Ben Parees from comment #5) > This is causing upgrade failures in 4.5, what is the basis for deferring it? > > (in general comments should always be added to a bug explaining a deferral, > when the bug is deferred) After bug 1843187 is fixed Tomas will need to dig through the logs and identify what is causing the actual problem. The fix has to land in 4.6 first and only then be back-ported (through clone of this BZ) to 4.5. It's not that we are deferring thos bug, we're following the process, but it takes a bit of time to nail down the root cause and find a fix. I think the DS expectations didn't get clear on re-create case, working on a fix upstream. > The fix has to land in 4.6 first and only then be back-ported (through clone of this BZ) to 4.5. It's not that
we are deferring thos bug, we're following the process, but it takes a bit of time to nail down the root cause and
find a fix.
you can still open the 4.5 clone now so we have a complete view of our blocker list for 4.5. Otherwise 4.5 risks going out the door w/o this being addressed (because no one except you and I are aware it affects 4.5, it doesn't show up on any 4.5 lists).
So by not opening the 4.5 BZ now, you are (implicitly) saying you're ok shipping as is/deferring this bug.
This bug is actively worked on. Check with the unit test code , the issue has fixed: [root@dhcp-140-138 daemon]# go test -v -run TestExpectationsOnRecreate ... === RUN TestExpectationsOnRecreate I0702 16:40:46.789765 8026 shared_informer.go:223] Waiting for caches to sync for test dsc I0702 16:40:46.890030 8026 shared_informer.go:230] Caches are synced for test dsc --- PASS: TestExpectationsOnRecreate (0.41s) PASS ok k8s.io/kubernetes/pkg/controller/daemon 0.426s Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |