Bug 1559987
Summary: | Unable to delete deploymentconfig | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Robert Bost <rbost> | |
Component: | Master | Assignee: | Michal Fojtik <mfojtik> | |
Status: | CLOSED ERRATA | QA Contact: | Wang Haoran <haowang> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 3.7.0 | CC: | acomabon, aos-bugs, bfurtado, deads, dsafford, fshaikh, glamb, jdesousa, jkaur, jmalde, jokerman, kmendez, maszulik, mfojtik, mmccomas, openshift-bugs-escalate, rbost, smunilla, sthangav, stwalter, suchaudh | |
Target Milestone: | --- | Keywords: | Reopened | |
Target Release: | 3.7.z | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | v3.7.49-1 | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
In some cases the shared informer caches is was not initialized properly or failed to initialize.
Consequence:
Controllers like garbage collection stuck in wait for caches to be initialized
Fix:
In case the cache is stuck, don't wait for it to initialize but forward the request to storage (etcd) directly to unblock controllers.
Result:
Controllers can reach the resources without being stuck on cache to initialize.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1678028 (view as bug list) | Environment: | ||
Last Closed: | 2018-07-11 09:57:57 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1267746, 1678028 |
Description
Robert Bost
2018-03-23 16:29:20 UTC
Can you please provide a dump of pods/replication controllers associated with this DC? Additionally an API server and controllers journal will be helpful to analyze. We do not currently have those details but requesting them now from another instance of the issue. Leaving needfinfo set. We would need to see controller logs from the time this removal was being invoked (at least for +1h after the initial oc delete invocation). It looks like there were some problems removing the dependant objects (either replication controllers or pods) the DC owned. Without the dependants being properly removed the actual DC won't be removed either. I'd like to investigate the logs to further confirm that theory and examine what might be causing this problem. I've reviewed the attached logs and unfortunately I can't figure out what's exactly going on. The logs suggest as if everything is working as expected (with that I mean I don't see any errors), but I can't verify any theory without the full yaml of dependant resources or garbage collector logs at a higher level. I'd suggest the next time this situation happens, before applying the workaround, please gather the following data: - controller logs but with loglevel at least 2 or higher (this is level at which garbage collector produces valuable output) - full yamls for all the resources involved, in the case similar to the one described in comment 1 that will be: deployment config, replication controllers and pods. Resetting needinfo Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1798 |