Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1559987 - Unable to delete deploymentconfig
Unable to delete deploymentconfig
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Master (Show other bugs)
3.7.0
Unspecified Unspecified
high Severity high
: ---
: 3.7.z
Assigned To: Michal Fojtik
Wang Haoran
: Reopened
Depends On:
Blocks: 1267746
  Show dependency treegraph
 
Reported: 2018-03-23 12:29 EDT by Robert Bost
Modified: 2018-07-27 09:50 EDT (History)
21 users (show)

See Also:
Fixed In Version: v3.7.49-1
Doc Type: Bug Fix
Doc Text:
Cause: In some cases the shared informer caches is was not initialized properly or failed to initialize. Consequence: Controllers like garbage collection stuck in wait for caches to be initialized Fix: In case the cache is stuck, don't wait for it to initialize but forward the request to storage (etcd) directly to unblock controllers. Result: Controllers can reach the resources without being stuck on cache to initialize.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-07-11 05:57:57 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3391401 None None None 2018-03-23 20:07 EDT
Red Hat Product Errata RHBA-2018:1798 None None None 2018-06-26 02:43 EDT

  None (edit)
Description Robert Bost 2018-03-23 12:29:20 EDT
Description of problem: 

Unable to delete deploymentconfig resource. The deploymentconfig has a finalizer but unable to find the blocking resource:

# oc get dc -o yaml NAME_OF_DC
apiVersion: v1
kind: DeploymentConfig
metadata:
  creationTimestamp: 2018-03-12T00:49:16Z
  deletionGracePeriodSeconds: 0
  deletionTimestamp: 2018-03-21T18:41:29Z
  finalizers:
  - foregroundDeletion
  generation: 29
  labels:
    app: firsttestgateway
  name: firsttestgateway
  namespace: first-dt
  resourceVersion: "125930908"
  selfLink: /oapi/v1/namespaces/first-dt/deploymentconfigs/firsttestgateway
  ...

Version-Release number of selected component (if applicable): atomic-openshift-3.7.23-1.git.0.8edc154.el7.x86_64


How reproducible: Reproducer steps unclear

Actual results: Running `oc delete dc/firsttestgateway` returned successful message but running `oc get dc` after still showed the deploymentconfig. The deploymentconfig hung around for days until manually deleting the finalizer from the dc yaml and running `oc delete` again.
Comment 1 Michal Fojtik 2018-03-26 04:22:32 EDT
Can you please provide a dump of pods/replication controllers associated with this DC? Additionally an API server and controllers journal will be helpful to analyze.
Comment 2 Robert Bost 2018-03-26 11:40:15 EDT
We do not currently have those details but requesting them now from another instance of the issue. Leaving needfinfo set.
Comment 4 Maciej Szulik 2018-03-28 06:19:50 EDT
We would need to see controller logs from the time this removal was being invoked (at least for +1h after the initial oc delete invocation). It looks like there were some problems removing the dependant objects (either replication controllers or pods) the DC owned. Without the dependants being properly removed the actual DC won't be removed either. I'd like to investigate the logs to further confirm that theory and examine what might be causing this problem.
Comment 6 Maciej Szulik 2018-04-05 03:44:17 EDT
I've reviewed the attached logs and unfortunately I can't figure out what's exactly going on. The logs suggest as if everything is working as expected (with that I mean I don't see any errors), but I can't verify any theory without the full yaml of dependant resources or garbage collector logs at a higher level. 

I'd suggest the next time this situation happens, before applying the workaround, please gather the following data:

- controller logs but with loglevel at least 2 or higher (this is level at which garbage collector produces valuable output)
- full yamls for all the resources involved, in the case similar to the one described in comment 1 that will be: deployment config, replication controllers and pods.
Comment 29 Robert Bost 2018-05-21 09:44:47 EDT
Resetting needinfo
Comment 43 errata-xmlrpc 2018-06-26 02:43:43 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1798

Note You need to log in before you can comment on or make changes to this bug.