Description of problem: Unable to delete deploymentconfig resource. The deploymentconfig has a finalizer but unable to find the blocking resource: # oc get dc -o yaml NAME_OF_DC apiVersion: v1 kind: DeploymentConfig metadata: creationTimestamp: 2018-03-12T00:49:16Z deletionGracePeriodSeconds: 0 deletionTimestamp: 2018-03-21T18:41:29Z finalizers: - foregroundDeletion generation: 29 labels: app: firsttestgateway name: firsttestgateway namespace: first-dt resourceVersion: "125930908" selfLink: /oapi/v1/namespaces/first-dt/deploymentconfigs/firsttestgateway ... Version-Release number of selected component (if applicable): atomic-openshift-3.7.23-1.git.0.8edc154.el7.x86_64 How reproducible: Reproducer steps unclear Actual results: Running `oc delete dc/firsttestgateway` returned successful message but running `oc get dc` after still showed the deploymentconfig. The deploymentconfig hung around for days until manually deleting the finalizer from the dc yaml and running `oc delete` again.
Can you please provide a dump of pods/replication controllers associated with this DC? Additionally an API server and controllers journal will be helpful to analyze.
We do not currently have those details but requesting them now from another instance of the issue. Leaving needfinfo set.
We would need to see controller logs from the time this removal was being invoked (at least for +1h after the initial oc delete invocation). It looks like there were some problems removing the dependant objects (either replication controllers or pods) the DC owned. Without the dependants being properly removed the actual DC won't be removed either. I'd like to investigate the logs to further confirm that theory and examine what might be causing this problem.
I've reviewed the attached logs and unfortunately I can't figure out what's exactly going on. The logs suggest as if everything is working as expected (with that I mean I don't see any errors), but I can't verify any theory without the full yaml of dependant resources or garbage collector logs at a higher level. I'd suggest the next time this situation happens, before applying the workaround, please gather the following data: - controller logs but with loglevel at least 2 or higher (this is level at which garbage collector produces valuable output) - full yamls for all the resources involved, in the case similar to the one described in comment 1 that will be: deployment config, replication controllers and pods.
Resetting needinfo
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1798