Bug 1568607 - template instance controller tries to use garbage collection across namespaces
Summary: template instance controller tries to use garbage collection across namespaces
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Templates
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.10.0
Assignee: Ben Parees
QA Contact: XiuJuan Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-17 22:36 UTC by Ben Parees
Modified: 2018-07-30 19:13 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
If this bug requires documentation, please select an appropriate Doc Type value.
Clone Of:
Environment:
Last Closed: 2018-07-30 19:13:03 UTC
Target Upstream Version:
Embargoed:
xiuwang: needinfo-
xiuwang: needinfo-


Attachments (Terms of Use)
crd related resources (4.54 KB, text/plain)
2018-05-21 05:36 UTC, XiuJuan Wang
no flags Details
FinalizerError (5.17 KB, text/plain)
2018-05-22 06:21 UTC, XiuJuan Wang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1816 0 None None None 2018-07-30 19:13:21 UTC

Description Ben Parees 2018-04-17 22:36:07 UTC
The template instance controller creates ownerrefs from the objects being created, to the templateinstance object.

Since the objects being created can:

1) be cluster scoped resources
2) be created in a different namespace from where the template instance was created

this results in objects that have dangling owner references.

for case (1) it's unclear what the GC will do with the object
for case (2) the object will likely be GCed as soon as the GC concludes the ownerreferenced object doesn't exist (because it doesn't exist in the object's namespace).

We need to decide how to deal w/ these scenarios.

Comment 1 Ben Parees 2018-04-18 12:31:31 UTC
(My current thinking is to stop depending on GC and just iterate+delete the objects when the template instance is deleted, however that requires reconciliation logic that can determine if we've missed a delete event which is generally challenging to implement, at best).

Comment 2 Jordan Liggitt 2018-04-18 14:47:14 UTC
how do you find all the objects created from the template? are the resources (and uids) captured somewhere immutable? aren't there cases like generateName that make it non-deterministic what objects were created?

Comment 3 Ben Parees 2018-04-18 15:07:15 UTC
well we know the objects because the templateinstance object has a fully copy of the template it instantiated, but you're right about generated names, we wouldn't be able to find those.

the next option would be to apply a uniquely generated label to all the objects the templateinstance controller generates (much like the ownerref today, but as a label instead) and then go find the objects later.

Comment 4 Ben Parees 2018-04-18 15:08:47 UTC
at which point, once we also add reconciliation to find orphaned objects that we missed the templateinstance delete event for, we've basically reinvented garbage collection, so the logic question would be:  is there any chance of garbage collection being enhanced to allow cross-namespace ownerrefs and/or clusterscoped objects to have ownerrefs to namespaced objects?

Comment 6 XiuJuan Wang 2018-05-09 08:05:02 UTC
Ben,
Could you give an example how to create the templateinstance and related objects in different namespaces?

Comment 7 Ben Parees 2018-05-09 13:49:06 UTC
Create a template that uses parameterized namespace values like:

https://github.com/openshift/origin/blob/master/examples/prometheus/prometheus.yaml#L38

In particular, use two different parameters with two different values, so that one object gets put in the namespace based on the value of PARAM1 and the second object gets put in the namespace based on the value of PARAM2.

Registry the template w/ the template service broker (put it in the openshift namespace) and then provision it using the service catalog/TSB.  (Provisioning a template via the TSB will result in creating a templateinstance for the template).

You should see:

1) object1 gets created in PARAM1's namespace
2) object2 gets created in PARAM2's namespace
3) when you de-provision the instance, object1+object2 get deleted (and the templateinstance object gets deleted)

The additional namespaces must exist and your user must have permission to create/delete objects in those namespaces.

Comment 9 XiuJuan Wang 2018-05-18 08:01:05 UTC
Server https://ec2-***.compute-1.amazonaws.com:8443
openshift v3.10.0-0.47.0
kubernetes v1.10.0+b81c8f8


1)Could deprovision cross namespace resources after deprovision templateinstance.
 a.Provision a clusterserviceclass include cross namespace resources
 b.All resources are marked as labels template.openshift.io/template-instance-owner
 c.Deprovision templateinstance
 d.Cross resources have been deleted.


2)Could provision a cluster scoped resource CRD
But failed to deprovision serviceinstance and templateinstance, although has destroied CRD manaully.

 a.Add cluster-admin policy
    #oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:openshift-infra:template-instance-controller
    #oc adm policy add-cluster-role-to-user cluster-admin common_user
 b.Create template(dummy) include CRD resources under openshift project
 c.After clusterserviceclass sync,provision dummy template
 d. serviceinstance and templateinstance status go to ready, and crd resource(which has label template.openshift.io/template-instance-owner) is created
 e.Deprovision serviceinstance or templateinstance. Two resources always exist after deprovision although has destroied CRD manaully.

Comment 10 Ben Parees 2018-05-18 14:03:23 UTC
how are you "Deprovisioning the templateinstance"?  That's not a thing you can do, you can only deprovision a serviceinstance.

please provide a yaml dump of the objects that are being left behind.

is deprovision reporting any error?

Comment 11 XiuJuan Wang 2018-05-21 05:35:09 UTC
Deprovision templateinstance  via rest api [1], or delete templateinstance from, or delete templateinstance from web console.
Neither does it work.

[1] https://github.com/openshift/origin/blob/master/pkg/templateservicebroker/servicebroker/test-scripts/deprovision.sh

Comment 12 XiuJuan Wang 2018-05-21 05:36:20 UTC
Created attachment 1439382 [details]
crd related resources

Comment 13 Ben Parees 2018-05-21 18:36:44 UTC
ok it looks like the CRD is not being deleted for some reason, so the templateinstance finalizer is blocking the deletion of the templateinstance.

Can you get level 5 logs from the master/controller process that cover the time period when you did the deprovision?

Comment 14 XiuJuan Wang 2018-05-22 06:21:22 UTC
Created attachment 1439869 [details]
FinalizerError

Comment 15 XiuJuan Wang 2018-05-22 06:23:40 UTC
Before I provision crd resources, have add policy:
    #oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:openshift-infra:template-instance-controller
    #oc adm policy add-cluster-role-to-user cluster-admin common_user

But the error in log is no permission to delete crd

Comment 16 Ben Parees 2018-05-22 14:26:06 UTC
oh, right, there's another SA involved now.

you need to also run:
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:openshift-infra:template-instance-finalizer-controller

to grant the finalizer permission to clean up the CRD.

sorry about that.

Comment 17 XiuJuan Wang 2018-05-23 08:38:51 UTC
After add 'oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:openshift-infra:template-instance-finalizer-controller'
Could deprovision crd serviceinstance & templateinstance & crd resources

Move this bug as verified

openshift v3.10.0-0.50.0
kubernetes v1.10.0+b81c8f8

Comment 19 errata-xmlrpc 2018-07-30 19:13:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816


Note You need to log in before you can comment on or make changes to this bug.