Bug 1550418 - [ASB] Zombie project issue - while deleting a project that contains deprovision failed instance
Summary: [ASB] Zombie project issue - while deleting a project that contains deprovisi...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Service Broker
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.10.0
Assignee: Erik Nelson
QA Contact: Jian Zhang
Depends On: 1566924
TreeView+ depends on / blocked
Reported: 2018-03-01 08:50 UTC by Jian Zhang
Modified: 2018-07-30 19:10 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-07-30 19:10:04 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1816 None None None 2018-07-30 19:10:30 UTC

Description Jian Zhang 2018-03-01 08:50:05 UTC
Description of problem:

Reference bug: https://bugzilla.redhat.com/show_bug.cgi?id=1548311, a similar issue.
Zombie project issue - while deleting a project contains a deprovision failed instance
Error message:
[2018-03-01T08:25:05.968Z] [ERROR] - Error occurred getting service instance [ 63b3d030-d91e-4dbe-902e-ca593e8597d5 ] after deprovision job:

Version-Release number of selected component (if applicable):
The ASB version: 1.1.15
[root@host-172-16-120-67 ~]# docker run --rm --entrypoint=asbd registry.reg-aws.openshift.com:443/openshift3/ose-ansible-service-broker:v3.9.1 --version

How reproducible:

Steps to Reproduce:
1, Config the below registry that stores an example APB(deprovision fail). Like below:

  - type: dockerhub
    name: dh
    url: registry.hub.docker.com
    org: zjianbjz
    tag: latest
    - ".*-apb$"

2, Provision the "Hello Test" APB in a project called "test" and select the "faildeprovision" plan in web UI.
3, Provision success.
4, Deprovision it (failed as we expected), and Delete this "test" project.
5, Check the project status.
Actual results: 
serviceinstance and project test cannot be deleted

[root@host-172-16-120-87 ~]# oc get ns
NAME                                STATUS        AGE
default                             Active        5h
dh-hello-test-apb-depr-lsn72        Active        35m
install-test                        Active        5h
kube-public                         Active        5h
kube-service-catalog                Active        5h
kube-system                         Active        5h
logging                             Active        5h
management-infra                    Active        5h
openshift                           Active        5h
openshift-ansible-service-broker    Active        5h
openshift-infra                     Active        5h
openshift-node                      Active        5h
openshift-template-service-broker   Active        5h
openshift-web-console               Active        5h
test                                Terminating   41m

[root@host-172-16-120-87 ~]# oc get serviceinstance -n test
NAME                      AGE
dh-hello-test-apb-q97lw   21m

Expected results:
serviceinstance and project test should be deleted succeed.

Additional info:
The ASB logs:
[2018-03-01T08:25:05.953Z] [INFO] - ASYNC deprovision in progress
[2018-03-01T08:25:05.954Z] [DEBUG] - skipping deprovision and sending complete msg to channel
[2018-03-01T08:25:05.954Z] [DEBUG] - received deprovision message from buffer - - [01/Mar/2018:08:25:05 +0000] "DELETE /ansible-service-broker/v2/service_instances/63b3d030-d91e-4dbe-902e-ca593e8597d5?accepts_incomplete=true&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 202 58
[2018-03-01T08:25:05.968Z] [ERROR] - Error occurred getting service instance [ 63b3d030-d91e-4dbe-902e-ca593e8597d5 ] after deprovision job:
[2018-03-01T08:25:05.972Z] [WARNING] - Broker configured to *NOT* launch and run APB unbind
[2018-03-01T08:25:05.972Z] [DEBUG] - Dao::DeleteBindInstance -> [ 61d0749b-19d4-4eb4-911d-099ad9c8c04c ]
[2018-03-01T08:25:05.986Z] [INFO] - Could not find a service instance in dao - 100: Key not found (/service_instance/63b3d030-d91e-4dbe-902e-ca593e8597d5) [722] - - [01/Mar/2018:08:25:05 +0000] "DELETE /ansible-service-broker/v2/service_instances/63b3d030-d91e-4dbe-902e-ca593e8597d5?accepts_incomplete=true&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 410 3
[2018-03-01T08:25:06.189Z] [DEBUG] - service_id: 03b69500305d9859bb9440d9f9023784
[2018-03-01T08:25:06.189Z] [DEBUG] - plan_id: 43d3e23d214c26dbebc0879e44425db4
[2018-03-01T08:25:06.189Z] [DEBUG] - operation:  96130bde-6cfa-4008-b4d0-3599b3a41c16
[2018-03-01T08:25:06.189Z] [DEBUG] - state: succeeded

Comment 1 John Matthews 2018-03-02 17:29:45 UTC
Aligning to 3.10.0

Below issue is related:

Comment 2 Ryan Hallisey 2018-03-02 22:03:13 UTC
workaround: for i in $(oc get projects  | grep Terminating| awk '{print $1}'); do echo $i; oc get serviceinstance -n $i -o yaml | sed "/kubernetes-incubator/d"| oc apply -f - ; done

Comment 3 Erik Nelson 2018-05-01 15:30:52 UTC
There have likely been enough changes in both the catalog and the broker since this bz was filed that the problem does not exist anymore for a 3.10 release.

Can you please retest this with the latest catalog (v0.1.16) and the latest broker (1.2.8-1)?

Comment 4 Jian Zhang 2018-05-02 07:35:49 UTC

OK, I will test it later. Changed status to "MODIFIED" since the latest version of service catalog is "v3.10.0-0.31.0;Upstream:v0.1.13". And, for the version 1.2.8 of the ASB, bug 1566924 block also it.

Comment 5 Erik Nelson 2018-05-09 18:21:30 UTC
So while debugging a related issue, tracked on trello (https://trello.com/c/Kb3CVqkH), I discovered a broker bug that *may* have prevented the zombie project referenced in the bz from being deleted. I patched the broker and confirmed the following PR ensures the broker is responding correctly during failed async deprovisions. I'm also able to fully delete the project that the failed service was deployed to, no ServiceInstance resources are left behind.


Comment 7 Jian Zhang 2018-05-16 08:28:56 UTC
Verify success.

The target namespace that the failed serviceinstance was deployed can be deleted succeed.

The ASB version: 1.2.11
Service catalog: v3.10.0-0.46.0;Upstream:v0.1.18

Comment 9 errata-xmlrpc 2018-07-30 19:10:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.