Bug 1583064

Summary: Should not launch multi unbind sandboxes frequently while unbind failed
Product: OpenShift Container Platform Reporter: Zhang Cheng <chezhang>
Component: Service BrokerAssignee: Jesus M. Rodriguez <jesusr>
Status: CLOSED ERRATA QA Contact: Zhang Cheng <chezhang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.10.0CC: aos-bugs, jiazha, jmatthew, zhsun, zitang
Target Milestone: ---   
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The Service Catalog changed how it handles failed bind and unbind actions, by repeatedly calling the action for a long period of time. APBs missing bind or unbind actions will cause binds or unbind actions to fail. Consequence: Because the Service Catalog detects a failure, repeatedly calls the Ansible Service Broker which will create new projects for each of the APBs it is launching in response to the action. Fix: The Ansible Service Broker will detect if the action is missing and mark the job as succeeded and return an error string in the message. Result: Only one namespace is created for a missing bind or unbind action playbook. If the bind or unbind *exists* and fails, this fix does not affect that we expect that to fail requiring APB authors to fix their APBs.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-30 19:16:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Zhang Cheng 2018-05-28 07:59:08 UTC
Description of problem: 
Should not launch multi unbind sandboxes frequently while unbind failed


service-catalog & asb image using images:
# service-catalog --version
v3.10.0-0.53.0;Upstream:v0.1.19
# asbd --version
1.2.14


How reproducible:
Always


Steps to Reproduce:
1. Set launch_apb_on_bind: true in cm, rollout asb pod and sync with service catalog
2. Provision a postgresql apb from web console
3. Create a servicebinding from web console
4. Delete the servicebinding from web console


Actual results:  
4. launch multi unbind sandboxes frequently
rh-postgresql-apb-unbi-25kcp        Active    25m
rh-postgresql-apb-unbi-28nz8        Active    25m
rh-postgresql-apb-unbi-2gf8l        Active    24m
rh-postgresql-apb-unbi-2rzzn        Active    26m
rh-postgresql-apb-unbi-427cf        Active    27m
rh-postgresql-apb-unbi-42qmm        Active    22m
rh-postgresql-apb-unbi-4d5l6        Active    20m
rh-postgresql-apb-unbi-4nh27        Active    24m
rh-postgresql-apb-unbi-4tfnp        Active    27m
rh-postgresql-apb-unbi-57s29        Active    26m
rh-postgresql-apb-unbi-589pr        Active    24m
rh-postgresql-apb-unbi-59s6v        Active    21m
rh-postgresql-apb-unbi-5b2qh        Active    26m
rh-postgresql-apb-unbi-5d8fz        Active    20m
rh-postgresql-apb-unbi-64lkg        Active    24m
rh-postgresql-apb-unbi-6jcns        Active    23m
rh-postgresql-apb-unbi-6mftd        Active    22m


Expected results: 
4. Should not launch multi unbind sandboxes frequently


Addition info: 
None

Comment 1 Jesus M. Rodriguez 2018-05-30 20:28:40 UTC
If the unbind fails we really want to let them know it failed because it
is probably something the APB developer needs to fix. But if the action
is non-existent which is the case with this bug, we really need to prevent
the multiple sandboxes.

Comment 3 Jesus M. Rodriguez 2018-05-30 20:32:13 UTC
$ oc get projects
NAME                                DISPLAY NAME   STATUS
ansible-service-broker                             Active
blog-project                                       Active
default                                            Active
dh-postgresql-apb-bind-zc82n                       Active
dh-postgresql-apb-prov-jq97t                       Active
dh-postgresql-apb-unbi-vpn2r                       Active

Comment 4 Jesus M. Rodriguez 2018-05-30 20:33:43 UTC
Looking at the asb log BEFORE this fix:

$ grep UNBINDING multi-unbind-asb.log 
time="2018-05-30T17:31:07Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:31:11Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:31:16Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:31:21Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:31:26Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:31:31Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:31:40Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:31:44Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:31:49Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:31:55Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:31:59Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:04Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:08Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:13Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:17Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:23Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:29Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:33Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:38Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:47Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:51Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:32:56Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:33:03Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:33:08Z" level=info msg="                       UNBINDING                            "
time="2018-05-30T17:33:12Z" level=info msg="                       UNBINDING                            "


Looking at the asb log AFTER this fix:
$ grep UNBINDING asb.log
time="2018-05-30T20:03:10Z" level=info msg="                       UNBINDING                            "

Comment 5 David Zager 2018-05-31 03:15:26 UTC
https://errata.devel.redhat.com/advisory/33505 updated with the following builds:

openshift-enterprise-apb-tools-v3.10.0-0.32.0.3
openshift-enterprise-asb-container-v3.10.0-0.54.0.1
openshift-enterprise-mariadb-apb-v3.10.0-0.51.0.1
openshift-enterprise-mediawiki-apb-v3.10.0-0.54.0.1
openshift-enterprise-mediawiki-container-v3.10.0-0.54.0.0
openshift-enterprise-mysql-apb-v3.10.0-0.54.0.1
openshift-enterprise-postgresql-apb-v3.10.0-0.54.0.1

Comment 6 Zhang Cheng 2018-05-31 07:20:43 UTC
Verified and passed with asb:1.2.16

Follow original steps, sandbox of unbind will be automated trigger while bind failed, and be prevented the multiple sandboxes in currently. LGTM.

# oc get ns | grep postgre
rh-postgresql-apb-bind-ml4qw        Active    11m
rh-postgresql-apb-prov-k25kz        Active    13m
rh-postgresql-apb-unbi-6pzvb        Active    10m

# oc logs asb-2-89lg4 | grep UNBINDING
time="2018-05-31T07:09:17Z" level=info msg="                       UNBINDING                            "

Comment 8 errata-xmlrpc 2018-07-30 19:16:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816