Bug 1570603 - Should not launch multi depr sandboxes while provision failed
Summary: Should not launch multi depr sandboxes while provision failed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Service Broker
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 3.10.0
Assignee: David Zager
QA Contact: Zhang Cheng
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-23 10:20 UTC by Zhang Cheng
Modified: 2018-07-30 19:14 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-07-30 19:13:42 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1816 None None None 2018-07-30 19:14:17 UTC

Description Zhang Cheng 2018-04-23 10:20:11 UTC
Description of problem: 
Should not lanuch so many depr sandboxs while provision failed.


service-catalog & asb image using images from brew registry:
# service-catalog --version
v3.10.0-0.27.0;Upstream:v0.1.13
# asbd --version
1.2.6

How reproducible:
Always


Steps to Reproduce:
1. Provison a mediawiki apb with same username and passwd (provision should failed by expect)
2. Look at namespace by `oc get ns | grep mediawiki`

Actual results:
1. Provison a mediawiki apb failed by expect.
2. Auto lanuch many depr sandboxs
# oc get ns | grep mediawiki
rh-mediawiki-apb-depr-4kf9z         Active    5m
rh-mediawiki-apb-depr-4vzlq         Active    2s
rh-mediawiki-apb-depr-5z5sr         Active    3m
rh-mediawiki-apb-depr-7kjsl         Active    2m
rh-mediawiki-apb-depr-b4p9t         Active    6m
rh-mediawiki-apb-depr-bms7h         Active    2m
rh-mediawiki-apb-depr-d74sl         Active    5m
rh-mediawiki-apb-depr-dvqx4         Active    1m
rh-mediawiki-apb-depr-fhdv4         Active    42s
rh-mediawiki-apb-depr-fkhr7         Active    4m
rh-mediawiki-apb-depr-fq4cc         Active    2m
rh-mediawiki-apb-depr-hh5lj         Active    1m
rh-mediawiki-apb-depr-hqj2k         Active    5m
rh-mediawiki-apb-depr-j5jn5         Active    4m
rh-mediawiki-apb-depr-j85sm         Active    16s
rh-mediawiki-apb-depr-kfrnc         Active    3m
rh-mediawiki-apb-depr-kwnf8         Active    5m
rh-mediawiki-apb-depr-mf5m9         Active    4m
rh-mediawiki-apb-depr-mxsff         Active    29s
rh-mediawiki-apb-depr-n7qqx         Active    2m
rh-mediawiki-apb-depr-p5s28         Active    3m
rh-mediawiki-apb-depr-prvf7         Active    56s
rh-mediawiki-apb-depr-rc8sv         Active    1m
rh-mediawiki-apb-depr-rgj7v         Active    6m
rh-mediawiki-apb-depr-rqz62         Active    3m
rh-mediawiki-apb-depr-rt7dk         Active    4m
rh-mediawiki-apb-depr-z8wfv         Active    4m
rh-mediawiki-apb-depr-zhmsk         Active    2m
rh-mediawiki-apb-depr-zqzbf         Active    1m
rh-mediawiki-apb-prov-r7jlc         Active    6m

Expected results:
2. should not launch any depr sandbox.


Addition info: 
Attach asb logs:
time="2018-04-23T10:08:44Z" level=info msg="============================================================"
time="2018-04-23T10:08:44Z" level=info msg="                       PROVISIONING                         "
time="2018-04-23T10:08:44Z" level=info msg="============================================================"
time="2018-04-23T10:08:44Z" level=info msg="Spec.ID: 03b69500305d9859bb9440d9f9023784"
time="2018-04-23T10:08:44Z" level=info msg="Spec.Name: rh-mediawiki-apb"
time="2018-04-23T10:08:44Z" level=info msg="Spec.Image: registry.access.stage.redhat.com/openshift3/mediawiki-apb:v3.10"
time="2018-04-23T10:08:44Z" level=info msg="Spec.Description: Mediawiki apb implementation"
time="2018-04-23T10:08:44Z" level=info msg="============================================================"
time="2018-04-23T10:08:44Z" level=info msg="Checking if namespace %s exists.cover"
time="2018-04-23T10:08:44Z" level=info msg="Creating RoleBinding apb-4c98884c-4d97-4342-ac56-ca61ce224205"
time="2018-04-23T10:08:44Z" level=info msg="Creating RoleBinding apb-4c98884c-4d97-4342-ac56-ca61ce224205"
time="2018-04-23T10:08:44Z" level=info msg="Successfully created apb sandbox: [ %s ], with %s permissions in namespace %sapb-4c98884c-4d97-4342-ac56-ca61ce224205editrh-mediawiki-apb-prov-r7jlc"
time="2018-04-23T10:08:44Z" level=info msg="Running post create sandbox fuctions if defined."
time="2018-04-23T10:08:44Z" level=info msg="Proxy configuration present. Applying to APB before execution:"
time="2018-04-23T10:08:44Z" level=info msg="HTTP_PROXY=\"\""
time="2018-04-23T10:08:44Z" level=info msg="HTTPS_PROXY=\"\""
time="2018-04-23T10:08:44Z" level=info msg="NO_PROXY=\"\""
time="2018-04-23T10:08:44Z" level=info msg="Creating pod \"apb-4c98884c-4d97-4342-ac56-ca61ce224205\" in the rh-mediawiki-apb-prov-r7jlc namespace"
[2018-04-23T10:08:44.898Z] [INFO] - All Jobs for instance: 4c8d3c6b-46de-11e8-9590-0a580a80000b in state:  in progress - 
[]apb.JobState{apb.JobState{Token:"7e728f5c-15a7-4716-8869-d6a45d15a875", State:"in progress", Podname:"", Method:"provision", Error:"", Description:"action started"}}
[2018-04-23T10:08:44.898Z] [INFO] - Provision requested for instance 4c8d3c6b-46de-11e8-9590-0a580a80000b, but job is already in progress
[2018-04-23T10:08:44.898Z] [ERROR] - provision error provision in progress
10.128.0.1 - - [23/Apr/2018:10:08:44 +0000] "PUT /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b?accepts_incomplete=true HTTP/1.1" 202 58
10.128.0.1 - - [23/Apr/2018:10:08:45 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=7e728f5c-15a7-4716-8869-d6a45d15a875&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64
10.128.0.1 - - [23/Apr/2018:10:08:45 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=7e728f5c-15a7-4716-8869-d6a45d15a875&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64
10.128.0.1 - - [23/Apr/2018:10:08:49 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=7e728f5c-15a7-4716-8869-d6a45d15a875&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64
time="2018-04-23T10:08:50Z" level=error msg="Provision or Update action failed - Pod [ apb-4c98884c-4d97-4342-ac56-ca61ce224205 ] failed with exit code [2]"
time="2018-04-23T10:08:50Z" level=info msg="Destroying APB sandbox..."
time="2018-04-23T10:08:50Z" level=info msg="Successfully deleted rolebinding apb-4c98884c-4d97-4342-ac56-ca61ce224205, namespace rh-mediawiki-apb-prov-r7jlc"
time="2018-04-23T10:08:50Z" level=info msg="Successfully deleted rolebinding apb-4c98884c-4d97-4342-ac56-ca61ce224205, namespace cover"
[2018-04-23T10:08:50.709Z] [ERROR] - broker::provision error occurred. Pod [ apb-4c98884c-4d97-4342-ac56-ca61ce224205 ] failed with exit code [2]
time="2018-04-23T10:08:50Z" level=error msg="Provision APB error: Pod [ apb-4c98884c-4d97-4342-ac56-ca61ce224205 ] failed with exit code [2]"
10.128.0.1 - - [23/Apr/2018:10:08:57 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=7e728f5c-15a7-4716-8869-d6a45d15a875&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 129
[2018-04-23T10:08:57.485Z] [INFO] - All Jobs for instance: 4c8d3c6b-46de-11e8-9590-0a580a80000b in state:  in progress - 
[]apb.JobState{}
time="2018-04-23T10:08:57Z" level=error msg="Unable to load secret '4c8d3c6b-46de-11e8-9590-0a580a80000b' from namespace 'openshift-ansible-service-broker'"
time="2018-04-23T10:08:57Z" level=error msg="unable to get secret data for 4c8d3c6b-46de-11e8-9590-0a580a80000b, in namespace: openshift-ansible-service-broker"
[2018-04-23T10:08:57.488Z] [INFO] - ASYNC deprovision in progress
10.128.0.1 - - [23/Apr/2018:10:08:57 +0000] "DELETE /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b?accepts_incomplete=true&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 202 58
time="2018-04-23T10:08:57Z" level=info msg="============================================================"
time="2018-04-23T10:08:57Z" level=info msg="                      DEPROVISIONING                        "
time="2018-04-23T10:08:57Z" level=info msg="============================================================"
time="2018-04-23T10:08:57Z" level=info msg="ServiceInstance.Id: 03b69500305d9859bb9440d9f9023784"
time="2018-04-23T10:08:57Z" level=info msg="ServiceInstance.Name: rh-mediawiki-apb"
time="2018-04-23T10:08:57Z" level=info msg="ServiceInstance.Image: registry.access.stage.redhat.com/openshift3/mediawiki-apb:v3.10"
time="2018-04-23T10:08:57Z" level=info msg="ServiceInstance.Description: Mediawiki apb implementation"
time="2018-04-23T10:08:57Z" level=info msg="============================================================"
time="2018-04-23T10:08:57Z" level=info msg="Creating RoleBinding apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626"
time="2018-04-23T10:08:57Z" level=info msg="Creating RoleBinding apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626"
time="2018-04-23T10:08:57Z" level=info msg="Successfully created apb sandbox: [ %s ], with %s permissions in namespace %sapb-a80e3b0e-ac46-419e-a12b-5c265cbd9626editrh-mediawiki-apb-depr-rgj7v"
time="2018-04-23T10:08:57Z" level=info msg="Running post create sandbox fuctions if defined."
time="2018-04-23T10:08:57Z" level=info msg="Proxy configuration present. Applying to APB before execution:"
time="2018-04-23T10:08:57Z" level=info msg="HTTP_PROXY=\"\""
time="2018-04-23T10:08:57Z" level=info msg="HTTPS_PROXY=\"\""
time="2018-04-23T10:08:57Z" level=info msg="NO_PROXY=\"\""
time="2018-04-23T10:08:57Z" level=info msg="Creating pod \"apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626\" in the rh-mediawiki-apb-depr-rgj7v namespace"
[2018-04-23T10:09:06.482Z] [ERROR] - broker::deprovision error occurred. Pod [ apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626 ] failed with exit code [2]
time="2018-04-23T10:09:06Z" level=error msg="Deprovision action failed - Pod [ apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626 ] failed with exit code [2]"
time="2018-04-23T10:09:06Z" level=info msg="Destroying APB sandbox..."
time="2018-04-23T10:09:06Z" level=info msg="Successfully deleted rolebinding apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626, namespace rh-mediawiki-apb-depr-rgj7v"
time="2018-04-23T10:09:06Z" level=info msg="Successfully deleted rolebinding apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626, namespace cover"
10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=45453ceb-3f11-43ac-b100-1da5c40a7cf3&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 131
10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=45453ceb-3f11-43ac-b100-1da5c40a7cf3&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 131
10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=45453ceb-3f11-43ac-b100-1da5c40a7cf3&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 131
10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=45453ceb-3f11-43ac-b100-1da5c40a7cf3&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 131
10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=45453ceb-3f11-43ac-b100-1da5c40a7cf3&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 131
[2018-04-23T10:09:13.692Z] [INFO] - All Jobs for instance: 4c8d3c6b-46de-11e8-9590-0a580a80000b in state:  in progress - 
[]apb.JobState{}
time="2018-04-23T10:09:13Z" level=error msg="Unable to load secret '4c8d3c6b-46de-11e8-9590-0a580a80000b' from namespace 'openshift-ansible-service-broker'"
time="2018-04-23T10:09:13Z" level=error msg="unable to get secret data for 4c8d3c6b-46de-11e8-9590-0a580a80000b, in namespace: openshift-ansible-service-broker"
[2018-04-23T10:09:13.694Z] [INFO] - ASYNC deprovision in progress
10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "DELETE /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b?accepts_incomplete=true&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 202 58
time="2018-04-23T10:09:13Z" level=info msg="============================================================"
time="2018-04-23T10:09:13Z" level=info msg="                      DEPROVISIONING                        "
time="2018-04-23T10:09:13Z" level=info msg="============================================================"
time="2018-04-23T10:09:13Z" level=info msg="ServiceInstance.Id: 03b69500305d9859bb9440d9f9023784"
time="2018-04-23T10:09:13Z" level=info msg="ServiceInstance.Name: rh-mediawiki-apb"
time="2018-04-23T10:09:13Z" level=info msg="ServiceInstance.Image: registry.access.stage.redhat.com/openshift3/mediawiki-apb:v3.10"
time="2018-04-23T10:09:13Z" level=info msg="ServiceInstance.Description: Mediawiki apb implementation"
time="2018-04-23T10:09:13Z" level=info msg="============================================================"
time="2018-04-23T10:09:13Z" level=info msg="Creating RoleBinding apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5"
[2018-04-23T10:09:13.923Z] [INFO] - All Jobs for instance: 4c8d3c6b-46de-11e8-9590-0a580a80000b in state:  in progress - 
[]apb.JobState{apb.JobState{Token:"05436063-4983-4437-85b3-2945e1ce48e2", State:"in progress", Podname:"", Method:"deprovision", Error:"", Description:"action started"}}
[2018-04-23T10:09:13.923Z] [INFO] - Deprovision requested for instance 4c8d3c6b-46de-11e8-9590-0a580a80000b, but job is already in progress
10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "DELETE /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b?accepts_incomplete=true&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 202 3
time="2018-04-23T10:09:13Z" level=info msg="Creating RoleBinding apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5"
time="2018-04-23T10:09:13Z" level=info msg="Successfully created apb sandbox: [ %s ], with %s permissions in namespace %sapb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5editrh-mediawiki-apb-depr-b4p9t"
time="2018-04-23T10:09:13Z" level=info msg="Running post create sandbox fuctions if defined."
time="2018-04-23T10:09:14Z" level=info msg="Proxy configuration present. Applying to APB before execution:"
time="2018-04-23T10:09:14Z" level=info msg="HTTP_PROXY=\"\""
time="2018-04-23T10:09:14Z" level=info msg="HTTPS_PROXY=\"\""
time="2018-04-23T10:09:14Z" level=info msg="NO_PROXY=\"\""
time="2018-04-23T10:09:14Z" level=info msg="Creating pod \"apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5\" in the rh-mediawiki-apb-depr-b4p9t namespace"
10.128.0.1 - - [23/Apr/2018:10:09:14 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=05436063-4983-4437-85b3-2945e1ce48e2&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64
10.128.0.1 - - [23/Apr/2018:10:09:14 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=05436063-4983-4437-85b3-2945e1ce48e2&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64
10.128.0.1 - - [23/Apr/2018:10:09:18 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=05436063-4983-4437-85b3-2945e1ce48e2&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64
time="2018-04-23T10:09:22Z" level=error msg="Deprovision action failed - Pod [ apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5 ] failed with exit code [2]"
time="2018-04-23T10:09:22Z" level=info msg="Destroying APB sandbox..."
[2018-04-23T10:09:22.964Z] [ERROR] - broker::deprovision error occurred. Pod [ apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5 ] failed with exit code [2]
time="2018-04-23T10:09:22Z" level=info msg="Successfully deleted rolebinding apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5, namespace rh-mediawiki-apb-depr-b4p9t"
time="2018-04-23T10:09:22Z" level=info msg="Successfully deleted rolebinding apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5, namespace cover"

Comment 1 Shawn Hurley 2018-04-29 16:37:35 UTC
I am seeing this behavior in service catalog 0.1.13 but not in 0.1.11.

@pmorie Do you know which behavior is correct for 3.10?

Comment 2 John Matthews 2018-05-02 13:32:42 UTC
Clearing needinfo

Plan for this BZ is to update the mediawiki APB so it's deprovision playbook is idempotent.

That is the first issue here, the deprovision should be able to run multiple times and ensure it has cleaned up the artifacts, if they do not exist it should _not_ treat that as an error.

For longer term, we will investigate and determine how we want to handle failure cases and a possible rate limiter so we will control how often we attempt a deprovision.
Trello for longer term approach here (post 3.10):  https://trello.com/c/hSGqK7wm

Summary:
For 3.10, we will update the deprovision playbook in mediawiki so it is idempotent

Comment 6 David Zager 2018-05-24 17:33:13 UTC
https://errata.devel.redhat.com/advisory/33505 moved to QE

openshift-enterprise-asb-container-v3.10.0-0.51.0.1
openshift-enterprise-mediawiki-apb-v3.10.0-0.51.0.1
openshift-enterprise-postgresql-apb-v3.10.0-0.51.0.1
openshift-enterprise-mysql-apb-v3.10.0-0.51.0.1
openshift-enterprise-mariadb-apb-v3.10.0-0.51.0.1
openshift-enterprise-apb-tools-v3.10.0-0.32.0.2

Comment 7 Zhang Cheng 2018-05-28 07:48:12 UTC
I want to know what is our expected in currently. I tried with openshift-enterprise-mediawiki-apb-v3.10.0-0.51.0.1 and latest asb:1.2.14, but still get similar result.
1. Will automatic launch depr sandbox if mediawiki provision failed
2. Still launch multi depr sandboxes while provision failed, but not frequently.

rh-mediawiki-apb-depr-4w8c6         Active    33m
rh-mediawiki-apb-depr-kpdwq         Active    40m
rh-mediawiki-apb-depr-kxjtd         Active    1h
rh-mediawiki-apb-depr-mnsqr         Active    29m
rh-mediawiki-apb-prov-fnjl5         Active    1h

Comment 8 David Zager 2018-05-29 02:27:12 UTC
From jmatthews: 


Plan for this BZ is to update the mediawiki APB so it's deprovision playbook is idempotent.




You should expect to see the multiple sandboxes, however they should now be idempotent.

Comment 9 Zhang Cheng 2018-05-29 02:43:49 UTC
That is OK. Verified in here.

Comment 11 errata-xmlrpc 2018-07-30 19:13:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816


Note You need to log in before you can comment on or make changes to this bug.