Description of problem: Should not lanuch so many depr sandboxs while provision failed. service-catalog & asb image using images from brew registry: # service-catalog --version v3.10.0-0.27.0;Upstream:v0.1.13 # asbd --version 1.2.6 How reproducible: Always Steps to Reproduce: 1. Provison a mediawiki apb with same username and passwd (provision should failed by expect) 2. Look at namespace by `oc get ns | grep mediawiki` Actual results: 1. Provison a mediawiki apb failed by expect. 2. Auto lanuch many depr sandboxs # oc get ns | grep mediawiki rh-mediawiki-apb-depr-4kf9z Active 5m rh-mediawiki-apb-depr-4vzlq Active 2s rh-mediawiki-apb-depr-5z5sr Active 3m rh-mediawiki-apb-depr-7kjsl Active 2m rh-mediawiki-apb-depr-b4p9t Active 6m rh-mediawiki-apb-depr-bms7h Active 2m rh-mediawiki-apb-depr-d74sl Active 5m rh-mediawiki-apb-depr-dvqx4 Active 1m rh-mediawiki-apb-depr-fhdv4 Active 42s rh-mediawiki-apb-depr-fkhr7 Active 4m rh-mediawiki-apb-depr-fq4cc Active 2m rh-mediawiki-apb-depr-hh5lj Active 1m rh-mediawiki-apb-depr-hqj2k Active 5m rh-mediawiki-apb-depr-j5jn5 Active 4m rh-mediawiki-apb-depr-j85sm Active 16s rh-mediawiki-apb-depr-kfrnc Active 3m rh-mediawiki-apb-depr-kwnf8 Active 5m rh-mediawiki-apb-depr-mf5m9 Active 4m rh-mediawiki-apb-depr-mxsff Active 29s rh-mediawiki-apb-depr-n7qqx Active 2m rh-mediawiki-apb-depr-p5s28 Active 3m rh-mediawiki-apb-depr-prvf7 Active 56s rh-mediawiki-apb-depr-rc8sv Active 1m rh-mediawiki-apb-depr-rgj7v Active 6m rh-mediawiki-apb-depr-rqz62 Active 3m rh-mediawiki-apb-depr-rt7dk Active 4m rh-mediawiki-apb-depr-z8wfv Active 4m rh-mediawiki-apb-depr-zhmsk Active 2m rh-mediawiki-apb-depr-zqzbf Active 1m rh-mediawiki-apb-prov-r7jlc Active 6m Expected results: 2. should not launch any depr sandbox. Addition info: Attach asb logs: time="2018-04-23T10:08:44Z" level=info msg="============================================================" time="2018-04-23T10:08:44Z" level=info msg=" PROVISIONING " time="2018-04-23T10:08:44Z" level=info msg="============================================================" time="2018-04-23T10:08:44Z" level=info msg="Spec.ID: 03b69500305d9859bb9440d9f9023784" time="2018-04-23T10:08:44Z" level=info msg="Spec.Name: rh-mediawiki-apb" time="2018-04-23T10:08:44Z" level=info msg="Spec.Image: registry.access.stage.redhat.com/openshift3/mediawiki-apb:v3.10" time="2018-04-23T10:08:44Z" level=info msg="Spec.Description: Mediawiki apb implementation" time="2018-04-23T10:08:44Z" level=info msg="============================================================" time="2018-04-23T10:08:44Z" level=info msg="Checking if namespace %s exists.cover" time="2018-04-23T10:08:44Z" level=info msg="Creating RoleBinding apb-4c98884c-4d97-4342-ac56-ca61ce224205" time="2018-04-23T10:08:44Z" level=info msg="Creating RoleBinding apb-4c98884c-4d97-4342-ac56-ca61ce224205" time="2018-04-23T10:08:44Z" level=info msg="Successfully created apb sandbox: [ %s ], with %s permissions in namespace %sapb-4c98884c-4d97-4342-ac56-ca61ce224205editrh-mediawiki-apb-prov-r7jlc" time="2018-04-23T10:08:44Z" level=info msg="Running post create sandbox fuctions if defined." time="2018-04-23T10:08:44Z" level=info msg="Proxy configuration present. Applying to APB before execution:" time="2018-04-23T10:08:44Z" level=info msg="HTTP_PROXY=\"\"" time="2018-04-23T10:08:44Z" level=info msg="HTTPS_PROXY=\"\"" time="2018-04-23T10:08:44Z" level=info msg="NO_PROXY=\"\"" time="2018-04-23T10:08:44Z" level=info msg="Creating pod \"apb-4c98884c-4d97-4342-ac56-ca61ce224205\" in the rh-mediawiki-apb-prov-r7jlc namespace" [2018-04-23T10:08:44.898Z] [INFO] - All Jobs for instance: 4c8d3c6b-46de-11e8-9590-0a580a80000b in state: in progress - []apb.JobState{apb.JobState{Token:"7e728f5c-15a7-4716-8869-d6a45d15a875", State:"in progress", Podname:"", Method:"provision", Error:"", Description:"action started"}} [2018-04-23T10:08:44.898Z] [INFO] - Provision requested for instance 4c8d3c6b-46de-11e8-9590-0a580a80000b, but job is already in progress [2018-04-23T10:08:44.898Z] [ERROR] - provision error provision in progress 10.128.0.1 - - [23/Apr/2018:10:08:44 +0000] "PUT /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b?accepts_incomplete=true HTTP/1.1" 202 58 10.128.0.1 - - [23/Apr/2018:10:08:45 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=7e728f5c-15a7-4716-8869-d6a45d15a875&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64 10.128.0.1 - - [23/Apr/2018:10:08:45 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=7e728f5c-15a7-4716-8869-d6a45d15a875&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64 10.128.0.1 - - [23/Apr/2018:10:08:49 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=7e728f5c-15a7-4716-8869-d6a45d15a875&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64 time="2018-04-23T10:08:50Z" level=error msg="Provision or Update action failed - Pod [ apb-4c98884c-4d97-4342-ac56-ca61ce224205 ] failed with exit code [2]" time="2018-04-23T10:08:50Z" level=info msg="Destroying APB sandbox..." time="2018-04-23T10:08:50Z" level=info msg="Successfully deleted rolebinding apb-4c98884c-4d97-4342-ac56-ca61ce224205, namespace rh-mediawiki-apb-prov-r7jlc" time="2018-04-23T10:08:50Z" level=info msg="Successfully deleted rolebinding apb-4c98884c-4d97-4342-ac56-ca61ce224205, namespace cover" [2018-04-23T10:08:50.709Z] [ERROR] - broker::provision error occurred. Pod [ apb-4c98884c-4d97-4342-ac56-ca61ce224205 ] failed with exit code [2] time="2018-04-23T10:08:50Z" level=error msg="Provision APB error: Pod [ apb-4c98884c-4d97-4342-ac56-ca61ce224205 ] failed with exit code [2]" 10.128.0.1 - - [23/Apr/2018:10:08:57 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=7e728f5c-15a7-4716-8869-d6a45d15a875&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 129 [2018-04-23T10:08:57.485Z] [INFO] - All Jobs for instance: 4c8d3c6b-46de-11e8-9590-0a580a80000b in state: in progress - []apb.JobState{} time="2018-04-23T10:08:57Z" level=error msg="Unable to load secret '4c8d3c6b-46de-11e8-9590-0a580a80000b' from namespace 'openshift-ansible-service-broker'" time="2018-04-23T10:08:57Z" level=error msg="unable to get secret data for 4c8d3c6b-46de-11e8-9590-0a580a80000b, in namespace: openshift-ansible-service-broker" [2018-04-23T10:08:57.488Z] [INFO] - ASYNC deprovision in progress 10.128.0.1 - - [23/Apr/2018:10:08:57 +0000] "DELETE /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b?accepts_incomplete=true&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 202 58 time="2018-04-23T10:08:57Z" level=info msg="============================================================" time="2018-04-23T10:08:57Z" level=info msg=" DEPROVISIONING " time="2018-04-23T10:08:57Z" level=info msg="============================================================" time="2018-04-23T10:08:57Z" level=info msg="ServiceInstance.Id: 03b69500305d9859bb9440d9f9023784" time="2018-04-23T10:08:57Z" level=info msg="ServiceInstance.Name: rh-mediawiki-apb" time="2018-04-23T10:08:57Z" level=info msg="ServiceInstance.Image: registry.access.stage.redhat.com/openshift3/mediawiki-apb:v3.10" time="2018-04-23T10:08:57Z" level=info msg="ServiceInstance.Description: Mediawiki apb implementation" time="2018-04-23T10:08:57Z" level=info msg="============================================================" time="2018-04-23T10:08:57Z" level=info msg="Creating RoleBinding apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626" time="2018-04-23T10:08:57Z" level=info msg="Creating RoleBinding apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626" time="2018-04-23T10:08:57Z" level=info msg="Successfully created apb sandbox: [ %s ], with %s permissions in namespace %sapb-a80e3b0e-ac46-419e-a12b-5c265cbd9626editrh-mediawiki-apb-depr-rgj7v" time="2018-04-23T10:08:57Z" level=info msg="Running post create sandbox fuctions if defined." time="2018-04-23T10:08:57Z" level=info msg="Proxy configuration present. Applying to APB before execution:" time="2018-04-23T10:08:57Z" level=info msg="HTTP_PROXY=\"\"" time="2018-04-23T10:08:57Z" level=info msg="HTTPS_PROXY=\"\"" time="2018-04-23T10:08:57Z" level=info msg="NO_PROXY=\"\"" time="2018-04-23T10:08:57Z" level=info msg="Creating pod \"apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626\" in the rh-mediawiki-apb-depr-rgj7v namespace" [2018-04-23T10:09:06.482Z] [ERROR] - broker::deprovision error occurred. Pod [ apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626 ] failed with exit code [2] time="2018-04-23T10:09:06Z" level=error msg="Deprovision action failed - Pod [ apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626 ] failed with exit code [2]" time="2018-04-23T10:09:06Z" level=info msg="Destroying APB sandbox..." time="2018-04-23T10:09:06Z" level=info msg="Successfully deleted rolebinding apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626, namespace rh-mediawiki-apb-depr-rgj7v" time="2018-04-23T10:09:06Z" level=info msg="Successfully deleted rolebinding apb-a80e3b0e-ac46-419e-a12b-5c265cbd9626, namespace cover" 10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=45453ceb-3f11-43ac-b100-1da5c40a7cf3&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 131 10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=45453ceb-3f11-43ac-b100-1da5c40a7cf3&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 131 10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=45453ceb-3f11-43ac-b100-1da5c40a7cf3&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 131 10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=45453ceb-3f11-43ac-b100-1da5c40a7cf3&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 131 10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=45453ceb-3f11-43ac-b100-1da5c40a7cf3&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 131 [2018-04-23T10:09:13.692Z] [INFO] - All Jobs for instance: 4c8d3c6b-46de-11e8-9590-0a580a80000b in state: in progress - []apb.JobState{} time="2018-04-23T10:09:13Z" level=error msg="Unable to load secret '4c8d3c6b-46de-11e8-9590-0a580a80000b' from namespace 'openshift-ansible-service-broker'" time="2018-04-23T10:09:13Z" level=error msg="unable to get secret data for 4c8d3c6b-46de-11e8-9590-0a580a80000b, in namespace: openshift-ansible-service-broker" [2018-04-23T10:09:13.694Z] [INFO] - ASYNC deprovision in progress 10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "DELETE /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b?accepts_incomplete=true&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 202 58 time="2018-04-23T10:09:13Z" level=info msg="============================================================" time="2018-04-23T10:09:13Z" level=info msg=" DEPROVISIONING " time="2018-04-23T10:09:13Z" level=info msg="============================================================" time="2018-04-23T10:09:13Z" level=info msg="ServiceInstance.Id: 03b69500305d9859bb9440d9f9023784" time="2018-04-23T10:09:13Z" level=info msg="ServiceInstance.Name: rh-mediawiki-apb" time="2018-04-23T10:09:13Z" level=info msg="ServiceInstance.Image: registry.access.stage.redhat.com/openshift3/mediawiki-apb:v3.10" time="2018-04-23T10:09:13Z" level=info msg="ServiceInstance.Description: Mediawiki apb implementation" time="2018-04-23T10:09:13Z" level=info msg="============================================================" time="2018-04-23T10:09:13Z" level=info msg="Creating RoleBinding apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5" [2018-04-23T10:09:13.923Z] [INFO] - All Jobs for instance: 4c8d3c6b-46de-11e8-9590-0a580a80000b in state: in progress - []apb.JobState{apb.JobState{Token:"05436063-4983-4437-85b3-2945e1ce48e2", State:"in progress", Podname:"", Method:"deprovision", Error:"", Description:"action started"}} [2018-04-23T10:09:13.923Z] [INFO] - Deprovision requested for instance 4c8d3c6b-46de-11e8-9590-0a580a80000b, but job is already in progress 10.128.0.1 - - [23/Apr/2018:10:09:13 +0000] "DELETE /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b?accepts_incomplete=true&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 202 3 time="2018-04-23T10:09:13Z" level=info msg="Creating RoleBinding apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5" time="2018-04-23T10:09:13Z" level=info msg="Successfully created apb sandbox: [ %s ], with %s permissions in namespace %sapb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5editrh-mediawiki-apb-depr-b4p9t" time="2018-04-23T10:09:13Z" level=info msg="Running post create sandbox fuctions if defined." time="2018-04-23T10:09:14Z" level=info msg="Proxy configuration present. Applying to APB before execution:" time="2018-04-23T10:09:14Z" level=info msg="HTTP_PROXY=\"\"" time="2018-04-23T10:09:14Z" level=info msg="HTTPS_PROXY=\"\"" time="2018-04-23T10:09:14Z" level=info msg="NO_PROXY=\"\"" time="2018-04-23T10:09:14Z" level=info msg="Creating pod \"apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5\" in the rh-mediawiki-apb-depr-b4p9t namespace" 10.128.0.1 - - [23/Apr/2018:10:09:14 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=05436063-4983-4437-85b3-2945e1ce48e2&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64 10.128.0.1 - - [23/Apr/2018:10:09:14 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=05436063-4983-4437-85b3-2945e1ce48e2&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64 10.128.0.1 - - [23/Apr/2018:10:09:18 +0000] "GET /ansible-service-broker/v2/service_instances/4c8d3c6b-46de-11e8-9590-0a580a80000b/last_operation?operation=05436063-4983-4437-85b3-2945e1ce48e2&plan_id=43d3e23d214c26dbebc0879e44425db4&service_id=03b69500305d9859bb9440d9f9023784 HTTP/1.1" 200 64 time="2018-04-23T10:09:22Z" level=error msg="Deprovision action failed - Pod [ apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5 ] failed with exit code [2]" time="2018-04-23T10:09:22Z" level=info msg="Destroying APB sandbox..." [2018-04-23T10:09:22.964Z] [ERROR] - broker::deprovision error occurred. Pod [ apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5 ] failed with exit code [2] time="2018-04-23T10:09:22Z" level=info msg="Successfully deleted rolebinding apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5, namespace rh-mediawiki-apb-depr-b4p9t" time="2018-04-23T10:09:22Z" level=info msg="Successfully deleted rolebinding apb-48edc83b-5ea4-45c1-a0bb-2a7d347d65b5, namespace cover"
I am seeing this behavior in service catalog 0.1.13 but not in 0.1.11. @pmorie Do you know which behavior is correct for 3.10?
Clearing needinfo Plan for this BZ is to update the mediawiki APB so it's deprovision playbook is idempotent. That is the first issue here, the deprovision should be able to run multiple times and ensure it has cleaned up the artifacts, if they do not exist it should _not_ treat that as an error. For longer term, we will investigate and determine how we want to handle failure cases and a possible rate limiter so we will control how often we attempt a deprovision. Trello for longer term approach here (post 3.10): https://trello.com/c/hSGqK7wm Summary: For 3.10, we will update the deprovision playbook in mediawiki so it is idempotent
https://github.com/ansibleplaybookbundle/mediawiki-apb/pull/31
https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=696144 https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=696139 https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=696141 https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=696142
There are all the PRs associated with making our main 4 APBs idempotent: https://github.com/ansibleplaybookbundle/mediawiki-apb/pull/31 https://github.com/ansibleplaybookbundle/postgresql-apb/pull/45 https://github.com/ansibleplaybookbundle/mariadb-apb/pull/31 https://github.com/ansibleplaybookbundle/mysql-apb/pull/31
https://errata.devel.redhat.com/advisory/33505 moved to QE openshift-enterprise-asb-container-v3.10.0-0.51.0.1 openshift-enterprise-mediawiki-apb-v3.10.0-0.51.0.1 openshift-enterprise-postgresql-apb-v3.10.0-0.51.0.1 openshift-enterprise-mysql-apb-v3.10.0-0.51.0.1 openshift-enterprise-mariadb-apb-v3.10.0-0.51.0.1 openshift-enterprise-apb-tools-v3.10.0-0.32.0.2
I want to know what is our expected in currently. I tried with openshift-enterprise-mediawiki-apb-v3.10.0-0.51.0.1 and latest asb:1.2.14, but still get similar result. 1. Will automatic launch depr sandbox if mediawiki provision failed 2. Still launch multi depr sandboxes while provision failed, but not frequently. rh-mediawiki-apb-depr-4w8c6 Active 33m rh-mediawiki-apb-depr-kpdwq Active 40m rh-mediawiki-apb-depr-kxjtd Active 1h rh-mediawiki-apb-depr-mnsqr Active 29m rh-mediawiki-apb-prov-fnjl5 Active 1h
From jmatthews: Plan for this BZ is to update the mediawiki APB so it's deprovision playbook is idempotent. You should expect to see the multiple sandboxes, however they should now be idempotent.
That is OK. Verified in here.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816