Created attachment 1304671 [details] deprovision.png Description of problem: Install with openshift-ansible and enable service-catalog, When deprovision, the instance can't delete from the console, then I check on server saw some of apb pod is error Version-Release number of selected component (if applicable): openshift v3.6.170 kubernetes v1.6.1+5115d708d7 etcd 3.2.1 How reproducible: Always Steps to Reproduce: 1. Delete the Provisioned Services [root@host-8-175-47 ~]# oc get po -n dma1 NAME READY STATUS RESTARTS AGE apb-06b91fb8-f13c-4e9e-b17e-6a76d38ca4eb 0/1 Error 0 19m apb-1083b611-08cd-407d-9a31-d582ad16411a 0/1 Error 0 19m apb-139593ba-875a-4c3f-8c56-2b98ba24b839 0/1 Completed 0 3h apb-2688706d-06ce-4eec-939a-d93e91b97e5f 0/1 Error 0 19m apb-67b0b448-a7aa-4b9a-8ed1-3d4d9affa46e 0/1 Error 0 18m apb-75c75052-adb6-4ec9-9954-e8639e3b8f09 0/1 Error 0 19m apb-7aa3bc64-c6f1-46d3-8a67-51047bd30fc2 0/1 Error 0 19m apb-8081f382-e561-474f-9c66-76ac8aa55d60 0/1 Completed 0 19m apb-9ca2aeeb-2b65-4081-acaa-77e015a317bb 0/1 Completed 0 3h apb-ae6e62d5-29a9-4ac7-97b9-1a3a76c30327 0/1 Error 0 19m apb-c52c93f2-b8da-45ca-8053-0bf7b31eb92f 0/1 Completed 0 19m apb-c8c8f9cc-8e18-4d94-b20e-45211a6d3f79 0/1 Error 0 19m apb-ee777918-e099-4555-aedc-4735eb48df2e 0/1 Error 0 19m apb-f8c57ae3-4f4e-491e-9dc4-bb9dd625eff9 0/1 Error 0 19m postgresql-1-b2whj 1/1 Running 0 3h [root@host-8-175-47 ~]# oc logs apb-06b91fb8-f13c-4e9e-b17e-6a76d38ca4eb -n dma1 Openshift cluster credentials not provided. Assuming the broker is running inside an Openshift cluster Attempting to login with a service account... [WARNING]: Could not create retry file '/opt/apb/actions/deprovision.retry'. [Errno 13] Permission denied: u'/opt/apb/actions/deprovision.retry' [root@host-8-175-47 ~]# [root@host-8-175-47 ~]# oc logs apb-1083b611-08cd-407d-9a31-d582ad16411a -n dma1 Openshift cluster credentials not provided. Assuming the broker is running inside an Openshift cluster Attempting to login with a service account... Logged into "https://kubernetes.default:443" as "system:serviceaccount:dma1:apb-1083b611-08cd-407d-9a31-d582ad16411a" using the token provided. You have one project on this server: "dma1" Using project "dma1". Welcome! See 'oc help' to get started. PLAY [Deprovision mediawiki123-apb from openshift] ***************************** TASK [ansible.kubernetes-modules : Intall latest openshift client] ************* skipping: [localhost] TASK [openshift_v1_route] ****************************************************** ok: [localhost] TASK [k8s_v1_persistent_volume_claim] ****************************************** [WARNING]: Could not create retry file '/opt/apb/actions/deprovision.retry'. [Errno 13] Permission denied: u'/opt/apb/actions/deprovision.retry' 2. 3. Actual results: Expected results: Additional info:
Confirmed the issue on the development side, have a WIP PR with initial fixes against the broker written today: https://github.com/openshift/ansible-service-broker/pull/306 Working on a suitable test environment to debug and test.
redacted comment: I dug through the catalog code to investigate how async deprovision was being handled. I asked in the slack channel to get guidance to help in my search. Paul pointed me to a few places, but I kept seeing no call to the last_operation after an async delete. Then mkibbe confirmed what I had thought. The catalog has a queue that they put polling jobs on that is used to monitor async jobs. The provision call had this call but the deprovision did not. This is Michael's response: mkibbe [6:01 PM] It seems to be a bug in the catalog, looking at how async instance provisioning is handled, it should add the instance to the polling queue mkibbe [6:34 PM] @zeus @pmorie 99% sure it's a bug. I'll make an issue and put out a fix. He posted a PR today: https://github.com/kubernetes-incubator/service-catalog/pull/1067 Issue posted: https://github.com/kubernetes-incubator/service-catalog/issues/1066
Checked with ansible-service-broker-0.9.11, and deprovision still failed # oc describe instance mediawiki-apb-hzp09 -n wjiang Name: mediawiki-apb-hzp09 Namespace: wjiang Labels: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 24m 24m 1 service-catalog-controller-manager Normal Provisioning The instance is being provisioned asynchronously 16m 2m 5 service-catalog-controller-manager Warning DeprovisionCallFailed deprovision call failed # oc logs -f apb-03f4bf6c-80cc-44c8-af51-691dcc84f843 -n wjiang Openshift cluster credentials not provided. Assuming the broker is running inside an Openshift cluster Attempting to login with a service account... Logged into "https://kubernetes.default:443" as "system:serviceaccount:wjiang:apb-03f4bf6c-80cc-44c8-af51-691dcc84f843" using the token provided. You have one project on this server: "wjiang" Using project "wjiang". Welcome! See 'oc help' to get started. PLAY [Deprovision mediawiki123-apb from openshift] ***************************** TASK [ansible.kubernetes-modules : Intall latest openshift client] ************* skipping: [localhost] TASK [openshift_v1_route] ****************************************************** changed: [localhost] TASK [k8s_v1_persistent_volume_claim] ****************************************** changed: [localhost] TASK [openshift_v1_deployment_config] ****************************************** changed: [localhost] TASK [k8s_v1_replication_controller] ******************************************* changed: [localhost] TASK [openshift_v1_deployment_config] ****************************************** changed: [localhost] TASK [k8s_v1_service] ********************************************************** changed: [localhost] PLAY RECAP ********************************************************************* localhost : ok=6 changed=6 unreachable=0 failed=0 Logs from ansible-service-broker pod: [2017-07-31T07:19:39.859Z] [INFO] ASYNC deprovision in progress [2017-07-31T07:19:39.859Z] [NOTICE] ============================================================ [2017-07-31T07:19:39.859Z] [NOTICE] DEPROVISIONING [2017-07-31T07:19:39.859Z] [NOTICE] ============================================================ [2017-07-31T07:19:39.859Z] [NOTICE] ServiceInstance.Id: 4fbcc051-03af-40c7-86e2-dc12977e6b5d [2017-07-31T07:19:39.859Z] [NOTICE] ServiceInstance.Name: mediawiki-apb [2017-07-31T07:19:39.859Z] [NOTICE] ServiceInstance.Image: openshift3/mediawiki-apb [2017-07-31T07:19:39.859Z] [NOTICE] ServiceInstance.Description: Mediawiki123 apb implementation [2017-07-31T07:19:39.859Z] [NOTICE] ============================================================ [2017-07-31T07:21:11.542Z] [ERROR] TIMED OUT WAITING FOR CONTAINER TO COME UP [2017-07-31T07:21:11.542Z] [INFO] Destroying APB sandbox... [2017-07-31T07:21:12.039Z] [ERROR] error from deprovision - &errors.errorString{s:"TIMED OUT WAITING FOR CONTAINER TO COME UP"} [2017-07-31T07:21:12.039Z] [ERROR] broker::Deprovision error occurred. [2017-07-31T07:21:12.039Z] [ERROR] TIMED OUT WAITING FOR CONTAINER TO COME UP [2017-07-31T07:22:32.82Z] [ERROR] Could not find a service instance in dao - 100: Key not found (/service_instance/f9dd3a88-b297-411c-b447-df527afdc0db) [50] 10.129.0.1 - - [31/Jul/2017:07:22:32 +0000] "DELETE /v2/service_instances/f9dd3a88-b297-411c-b447-df527afdc0db?accepts_incomplete=true&plan_id=4c10ff42-be89-420a-9bab-27a9bef9aed8&service_id=4fbcc051-03af-40c7-86e2-dc12977e6b5d HTTP/1.1" 410 3
Moving back to ON_QA to ask for this to be retested. For 3.6.0, the deprovision workflow is not ideal, yet it should be sufficient for tech-preview. We expect that when mediawiki is deprovisioned, the resources the APB created will be cleaned up. The service instance will remain for some period of time later, this is due to an issue with Service Catalog reconciling state, the delay is expected for 3.6.0 and noted in this release note BZ: Release Note for 3.6.0 https://bugzilla.redhat.com/show_bug.cgi?id=1476012 BZ to track fix in future release: https://bugzilla.redhat.com/show_bug.cgi?id=1475949 As to the errors noted from ansible-service-broker log, those are extra noise, not true errors. The noise results from the Service Catalog calling deprovision multiple times on an APB that has already been deprovisioned. https://bugzilla.redhat.com/show_bug.cgi?id=1476026
Checked with # openshift version openshift v3.6.172.0.1 kubernetes v1.6.1+5115d708d7 etcd 3.2.1 again, verify the bug since deprovision stage can delete the resources which are created by provision stage.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188