Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1507440 - some servicebindings will be blocked to unbind status when delete multi servicebindings at same time
some servicebindings will be blocked to unbind status when delete multi servi...
Status: CLOSED CURRENTRELEASE
Product: OpenShift Container Platform
Classification: Red Hat
Component: Service Broker (Show other bugs)
3.7.0
Unspecified Unspecified
urgent Severity urgent
: ---
: 3.9.0
Assigned To: Jay Boyd
XiuJuan Wang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-10-30 05:26 EDT by XiuJuan Wang
Modified: 2018-11-02 10:49 EDT (History)
17 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-08-28 10:46:39 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
failedtodeprovisionserviceinstance (124.70 KB, image/png)
2017-10-30 05:58 EDT, XiuJuan Wang
no flags Details
three x controller logs (8.64 MB, application/zip)
2018-02-08 11:39 EST, David Caldwell
no flags Details
oc describe all -n kube-service-catalog (15.98 KB, text/plain)
2018-02-08 11:40 EST, David Caldwell
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3305241 None None None 2018-01-02 02:08 EST

  None (edit)
Description XiuJuan Wang 2017-10-30 05:26:20 EDT
Description of problem:
Provision a clusterserviceclass via browse catalog,after provision is ready,create binding twice.Can't deprovision serviceinstance via browse catalog after binding 2 mins.
Could deprovision serviceinstance successfully after binding immediately

Version-Release number of selected component (if applicable):
brew-*/ose-service-catalog   v3.7.0-0.185.0.0    ef0dd815574a        2 days ago          266 MB


How reproducible:
always

Steps to Reproduce:
1.Provision jenkins-ephemral clusterserviceclass via browse catalog
2.After serviceinstance is ready,creating binging twice
3.Wait 2 mins, deprovision serviceinstance via browse catalog

Actual results:
Can't deprovision successfully due to 'Delete instance blocked by existing ServiceBindings associated with this instance. All credentials must be removed first'

Bindings have been deleted when checked from cli side.

Expected results:
Can't deprovision successfully 

Additional info:
Comment 1 XiuJuan Wang 2017-10-30 05:58 EDT
Created attachment 1345358 [details]
failedtodeprovisionserviceinstance
Comment 2 Paul Morie 2017-10-30 10:49:23 EDT
This sounds to me like this is working as expected - deprovision of the instance is blocked until there are no bindings to the instance remaining.  I'm putting NEEDINFO on this so we can dig into why the bindings were not fully deleted after so much time.  Would you please provide:

- service catalog controller logs
- service catalog api server logs
- output of `oc get servicebindings -o yaml`
- output of `oc get serviceinstances -o yaml`
Comment 3 Matthew Staebler 2017-10-30 22:16:24 EDT
I have reproduced this bug.

When the Jenkins (Ephemeral) service is deleted via the web console, the two ServiceBindings and the ServiceInstance are all deleted in rapid succession. The service catalog controller sends Unbind requests to the template service broker for the two ServiceBindings in rapid succession. When the template service broker receives the first Unbind request, the template service broker makes modifications to the provisioned instance that temporarily makes it unavailable. When the template service broker receives the second Unbind request, the template service broker rejects the request with a 500 response indicating that the instance has been modified and the request should be retried with the latest version [1].

Subsequent Unbind requests for the binding succeed. The binding is deleted from the template service broker. However, the service catalog retains the ServiceBinding and thinks that it has not been unbound. The only reason that the service catalog sends another Unbind request is because there is an outstanding bug in the service catalog that causes the controller to repeat OSB requests. If that bug were fixed, then the service catalog would not send any additional Unibind requests after receiving the 500 response from the initial Unbind request for the binding.


[1] This is the full text of the 500 response from the template service broker. "f95fb251-289-4002-9323-0cfe751d2040" is the ID of the instance.

Operation cannot be fulfilled on brokertemplateinstances.template.openshift.io "f95fb251-289-4002-9323-0cfe751d2040": the object has been modified; please apply your changes to the latest version and try again
Comment 5 Matthew Staebler 2017-10-31 12:55:20 EDT
Upstream issue created to address this bug.

https://github.com/kubernetes-incubator/service-catalog/issues/1503
Comment 9 XiuJuan Wang 2017-11-10 04:47:40 EST
Update my bug description.

some servicebindings will be blocked to unbind status when delete multi servicebindings at same time.

#oc  get servicebinding  -n xiu12 
NAME                            AGE
mongodb-ephemeral-ncd5b-j78hl   51s
mongodb-ephemeral-ncd5b-mhhhw   48s
mongodb-ephemeral-ncd5b-wbvbk   45s
# oc  delete servicebinding  --all  -n xiu12 
servicebinding "mongodb-ephemeral-ncd5b-j78hl" deleted
servicebinding "mongodb-ephemeral-ncd5b-mhhhw" deleted
servicebinding "mongodb-ephemeral-ncd5b-wbvbk" deleted

oc  get servicebinding  -n xiu12 
NAME                            AGE
mongodb-ephemeral-ncd5b-mhhhw   3m
mongodb-ephemeral-ncd5b-wbvbk   3m

# oc  get  servicebinding mongodb-ephemeral-ncd5b-mhhhw  -n xiu12  -o yaml 
apiVersion: servicecatalog.k8s.io/v1beta1
kind: ServiceBinding
metadata:
  creationTimestamp: 2017-11-10T09:42:15Z
  deletionGracePeriodSeconds: 0
  deletionTimestamp: 2017-11-10T09:43:12Z
  finalizers:
  - kubernetes-incubator/service-catalog
  generateName: mongodb-ephemeral-ncd5b-
  generation: 2
  name: mongodb-ephemeral-ncd5b-mhhhw
  namespace: xiu12
  resourceVersion: "16515"
  selfLink: /apis/servicecatalog.k8s.io/v1beta1/namespaces/xiu12/servicebindings/mongodb-ephemeral-ncd5b-mhhhw
  uid: 6f8e8246-c5fb-11e7-a8a8-0a580a800003
spec:
  externalID: 15f1c433-03a3-4912-b6d6-30bf29003a29
  instanceRef:
    name: mongodb-ephemeral-ncd5b
  secretName: mongodb-ephemeral-ncd5b-credentials-gk1s7
  userInfo:
    groups:
    - system:cluster-admins
    - system:authenticated
    uid: ""
    username: system:admin
status:
  conditions:
  - lastTransitionTime: 2017-11-10T09:43:12Z
    message: 'Unbind call failed. Error unbinding from ServiceInstance "xiu12/mongodb-ephemeral-ncd5b"
      of ClusterServiceClass (K8S: "4926f6cb-c5ec-11e7-9993-fa163e1a3633" ExternalName:
      "mongodb-ephemeral") at ClusterServiceBroker "template-service-broker": Status:
      500; ErrorMessage: <nil>; Description: Operation cannot be fulfilled on brokertemplateinstances.template.openshift.io
      "e9fd4c45-e565-42be-80bc-2a67165011bb": the object has been modified; please
      apply your changes to the latest version and try again; ResponseError: <nil>'
    reason: UnbindCallFailed
    status: Unknown
    type: Ready
  - lastTransitionTime: 2017-11-10T09:43:12Z
    message: 'Unbind call failed. Error unbinding from ServiceInstance "xiu12/mongodb-ephemeral-ncd5b"
      of ClusterServiceClass (K8S: "4926f6cb-c5ec-11e7-9993-fa163e1a3633" ExternalName:
      "mongodb-ephemeral") at ClusterServiceBroker "template-service-broker": Status:
      500; ErrorMessage: <nil>; Description: Operation cannot be fulfilled on brokertemplateinstances.template.openshift.io
      "e9fd4c45-e565-42be-80bc-2a67165011bb": the object has been modified; please
      apply your changes to the latest version and try again; ResponseError: <nil>'
    reason: UnbindCallFailed
    status: "True"
    type: Failed
  externalProperties:
    userInfo:
      extra:
        scopes.authorization.openshift.io:
        - user:full
      groups:
      - system:authenticated:oauth
      - system:authenticated
      uid: ""
      username: xiuwang
  orphanMitigationInProgress: false
  reconciledGeneration: 2
Comment 10 XiuJuan Wang 2017-11-10 04:58:57 EST
Depend on https://bugzilla.redhat.com/show_bug.cgi?id=1511847.
When deprovision serviceinstance from web console,multi servicebindings will be removed firstly at same time.So there is the behavior as comment #0 description.
Comment 14 Jay Boyd 2018-01-25 09:28:55 EST
I reproduce this in MASTER, one of the unbind requests fails with error 500:

I0125 13:59:28.209966       1 event.go:218] Event(v1.ObjectReference{Kind:"ServiceBinding", Namespace:"proj", Name:"jenkins-ephemeral-gxs6s-9q5jk", UID:"9acd5e9f-0087-11e8-a158-0242ac110004", APIVersion:"servic
ecatalog.k8s.io", ResourceVersion:"686", FieldPath:""}): type: 'Warning' reason: 'UnbindCallFailed' Error unbinding from ServiceInstance "proj/jenkins-ephemeral-gxs6s" of ClusterServiceClass (K8S: "a61a7f29-008
6-11e8-a43f-c85b76aab8ec" ExternalName: "jenkins-ephemeral") at ClusterServiceBroker "template-service-broker": Status: 500; ErrorMessage: <nil>; Description: Operation cannot be fulfilled on brokertemplateinst
ances.template.openshift.io "958386a6-dd59-4f9e-b526-a0b3c73a1225": the object has been modified; please apply your changes to the latest version and try again; ResponseError: <nil>


and any subsequent delete attempts are rejected:
I0125 14:01:50.966188       1 controller_binding.go:615] ServiceBinding "proj/jenkins-ephemeral-gxs6s-9q5jk": Processing Delete
I0125 14:01:50.966207       1 controller_binding.go:625] ServiceBinding "proj/jenkins-ephemeral-gxs6s-9q5jk": Not processing delete event because unbinding has failed


This has been recently fixed upstream by https://github.com/kubernetes-incubator/service-catalog/pull/1653  on January 8, 2018.

There is no work around, --force=true does not work.
Comment 19 Jay Boyd 2018-02-07 10:35:25 EST
fixed by https://github.com/openshift/origin/pull/18480
Comment 23 David Caldwell 2018-02-08 11:39 EST
Created attachment 1393263 [details]
three x controller logs

Attach in response to comment #22
Comment 24 David Caldwell 2018-02-08 11:40 EST
Created attachment 1393264 [details]
oc describe all -n kube-service-catalog

Attached in response to comment #22
Comment 30 XiuJuan Wang 2018-02-12 21:47:12 EST
The fix has merged into ocp v3.9.0-0.42.0.
And can't reproduce this bug,could deprovision serviceinstance successfully when multiple servicebinding  existing.
Please help set this bug to on_qa when you are free.
Thanks
Comment 31 XiuJuan Wang 2018-02-13 21:31:24 EST
As comment #30, move this bug to verified
Comment 42 Alexander Rydekull 2018-11-02 07:00:50 EDT
Issue still exists in 3.11:

[root@ocpmaster1 ~]# oc get projects
NAME                                DISPLAY NAME        STATUS
default                                                 Active
kube-public                                             Active
kube-service-catalog                                    Active
kube-system                                             Active
management-infra                                        Active
openshift                                               Active
openshift-ansible-service-broker                        Active
openshift-console                                       Active
openshift-infra                                         Active
openshift-logging                                       Active
openshift-monitoring                                    Active
openshift-node                                          Active
openshift-sdn                                           Active
openshift-storage                                       Active
openshift-template-service-broker                       Active
openshift-web-console                                   Active
test                                                    Terminating
[root@ocpmaster1 ~]# oc get projects | grep Terminating
test                                                    Terminating
[root@ocpmaster1 ~]# oc get serviceinstance -n test
NAME                           CLASS                                        PLAN      STATUS         AGE
django-psql-persistent-qjqnb   ClusterServiceClass/django-psql-persistent   default   Provisioning   1d
[root@ocpmaster1 ~]# 

Implement workaround from: https://github.com/openshift/origin/issues/19660#issuecomment-388797440

[root@ocpmaster1 ~]# for i in $(oc get projects  | grep Terminating| awk '{print $1}'); do echo $i; oc get serviceinstance -n $i -o yaml | sed "/kubernetes-incubator/d"| oc apply -f - ; done
test
Warning: oc apply should be used on resource created by either oc create --save-config or oc apply
serviceinstance.servicecatalog.k8s.io/django-psql-persistent-qjqnb configured
[root@ocpmaster1 ~]#

Verify that it is gone.

[root@ocpmaster1 ~]# oc get projects
NAME                                DISPLAY NAME        STATUS
default                                                 Active
kube-public                                             Active
kube-service-catalog                                    Active
kube-system                                             Active
management-infra                                        Active
openshift                                               Active
openshift-ansible-service-broker                        Active
openshift-console                                       Active
openshift-infra                                         Active
openshift-logging                                       Active
openshift-monitoring                                    Active
openshift-node                                          Active
openshift-sdn                                           Active
openshift-storage                                       Active
openshift-template-service-broker                       Active
openshift-web-console                                   Active
[root@ocpmaster1 ~]#


Success!
Comment 43 Jay Boyd 2018-11-02 10:49:53 EDT
Hey Alex, this is a thorny issue.  Initially there were some bugs in earlier releases attributing to these issues where we had Service Instances or Service Bindings that clould not be deleted.  What we primarily see now though is the Service Broker is having issues freeing up resources when the user asks for them to be deleted.  Customers may be charged for any resources allocated by a Broker so we don't want to proceed with forgetting about them.

Because the Broker is telling Service Catalog that the resources may not be properly deleted or are in some unknown state, Service Catalog is not going to free up the tracking of them.

Take a look at the status of the resources - try `oc describe django-psql-persistent-qjqnb` and review the events and status.  

This is well documented by https://access.redhat.com/articles/3441161

Without knowing the status of your service instance, I'd initially say this is working as designed.   That work around you quoted is a really big hammer and would not be advisable in any production environment.  If the status shows some issues/errors from the Broker, please take it up with the Broker team.   If you think you have other issues, please open a new BZ.

You may want to review https://bugzilla.redhat.com/show_bug.cgi?id=1623918#c9 (comments 9 & 10) - my guess is your issue falls into this category.

Note You need to log in before you can comment on or make changes to this bug.