Bug 1610714 - [ASB] async binding failed
Summary: [ASB] async binding failed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Service Broker
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.11.0
Assignee: Jesus M. Rodriguez
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-01 09:07 UTC by Jian Zhang
Modified: 2018-10-11 07:23 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-11 07:23:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 0 None None None 2018-10-11 07:23:50 UTC

Description Jian Zhang 2018-08-01 09:07:02 UTC
Description of problem:

Async binding failed. Errors:
time="2018-08-01T08:12:58Z" level=debug msg="state: failed"
time="2018-08-01T08:12:58Z" level=debug msg="description: Error occurred during bind. Please contact administrator if the issue persists."
time="2018-08-01T08:12:58Z" level=debug msg="job state has an error. Assuming that any error here is human readable. err - Pod [ bundle-438177b2-6493-4dfd-857f-8c5f3d4843e1 ] failed with exit code [2]"


Version-Release number of selected component (if applicable):
ASB version: 1.3.5
SC version: v0.1.25

How reproducible:
always

Steps to Reproduce:
1.  Enable the ASYNC of the ASB, set the launch_apb_on_bind to true as the following:
broker:
  ...
  launch_apb_on_bind: true

2.  Provision an DB APB(for example, PostgreSQL) in web console.
Since our APBs stored in the stage registry have not support async yet, so we need to use the below registry for this test.
  - type: dockerhub
    name: dh
    url:  https://registry.hub.docker.com 
    org:  mhrivnak
    tag:  latest
    white_list:
      - ".*-apb$"

3. Provision the PostgreSQL, and then create a binding.

Actual results:
time="2018-08-01T08:12:58Z" level=debug msg="description: Error occurred during bind. Please contact administrator if the issue persists."
time="2018-08-01T08:12:58Z" level=debug msg="job state has an error. Assuming that any error here is human readable. err - Pod [ bundle-438177b2-6493-4dfd-857f-8c5f3d4843e1 ] failed with exit code [2]"
10.128.0.8 - - [01/Aug/2018:08:12:58 +0000] "GET /ansible-service-broker/v2/service_instances/662dfe22-955d-11e8-805b-0a580a800004/service_bindings/194cf865-9562-11e8-805b-0a580a800004/last_operation?operation=0ebaa467-eeb9-47fc-949e-a024d2e5cbb4&plan_id=7f4a5e35e4af2beb70076e72fab0b7ff&service_id=1dda1477cace09730bd8ed7a6505607e HTTP/1.1" 200 124

[root@ip-172-18-15-195 ~]# oc describe bundlebinding 194cf865-9562-11e8-805b-0a580a800004
Name:         194cf865-9562-11e8-805b-0a580a800004
Namespace:    openshift-ansible-service-broker
Labels:       <none>
Annotations:  <none>
API Version:  automationbroker.io/v1alpha1
Kind:         BundleBinding
Metadata:
  Creation Timestamp:  2018-08-01T08:08:40Z
  Generation:          1
  Resource Version:    58571
  Self Link:           /apis/automationbroker.io/v1alpha1/namespaces/openshift-ansible-service-broker/bundlebindings/194cf865-9562-11e8-805b-0a580a800004
  UID:                 198ac7b9-9562-11e8-a230-0ee5a8cf3156
Spec:
  Bundle Instance:
    Name:      662dfe22-955d-11e8-805b-0a580a800004
  Parameters:  {"_apb_last_requesting_user":"jiazha","_apb_plan_id":"dev","_apb_provision_creds":{"DB_ADMIN_PASSWORD":"7oQfhZCc1ub7Ftx0HxFs","DB_HOST":"postgresql","DB_NAME":"admin","DB_PASSWORD":"test","DB_PORT":"5432","DB_TYPE":"postgres","DB_USER":"admin"},"_apb_service_binding_id":"194cf865-9562-11e8-805b-0a580a800004","_apb_service_class_id":"1dda1477cace09730bd8ed7a6505607e","_apb_service_instance_id":"662dfe22-955d-11e8-805b-0a580a800004"}
Status:
  Jobs:
    0 Ebaa 467 - Eeb 9 - 47 Fc - 949 E - A 024 D 2 E 5 Cbb 4:
      Description:         Error occurred during bind. Please contact administrator if the issue persists.
      Error:               Pod [ bundle-438177b2-6493-4dfd-857f-8c5f3d4843e1 ] failed with exit code [2]
      Last Modified Time:  2018-08-01T08:11:13Z
      Method:              bind
      Podname:             
      State:               failed
  Last Description:        Error occurred during bind. Please contact administrator if the issue persists.
  State:                   failed
Events:                    <none>

Expected results:
Async binding success.

Additional info:

Comment 1 Jesus M. Rodriguez 2018-08-06 19:53:05 UTC
Registry in configuration:

data:
  broker-config: |
    registry:
      - type: dockerhub
        name: dh
        url: https://registry.hub.docker.com
        org: mhrivnak
        tag: latest
        white_list:
          - '.*-apb$'

Final output from bind job:

time="2018-08-06T19:13:32Z" level=debug msg="get binding instance: ca878c37-99ac-11e8-bddd-0242ac110007"
time="2018-08-06T19:13:32Z" level=debug msg="service_id: 1dda1477cace09730bd8ed7a6505607e"
time="2018-08-06T19:13:32Z" level=debug msg="plan_id: 7f4a5e35e4af2beb70076e72fab0b7ff"
time="2018-08-06T19:13:32Z" level=debug msg="operation:  f6b3d487-09b8-409c-9a0f-1f96c349ca52"
time="2018-08-06T19:13:32Z" level=debug msg="state: succeeded"
time="2018-08-06T19:13:32Z" level=debug msg="description: bind job completed"
172.17.0.10 - - [06/Aug/2018:19:13:32 +0000] "GET /automation-broker/v2/service_instances/9a2672a8-99ac-11e8-bddd-0242ac110007/service_bindings/ca878c37-99ac-11e8-bddd-0242ac110007/last_operation?operation=f6b3d487-09b8-409c-9a0f-1f96c349ca52&plan_id=7f4a5e35e4af2beb70076e72fab0b7ff&service_id=1dda1477cace09730bd8ed7a6505607e HTTP/1.1" 200 66

Bundlebinding CR

$ oc describe bundlebindings ca878c37-99ac-11e8-bddd-0242ac110007
Name:         ca878c37-99ac-11e8-bddd-0242ac110007
Namespace:    automation-broker
Labels:       <none>
Annotations:  <none>
API Version:  automationbroker.io/v1alpha1
Kind:         BundleBinding
Metadata:
  Cluster Name:
  Creation Timestamp:  2018-08-06T19:13:24Z
  Generation:          1
  Resource Version:    14372
  Self Link:           /apis/automationbroker.io/v1alpha1/namespaces/automation-broker/bundlebindings/ca878c37-99ac-11e8-bddd-0242ac110007
  UID:                 cabaee21-99ac-11e8-b60c-c85b76145add
Spec:
  Bundle Instance:
    Name:      9a2672a8-99ac-11e8-bddd-0242ac110007
  Parameters:  {"_apb_last_requesting_user":"admin","_apb_plan_id":"dev","_apb_service_binding_id":"ca878c37-99ac-11e8-bddd-0242ac110007","_apb_service_class_id":"1dda1477cace09730bd8ed7a6505607e","_apb_service_instance_id":"9a2672a8-99ac-11e8-bddd-0242ac110007"}
Status:
  Jobs:
    F 6 B 3 D 487 - 09 B 8 - 409 C - 9 A 0 F - 1 F 96 C 349 Ca 52:
      Description:         bind job completed
      Last Modified Time:  2018-08-06T19:13:31Z
      Method:              bind
      Podname:
      State:               succeeded
  Last Description:        bind job completed
  State:                   succeeded
Events:                    <none>

Comment 2 Jesus M. Rodriguez 2018-08-06 19:56:41 UTC
I was not able to recreate the failure. Using mhrivnak's repo and APB for PostgreSQL, I was able to provision and create a binding.

In the original comment, it looks like the APB's bind playbook failed with exit code 2. The broker is behaving correctly as the error came from the APB. The broker handled the error and remained running.

The image I see for the APB I used is as follows:


$ docker images | grep mhrivnak
mhrivnak/postgresql-apb                                latest              981d00a7b33b        5 months ago        1.47GB

Comment 3 Jesus M. Rodriguez 2018-08-06 20:29:52 UTC
During a failure the apb for the bind should still be around. It would be helpful to capture the log of that pod. Might be something like dh-postgresq-bind*... something like that. It will definitely had -bind- in the name.

oc logs of that pod to see what the APB reported beyond the exit code.

Comment 4 Jian Zhang 2018-08-08 08:22:23 UTC
Jesus,

I retest it with the latest version:
ASB: 1.3.7
Service catalog: v0.1.27

Unfortunately, I cannot reproduce this issue either. The binding action was succeeded, I do not know why. Maybe the issue has been fixed in somewhere in the latest version. But, I confirm I encountered this issue with the 1.3.5 version.

And, For the 1.3.7 version, unbinding was failed, as below:

time="2018-08-08T08:13:44Z" level=info msg="ASYNC unbinding in progress"
time="2018-08-08T08:13:44Z" level=debug msg="set job state for instance: 0c985cb7-9ae1-11e8-8391-0a580a800004 token: 7d5287a5-2939-4b2c-b592-4ce78acc40bb"
time="2018-08-08T08:13:44Z" level=error msg="Could not find binding 0c985cb7-9ae1-11e8-8391-0a580a800004 associated with job state 7d5287a5-2939-4b2c-b592-4ce78acc40bb - bundlebindings.automationbroker.io \"0c985cb7-9ae1-11e8-8391-0a580a800004\" not found"
time="2018-08-08T08:13:44Z" level=error msg="Failed to start new job for async unbind\nbundlebindings.automationbroker.io \"0c985cb7-9ae1-11e8-8391-0a580a800004\" not found"
time="2018-08-08T08:13:44Z" level=error msg="Unknown error: bundlebindings.automationbroker.io \"0c985cb7-9ae1-11e8-8391-0a580a800004\" not found"
10.128.0.1 - - [08/Aug/2018:08:13:44 +0000] "DELETE /ansible-service-broker/v2/service_instances/0c985cb7-9ae1-11e8-8391-0a580a800004/service_bindings/3c6adaf2-9ae1-11e8-8391-0a580a800004?accepts_incomplete=true&plan_id=7f4a5e35e4af2beb70076e72fab0b7ff&service_id=1dda1477cace09730bd8ed7a6505607e HTTP/1.1" 500 109

Comment 5 Jesus M. Rodriguez 2018-08-17 21:06:07 UTC
Unbind problem fixed by PR #1050
https://github.com/openshift/ansible-service-broker/pull/1050

Comment 6 Dylan Murray 2018-08-22 14:47:44 UTC
New builds added to advisory:
openshift-enterprise-asb-container-v3.11.0-0.20.0.0
openshift-enterprise-apb-tools-container-v3.11.0-0.20.0.0

Comment 7 Jian Zhang 2018-08-23 07:54:43 UTC
It works well, LGTM.

The SC version: v3.11.0-0.20.0;Upstream:v0.1.29
The ASB version: 1.3.11

Comment 9 errata-xmlrpc 2018-10-11 07:23:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652


Note You need to log in before you can comment on or make changes to this bug.