Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1583503

Summary: After failed update, subsequent identical update request returned as success
Product: OpenShift Container Platform Reporter: Zhang Cheng <chezhang>
Component: Service BrokerAssignee: Jason Montleon <jmontleo>
Status: CLOSED ERRATA QA Contact: Zhang Cheng <chezhang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.10.0CC: aos-bugs, dzager, jaboyd, jiazha, jmatthew, jmontleo, jpeeler, mluksa, pmorie, zhsun, zitang
Target Milestone: ---   
Target Release: 3.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-11 07:20:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1615229    
Bug Blocks:    

Description Zhang Cheng 2018-05-29 07:03:24 UTC
Description of problem: 
ServiceInstance update failed when revover planUpdatable from "false" to "true"


service-catalog & asb image using images:
# service-catalog --version
v3.10.0-0.53.0;Upstream:v0.1.19
# asbd --version
1.2.14


How reproducible:
Always


Steps to Reproduce:
1. Provision a PostgreSql apb(with dev plan) from web console
2. Update clusterserviceclass of PostgreSQL (APB) and change PlanUpdatable=false
3. Try to update spec.clusterServicePlanExternalName of serviceinstance (from "dev" to "prod")
4. Update clusterserviceclass of PostgreSQL (APB) and change PlanUpdatable=true
5. Try to update spec.clusterServicePlanExternalName of serviceinstance (from "dev" to "prod")


Actual results:  
3. ServiceInstance update failed by expected.
5. Sandbox pod failed at "TASK [postgresql-apb : Restore database]", there is a warning event "UpdateInstanceCallFailed, Update call failed: Error occurred during update. Please contact administrator if the issue persists."  But, got "The instance was updated successfully" event and message in finally.

Sandbox log:
# oc logs apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b

PLAY [postgresql-apb update] ***************************************************

TASK [ansibleplaybookbundle.asb-modules : debug] *******************************
skipping: [localhost]

TASK [postgresql-apb : Update last operation] **********************************
changed: [localhost]

TASK [postgresql-apb : Determine cluster type] *********************************
changed: [localhost]

TASK [postgresql-apb : Set facts] **********************************************
ok: [localhost]

TASK [postgresql-apb : Include variables based on ansible version] *************
skipping: [localhost]

TASK [postgresql-apb : Backup database] ****************************************
changed: [localhost]

TASK [postgresql-apb : Rsync to APB] *******************************************
changed: [localhost]

TASK [postgresql-apb : Take down old deployment and pvc (if it exists)] ********
ok: [localhost] => (item=pvc.yaml)
changed: [localhost] => (item=deployment.yaml)

TASK [postgresql-apb : Update last operation] **********************************
changed: [localhost]

TASK [postgresql-apb : Set postgresql objects state=present] *******************
ok: [localhost] => (item={u'state': u'present', u'name': u'service.yaml'})
changed: [localhost] => (item={u'state': u'present', u'name': u'pvc.yaml'})
changed: [localhost] => (item={u'state': u'present', u'name': u'deployment.yaml'})

TASK [postgresql-apb : Update last operation] **********************************
changed: [localhost]

TASK [postgresql-apb : Wait for deployment to become available] ****************
FAILED - RETRYING: Wait for deployment to become available (12 retries left).Result was: {
    "attempts": 1, 
    "changed": false, 
    "msg": "Deployment available status: False", 
    "retries": 13
}
FAILED - RETRYING: Wait for deployment to become available (11 retries left).Result was: {
    "attempts": 2, 
    "changed": false, 
    "msg": "Deployment available status: False", 
    "retries": 13
}
ok: [localhost] => {
    "attempts": 3, 
    "msg": "Deployment available status: False"
}

TASK [postgresql-apb : rsync db backup to new pod] *****************************
changed: [localhost]

TASK [postgresql-apb : Restore database] ***************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "oc exec -it -n test3 postgresql-95a70ff4-6302-11e8-a7d4-0a580a80000c-1-7sxvc -- /bin/bash -c \"psql -f /tmp/db/dump\"", "delta": "0:00:02.188371", "end": "2018-05-29 05:45:45.956295", "msg": "non-zero return code", "rc": 2, "start": "2018-05-29 05:45:43.767924", "stderr": "Unable to use a TTY - input is not a terminal or the right kind of file\npsql:/tmp/db/dump:14: ERROR:  role \"admin\" already exists\npsql:/tmp/db/dump:16: ERROR:  role \"postgres\" already exists\npsql:/tmp/db/dump:28: ERROR:  duplicate key value violates unique constraint \"pg_database_datname_index\"\nDETAIL:  Key (datname)=(admin) already exists.\npsql:/tmp/db/dump:30: FATAL:  terminating connection due to administrator command\npsql:/tmp/db/dump:33: \\connect: FATAL:  the database system is shutting down\ncommand terminated with exit code 2", "stderr_lines": ["Unable to use a TTY - input is not a terminal or the right kind of file", "psql:/tmp/db/dump:14: ERROR:  role \"admin\" already exists", "psql:/tmp/db/dump:16: ERROR:  role \"postgres\" already exists", "psql:/tmp/db/dump:28: ERROR:  duplicate key value violates unique constraint \"pg_database_datname_index\"", "DETAIL:  Key (datname)=(admin) already exists.", "psql:/tmp/db/dump:30: FATAL:  terminating connection due to administrator command", "psql:/tmp/db/dump:33: \\connect: FATAL:  the database system is shutting down", "command terminated with exit code 2"], "stdout": "SET\nSET\nSET\nALTER ROLE\nALTER ROLE\nREVOKE\nGRANT", "stdout_lines": ["SET", "SET", "SET", "ALTER ROLE", "ALTER ROLE", "REVOKE", "GRANT"]}

PLAY RECAP *********************************************************************
localhost                  : ok=11   changed=9    unreachable=0    failed=1   


ServiceInstance description:
# oc describe serviceinstance -n test3
Name:         rh-postgresql-apb-mrcxz
Namespace:    test3
Labels:       app=test-instance
Annotations:  <none>
API Version:  servicecatalog.k8s.io/v1beta1
Kind:         ServiceInstance
Metadata:
  Creation Timestamp:  2018-05-29T05:38:58Z
  Finalizers:
    kubernetes-incubator/service-catalog
  Generate Name:     rh-postgresql-apb-
  Generation:        2
  Resource Version:  32144
  Self Link:         /apis/servicecatalog.k8s.io/v1beta1/namespaces/test3/serviceinstances/rh-postgresql-apb-mrcxz
  UID:               95a710d8-6302-11e8-a7d4-0a580a80000c
Spec:
  Cluster Service Class External Name:  rh-postgresql-apb
  Cluster Service Class Ref:
    Name:                              d5915e05b253df421efe6e41fb6a66ba
  Cluster Service Plan External Name:  prod
  Cluster Service Plan Ref:
    Name:       4acaf1511a92890cd8910b1d8473be97
  External ID:  95a70ff4-6302-11e8-a7d4-0a580a80000c
  Parameters From:
    Secret Key Ref:
      Key:          parameters
      Name:         rh-postgresql-apb-parameters2xn3j
  Update Requests:  0
  User Info:
    Groups:
      system:masters
      system:cluster-admins
      system:authenticated
    UID:       
    Username:  system:admin
Status:
  Async Op In Progress:  false
  Conditions:
    Last Transition Time:  2018-05-29T05:46:38Z
    Message:               The instance was updated successfully
    Reason:                InstanceUpdatedSuccessfully
    Status:                True
    Type:                  Ready
  Deprovision Status:      Required
  External Properties:
    Cluster Service Plan External ID:    4acaf1511a92890cd8910b1d8473be97
    Cluster Service Plan External Name:  prod
    Parameter Checksum:                  dbac93dc4526449a7c2650fa3916e9370052e331128cba9d8d3c0567393fa453
    Parameters:
      Postgresql _ Database:  <redacted>
      Postgresql _ Password:  <redacted>
      Postgresql _ User:      <redacted>
      Postgresql _ Version:   <redacted>
    User Info:
      Groups:
        system:masters
        system:cluster-admins
        system:authenticated
      UID:                        
      Username:                   system:admin
  Observed Generation:            2
  Orphan Mitigation In Progress:  false
  Provision Status:               Provisioned
  Reconciled Generation:          2
Events:
  Type     Reason                       Age                From                                Message
  ----     ------                       ----               ----                                -------
  Normal   Provisioning                 10m                service-catalog-controller-manager  The instance is being provisioned asynchronously
  Normal   Provisioning                 10m (x2 over 10m)  service-catalog-controller-manager  The instance is being provisioned asynchronously (action started)
  Normal   Provisioning                 10m                service-catalog-controller-manager  The instance is being provisioned asynchronously (0%: Starting)
  Normal   ProvisionedSuccessfully      9m                 service-catalog-controller-manager  The instance was provisioned successfully
  Normal   UpdatingInstance             4m                 service-catalog-controller-manager  The instance is being updated asynchronously
  Normal   UpdatingInstance             4m (x2 over 4m)    service-catalog-controller-manager  The instance is being updated asynchronously (action started)
  Normal   UpdatingInstance             4m                 service-catalog-controller-manager  The instance is being updated asynchronously (0%: Starting)
  Normal   UpdatingInstance             4m (x2 over 4m)    service-catalog-controller-manager  The instance is being updated asynchronously (25%: Backup complete)
  Normal   UpdatingInstance             3m                 service-catalog-controller-manager  The instance is being updated asynchronously (50%: Objects updated state=present)
  Warning  UpdateInstanceCallFailed     2m                 service-catalog-controller-manager  Update call failed: Error occurred during update. Please contact administrator if the issue persists.
  Normal   InstanceUpdatedSuccessfully  2m                 service-catalog-controller-manager  The instance was updated successfully


Expected results: 
5. ServiceInstance update should succeed in every place(such as sandbox, event, asb log etc.).


Addition info: 
Attach asb logs:
time="2018-05-29T05:44:30Z" level=info msg="============================================================"
time="2018-05-29T05:44:30Z" level=info msg="                       UPDATING                             "
time="2018-05-29T05:44:30Z" level=info msg="============================================================"
time="2018-05-29T05:44:30Z" level=info msg="Spec.ID: d5915e05b253df421efe6e41fb6a66ba"
time="2018-05-29T05:44:30Z" level=info msg="Spec.Name: rh-postgresql-apb"
[2018-05-29T05:44:30.726Z] [DEBUG] - JobStateSubscriber Notify : msg state {be44a5db-913a-45ca-acbd-7062603f92fd in progress  update  action started} 
[2018-05-29T05:44:30.726Z] [DEBUG] - set job state for instance: 95a70ff4-6302-11e8-a7d4-0a580a80000c token: be44a5db-913a-45ca-acbd-7062603f92fd
time="2018-05-29T05:44:30Z" level=info msg="Spec.Image: registry.access.stage.redhat.com/openshift3/postgresql-apb:v3.10"
time="2018-05-29T05:44:30Z" level=info msg="Spec.Description: SCL PostgreSQL apb implementation"
time="2018-05-29T05:44:30Z" level=info msg="============================================================"
time="2018-05-29T05:44:30Z" level=info msg="Checking if namespace %s exists.test3"
time="2018-05-29T05:44:30Z" level=info msg="Creating RoleBinding apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b"
time="2018-05-29T05:44:30Z" level=info msg="Creating RoleBinding apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b"
time="2018-05-29T05:44:30Z" level=info msg="Successfully created apb sandbox: [ %s ], with %s permissions in namespace %sapb-bf6d0246-57b7-48c6-b00b-7b1e51e9101beditrh-postgresql-apb-upda-xjvcb"
time="2018-05-29T05:44:30Z" level=info msg="Running post create sandbox fuctions if defined."
time="2018-05-29T05:44:30Z" level=info msg="Proxy configuration present. Applying to APB before execution:"
time="2018-05-29T05:44:30Z" level=info msg="HTTP_PROXY=\"http://file.rdu.redhat.com:3128\""
time="2018-05-29T05:44:30Z" level=info msg="HTTPS_PROXY=\"http://file.rdu.redhat.com:3128\""
time="2018-05-29T05:44:30Z" level=info msg="NO_PROXY=\".centralci.eng.rdu2.redhat.com,.cluster.local,.svc,169.254.169.254,172.16.120.88,172.30.0.1,qe-chezhang-round4master-etcd-1,qe-chezhang-round4nrr-1,.default\""
time="2018-05-29T05:44:30Z" level=info msg="Creating pod \"apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b\" in the rh-postgresql-apb-upda-xjvcb namespace"
[2018-05-29T05:44:31.762Z] [DEBUG] - service_id: d5915e05b253df421efe6e41fb6a66ba
[2018-05-29T05:44:31.762Z] [DEBUG] - plan_id: 4acaf1511a92890cd8910b1d8473be97
[2018-05-29T05:44:31.762Z] [DEBUG] - operation:  be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:44:31.767Z] [DEBUG] - state: in progress
[2018-05-29T05:44:31.767Z] [DEBUG] - description: action started
10.128.0.9 - - [29/May/2018:05:44:31 +0000] "GET /ansible-service-broker/v2/service_instances/95a70ff4-6302-11e8-a7d4-0a580a80000c/last_operation?operation=be44a5db-913a-45ca-acbd-7062603f92fd&plan_id=4acaf1511a92890cd8910b1d8473be97&service_id=d5915e05b253df421efe6e41fb6a66ba HTTP/1.1" 200 64
[2018-05-29T05:44:33.79Z] [DEBUG] - service_id: d5915e05b253df421efe6e41fb6a66ba
[2018-05-29T05:44:33.79Z] [DEBUG] - plan_id: 4acaf1511a92890cd8910b1d8473be97
[2018-05-29T05:44:33.79Z] [DEBUG] - operation:  be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:44:33.794Z] [DEBUG] - state: in progress
[2018-05-29T05:44:33.794Z] [DEBUG] - description: action started
10.128.0.9 - - [29/May/2018:05:44:33 +0000] "GET /ansible-service-broker/v2/service_instances/95a70ff4-6302-11e8-a7d4-0a580a80000c/last_operation?operation=be44a5db-913a-45ca-acbd-7062603f92fd&plan_id=4acaf1511a92890cd8910b1d8473be97&service_id=d5915e05b253df421efe6e41fb6a66ba HTTP/1.1" 200 64
[2018-05-29T05:44:35.64Z] [DEBUG] - JobStateSubscriber Notify : msg state {be44a5db-913a-45ca-acbd-7062603f92fd in progress  update  0%: Starting} 
[2018-05-29T05:44:35.641Z] [DEBUG] - set job state for instance: 95a70ff4-6302-11e8-a7d4-0a580a80000c token: be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:44:37.81Z] [DEBUG] - service_id: d5915e05b253df421efe6e41fb6a66ba
[2018-05-29T05:44:37.81Z] [DEBUG] - plan_id: 4acaf1511a92890cd8910b1d8473be97
[2018-05-29T05:44:37.81Z] [DEBUG] - operation:  be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:44:37.814Z] [DEBUG] - state: in progress
[2018-05-29T05:44:37.814Z] [DEBUG] - description: 0%: Starting
10.128.0.9 - - [29/May/2018:05:44:37 +0000] "GET /ansible-service-broker/v2/service_instances/95a70ff4-6302-11e8-a7d4-0a580a80000c/last_operation?operation=be44a5db-913a-45ca-acbd-7062603f92fd&plan_id=4acaf1511a92890cd8910b1d8473be97&service_id=d5915e05b253df421efe6e41fb6a66ba HTTP/1.1" 200 62
[2018-05-29T05:44:45.284Z] [DEBUG] - JobStateSubscriber Notify : msg state {be44a5db-913a-45ca-acbd-7062603f92fd in progress  update  25%: Backup complete} 
[2018-05-29T05:44:45.284Z] [DEBUG] - set job state for instance: 95a70ff4-6302-11e8-a7d4-0a580a80000c token: be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:44:45.838Z] [DEBUG] - service_id: d5915e05b253df421efe6e41fb6a66ba
[2018-05-29T05:44:45.838Z] [DEBUG] - plan_id: 4acaf1511a92890cd8910b1d8473be97
[2018-05-29T05:44:45.838Z] [DEBUG] - operation:  be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:44:45.846Z] [DEBUG] - state: in progress
[2018-05-29T05:44:45.846Z] [DEBUG] - description: 25%: Backup complete
10.128.0.9 - - [29/May/2018:05:44:45 +0000] "GET /ansible-service-broker/v2/service_instances/95a70ff4-6302-11e8-a7d4-0a580a80000c/last_operation?operation=be44a5db-913a-45ca-acbd-7062603f92fd&plan_id=4acaf1511a92890cd8910b1d8473be97&service_id=d5915e05b253df421efe6e41fb6a66ba HTTP/1.1" 200 70
[2018-05-29T05:45:01.877Z] [DEBUG] - service_id: d5915e05b253df421efe6e41fb6a66ba
[2018-05-29T05:45:01.877Z] [DEBUG] - plan_id: 4acaf1511a92890cd8910b1d8473be97
[2018-05-29T05:45:01.877Z] [DEBUG] - operation:  be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:45:01.882Z] [DEBUG] - state: in progress
[2018-05-29T05:45:01.882Z] [DEBUG] - description: 25%: Backup complete
10.128.0.9 - - [29/May/2018:05:45:01 +0000] "GET /ansible-service-broker/v2/service_instances/95a70ff4-6302-11e8-a7d4-0a580a80000c/last_operation?operation=be44a5db-913a-45ca-acbd-7062603f92fd&plan_id=4acaf1511a92890cd8910b1d8473be97&service_id=d5915e05b253df421efe6e41fb6a66ba HTTP/1.1" 200 70
[2018-05-29T05:45:20.947Z] [DEBUG] - JobStateSubscriber Notify : msg state {be44a5db-913a-45ca-acbd-7062603f92fd in progress  update  50%: Objects updated state=present} 
[2018-05-29T05:45:20.947Z] [DEBUG] - set job state for instance: 95a70ff4-6302-11e8-a7d4-0a580a80000c token: be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:45:33.905Z] [DEBUG] - service_id: d5915e05b253df421efe6e41fb6a66ba
[2018-05-29T05:45:33.905Z] [DEBUG] - plan_id: 4acaf1511a92890cd8910b1d8473be97
[2018-05-29T05:45:33.905Z] [DEBUG] - operation:  be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:45:33.909Z] [DEBUG] - state: in progress
[2018-05-29T05:45:33.909Z] [DEBUG] - description: 50%: Objects updated state=present
10.128.0.9 - - [29/May/2018:05:45:33 +0000] "GET /ansible-service-broker/v2/service_instances/95a70ff4-6302-11e8-a7d4-0a580a80000c/last_operation?operation=be44a5db-913a-45ca-acbd-7062603f92fd&plan_id=4acaf1511a92890cd8910b1d8473be97&service_id=d5915e05b253df421efe6e41fb6a66ba HTTP/1.1" 200 84
[2018-05-29T05:45:47.249Z] [DEBUG] - JobStateSubscriber Notify : msg state {be44a5db-913a-45ca-acbd-7062603f92fd in progress  update  50%: Objects updated state=present} 
[2018-05-29T05:45:47.249Z] [DEBUG] - set job state for instance: 95a70ff4-6302-11e8-a7d4-0a580a80000c token: be44a5db-913a-45ca-acbd-7062603f92fd
time="2018-05-29T05:45:47Z" level=error msg="Provision or Update action failed - Pod [ apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b ] failed with exit code [2]"
time="2018-05-29T05:45:47Z" level=info msg="Destroying APB sandbox..."
time="2018-05-29T05:45:47Z" level=info msg="Successfully deleted rolebinding apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b, namespace rh-postgresql-apb-upda-xjvcb"
time="2018-05-29T05:45:47Z" level=info msg="Successfully deleted rolebinding apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b, namespace test3"
time="2018-05-29T05:45:47Z" level=error msg="Update APB error: Pod [ apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b ] failed with exit code [2]"
[2018-05-29T05:45:47.265Z] [ERROR] - broker::update error occurred. Pod [ apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b ] failed with exit code [2]
[2018-05-29T05:45:47.265Z] [DEBUG] - closing channel for job be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:45:47.265Z] [DEBUG] - JobStateSubscriber Notify : msg state {be44a5db-913a-45ca-acbd-7062603f92fd failed  update Pod [ apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b ] failed with exit code [2] Error occurred during update. Please contact administrator if the issue persists.} 
[2018-05-29T05:45:47.265Z] [DEBUG] - set job state for instance: 95a70ff4-6302-11e8-a7d4-0a580a80000c token: be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:46:37.937Z] [DEBUG] - service_id: d5915e05b253df421efe6e41fb6a66ba
[2018-05-29T05:46:37.937Z] [DEBUG] - plan_id: 4acaf1511a92890cd8910b1d8473be97
[2018-05-29T05:46:37.937Z] [DEBUG] - operation:  be44a5db-913a-45ca-acbd-7062603f92fd
[2018-05-29T05:46:37.942Z] [DEBUG] - state: failed
[2018-05-29T05:46:37.942Z] [DEBUG] - description: Error occurred during update. Please contact administrator if the issue persists.
[2018-05-29T05:46:37.942Z] [DEBUG] - job state has an error. Assuming that any error here is human readable. err - Pod [ apb-bf6d0246-57b7-48c6-b00b-7b1e51e9101b ] failed with exit code [2]
10.128.0.9 - - [29/May/2018:05:46:37 +0000] "GET /ansible-service-broker/v2/service_instances/95a70ff4-6302-11e8-a7d4-0a580a80000c/last_operation?operation=be44a5db-913a-45ca-acbd-7062603f92fd&plan_id=4acaf1511a92890cd8910b1d8473be97&service_id=d5915e05b253df421efe6e41fb6a66ba HTTP/1.1" 200 126
[2018-05-29T05:46:38.088Z] [DEBUG] - get service instance: 95a70ff4-6302-11e8-a7d4-0a580a80000c
[2018-05-29T05:46:38.09Z] [DEBUG] - get spec: d5915e05b253df421efe6e41fb6a66ba
[2018-05-29T05:46:38.096Z] [INFO] - All Jobs for instance: 95a70ff4-6302-11e8-a7d4-0a580a80000c in state:  in progress - 
[]apb.JobState{}
[2018-05-29T05:46:38.096Z] [DEBUG] - get spec: d5915e05b253df421efe6e41fb6a66ba
[2018-05-29T05:46:38.098Z] [DEBUG] - Update received the following Request.PlanID: [4acaf1511a92890cd8910b1d8473be97]
[2018-05-29T05:46:38.098Z] [DEBUG] - Plan transition NOT requested as part of update
[2018-05-29T05:46:38.098Z] [DEBUG] - Validating update parameters...
[2018-05-29T05:46:38.098Z] [DEBUG] - Request Params: map[]
[2018-05-29T05:46:38.099Z] [DEBUG] - Previous Params: map[_apb_service_class_id:d5915e05b253df421efe6e41fb6a66ba _apb_service_instance_id:95a70ff4-6302-11e8-a7d4-0a580a80000c postgresql_database:admin postgresql_password:test postgresql_user:admin postgresql_version:9.6 _apb_last_requesting_user:system:admin _apb_plan_id:prod]
[2018-05-29T05:46:38.099Z] [DEBUG] - Changed Params: map[]
[2018-05-29T05:46:38.099Z] [DEBUG] - Validated Params: map[]
[2018-05-29T05:46:38.099Z] [WARNING] - Returning without running the APB. No changes were actually requested
10.128.0.9 - - [29/May/2018:05:46:38 +0000] "PATCH /ansible-service-broker/v2/service_instances/95a70ff4-6302-11e8-a7d4-0a580a80000c?accepts_incomplete=true HTTP/1.1" 200 3
[2018-05-29T05:47:28.314Z] [DEBUG] - Unable to find originating origin header
[2018-05-29T05:47:28.314Z] [INFO] - AnsibleBroker::Catalog
[2018-05-29T05:47:28.314Z] [DEBUG] - Dao::BatchGetSpecs
[2018-05-29T05:47:28.319Z] [DEBUG] - Filtering secret parameters out of specs...
10.128.0.9 - - [29/May/2018:05:47:28 +0000] "GET /ansible-service-broker/v2/catalog HTTP/1.1" 200 33399

Comment 1 Jeff Peeler 2018-05-29 15:12:15 UTC
This looks more like an issue of the broker.

Comment 2 Jason Montleon 2018-05-29 19:01:20 UTC
"FATAL:  the database system is shutting down" and 
"role \"admin\" already exists" 
make me think that your backing up to the old database pod, or even perhaps the pv with data is getting reused, but then why would it be shutting down. I'm not sure how this happened.

Comment 3 Jason Montleon 2018-05-29 19:38:39 UTC
I went from 9.6 Prod -> 9.5 Prod -> 9.4 Prod -> 9.5 Prod -> 9.6 Prod -> 9.6 Dev -> 9.5 Dev -> 9.4 Dev -> 9.5 Dev -> 9.6 Dev with mediawiki bound and it remained usable all the way through.

Thus far I have not been able to reproduce this error.

Comment 4 Jason Montleon 2018-05-29 19:45:33 UTC
I edited the clusterserviceclass by hand and disabled planUpdatable. I was still able to modify the version, but not the plan and it worked fine. I restored the planUpdatable setting and was able to update the plan and version successfully.

Comment 5 Jason Montleon 2018-05-29 19:57:02 UTC
Can you please try to reproduce this again and if so provide any missing steps for getting there.

Comment 6 Zhang Cheng 2018-05-30 03:16:34 UTC
Jason,

Firstly, planUpdatable only for preventing plan(such as:dev/prod) update and not useful for version update. Yes, the normal plan/version update cannot recreate this problem. So, I described the reproduce steps in above.

1. Provision a PostgreSql apb(with dev plan) from web console
2. Update clusterserviceclass of PostgreSQL (APB) and change PlanUpdatable=false
3. Try to update spec.clusterServicePlanExternalName of serviceinstance (from "dev" to "prod")
4. Update clusterserviceclass of PostgreSQL (APB) and change PlanUpdatable=true
5. Try to update spec.clusterServicePlanExternalName of serviceinstance (from "dev" to "prod")

I think the difference with you is that I tried to update plan name(dev/prod) and you are trying to update version.

Comment 7 Zihan Tang 2018-05-30 08:13:20 UTC
It looks there're 2 bugs here.
1. the update of serviceinstance should succeed in step5 #comment6;
2. when update is failed from asb side, the condition of serviceinstance is not right. it should be failed not succeed. this scenario can be repoduced by the following step:
   1. provision mysql-apb 5.6 (v3.10.0-0.51.0.1 will update failed by bug) 
   2. update to 5.7 
   3. check the serviceinstance
ASB log:
time="2018-05-30T07:35:59Z" level=error msg="Update APB error: Pod [ apb-34202692-4dd4-46cf-8b0c-61c041ac2303 ] failed with exit code [2]"
[2018-05-30T07:35:59.217Z] [ERROR] - broker::update error occurred. Pod [ apb-34202692-4dd4-46cf-8b0c-61c041ac2303 ] failed with exit code [2]

serviceinstance: 
# oc describe serviceinstance
Name:         rh-mysql-apb
Namespace:    w9y-q
Labels:       app=serviceinstance-template
Annotations:  openshift.io/generated-by=OpenShiftNewApp
API Version:  servicecatalog.k8s.io/v1beta1
Kind:         ServiceInstance
Metadata:
  Creation Timestamp:  2018-05-30T07:33:30Z
  Finalizers:
    kubernetes-incubator/service-catalog
  Generation:        2
  Resource Version:  43165
  Self Link:         /apis/servicecatalog.k8s.io/v1beta1/namespaces/w9y-q/serviceinstances/rh-msyql-apb
  UID:               c00e0c53-63db-11e8-a675-0a580a800005
Spec:
  Cluster Service Class External Name:  rh-mysql-apb
  Cluster Service Class Ref:
    Name:                              2c259ddd8059b9bc65081e07bf20058f
  Cluster Service Plan External Name:  dev
  Cluster Service Plan Ref:
    Name:       ba144caf21607ad5d50c06bdd0f72db3
  External ID:  c00e0bf4-63db-11e8-a675-0a580a800005
  Parameters From:
    Secret Key Ref:
      Key:          parameters
      Name:         rh-mysql-apb-parameters-new
  Update Requests:  1
  User Info:
    Extra:
      Scopes . Authorization . Openshift . Io:
        user:full
    Groups:
      system:authenticated:oauth
      system:authenticated
    UID:       
    Username:  zitang2
Status:
  Async Op In Progress:  false
  Conditions:
    Last Transition Time:  2018-05-30T07:36:10Z
    Message:               The instance was updated successfully
    Reason:                InstanceUpdatedSuccessfully
    Status:                True
    Type:                  Ready
  Deprovision Status:      Required
  External Properties:
    Cluster Service Plan External ID:    ba144caf21607ad5d50c06bdd0f72db3
    Cluster Service Plan External Name:  dev
    Parameter Checksum:                  1209c36edd3b77e0f897e839249a679e624dcc486be7733c3477bc7db26b723b
    Parameters:
      Mariadb _ Database:         <redacted>
      Mariadb _ Password:         <redacted>
      Mariadb _ Root _ Password:  <redacted>
      Mariadb _ User:             <redacted>
      Mariadb _ Version:          <redacted>
    User Info:
      Extra:
        Scopes . Authorization . Openshift . Io:
          user:full
      Groups:
        system:authenticated:oauth
        system:authenticated
      UID:                        
      Username:                   zitang2
  Observed Generation:            2
  Orphan Mitigation In Progress:  false
  Provision Status:               Provisioned
  Reconciled Generation:          2
Events:
  Type     Reason                       Age                 From                                Message
  ----     ------                       ----                ----                                -------
  Warning  ErrorWithParameters          14m (x10 over 14m)  service-catalog-controller-manager  failed to prepare parameters nil: secrets "rh-mysql-apb-parameters" not found
  Normal   Provisioning                 14m                 service-catalog-controller-manager  The instance is being provisioned asynchronously
  Normal   Provisioning                 14m (x2 over 14m)   service-catalog-controller-manager  The instance is being provisioned asynchronously (action started)
  Normal   Provisioning                 13m (x3 over 14m)   service-catalog-controller-manager  The instance is being provisioned asynchronously (0%: Starting)
  Normal   ProvisionedSuccessfully      13m                 service-catalog-controller-manager  The instance was provisioned successfully
  Normal   UpdatingInstance             12m                 service-catalog-controller-manager  The instance is being updated asynchronously
  Normal   UpdatingInstance             12m (x2 over 12m)   service-catalog-controller-manager  The instance is being updated asynchronously (action started)
  Normal   UpdatingInstance             12m (x2 over 12m)   service-catalog-controller-manager  The instance is being updated asynchronously (0%: Starting)
  Normal   UpdatingInstance             12m                 service-catalog-controller-manager  The instance is being updated asynchronously (50%: Objects updated state=present)
  Warning  UpdateInstanceCallFailed     11m                 service-catalog-controller-manager  Update call failed: Error occurred during update. Please contact administrator if the issue persists.
  Normal   InstanceUpdatedSuccessfully  11m                 service-catalog-controller-manager  The instance was updated successfully

Comment 8 Jason Montleon 2018-05-30 12:40:38 UTC
The update to 5.7 should be fixed (hopefully in an RPM/container build today). As mentioned in the associated bug that appears to be a change in mysql_upgrades behavior in 5.7.21.

I agree the logs below look incorrect. I am not sure if this is a catalog or broker bug though. We'll have to look into that. It looks like we reported the error correctly.

"service-catalog-controller-manager  Update call failed: Error occurred during
update. Please contact administrator if the issue persists.
  Normal   InstanceUpdatedSuccessfully  11m                
service-catalog-controller-manager  The instance was updated successfully"

Comment 9 Jason Montleon 2018-05-30 12:47:02 UTC
After discussing with Shawn Hurley we're pretty confident that we're returning the correct status code when the APB fails to update.

Comment 10 Marko Luksa 2018-05-31 11:09:37 UTC
@Jason, how does the broker respond if it receives another last_operation request immediately after it responds with an error?

Comment 11 Jay Boyd 2018-06-01 19:59:57 UTC
I reproduced.  Immediately after Catalog Controller polls and gets back the status about the failed update, it's doing another update to the Broker.  The 2nd update request is identical to the previous update (the now failed update, saying to change to version 5.7).  Broker sees no change and returns a 200 success.

Two issues here:

1) catalog shouldn't be making an additional update request.

2) Broker is saying this update is identical to the last update which has completed, there 200 status code.  Catalog says great, READY=TRUE.  Someone in here is wrong.

Comment 12 Jay Boyd 2018-06-01 20:02:24 UTC
I have a upstream PR for a similar issue which should get into 3.10.  This PR seems to work around the double update issue here.   https://github.com/kubernetes-incubator/service-catalog/pull/2065 for https://bugzilla.redhat.com/show_bug.cgi?id=1577076

Comment 13 Jay Boyd 2018-06-05 19:13:23 UTC
in comment 11 I was wrong about #1 - catalog now retries non-terminal errors (relatively new behavior in Service Catalog, around version v0.1.12).  On the update retry, the broker looks at the update and realizes this is the same as what the current state is supposed to be.  The broker returns a 200 success.

There is an outstanding issue in Catalog where the retries are not properly backed off.  https://github.com/kubernetes-incubator/service-catalog/pull/2065  is fixing that.  However the underlying problem here still persists when you make an additional update request after the failed update.  I don't believe the broker should say the 2nd update is successful.

I discussed with Shawn and Jason in IRC and I believe they are pretty familiar with the issue.  I'd be happy to discuss if needed.

Comment 14 John Matthews 2018-07-12 18:29:59 UTC
Aligning to 3.11

Comment 15 Jason Montleon 2018-08-07 15:53:29 UTC
PR:
https://github.com/openshift/ansible-service-broker/pull/1032

Please be aware that the continued error responses are going to trigger the same bug we see when provision and deprovision fail where requests endlessly loop and spawn new bundle containers. It's the same bug with the same root cause (and hopefully eventual fix) so won't need a new BZ created.

Comment 16 David Zager 2018-08-10 20:45:56 UTC
Errata https://errata.devel.redhat.com/brew/save_builds/33505 updated:

openshift-enterprise-mariadb-apb-v3.11.0-0.13.0.1
openshift-enterprise-mediawiki-apb-v3.11.0-0.13.0.1
openshift-enterprise-postgresql-apb-v3.11.0-0.13.0.1
openshift-enterprise-mysql-apb-v3.11.0-0.13.0.1
openshift-enterprise-asb-container-v3.11.0-0.13.0.1

Comment 17 Zhang Cheng 2018-08-14 07:20:07 UTC
The original problem is Verified and Passed with:

service-catalog: v3.11.0-0.14.0;Upstream:v0.1.27
asbd: 1.3.8
APB images: v3.11.0-0.14.0.0

Comment 19 errata-xmlrpc 2018-10-11 07:20:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652