Bug 1548122

Summary: Service Catalog does not refresh ServiceClasses after removing from catalog
Product: OpenShift Container Platform Reporter: Jesus M. Rodriguez <jesusr>
Component: Service CatalogAssignee: Jay Boyd <jaboyd>
Status: CLOSED CURRENTRELEASE QA Contact: Zihan Tang <zitang>
Severity: high Docs Contact:
Priority: urgent    
Version: 3.7.0CC: andcosta, aos-bugs, chezhang, cshereme, dapark, jaboyd, jmatthew, mrobson, pmorie, rbost, ssadhale, travi, wkulhane
Target Milestone: ---   
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
If a Service Broker removes Service Classes or Service Plans, Service Catalog will mark the classes/plans as RemovedFromBrokerCatalog and they will not be able to be used with new service instances or plans. If the Service Broker re-adds the Classes or Plans, Service Catalog should reset the RemovedFromBrokerCatalog flag. This bug prevented that reset from happening leaving the classes and plans unusable. This mostly impacts developers that are creating new Service Broker offerings.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-12-20 21:42:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1541524    
Attachments:
Description Flags
various output files
none
recovered /v2/catalog response
none
controller manager logs none

Description Jesus M. Rodriguez 2018-02-22 19:05:21 UTC
Description of problem:

If the Service Catalog calls the /v2/catalog endpoint and a provisioned ServiceInstance's ServiceClass is removed from the Broker's catalog response, the Service Catalog will mark it as:

removedFromBrokerCatalog: true

When the Service Broker recovers and the service class returns, the next relist by the Service Catalog should recover the missing Service Class and update the flag to false:

removedFromBrokerCatalog: false

Version-Release number of selected component (if applicable):


How reproducible:
I had to modify the Automation Broker to randomly fail so that it would remove some Service specs from the /v2/catalog response.

Steps to Reproduce:
1. Have a test broker return say 4 service class specs from the /v2/catalog endpoint
2. Verify that the Service Catalog has 4 ClusterServiceClasses
3. Provision 1 of the Service Classes
4. Have the test broker return 3 service class specs from the /v2/catalog, the one that is missing MUST BE the one provisioned in step 3.
5. Have the Service Catalog relist with the test broker
6. Verify that the Provisioned Service Class reference is marked with:

   removedFromBrokerCatalog: true

7. Now, have the test broker return all 4 service class specs from the /v2/catalog again.
8. Have the Service Catalog relist with the test broker
9. Verify that the flag for the provisioned Service Class ref again

Actual results:

10. The removedFromBrokerCatalog: flag remains true forever

Expected results:

10. The removedFromBrokerCatalog: flag should flip back to false once the service class spec returns to the /v2/catalog payload.

Additional info:

Comment 1 Jesus M. Rodriguez 2018-02-22 19:07:08 UTC
Also happens in the latest 3.9.0 alpha as well as the GA 3.7 releases.

Comment 2 Jesus M. Rodriguez 2018-02-22 19:15:01 UTC
Created attachment 1399506 [details]
various output files

The tarball contains the following files:

* catalog-response-with-2-specs-removed.log

This is the /v2/catalog json response that removes 2 service specs

* initial-catalog-response-with-all-specs-present.log

This is the /v2/catalog json response that contains all of the service specs when the broker started.

* provisioned-instances.log

This is the result of oc get serviceinstances --all-namespaces -o yaml

* serviceclasses-with-provisioned-marked-as-removed.log

This file contains the output of oc get clusterserviceclass -o yaml

Comment 3 Jesus M. Rodriguez 2018-02-22 19:16:15 UTC
Created attachment 1399507 [details]
recovered /v2/catalog response

This file is the output of /v2/catalog endpoint after the broker recovers and restores the missing service class specs.

Comment 4 Jesus M. Rodriguez 2018-02-22 19:22:25 UTC
Upstream issue:
https://github.com/kubernetes-incubator/service-catalog/issues/1757

Comment 6 Paul Morie 2018-02-26 19:23:48 UTC
Jay-

Can you take a look at this?

Comment 7 Jesus M. Rodriguez 2018-02-26 20:55:10 UTC
Created attachment 1401027 [details]
controller manager logs

Recreated the problem. The catalog response contained 29 APBs. Then I provisioned 2 of them. I then had the catalog return 27 APBs. Did the relist on the service catalog and the 2 serviceclasses that were provisioned were correctly marked as removedFromBrokerCatalog.

I then had the catalog response from the broker again return 29 APBs. Did the relist on the service catalog twice. They flags never get reset.

Comment 8 Jay Boyd 2018-02-28 15:58:10 UTC
Fixed upstream by https://github.com/kubernetes-incubator/service-catalog/pull/1770   Requires us to revendor Service Catalog into OpenShift

Comment 12 Jay Boyd 2018-03-07 14:29:13 UTC
https://github.com/openshift/origin/pull/18788 has been merged into 3.9

Comment 14 Zihan Tang 2018-03-09 05:44:07 UTC
verified.
service-catalog version: 0.1.9

step:
using 'apb remove' to delete clusterserviceclass, after the asb refresh interval, the apb will come back
1.set broker-config:
   broker:
      dev_broker: true
      refresh_interval: 600s
2.Decrease relistDuration and make sure rsync setting in controller-manager.   #oc edit clusterservicebroker ansible-service-broker
  relistBehavior: Duration
  relistDuration: 5m0s
# oc edit daemonset controller-manager -n kube-service-catalog
   spec:
      containers:
      - args:
        - 5m
3. provision a postgresql apb
4. delete postgresql class 
    [root@dhcp-140-42 zitang]# apb list --broker asb-1338-openshift-ansible-service-broker.apps.0307-cbv.qe.rhcloud.comContacting the ansible-service-broker at: https://asb-1338-openshift-ansible-service-broker.apps.0307-cbv.qe.rhcloud.com/ansible-service-broker/v2/catalog
ID                                NAME               DESCRIPTION                        
2c259ddd8059b9bc65081e07bf20058f  rh-mariadb-apb     Mariadb apb implementation         
03b69500305d9859bb9440d9f9023784  rh-mediawiki-apb   Mediawiki apb implementation       
73ead67495322cc462794387fa9884f5  rh-mysql-apb       Software Collections MySQL APB     
d5915e05b253df421efe6e41fb6a66ba  rh-postgresql-apb  SCL PostgreSQL apb implementation 
[root@dhcp-140-42 zitang]# apb remove --id d5915e05b253df421efe6e41fb6a66ba  --broker asb-1338-openshift-ansible-service-broker.apps.0307-cbv.qe.rhcloud.comContacting the ansible-service-broker at: https://asb-1338-openshift-ansible-service-broker.apps.0307-cbv.qe.rhcloud.com/ansible-service-broker/v2/apb/d5915e05b253df421efe6e41fb6a66ba
Successfully relisted the Service Catalog
Successfully deleted APB

the /v2/catalog will not return 'postgresql-apb'
5. check clusterserviceclass
[root@host-172-16-120-80 ~]# oc get clusterserviceclass -o=custom-columns=NAME:.metadata.name,EXTERNAL\ NAME:.spec.externalName,REMOVED:.status.removedFromBrokerCatalog 
NAME                               EXTERNAL NAME       REMOVED
03b69500305d9859bb9440d9f9023784   rh-mediawiki-apb    false
2c259ddd8059b9bc65081e07bf20058f   rh-mariadb-apb      false
73ead67495322cc462794387fa9884f5   rh-mysql-apb        false
d5915e05b253df421efe6e41fb6a66ba   rh-postgresql-apb   true

6. wait about 10min to wait asb refresh and catalog getting new apbs, then check the clusterserviceclass
[root@host-172-16-120-80 ~]# oc get clusterserviceclass -o=custom-columns=NAME:.metadata.name,EXTERNAL\ NAME:.spec.externalName,REMOVED:.status.removedFromBrokerCatalog 
NAME                               EXTERNAL NAME       REMOVED
03b69500305d9859bb9440d9f9023784   rh-mediawiki-apb    false
2c259ddd8059b9bc65081e07bf20058f   rh-mariadb-apb      false
73ead67495322cc462794387fa9884f5   rh-mysql-apb        false
d5915e05b253df421efe6e41fb6a66ba   rh-postgresql-apb   false

Comment 15 Zihan Tang 2018-03-09 06:31:57 UTC
In comment 14 step6 , even the clusterserviceclass come back, but the clusterserviceplan 'removedFromBrokerCatalog' is still 'true', so change it to 'ASSIGNED'

In step 6:
[root@host-172-16-120-80 ~]# oc get clusterserviceclass -o=custom-columns=NAME:.metadata.name,EXTERNAL\ NAME:.spec.externalName,REMOVED:.status.removedFromBrokerCatalog
NAME                               EXTERNAL NAME       REMOVED
03b69500305d9859bb9440d9f9023784   rh-mediawiki-apb    false
2c259ddd8059b9bc65081e07bf20058f   rh-mariadb-apb      false
73ead67495322cc462794387fa9884f5   rh-mysql-apb        false
d5915e05b253df421efe6e41fb6a66ba   rh-postgresql-apb   false
[root@host-172-16-120-80 ~]# oc get clusterserviceplan -o=custom-columns=NAME:.metadata.name,EXTERNAL\ NAME:.spec.externalName,REMOVED:.status.removedFromBrokerCatalog
NAME                               EXTERNAL NAME   REMOVED
43d3e23d214c26dbebc0879e44425db4   default         false
465063705516630c3d3e2de7798b7041   prod            false
4acaf1511a92890cd8910b1d8473be97   prod            true
93a78b607a2e18776587f0a35ebe92a6   prod            false
9783fc2e859f9179833a7dd003baa841   dev             true
ac68a02cfc1148713f61edb99213a17d   dev             false
ba144caf21607ad5d50c06bdd0f72db3   dev             false

Comment 16 Jay Boyd 2018-03-09 15:16:44 UTC
excellent catch.  I am fixing this upstream via https://github.com/kubernetes-incubator/service-catalog/pull/1824  and also updated the unit test.

Comment 17 Jay Boyd 2018-03-09 19:22:26 UTC
we will not be patching 3.9, but there is a work around for any plans that are erroneously marked as RemovedFromBroker.  You can edit the plan (oc edit clusterserviceplan xxxx) and change the RemovedFromBroker attribute.

Comment 19 Jay Boyd 2018-04-04 13:25:30 UTC
There were two issues reported here:

1) original issue, RemovedFromBroker not reset on Service Classes.  That was fixed in 3.9.0

2) On verifying the fix for #1, QE found that RemovedFromBroker was not reset on Plans.  That was fixed upstream and then revendored by https://github.com/openshift/origin/pull/19117 and landed in 3.9.z

Comment 20 Jay Boyd 2018-04-05 14:25:45 UTC
Please note:  I listed wrong information in comment #19 for item 2.  The fix has not yet landed in 3.9.z but was delivered to master with https://github.com/openshift/origin/pull/19105.

There is an effort planned to revendor a new version of Catalog to 3.9.z but we have not yet closed on the details for it.

Comment 21 Zihan Tang 2018-04-13 05:35:26 UTC
image is ready, changed to ON_QA

Comment 22 Zihan Tang 2018-04-13 05:37:02 UTC
verified 
step is following #comment14

Service-catalog version:
v3.10.0-0.21.0;Upstream:v0.1.13

Comment 23 Jay Boyd 2018-05-09 18:33:14 UTC
*** Bug 1575943 has been marked as a duplicate of this bug. ***

Comment 26 Jay Boyd 2019-03-25 18:43:17 UTC
@Saurabh let's go with a new BZ please & include as many details as you have.