Bug 1575943 - Duplicate ClusterServiceClass in Catalog cause template deployment failure
Summary: Duplicate ClusterServiceClass in Catalog cause template deployment failure
Status: CLOSED DUPLICATE of bug 1548122
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Service Catalog
Version: 3.7.1
Hardware: All
OS: All
Target Milestone: ---
: ---
Assignee: Jay Boyd
QA Contact: Zhang Cheng
Depends On:
TreeView+ depends on / blocked
Reported: 2018-05-08 11:07 UTC by Andre Costa
Modified: 2021-09-09 13:58 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Last Closed: 2018-05-09 18:33:14 UTC
Target Upstream Version:

Attachments (Terms of Use)
controller-manager_pod_logs (105.05 KB, text/plain)
2018-05-08 11:07 UTC, Andre Costa
no flags Details

Description Andre Costa 2018-05-08 11:07:03 UTC
Created attachment 1433143 [details]

Customer created a project to have his own custom templates apart from the openshift standard template projects, and reconfigure the template service broker cm to fetch templates from it.

Custom templates failed to deploy with error:
The service is not yet ready. The instance references a ClusterServiceClass that does not exist. References a non-existent ClusterServiceClass (ExternalName: "orange-grafana-dbms") or there is more than one (found: 2)

Looking at the clusterviceclasses object there are duplicated with external name but not with .metadata.name

cloud@ansible-dev: ~ > oc get clusterserviceclass -o=custom-columns="id:.metadata.name,external name:spec.externalName" | grep orange
1d73b621-30d6-11e8-9531-029dea4665d2   orange-prometheus-dbms
2c74fd94-2db6-11e8-9531-029dea4665d2   orange-cassandra
325b866f-30d6-11e8-9531-029dea4665d2   orange-grafana-dbms
51ad1c2a-496c-11e8-9ed0-029dea4665d2   orange-grafana-dbms
54e54479-4969-11e8-9ed0-029dea4665d2   orange-prometheus-dbms
7e58a7ba-496f-11e8-9ed0-029dea4665d2   orange-mariadb-galera
835aea69-496f-11e8-9ed0-029dea4665d2   orange-mariadb-standalone
90bef2e4-496f-11e8-9ed0-029dea4665d2   orange-cassandra
fe38dd15-49e2-11e8-9ed0-029dea4665d2   orange-elasticsearch-persistent

But picking the orange-grafana-dbms, when doing oc describe we have 

 loud@ansible-dev: ~ > oc describe clusterserviceclass 325b866f-30d6-11e8-9531-029dea4665d2
Name:           325b866f-30d6-11e8-9531-029dea4665d2
  Creation Timestamp:   2018-03-26T09:20:09Z
  External Metadata:
    Console . Openshift . Io / Icon Class:      fa fa-cogs
    Display Name:                               orange-grafana-dbms
    Documentation URL:                          https://gitlab.forge.orange-labs.fr/kubernetes/deployment/dbms-ops/orange-grafana-dbms-oc-3.7
  External Name:                                orange-grafana-dbms
  Plan Updatable:                               false
  Removed From Broker Catalog:  true
cloud@ansible-dev: ~ > oc describe clusterserviceclass 51ad1c2a-496c-11e8-9ed0-029dea4665d2
Name:           51ad1c2a-496c-11e8-9ed0-029dea4665d2
  Creation Timestamp:   2018-04-26T16:26:10Z
  External Metadata:
    Console . Openshift . Io / Icon Class:      fa fa-cogs
    Display Name:                               Orange Grafana Dbms
    Documentation URL:                          https://gitlab.forge.orange-labs.fr/openshift/dbms-ops/orange-grafana-dbms-openshift
    Provider Display Name:                      Orange, Inc. (DBEI Team)
    Support URL:                                http://dbei.si.fr.intraorange/
  External Name:                                orange-grafana-dbms
  Plan Updatable:                               false
  Removed From Broker Catalog:  false

Looks whenever templates and is are deleted and recreated the previous clusterserviceclass changes the status to Removed From Broker Catalog:  true, but doesn't seem to be deleted from etcd. This only solved when deleting clusterserviceclass manually. Is this a bug or do we need to always delete the clusterserviceclass manually before making any updates on the templates?

Looking at the logs from the controller pod there are a lot of reconciliation processes pending for ServiceBinding and ServiceInstance which I know there are some bugs reported about this for OCP 3.7. May this also cause the clusterserviceclass to be stuck in 'removedFromBrokerCatalog: true'?

Comment 1 Jay Boyd 2018-05-09 18:33:14 UTC
Duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1548122

In the description below, the same holds true for service plans as what is described for service classes:

When the broker stops listing classes in a GetCatalog request, Service Catalog detects this and if there are no instances of the class, it deletes the class.  If there are instances, it marks the class's status removedFromBrokerCatalog=true.  If the Broker re-adds the same Class and or plan, Catalog should reset the removedFromBrokerCatalog status to false.  

This was not working properly until 3.10: in 3.7 and 3.9 the deletion is detected but the status is never reset when the broker starts listing the class/plan again.

As a work around, you can edit the service class/service plan and reset the removedFromBrokerCatalog status after the broker relists the class/plan again.  I.E.,

$ oc edit clusterserviceclass xxxx

and change the Status/removedFromBrokerCatalog to false.

*** This bug has been marked as a duplicate of bug 1548122 ***

Comment 2 Jay Boyd 2018-05-11 13:42:30 UTC
correction to comment #1:  the fix was also delivered to OSE v3.9.21-1 by https://github.com/openshift/ose/pull/1200

Note You need to log in before you can comment on or make changes to this bug.