Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1583518

Summary: Incorrect event when update ServiceInstance plan from "valid" -> "invalid" -> "valid" (Event Rate Limter)
Product: OpenShift Container Platform Reporter: Zhang Cheng <chezhang>
Component: Service CatalogAssignee: Jay Boyd <jaboyd>
Status: CLOSED WONTFIX QA Contact: Zhang Cheng <chezhang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.10.0CC: aos-bugs, jaboyd, jiazha, jmatthew, zhsun, zitang
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-01 14:56:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
full-spec with failed message
none
full-spec with successful message none

Description Zhang Cheng 2018-05-29 07:26:25 UTC
Description of problem: 
Incorrect event when update ServiceInstance plan from "valid" -> "invalid" -> "valid"


service-catalog & asb image using images:
# service-catalog --version
v3.10.0-0.53.0;Upstream:v0.1.19
# asbd --version
1.2.14


How reproducible:
Always


Steps to Reproduce:
1. Provision a PostgreSql apb(with dev plan) from web console
2. Update spec.clusterServicePlanExternalName of serviceinstance (from "dev" to "test")
3. Update spec.clusterServicePlanExternalName of serviceinstance (from "test" to "prod")
4. Check the status and event in serviceinstance description.


Actual results:  
4. Serviceinstance update successfully, but none correct event(should get event "The instance was updated successfully" at the end.)
# oc describe serviceinstance
Name:         rh-postgresql-apb-hg5qw
Namespace:    test2
Labels:       <none>
Annotations:  <none>
API Version:  servicecatalog.k8s.io/v1beta1
Kind:         ServiceInstance
Metadata:
  Creation Timestamp:  2018-05-29T05:14:16Z
  Finalizers:
    kubernetes-incubator/service-catalog
  Generate Name:     rh-postgresql-apb-
  Generation:        3
  Resource Version:  28769
  Self Link:         /apis/servicecatalog.k8s.io/v1beta1/namespaces/test2/serviceinstances/rh-postgresql-apb-hg5qw
  UID:               226768ca-62ff-11e8-a7d4-0a580a80000c
Spec:
  Cluster Service Class External Name:  rh-postgresql-apb
  Cluster Service Class Ref:
    Name:                              d5915e05b253df421efe6e41fb6a66ba
  Cluster Service Plan External Name:  prod
  Cluster Service Plan Ref:
    Name:       4acaf1511a92890cd8910b1d8473be97
  External ID:  22676868-62ff-11e8-a7d4-0a580a80000c
  Parameters From:
    Secret Key Ref:
      Key:          parameters
      Name:         rh-postgresql-apb-parametersd11f3
  Update Requests:  0
  User Info:
    Groups:
      system:masters
      system:cluster-admins
      system:authenticated
    UID:      
    Username:  system:admin
Status:
  Async Op In Progress:  false
  Conditions:
    Last Transition Time:  2018-05-29T05:22:08Z
    Message:               The instance was updated successfully
    Reason:                InstanceUpdatedSuccessfully
    Status:                True
    Type:                  Ready
  Deprovision Status:      Required
  External Properties:
    Cluster Service Plan External ID:    4acaf1511a92890cd8910b1d8473be97
    Cluster Service Plan External Name:  prod
    Parameter Checksum:                  dbac93dc4526449a7c2650fa3916e9370052e331128cba9d8d3c0567393fa453
    Parameters:
      Postgresql _ Database:  <redacted>
      Postgresql _ Password:  <redacted>
      Postgresql _ User:      <redacted>
      Postgresql _ Version:   <redacted>
    User Info:
      Groups:
        system:masters
        system:cluster-admins
        system:authenticated
      UID:                        
      Username:                   system:admin
  Observed Generation:            3
  Orphan Mitigation In Progress:  false
  Provision Status:               Provisioned
  Reconciled Generation:          3
Events:
  Type     Reason                            Age               From                                Message
  ----     ------                            ----              ----                                -------
  Normal   Provisioning                      9m                service-catalog-controller-manager  The instance is being provisioned asynchronously
  Normal   Provisioning                      9m (x2 over 9m)   service-catalog-controller-manager  The instance is being provisioned asynchronously (action started)
  Normal   Provisioning                      9m                service-catalog-controller-manager  The instance is being provisioned asynchronously (0%: Starting)
  Normal   ProvisionedSuccessfully           8m                service-catalog-controller-manager  The instance was provisioned successfully
  Warning  ReferencesNonexistentServicePlan  5m (x15 over 6m)  service-catalog-controller-manager  References a non-existent ClusterServicePlan {PlanExternalName:"testdev"} on ClusterServiceClass d5915e05b253df421efe6e41fb6a66ba {ClassExternalName:"rh-postgresql-apb"} or there is more than one (found: 0)
  Normal   UpdatingInstance                  4m                service-catalog-controller-manager  The instance is being updated asynchronously
  Normal   UpdatingInstance                  4m (x2 over 4m)   service-catalog-controller-manager  The instance is being updated asynchronously (action started)
  Normal   UpdatingInstance                  4m (x2 over 4m)   service-catalog-controller-manager  The instance is being updated asynchronously (0%: Starting)
  Normal   UpdatingInstance                  4m                service-catalog-controller-manager  The instance is being updated asynchronously (25%: Backup complete)


Expected results: 
4. Should get event "The instance was updated successfully" at the end of description.


Addition info: 
None.

Comment 1 John Matthews 2018-05-29 15:43:47 UTC
Aligning this to 3.11 as we are late in 3.10 cycle and don't intend public access to modifying the CustomResources in 3.10, they will be locked down and not available to a regular user.

Comment 2 Zihan Tang 2018-05-30 07:21:16 UTC
The event and the conditions sometimes not match even a success update.

# oc describe serviceinstance
Name:         rh-mysql-apb
Namespace:    j1jy0
Labels:       app=serviceinstance-template
Annotations:  openshift.io/generated-by=OpenShiftNewApp
API Version:  servicecatalog.k8s.io/v1beta1
Kind:         ServiceInstance
Metadata:
  Creation Timestamp:  2018-05-30T07:12:43Z
  Finalizers:
    kubernetes-incubator/service-catalog
  Generation:        2
  Resource Version:  40240
  Self Link:         /apis/servicecatalog.k8s.io/v1beta1/namespaces/j1jy0/serviceinstances/rh-mysql-apb
  UID:               d8f750c0-63d8-11e8-a675-0a580a800005
Spec:
  Cluster Service Class External Name:  rh-mysql-apb
  Cluster Service Class Ref:
    Name:                              73ead67495322cc462794387fa9884f5
  Cluster Service Plan External Name:  prod
  Cluster Service Plan Ref:
    Name:       465063705516630c3d3e2de7798b7041
  External ID:  d8f75009-63d8-11e8-a675-0a580a800005
  Parameters From:
    Secret Key Ref:
      Key:          parameters
      Name:         rh-mysql-apb-parameters-new
  Update Requests:  1
  User Info:
    Extra:
      Scopes . Authorization . Openshift . Io:
        user:full
    Groups:
      system:authenticated:oauth
      system:authenticated
    UID:       
    Username:  zitang2
Status:
  Async Op In Progress:  false
  Conditions:
    Last Transition Time:  2018-05-30T07:17:30Z
    Message:               The instance was updated successfully
    Reason:                InstanceUpdatedSuccessfully
    Status:                True
    Type:                  Ready
  Deprovision Status:      Required
  External Properties:
    Cluster Service Plan External ID:    465063705516630c3d3e2de7798b7041
    Cluster Service Plan External Name:  prod
    Parameter Checksum:                  f1ebdb8e3b192571a5c73a26ebff73ba4bc007efdc093ca98ad2b5132b263654
    Parameters:
      Mysql _ Database:  <redacted>
      Mysql _ Password:  <redacted>
      Mysql _ User:      <redacted>
      Mysql _ Version:   <redacted>
    User Info:
      Extra:
        Scopes . Authorization . Openshift . Io:
          user:full
      Groups:
        system:authenticated:oauth
        system:authenticated
      UID:                        
      Username:                   zitang2
  Observed Generation:            2
  Orphan Mitigation In Progress:  false
  Provision Status:               Provisioned
  Reconciled Generation:          2
Events:
  Type     Reason                   Age               From                                Message
  ----     ------                   ----              ----                                -------
  Warning  ErrorWithParameters      4m (x10 over 5m)  service-catalog-controller-manager  failed to prepare parameters nil: secrets "rh-mysql-apb-parameters" not found
  Normal   Provisioning             4m                service-catalog-controller-manager  The instance is being provisioned asynchronously
  Normal   Provisioning             3m (x6 over 4m)   service-catalog-controller-manager  The instance is being provisioned asynchronously (action started)
  Normal   ProvisionedSuccessfully  2m                service-catalog-controller-manager  The instance was provisioned successfully
  Normal   UpdatingInstance         2m                service-catalog-controller-manager  The instance is being updated asynchronously
  Normal   UpdatingInstance         2m (x2 over 2m)   service-catalog-controller-manager  The instance is being updated asynchronously (action started)
  Normal   UpdatingInstance         2m (x2 over 2m)   service-catalog-controller-manager  The instance is being updated asynchronously (0%: Starting)
  Normal   UpdatingInstance         1m                service-catalog-controller-manager  The instance is being updated asynchronously (25%: Backup complete)
  Normal   UpdatingInstance         1m                service-catalog-controller-manager  The instance is being updated asynchronously (50%: Objects updated state=present)

Comment 3 Jesus M. Rodriguez 2018-08-13 16:54:04 UTC
Changing from dev to test plan resulted in the expected error message in the status:

  status:
    asyncOpInProgress: false
    conditions:
    - lastTransitionTime: 2018-08-13T16:49:18Z
      message: 'The instance references a ClusterServicePlan that does not exist.
        References a non-existent ClusterServicePlan {PlanExternalName:"test"} on
        ClusterServiceClass 1dda1477cace09730bd8ed7a6505607e {ClassExternalName:"dh-postgresql-apb"}
        or there is more than one (found: 0)'
      reason: ReferencesNonexistentServicePlan
      status: "False"
      type: Ready

Comment 4 Jesus M. Rodriguez 2018-08-13 16:54:46 UTC
Created attachment 1475622 [details]
full-spec with failed message

Comment 5 Jesus M. Rodriguez 2018-08-13 16:55:39 UTC
Going from test to prod resulted in a successful message:

  status:
    asyncOpInProgress: true
    conditions:
    - lastTransitionTime: 2018-08-13T16:49:18Z
      message: The instance is being updated asynchronously
      reason: UpdatingInstance
      status: "False"
      type: Ready
    currentOperation: Update

Comment 6 Jesus M. Rodriguez 2018-08-13 16:56:05 UTC
Created attachment 1475623 [details]
full-spec with successful message

Comment 7 Jesus M. Rodriguez 2018-08-13 16:56:51 UTC
Please re-test using ansible-service-broker 1.3.8-1 or newer

Comment 9 Jesus M. Rodriguez 2018-08-16 19:49:13 UTC
After researching the problem some more, the broker is returning the correct information from the update. The events are not controlled by the broker. I believe this is a service catalog bug with setting the update status event.

Comment 10 Jay Boyd 2018-08-21 19:05:42 UTC
Something is indeed wrong in the event logging.  I reproduced the issue, then created a test broker to more easily debug.  The missing events does not always happen.  Here is an example where you can see the current operation in progress in the conditions (& the Catalog Controller log) but it's missing from the events:

Status:
  Async Op In Progress:  true
  Conditions:
    Last Transition Time:  2018-08-21T18:46:16Z
    Message:               The instance is being updated asynchronously (operation in progress: 0.75%)
    Reason:                UpdatingInstance
    Status:                False
    Type:                  Ready
  Current Operation:       Update
  Deprovision Status:      Required
  External Properties:
    Cluster Service Plan External ID:    86064792-7ea2-467b-af93-ac9694d96d5x
    Cluster Service Plan External Name:  premium
  In Progress Properties:
    Cluster Service Plan External ID:    86064792-7ea2-467b-af93-ac9694d96d5b
    Cluster Service Plan External Name:  default
  Observed Generation:                   15
  Operation Start Time:                  2018-08-21T18:48:16Z
  Orphan Mitigation In Progress:         false
  Provision Status:                      Provisioned
  Reconciled Generation:                 11
Events:
  Type     Reason                            Age                 From                                Message
  ----     ------                            ----                ----                                -------
  Normal   Provisioning                      19m                 service-catalog-controller-manager  The instance is being provisioned asynchronously
  Normal   Provisioning                      19m                 service-catalog-controller-manager  The instance is being provisioned asynchronously (operation in progress: 0.25%)
  Normal   Provisioning                      18m                 service-catalog-controller-manager  The instance is being provisioned asynchronously (operation in progress: 0.5%)
  Normal   Provisioning                      18m                 service-catalog-controller-manager  The instance is being provisioned asynchronously (operation in progress: 0.75%)
  Normal   ProvisionedSuccessfully           18m                 service-catalog-controller-manager  The instance was provisioned successfully
  Normal   UpdatingInstance                  18m                 service-catalog-controller-manager  The instance is being updated asynchronously
  Normal   UpdatingInstance                  18m                 service-catalog-controller-manager  The instance is being updated asynchronously (operation in progress: 0.25%)
  Normal   UpdatingInstance                  18m                 service-catalog-controller-manager  The instance is being updated asynchronously (operation in progress: 0.5%)
  Normal   UpdatingInstance                  18m                 service-catalog-controller-manager  The instance is being updated asynchronously (operation in progress: 0.75%)
  Normal   InstanceUpdatedSuccessfully       18m                 service-catalog-controller-manager  The instance was updated successfully
  Warning  ReferencesNonexistentServicePlan  15m (x12 over 16m)  service-catalog-controller-manager  References a non-existent ClusterServicePlan {ClusterServicePlanExternalName:"invalid"} on ClusterServiceClass 4f6e6cf6-ffdd-425f-a2c7-3c9258ad246a {ClusterServiceClassExternalName:"example-starter-pack-service"} or there is more than one (found: 0)
  Warning  ReferencesNonexistentServicePlan  15m (x3 over 15m)   service-catalog-controller-manager  References a non-existent ClusterServicePlan {ClusterServicePlanExternalName:"dev"} on ClusterServiceClass 4f6e6cf6-ffdd-425f-a2c7-3c9258ad246a {ClusterServiceClassExternalName:"example-starter-pack-service"} or there is more than one (found: 0)
  Warning  ReferencesNonexistentServicePlan  12m                 service-catalog-controller-manager  References a non-existent ClusterServicePlan {ClusterServicePlanExternalName:"XXX"} on ClusterServiceClass 4f6e6cf6-ffdd-425f-a2c7-3c9258ad246a {ClusterServiceClassExternalName:"example-starter-pack-service"} or there is more than one (found: 0)
  Warning  ReferencesNonexistentServicePlan  3m (x13 over 3m)    service-catalog-controller-manager  References a non-existent ClusterServicePlan {ClusterServicePlanExternalName:"XXXXXX"} on ClusterServiceClass 4f6e6cf6-ffdd-425f-a2c7-3c9258ad246a {ClusterServiceClassExternalName:"example-starter-pack-service"} or there is more than one (found: 0)
  Normal   UpdatingInstance                  2m (x2 over 4m)     service-catalog-controller-manager  The instance is being updated asynchronously
  Normal   UpdatingInstance                  2m (x2 over 4m)     service-catalog-controller-manager  The instance is being updated asynchronously (operation in progress: 0.25%)
  Normal   UpdatingInstance                  2m (x2 over 4m)     service-catalog-controller-manager  The instance is being updated asynchronously (operation in progress: 0.5%)
  Normal   UpdatingInstance                  2m (x2 over 4m)     service-catalog-controller-manager  The instance is being updated asynchronously (operation in progress: 0.75%)
  Normal   InstanceUpdatedSuccessfully       2m (x2 over 3m)     service-catalog-controller-manager  The instance was updated successfully
  Warning  ReferencesNonexistentServicePlan  2m (x2 over 2m)     service-catalog-controller-manager  References a non-existent ClusterServicePlan {ClusterServicePlanExternalName:"XXXXXXXXXXxxxxxxxxXXXXXXXXX"} on ClusterServiceClass 4f6e6cf6-ffdd-425f-a2c7-3c9258ad246a {ClusterServiceClassExternalName:"example-starter-pack-service"} or there is more than one (found: 0)

Comment 11 Jian Zhang 2018-08-22 07:07:02 UTC
I also encounter the event losing issue. I did below options:
changed the clusterserviceplan name of the serviceinstance:
 "dev" -> "dev-123" -> "dev".

  Warning  ErrorWithParameters               20m (x2 over 20m)   service-catalog-controller-manager  failed to prepare parameters nil: secrets "brew-postgresql-apb-parametersrdwk7" not found
  Normal   Provisioning                      20m                 service-catalog-controller-manager  The instance is being provisioned asynchronously
  Normal   Provisioning                      20m (x3 over 20m)   service-catalog-controller-manager  The instance is being provisioned asynchronously (action started)
  Normal   Provisioning                      20m                 service-catalog-controller-manager  The instance is being provisioned asynchronously (100%: Done)
  Normal   ProvisionedSuccessfully           20m                 service-catalog-controller-manager  The instance was provisioned successfully
  Warning  ReferencesNonexistentServicePlan  13m (x14 over 13m)  service-catalog-controller-manager  References a non-existent ClusterServicePlan {ClusterServicePlanExternalName:"dev-123"} on ClusterServiceClass 48749329dd289591e11ba737f15fc71b {ClusterServiceClassExternalName:"brew-postgresql-apb"} or there is more than one (found: 0)
  Normal   UpdatingInstance                  12m (x3 over 12m)   service-catalog-controller-manager  The instance is being updated asynchronously (action started)

Finally, the serviceinstance updated completed, but I did NOT get events showing the progress, such as: "The instance is being updated asynchronously (50%: Objects updated state=present)" only get the "(action started)" event.

Comment 12 Jian Zhang 2018-08-22 07:08:13 UTC
The service catalog version:
sh-4.2$ service-catalog --version
v3.11.0-0.17.0;Upstream:v0.1.27

Comment 13 Jay Boyd 2018-08-27 18:12:08 UTC
Most easily reproduced by editing an instance and setting it to an invalid plan.  The issue is that we produce too many events.  You are allowed a maximum of 25 events in a 5 minute period and then only 1 event per 5 minutes.

I think we could lessen the chance of events being categorized as spam if we can suppress reporting the same error event too many times.

https://github.com/kubernetes-incubator/service-catalog/issues/2302

Comment 14 Jay Boyd 2018-09-04 12:48:32 UTC
This issue is not new, it's been here since the initial release of Service Catalog and it impacts the list of events - - the status is and controller logs still contain all of the correct information.  Fixing this will require some redesign in the Catalog controller, Let's address this in 4.0.

Comment 15 Jay Boyd 2019-03-27 20:15:32 UTC
Most likely won't fix this but leaving it here for final triage.  Its the rate limiter that is causing this issue