Bug 1567847

Summary: Sending multiple provision request to asb when provision the same apb for the second time
Product: OpenShift Container Platform Reporter: Zihan Tang <zitang>
Component: Service CatalogAssignee: Marko Luksa <mluksa>
Status: CLOSED NEXTRELEASE QA Contact: Zihan Tang <zitang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.10.0CC: chezhang, jaboyd, jiazha, jmatthew, mluksa, suchaudh, zhsun, zitang
Target Milestone: ---   
Target Release: 3.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-18 14:30:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
controller log none

Description Zihan Tang 2018-04-16 09:51:02 UTC
Description of problem:
When provision postgresql-apb, the controller-manager sent multiple provision request 

Version-Release number of selected component (if applicable):
v3.10.0-0.21.0;Upstream:v0.1.13

How reproducible:
always

Steps to Reproduce:
1. set controller-manager log level to '6'
2. provision postgresql-apb
3. check the controller-manager log

Actual results:

in step 3, the controller manager sent multiple request to asb
#oc logs -f controller-manager-htxcv -n kube-service-catalog | grep -i provision

I0416 09:39:47.626477       1 controller_instance.go:1134] ServiceInstance "test/rh-postgresql-apb-wcsp5": Provision request for ServiceInstance in-flight to Broker
I0416 09:39:48.225491       1 controller_instance.go:1134] ServiceInstance "test/rh-postgresql-apb-wcsp5": Provision request for ServiceInstance in-flight to Broker
I0416 09:39:48.825947       1 controller_instance.go:1134] ServiceInstance "test/rh-postgresql-apb-wcsp5": Provision request for ServiceInstance in-flight to Broker
I0416 09:39:49.424534       1 controller_instance.go:383] ServiceInstance "test/rh-postgresql-apb-wcsp5": Provisioning a new ServiceInstance of ClusterServiceClass (K8S: "d5915e05b253df421efe6e41fb6a66ba" ExternalName: "rh-postgresql-apb") at ClusterServiceBroker "ansible-service-broker"
I0416 09:39:49.515830       1 controller_instance.go:1134] ServiceInstance "test/rh-postgresql-apb-wcsp5": The instance is being provisioned asynchronously
I0416 09:39:49.543159       1 event.go:218] Event(v1.ObjectReference{Kind:"ServiceInstance", Namespace:"test", Name:"rh-postgresql-apb-wcsp5", UID:"19e4be5b-415a-11e8-ad4c-0a580a80000c", APIVersion:"servicecatalog.k8s.io", ResourceVersion:"62100", FieldPath:""}): type: 'Normal' reason: 'Provisioning' The instance is being provisioned asynchronously
I0416 09:39:50.049012       1 controller_instance.go:383] ServiceInstance "test/rh-postgresql-apb-wcsp5": Provisioning a new ServiceInstance of ClusterServiceClass (K8S: "d5915e05b253df421efe6e41fb6a66ba" ExternalName: "rh-postgresql-apb") at ClusterServiceBroker "ansible-service-broker"
I0416 09:39:50.132615       1 controller_instance.go:1134] ServiceInstance "test/rh-postgresql-apb-wcsp5": The instance is being provisioned asynchronously
I0416 09:39:50.274203       1 controller_instance.go:1134] ServiceInstance "test/rh-postgresql-apb-wcsp5": The instance is being provisioned asynchronously (action started)
I0416 09:39:50.274536       1 event.go:218] Event(v1.ObjectReference{Kind:"ServiceInstance", Namespace:"test", Name:"rh-postgresql-apb-wcsp5", UID:"19e4be5b-415a-11e8-ad4c-0a580a80000c", APIVersion:"servicecatalog.k8s.io", ResourceVersion:"62107", FieldPath:""}): type: 'Normal' reason: 'Provisioning' The instance is being provisioned asynchronously (action started)
I0416 09:39:50.574787       1 controller_instance.go:1134] ServiceInstance "test/rh-postgresql-apb-wcsp5": The instance is being provisioned asynchronously (action started)
I0416 09:39:50.575297       1 event.go:218] Event(v1.ObjectReference{Kind:"ServiceInstance", Namespace:"test", Name:"rh-postgresql-apb-wcsp5", UID:"19e4be5b-415a-11e8-ad4c-0a580a80000c", APIVersion:"servicecatalog.k8s.io", ResourceVersion:"62156", FieldPath:""}): type: 'Normal' reason: 'Provisioning' The instance is being provisioned asynchronously (action started)
I0416 09:39:54.623849       1 controller_instance.go:1134] ServiceInstance "test/rh-postgresql-apb-wcsp5": The instance is being provisioned asynchronously (action started)
I0416 09:39:54.624366       1 event.go:218] Event(v1.ObjectReference{Kind:"ServiceInstance", Namespace:"test", Name:"rh-postgresql-apb-wcsp5", UID:"19e4be5b-415a-11e8-ad4c-0a580a80000c", APIVersion:"servicecatalog.k8s.io", ResourceVersion:"62156", FieldPath:""}): type: 'Normal' reason: 'Provisioning' The instance is being provisioned asynchronously (action started)

Expected results:
only sent 1 provision request.

Additional info:

Comment 1 Paul Morie 2018-04-19 23:23:03 UTC
Marko, you've done a lot of work on this area. Would you take a look?

Comment 2 Marko Luksa 2018-04-20 06:05:19 UTC
I'm unable to reproduce this. In my case, the provision request is only sent once. The behavior you're seeing is consistent with an older version of the service catalog.

Please confirm that you're really running v0.1.13 of the service catalog. This issue was fixed in v0.1.11. 

To see the controller's version, please run: oc logs -f controller-manager-htxcv -n kube-service-catalog | head

Comment 3 Zhang Cheng 2018-04-24 01:54:08 UTC
*** Bug 1570578 has been marked as a duplicate of this bug. ***

Comment 4 Zihan Tang 2018-04-26 06:20:07 UTC
When provision apb , I find logs in asb: 
[2018-04-26T05:59:18.611Z] [INFO] - All Jobs for instance: f41d69a8-4916-11e8-9198-0a580a800003 in state:  in progress - 
[]apb.JobState{apb.JobState{Token:"822e03f4-8929-4f58-b98f-99f153e37be2", State:"in progress", Podname:"", Method:"provision", Error:"", Description:"action started"}}
[2018-04-26T05:59:18.611Z] [INFO] - Provision requested for instance f41d69a8-4916-11e8-9198-0a580a800003, but job is already in progress
[2018-04-26T05:59:18.611Z] [ERROR] - provision error provision in progress

It's not always reproduced,  about 80%. 

controller version: v0.1.13
in the log : Service Catalog version v0.0.0-master+$Format:%h$ (built 2018-04-25T14:16:15Z)
# oc exec controller-manager-f2q9t  -- /usr/bin/service-catalog --version
v3.10.0-0.29.0;Upstream:v0.1.13

Comment 5 Marko Luksa 2018-05-04 08:46:00 UTC
Where are you getting the Service Catalog image from? I'd like to try this out with the exact same image, because I still haven't been able to reproduce this bug with the images I've tried (which includes v0.1.13).

Comment 7 Zihan Tang 2018-05-04 09:24:02 UTC
And when I use the 
service-catalog: v3.10.0-0.32.0;Upstream:v0.1.13  and ASB:v1.2.7 
 
still find the issue in asb when provision 
[2018-05-04T09:20:41.507Z] [INFO] - All Jobs for instance: 695336f2-4f7c-11e8-9ef9-0a580a800005 in state:  in progress - 
[]apb.JobState{apb.JobState{Token:"d934a5b8-8bda-4498-9351-31d465204e21", State:"in progress", Podname:"", Method:"provision", Error:"", Description:"action started"}}
[2018-05-04T09:20:41.507Z] [INFO] - Provision requested for instance 695336f2-4f7c-11e8-9ef9-0a580a800005, but job is already in progress
[2018-05-04T09:20:41.507Z] [ERROR] - provision error provision in progress

Comment 8 Zihan Tang 2018-05-14 09:50:30 UTC
In asb 1.2.10, service-catalog v3.10.0-0.38.0;Upstream:v0.1.16
this bug is 'NOT ALWAYS' reproduced.

If it is the first time provision the certain apb(no this serviceinstance exist in all namespace), it's not reproduced.
if provision the apb for the second time (more than one this intance exist in all namespace), this will be reproduced.

ASB logs like: 
[2018-05-14T09:38:18.559Z] [INFO] - All Jobs for instance: 86b216c0-575a-11e8-be4e-0a580a800003 in state:  in progress - 
[]apb.JobState{apb.JobState{Token:"8065f043-e832-451d-a2ec-c255f382c427", State:"in progress", Podname:"", Method:"provision", Error:"", Description:"action started"}}
[2018-05-14T09:38:18.559Z] [INFO] - Provision requested for instance 86b216c0-575a-11e8-be4e-0a580a800003, but job is already in progress
[2018-05-14T09:38:18.559Z] [ERROR] - provision error provision in progress

Comment 9 Jay Boyd 2018-05-22 15:16:55 UTC
@Marko, we're closing down bugs for 3.10.  We need to fix this ASAP or move the target release out to 3.11.  I'm thinking we probably want to bump this out, do you agree?

Comment 10 Marko Luksa 2018-05-23 13:06:24 UTC
I still can't reproduce this. I've provisioned five "PostgreSQL (APB)" services in different namespaces and there is always only one provision request. 

I'm using service catalog v0.1.19. 

Are you sure you don't have two instances of the controller manager running? Possibly in different namespaces?

Please attach the full logs of the controller manager (with --v 6) so we can confirm that it's in fact a single controller sending multiple requests to the ansible service broker.

Comment 11 Zihan Tang 2018-05-24 03:33:19 UTC
I use service catalog v3.10.0-0.50.0;Upstream:v0.1.19,
but this is not always reproduced.
I use mysql-apb and maria-apb , and reproduced this time:
asb log:
[2018-05-24T03:21:25.992Z] [INFO] - All Jobs for instance: 89bac308-5f01-11e8-9f13-0a580a800005 in state:  in progress - 
[]apb.JobState{apb.JobState{Token:"9ff3413a-1517-47c4-951c-038677ae7149", State:"in progress", Podname:"", Method:"provision", Error:"", Description:"action started"}}
[2018-05-24T03:21:25.992Z] [INFO] - Provision requested for instance 89bac308-5f01-11e8-9f13-0a580a800005, but job is already in progress
[2018-05-24T03:21:25.992Z] [ERROR] - provision error provision in progress
10.128.0.1 - - [24/May/2018:03:21:25 +0000] "PUT /ansible-service-broker/v2/service_instances/89bac308-5f01-11e8-9f13-0a580a800005?accepts_incomplete=true HTTP/1.1" 202 58
time="2018-05-24T03:21:26Z" level=info msg="Successfully created apb sandbox: [ %s ], with %s permissions in namespace %sapb-05c186c9-c41b-478e-8a2a-4a1807529f1ceditdh-mysql-apb-prov-nd2cf"
time="2018-05-24T03:21:26Z" level=info msg="Running post create sandbox fuctions if defined."
.....
time="2018-05-24T03:23:18Z" level=info msg="Creating RoleBinding apb-ed3e0c6f-3fed-4be8-b7ca-d1a70ff47452"
[2018-05-24T03:23:18.508Z] [INFO] - All Jobs for instance: cc64c254-5f01-11e8-9f13-0a580a800005 in state:  in progress - 
[]apb.JobState{apb.JobState{Token:"5833e48d-250a-4e24-8a32-f04dd2c62b78", State:"in progress", Podname:"", Method:"provision", Error:"", Description:"action started"}}
[2018-05-24T03:23:18.508Z] [INFO] - Provision requested for instance cc64c254-5f01-11e8-9f13-0a580a800005, but job is already in progress
[2018-05-24T03:23:18.508Z] [ERROR] - provision error provision in progress

Comment 12 Zihan Tang 2018-05-24 03:34:35 UTC
Created attachment 1440933 [details]
controller log

Here's the controller's log when I provision postgresql-apb, mysql-apb, mariadb-apb.

Comment 13 Zihan Tang 2018-05-24 08:34:52 UTC
The apbs I used is v3.10.0-0.32.0.0
I'm not sure whether it is caused by apb.

Comment 14 Marko Luksa 2018-05-28 13:13:48 UTC
I was able to reproduce this last week and figure out what is causing it. 

This issue is almost always reproducible like this:

- oc apply -f serviceinstance.yaml ; sleep 1 ; oc apply -f secret.yaml

where the secret.yaml holds the secret that serviceinstance.yaml refers to in its `parametersFrom`. 

It doesn't matter if this is the first ServiceInstance you're provisioning or if there are existing instances in the cluster.

Comment 15 Marko Luksa 2018-05-29 14:59:26 UTC
This problem isn't trivial to fix, so I'm moving this out of 3.10.

Comment 17 Jay Boyd 2018-09-25 18:21:18 UTC
What is the impact of this bug?  Is it limited to confusion when reviewing the logs and events on the service instance because it appears the instance is provisioned multiple times?  With enough time, the instance is eventually provisioned and works, correct?

Comment 18 Zihan Tang 2018-09-26 09:19:36 UTC
(In reply to Jay Boyd from comment #17)
> What is the impact of this bug?  Is it limited to confusion when reviewing
> the logs and events on the service instance because it appears the instance
> is provisioned multiple times?  
This is reported according to card: 
https://trello.com/c/oRfQX2Ih/110-5-catalog310-reconciliation-loop-rework
the asb log shows it receive mulitple request from service catalog

With enough time, the instance is eventually
> provisioned and works, correct?
Yes, it provisioned and works

Besides, for 3.11, asb didn't report this kind of log any more.

Comment 19 Jay Boyd 2018-09-26 14:04:34 UTC
Thanks Zihan.