Bug 1771931

Summary: Catalogsource AVAILABLE will be false when enable it (message: 'Available: v1beta1.servicecatalog.k8s.io is not ready: 503')
Product: OpenShift Container Platform Reporter: Jian Zhang <jiazha>
Component: Service CatalogAssignee: Jesus M. Rodriguez <jesusr>
Status: CLOSED ERRATA QA Contact: Fan Jia <jfan>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.3.0CC: bandrade, chuo, scolange, wzheng
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-23 11:12:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1769700    

Description Jian Zhang 2019-11-13 09:41:50 UTC
Description of problem:
The ClusterOperator service-catalog-apiserver unavailabel. Got the message:
'Available: v1beta1.servicecatalog.k8s.io is not ready: 503'

Version-Release number of selected component (if applicable):
4.3.0-0.nightly-2019-11-12-185229 

How reproducible:
always

Steps to Reproduce:
1. Install an OCP 4.3 with proxy, for example https://openshift-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/Launch%20Environment%20Flexy/71316/artifact/workdir/install-dir/auth/kubeconfig/*view*/

2. Enable Service Catalog.

3. Check the status of the service-catalog-apiserver.

Actual results:
The service-catalog-apiserver is unavailable.
mac:~ jianzhang$ oc get co |grep service-catalog-apiserver
service-catalog-apiserver                  4.3.0-0.nightly-2019-11-12-185229   False       False         False      41m

Got the message: 'Available: v1beta1.servicecatalog.k8s.io is not ready: 503'

Expected results:
The service-catalog-apiserver should be in Available status.

Additional info:
Debug stpes:
1) I check the v1beta1.servicecatalog.k8s.io, and it works well. See below:

mac:~ jianzhang$ oc get apiservice v1beta1.servicecatalog.k8s.io  -o yaml
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  annotations:
    service.alpha.openshift.io/inject-cabundle: "true"
  creationTimestamp: "2019-11-13T08:22:42Z"
  name: v1beta1.servicecatalog.k8s.io
  resourceVersion: "42456"
  selfLink: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.servicecatalog.k8s.io
  uid: c894b3c1-52df-42f5-8544-621f9e2b4f1d
spec:
  caBundle: xxx
  group: servicecatalog.k8s.io
  groupPriorityMinimum: 9900
  service:
    name: api
    namespace: openshift-service-catalog-apiserver
    port: 443
  version: v1beta1
  versionPriority: 15
status:
  conditions:
  - lastTransitionTime: "2019-11-13T08:22:43Z"
    message: all checks passed
    reason: Passed
    status: "True"
    type: Available
mac:~ jianzhang$ oc get pods -n openshift-service-catalog-apiserver 
NAME              READY   STATUS    RESTARTS   AGE
apiserver-7vng6   1/1     Running   0          76m
apiserver-8plzb   1/1     Running   0          76m
apiserver-p6dj5   1/1     Running   0          76m
mac:~ jianzhang$ oc get svc -n openshift-service-catalog-apiserver 
NAME   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
api    ClusterIP   172.30.150.220   <none>        443/TCP   76m

2) According to the code: https://github.com/openshift/cluster-svcat-apiserver-operator/blob/master/pkg/operator/workloadcontroller/apigroup.go#L20-L27
I check the API status, it works well. See below:
mac:~ jianzhang$ oc get --raw /apis/servicecatalog.k8s.io/v1beta1
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"servicecatalog.k8s.io/v1beta1","resources":[{"name":"clusterservicebrokers","singularName":"","namespaced":false,"kind":"ClusterServiceBroker","verbs":["create","delete","deletecollection","get","list","patch","update","watch"]},{"name":"clusterservicebrokers/status","singularName":"","namespaced":false,"kind":"ClusterServiceBroker","verbs":["get","patch","update"]},{"name":"clusterserviceclasses","singularName":"","namespaced":false,"kind":"ClusterServiceClass","verbs":["create","delete","deletecollection","get","list","patch","update","watch"]},{"name":"clusterserviceclasses/status","singularName":"","namespaced":false,"kind":"ClusterServiceClass","verbs":["get","patch","update"]},{"name":"clusterserviceplans","singularName":"","namespaced":false,"kind":"ClusterServicePlan","verbs":["create","delete","deletecollection","get","list","patch","update","watch"]},{"name":"clusterserviceplans/status","singularName":"","namespaced":false,"kind":"ClusterServicePlan","verbs":["get","patch","update"]},{"name":"servicebindings","singularName":"","namespaced":true,"kind":"ServiceBinding","verbs":["create","delete","deletecollection","get","list","patch","update","watch"]},{"name":"servicebindings/status","singularName":"","namespaced":true,"kind":"ServiceBinding","verbs":["get","patch","update"]},{"name":"servicebrokers","singularName":"","namespaced":true,"kind":"ServiceBroker","verbs":["create","delete","deletecollection","get","list","patch","update","watch"]},{"name":"servicebrokers/status","singularName":"","namespaced":true,"kind":"ServiceBroker","verbs":["get","patch","update"]},{"name":"serviceclasses","singularName":"","namespaced":true,"kind":"ServiceClass","verbs":["create","delete","deletecollection","get","list","patch","update","watch"]},{"name":"serviceclasses/status","singularName":"","namespaced":true,"kind":"ServiceClass","verbs":["get","patch","update"]},{"name":"serviceinstances","singularName":"","namespaced":true,"kind":"ServiceInstance","verbs":["create","delete","deletecollection","get","list","patch","update","watch"]},{"name":"serviceinstances/reference","singularName":"","namespaced":true,"kind":"ServiceInstance","verbs":["get","patch","update"]},{"name":"serviceinstances/status","singularName":"","namespaced":true,"kind":"ServiceInstance","verbs":["get","patch","update"]},{"name":"serviceplans","singularName":"","namespaced":true,"kind":"ServicePlan","verbs":["create","delete","deletecollection","get","list","patch","update","watch"]},{"name":"serviceplans/status","singularName":"","namespaced":true,"kind":"ServicePlan","verbs":["get","patch","update"]}]}

Comment 1 Jian Zhang 2019-11-13 10:14:05 UTC
And, the pods of apiserver and apiserver-operator are in the same node. See below:
mac:~ jianzhang$ oc get pods -n openshift-service-catalog-apiserver  -o wide
NAME              READY   STATUS    RESTARTS   AGE   IP            NODE                                       NOMINATED NODE   READINESS GATES
apiserver-7vng6   1/1     Running   0          32m   10.129.0.47   wzheng-lfd4d-m-0.c.openshift-qe.internal   <none>           <none>
apiserver-8plzb   1/1     Running   0          32m   10.128.0.55   wzheng-lfd4d-m-1.c.openshift-qe.internal   <none>           <none>
apiserver-p6dj5   1/1     Running   0          32m   10.130.0.38   wzheng-lfd4d-m-2.c.openshift-qe.internal   <none>           <none>

mac:~ jianzhang$ oc get pods -n openshift-service-catalog-apiserver-operator   -o wide
NAME                                                            READY   STATUS    RESTARTS   AGE   IP            NODE                                       NOMINATED NODE   READINESS GATES
openshift-service-catalog-apiserver-operator-86cfd8c774-rdz87   1/1     Running   0          48m   10.130.0.13   wzheng-lfd4d-m-2.c.openshift-qe.internal   <none>           <none>

Comment 2 Jian Zhang 2019-11-14 09:58:08 UTC
We met this issue again. It's nothing with if enabling the Proxy for the cluster.
Cluster version: 4.3.0-0.nightly-2019-11-13-233341  
Improve the Priority.
Seems like: https://github.com/openshift/cluster-svcat-apiserver-operator/blob/master/pkg/operator/workloadcontroller/workload_controller_openshiftapiserver_v311_00.go#L123-L128

mac:~ jianzhang$ oc get co service-catalog-apiserver  -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: "2019-11-14T01:53:47Z"
  generation: 1
  name: service-catalog-apiserver
  resourceVersion: "258019"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/service-catalog-apiserver
  uid: 98c644a7-0681-11ea-8ad2-fa163e424534
spec: {}
status:
  conditions:
  - lastTransitionTime: "2019-11-14T01:53:48Z"
    reason: AsExpected
    status: "False"
    type: Degraded
  - lastTransitionTime: "2019-11-14T08:46:31Z"
    reason: AsExpected
    status: "False"
    type: Progressing
  - lastTransitionTime: "2019-11-14T08:48:14Z"
    message: 'Available: v1beta1.servicecatalog.k8s.io is not ready: 503'
    reason: Available
    status: "False"
    type: Available
  - lastTransitionTime: "2019-11-14T05:46:15Z"
    reason: AsExpected
    status: "True"
    type: Upgradeable
  extension: null
  relatedObjects:
  - group: ""
    name: openshift-config
    resource: namespaces
  - group: ""
    name: openshift-config-managed
    resource: namespaces
  - group: ""
    name: openshift-service-catalog-apiserver-operator
    resource: namespaces
  - group: ""
    name: openshift-service-catalog-apiserver
    resource: namespaces
  - group: apiregistration.k8s.io
    name: v1beta1.servicecatalog.k8s.io
    resource: apiservices
  versions:
  - name: operator
    version: 4.3.0-0.nightly-2019-11-13-233341
  - name: service-catalog-apiserver
    version: ""


mac:~ jianzhang$ oc get servicecatalogapiserver cluster -o yaml
apiVersion: operator.openshift.io/v1
kind: ServiceCatalogAPIServer
metadata:
  annotations:
    release.openshift.io/create-only: "true"
  creationTimestamp: "2019-11-14T01:48:56Z"
  generation: 4
  name: cluster
  resourceVersion: "258018"
  selfLink: /apis/operator.openshift.io/v1/servicecatalogapiservers/cluster
  uid: ebb68227-0680-11ea-add8-fa163ec78d70
spec:
  logLevel: Normal
  managementState: Managed
status:
  conditions:
  - lastTransitionTime: "2019-11-14T08:48:14Z"
    message: 'v1beta1.servicecatalog.k8s.io is not ready: 503'
    status: "False"
    type: Available
  - lastTransitionTime: "2019-11-14T08:46:31Z"
    status: "False"
    type: Progressing
  - lastTransitionTime: "2019-11-14T01:53:48Z"
    reason: Removed
    status: "False"
    type: Degraded
  - lastTransitionTime: "2019-11-14T05:46:15Z"
    reason: NoUnsupportedConfigOverrides
    status: "True"
    type: UnsupportedConfigOverridesUpgradeable
  - lastTransitionTime: "2019-11-14T05:46:15Z"
    status: "False"
    type: ResourceSyncControllerDegraded
  - lastTransitionTime: "2019-11-14T05:46:21Z"
    status: "False"
    type: WorkloadDegraded
  generations:
  - group: apps
    hash: ""
    lastGeneration: 1
    name: apiserver
    namespace: openshift-service-catalog-apiserver
    resource: daemonsets
  observedGeneration: 4
  readyReplicas: 0

Comment 3 Jesus M. Rodriguez 2019-11-22 01:24:45 UTC
Recreated this in my 4.3 cluster:
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: "2019-11-20T15:55:51Z"
  generation: 1
  name: service-catalog-apiserver
  resourceVersion: "597646"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/service-catalog-apiserver
  uid: bad93f4c-9477-4130-adbe-1d341a9f11df
spec: {}
status:
  conditions:
  - lastTransitionTime: "2019-11-21T20:16:03Z"
    reason: AsExpected
    status: "False"
    type: Degraded
  - lastTransitionTime: "2019-11-22T00:59:47Z"
    reason: AsExpected
    status: "False"
    type: Progressing
  - lastTransitionTime: "2019-11-22T00:59:44Z"
    message: 'Available: v1beta1.servicecatalog.k8s.io is not ready: 503'
    reason: Available
    status: "False"
    type: Available
  - lastTransitionTime: "2019-11-20T16:26:56Z"
    reason: AsExpected
    status: "True"
    type: Upgradeable
  extension: null
  relatedObjects:
  - group: ""
    name: openshift-config
    resource: namespaces
  - group: ""
    name: openshift-config-managed
    resource: namespaces
  - group: ""
    name: openshift-service-catalog-apiserver-operator
    resource: namespaces
  - group: ""
    name: openshift-service-catalog-apiserver
    resource: namespaces
  - group: apiregistration.k8s.io
    name: v1beta1.servicecatalog.k8s.io
    resource: apiservices
  versions:
  - name: operator
    version: 4.3.0-0.ci-2019-11-20-134433
  - name: service-catalog-apiserver
    version: ""

Comment 4 Jesus M. Rodriguez 2019-11-22 01:25:31 UTC
After the fix:
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: "2019-11-20T15:55:51Z"
  generation: 1
  name: service-catalog-apiserver
  resourceVersion: "599140"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/service-catalog-apiserver
  uid: bad93f4c-9477-4130-adbe-1d341a9f11df
spec: {}
status:
  conditions:
  - lastTransitionTime: "2019-11-21T20:16:03Z"
    reason: AsExpected
    status: "False"
    type: Degraded
  - lastTransitionTime: "2019-11-22T00:59:47Z"
    reason: AsExpected
    status: "False"
    type: Progressing
  - lastTransitionTime: "2019-11-22T01:05:45Z"
    reason: AsExpected
    status: "True"
    type: Available
  - lastTransitionTime: "2019-11-20T16:26:56Z"
    reason: AsExpected
    status: "True"
    type: Upgradeable
  extension: null
  relatedObjects:
  - group: ""
    name: openshift-config
    resource: namespaces
  - group: ""
    name: openshift-config-managed
    resource: namespaces
  - group: ""
    name: openshift-service-catalog-apiserver-operator
    resource: namespaces
  - group: ""
    name: openshift-service-catalog-apiserver
    resource: namespaces
  - group: apiregistration.k8s.io
    name: v1beta1.servicecatalog.k8s.io
    resource: apiservices
  versions:
  - name: operator
    version: 4.3.0-0.ci-2019-11-20-134433
  - name: service-catalog-apiserver
    version: ""

Comment 6 Fan Jia 2019-11-22 07:30:03 UTC
The latest nightly build doesn't include the fix pr, will test when the nightly build is ready.

Comment 8 Fan Jia 2019-11-25 07:14:25 UTC
test env:
cv:4.3.0-0.nightly-2019-11-24-183610

test result:
1. enable service-catalog-apiserver & service-catalog-controller-manager
2. # oc get clusteroperators service-catalog-apiserver -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: "2019-11-25T02:35:01Z"
  generation: 1
  name: service-catalog-apiserver
  resourceVersion: "71640"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/service-catalog-apiserver
  uid: 5fa29b0b-a102-40f7-8d51-21695cab0f28
spec: {}
status:
  conditions:
  - lastTransitionTime: "2019-11-25T02:35:01Z"
    reason: AsExpected
    status: "False"
    type: Degraded
  - lastTransitionTime: "2019-11-25T05:43:05Z"
    reason: AsExpected
    status: "False"
    type: Progressing
  - lastTransitionTime: "2019-11-25T05:44:16Z"
    reason: AsExpected
    status: "True"
    type: Available
  - lastTransitionTime: "2019-11-25T05:42:56Z"
    reason: AsExpected
    status: "True"
    type: Upgradeable
  extension: null
  relatedObjects:
  - group: ""
    name: openshift-config
    resource: namespaces
  - group: ""
    name: openshift-config-managed
    resource: namespaces
  - group: ""
    name: openshift-service-catalog-apiserver-operator
    resource: namespaces
  - group: ""
    name: openshift-service-catalog-apiserver
    resource: namespaces
  - group: apiregistration.k8s.io
    name: v1beta1.servicecatalog.k8s.io
    resource: apiservices
  versions:
  - name: operator
    version: 4.3.0-0.nightly-2019-11-24-183610
  - name: service-catalog-apiserver
    version: ""

Comment 10 errata-xmlrpc 2020-01-23 11:12:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062